Preemptive vs. Cooperative Multitasking

Go To Last Post
118 posts / 0 new

Pages

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Some of the Arduino users may not have heard of the esp8266.

It is a 32 bit processor running at 80/160Mhz and has about 80kB of data ram

and 35kB of instruction ram and WiFi hardware on one chip.

To build the firmware for that machine you have to use a special tool chain which has

nothing to do with Arduino. This is what Kartman talked about.

 

Introduction:

http://hackaday.com/2015/03/18/how-to-directly-program-an-inexpensive-esp8266-wifi-module/#cite4

 

Datasheet:

https://drive.google.com/file/d/0B_ctPy0pJuW6Y0FHcDlVY09Xdjg/view?pli=1

 

From the point of view of an Arduino user this works a bit like an old modem with

AT-commands:

https://nurdspace.nl/ESP8266#AT_Commands

 

It is a  good example, because Arduino users normally do not want to build a new

firmware for that chip, but they want help to use it.

 

My points:

Building a firmware for that machine is something different than programming an Arduino.

 

The Arduino has the limitation of sram/stackspace. Transporting an OS (like linux) is not possible

due to these limitations.

 

Well, the audience of this thread are Arduino users. For them it is often not the best way to use

MT with a stack switching OS because of:

 

- Error prone, if the stack size for the tasks is not calculated very well. It is easy to understand, that

the stack space of each task depends on the usage of library modules.

Tests showed: 75 bytes stack are enough for some tasks, 200 are not for other tasks.

With adding new interrupt sources you may have to recalculate the stack spaces. 

 

- Libraries are not built for multitasking usage. The possibility to interrupt a library routine at every

line of its code will cause problems

 

- Using CO is a possibility to overcome these problems because:

- Just one stack is used

- Building all those elements of an OS like atomic acces, semaphores etc. are possible as well

- Using of libraries is less error prone because of deterministic task switches

-Task switch time is short, because less registers are saved.

- Building a small „state machine“ on top of timer interrupts is no problem when the execution time

of all modules of this "RTOS" are less than the tick time.

 

I have done some different examples using these technics in the last weeks and all are running well without problems.

For instance:

Using the uart for the serial line and building a softserial line with:

115200 baud, half duplex, send without interrupt, receive characters with pinchange interrupt

and own buffer.

Both lines can be used as streams printing for instance floating point numbers with correct

rounded numbers and with a choosable number of fraction digits.

The CO task scheduler is garanteed to be called every 500 us at worst case (mostly l17 to 23 us)

Timer tick time is 100 us

 

Here the output(TX)  of Hardware Serial is connected to input(RX) of Software Serial to test

the Soft Serial receiving using interrupts.

 

Task 1 sends to Hardware Serial (which is connected to Softserial input)

Task 2 sends to Soft Serial directly.

A third task sends the received characters (Software Serial) to the output of Software Serial.

 

A Semaphore regulates the acces to Software Serial out to mix the output from task 2 and 3

Task3 sends, if the input buffer watermark of 80% is reached   OR   if a LF is detected in the received character stream of task 1

Channel 1: Hardware Serial out = task 1,      Channel 2:  Software Serial out = output of tasks 2 and 3 ,    Channel 3: Signals from CO-Scheduler

The 473us gap caused by library usage (printf) could be shortened with additional programming

BTW: Faster baud rates could be used. Maybe in the future the Arduino IDE Monitor will allow this?

The Software Serial receive interrupt is called only at the start bit. Higher baud rates will improve the performance.

 

With CO i have tested:

- Sending events from interrupts and tasks to other waiting tasks

- using a mailbox to send messages form many tasks to one receiver

 

It is easy to build unreliable constructs if you are writing programs, no matter wich language,

processor, OS etc. you use.

But there are a lot of helping technics to avoid such situations and they do not depend on

preemtive multitasking.

 

Once again: we are not talking about building firmware for the mighty esp8266 or building

a new linux, but about using the Arduino to bring it to the limits. And here I am sure,

with CO you are able to build solutions wich could not be done using PE.

I will show some examples on youtube in the next weeks and all visitors will have the chance

to make it better – with an example, not with words.

What i want to say? Try all those technics Kartman is not willing to use and be happy with the fast

and reliable solutions. Some of us are willing to be teached even if they are experts.

Some of us are willing to accept rules without asking, some of us are willing to ask of the sense

of each rule and try to modify rules. Some of us call it science.

Some are sure to know how the world is built because you can read it in the bible.

Do not take the words of an expert as the one and only truth.

 

Teaching means: Show me how i can do it, show me, why i should not do it.

In this thread „Preemptive vs. Cooperative Multitasking“ in the Arduino Forum i have

showed how to do your own experiments using CO and i have tested and showed, that

CO can be a mighty tool to program an Arduino.

Its dedicated to the audience of Arduino programmers.

„Isr code is usually hand crafted to do the bare minimum and hand off the rest to tasks external to the isrs“

Thats what i have tried to show here.

 

BTW: Experts may have different opinions: "Tanenbaum–Torvalds debate" (Wikipedia)

" Torvalds wants it understood that he holds no animosity towards Tanenbaum, and Tanenbaum underlines that

disagreements about ideas or technical issues should not be interpreted as personal feuds"

 

I hope, this is true here as well ;)

 

Last Edited: Mon. Mar 23, 2015 - 03:08 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Helmut, i think you're rambling or maybe pontificating. I can tell you there is a product in europe that is #1 in its product segment and has generated significant income where i pulled just about every trick in the book as i was balancing code size, ram size and cpu cycles. Unfortunately, it was a handful to maintain and a source of defects. Yes, you can cross a road blindfolded, but i wouldn't recommend it. You are free to choose how you cross a road.

Also, don't quote people of note to support your flimsy theories. Especially out of context.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The "flimsy" idea of saving the state of a task to start it later at that point is as old as computer science. PE is one possibilitiy - CO and/or state machines, interrupts etc.  are other implementations. You will find them round the world in most big programs / OSs.

That you name Coroutines well explained by "Don" Knuth "flimsy" is a kind of preaching your point of view with general statements - never being concrete- I will not accept any longer.

I think a lot of views in the last weeks show that there are some interested people in doing multitasking with small MCUs without PE.

But i estimate they are more interested in usable examples than in your comments. We wil see what you have to offer to this part of the audience.

 

Well, you stated more than once, that this theme is not interesting for your audience and i have to accept that, but again: i dont think you are right.

 

In the Arduino world we are living now with the upcomming YIELD called by DELAY. There are better methods using ANSI C to build multitasking.

It is obvious that you do not  like my posts and i will stop posting at AVRFreaks, congratulations.

 

Who is "rambling" and "pontificating" is up to the reader - if he will be able to read this.

Bye.

 

PS Helping blind people crossing roads using microcontrollers is an actual research project at german universities.

Last Edited: Wed. Mar 25, 2015 - 02:49 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

hello, seems this thread on AVRFREAK could end soon, can somebody please help and write jet at least some guide links to the other similar sites and theme? Found this very interesting but just casually and seems can be problem now to find continuation, so good work, many thanks, but social discussions are problem :)

 

links means to some more pure technical threads about microcontroller's internals, because exist common problems to find out limits and define them althought everybody knows about them, so no any lists of official RTOS'es or libraries etc., more like discussed details of desing ideas with directly attached functional code samples of architecture mainly in parallel with evidence by the physical measurements of results like timing and other on the real MCUs..

 

Create conditions and do systematic work like presented can be anavailable in common low-budget developement then is forced rely solely to fulfil the package recommendations also sometimes with unwanted "too safety limits" or vice versa straying in solving unreadable faults.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Helmut,

 

I have just returned from a 2 week holiday. I was following this thread before I left and have just finished reading the new posts. I encourage you to "not go away". I intend to replicate some of your test code and "play with them a little".

 

Cheers,

 

Ross

 

Ross McKenzie ValuSoft Melbourne Australia

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thank you, Ross - I will try it one other time.

 

Multitasking is a big theme - but not every Arduino user knows, how to do it.

 

I see 4 possibilities:

 

1) Arduino IDE introduced the LOOP, which is called from (invisible) main again and again.

Thats not multitasking.

But if you introduce some timing, its easy to pretend the illusion of MT:

void setup() {
  Serial.begin(115200);
}

void loop() {
static unsigned long counter;
  counter++;
  Serial.println("Task1-Part-1");
  if ((counter % 10==0)) Serial.println("Task2");
  Serial.println("Task1-Part-2");
}

This gives the illusion of 2 running tasks parallel - and the illusion is all that counts.

We only have one CPU that can only run one task (part of a task) at a time.

 

We can extend this example using a trick to avoid DELAY:

unsigned long start1=micros();
unsigned long start2=micros();

void setup() {
  pinMode(13,OUTPUT);      // LED
  Serial.begin(115200);
}

void loop() {
static unsigned long counter;

  counter++;
  Serial.println("Task1-Part-1");
  if ((counter % 10==0)) Serial.println("Task2");
  
  if ((micros()-start1) > 20000) {
    Serial.println("Task-3");
    start1=micros();
  }
  
  if ((micros()-start2) > 500000) {
    digitalWrite(13,!digitalRead(13));
    start2=micros();
  }  
  
  Serial.println("Task1-Part-2");
}

Task1-Part-1
Task1-Part-2
Task1-Part-1
Task1-Part-2
Task1-Part-1
Task2
Task1-Part-2
Task1-Part-1
Task-3
Task1-Part-2
Task1-Part-1
Task1-Part-2
Task1-Part-1
Task1-Part-2
Task1-Part-1
Task1-Part-2
Task1-Part-1
Task1-Part-2
Task1-Part-1
Task1-Part-2
Task1-Part-1
Task1-Part-2
Task1-Part-1
Task1-Part-2
Task1-Part-1
Task1-Part-2
Task1-Part-1
Task2
Task-3
Task1-Part-2
Task1-Part-1
Task1-Part-2

 

And the LED blinks once per second.

 

Well, it's all in one function in LOOP, but we are near real MT.

 

 

 

2) Using Interrupts is the most common way used with MCUs like Arduino to do real MT

With an Interrupt (external, timer, ...) the running program is suspended.

An Interrupt function is called. It saves the state of the CPU (Registers, Program counter,

State-register) by pushing values on the stack.

Then the body of the Interrupt function does its job.

At the End of the interrupt function the saved values are popped from the stack and

the suspended program executes, where it was stopped.

That's what we are using, when we use MILLIES() and MICROS(), because they are

updated with timer interrupts running parallel to our program.

 

3) You can build a state machine inside each called task:

unsigned long start1=micros();

void task1() {
static unsigned int state;
  switch (state) {
    case 0: Serial.println("Task1-Part1"); state++;
            break;
    case 1: Serial.println("Task1-Part2"); state++;
            break;
    case 2: Serial.println("Task1-Part3"); state++;
            break;
    case 3: Serial.println("Task1-Part4"); state=0;
            break;
  }         
}

void task2() {
static unsigned int state;
  switch (state) {
    case 0: Serial.println("Task2-Part1"); state++;
            break;
    case 1: Serial.println("Task2-Part2"); state=0;
            break;
  }         
}


void task3() {
  if ((micros()-start1) > 500000) {
    digitalWrite(13, !digitalRead(13));
    start1=micros();
  }  
}


void setup() {
  pinMode(13,OUTPUT);      // LED
  Serial.begin(115200);
  Serial.println("Start...");
}

void loop() {
  task1();
  task2();
  task3();
}

There is no doubt, we have some kind of MT.

We decide inside our functions, when to return to LOOP. This seems to give unpredictable results, because we dont know exactly, when each task restarts.

But as we can see, the blinking is done pretty precisely:

 

 

 

4) The only MT some people will accept is preemptive MT:

(simplified!)

Using a timer for interrupts. Each timer interrupt saves the state of the CPU as described. Then change the stack pointer to another (prepared) area to pop the values, which another task

pushed some time before and running that tasks. The next timer interrupt switches to the next task ...

Advantages:

Every tasks seems to be owner of the CPU. It is garanteed to get time to work and it is easy to garantee that it will get worktime at least for instance ever 5ms.

This is the way all our OS are working. We know well, how to build them and there are a lot of addons for interprocess communication (semaphores, mutex,...)

There are a lot of commercial and free OS around which are tested over years and there are a lot of them available for Arduino.

If you can: use them!

 

But there are some disadvantages:

The AtMega328 has 32 registers. to save them all plus additional infos each task need maybe 40 bytes. For interrupts, you need another 40 bytes. And each tasks

needs its own stackspace. My experiments show: 200 bytes stackspace for each task is a good value.

Making stack space too small will end in a crash, because you overwrite the stackspace of another task. If you give greater stackspace to be save you will waste

some memory. All in all you may get 5 tasks, if you want to have half of sram for the tasks.

And: Saving all registers cost some time. For that usual taskswitch interval is 1 ms.

And: problems with timing of libraries are possible

 

1-3) are called cooperative MT, 4) is preemtive MT.

 

I have tested a lot and FOR ME combined methods 1-3 are :

* working faster (no stack manipulation necessary)

* need less space when working with Arduino (or lower) - (all tasks and interrupts share one stack without wasted space and far away from stack overrun)

And to kill a fairy tale: Everything you may need to for MT: Semaphores, mutual exculsion, Events, Messageboxes, tasklists for different tasks states (wait for timed next call, wait for resource, blocked...)

can be build as you need them !!!

Indeed, some of these could be implemented much easier, because you know very well, when task switching is done - no other task will disturb.

 

Here is a full working example. Maybe for some people it is primitive. But I don' t want to show how clever i am but how you can manage MT with less than 80 lines of code and 4 tasks - no libraries used:

// (C) 2015 Helmut Weber

// to be used later
#define FINISHED   4000000000
#define ENDED      4000000001
#define WAITMUT    4000000002


#define crReturn do{ state=__LINE__; return 0; \
  case __LINE__:; } while (0)

#define crBegin static uint32_t state=0; static uint32_t strt; \
static bool first=true; \
if (first) { first=false; strt=micros(); } \
switch(state) { case 0:

#define crFinish state=0; first=true; return FINISHED;}

#define crEnd    state=0; first=true; return ENDED;}

#define crWait(X)  \
do{ \
  state=__LINE__;   return 0; \
  case __LINE__: ;\
} while ((micros()-strt) < (unsigned long)X); first=true

unsigned long start1=micros();

// Here are the tasks

uint32_t task1() {
  crBegin;
  Serial.println("Task1-Part1");
  // stop this task for 5000 us = 5 ms
  // the other tasks will go on !
  crWait(5000);
  Serial.println("Task1-Part2");
  crWait(5000);
  Serial.println("Task1-Part3");
  crWait(5000);
  Serial.println("Task1-Part4"); 
  crWait(5000);
  crFinish;  
}


uint32_t task2() {
  crBegin;
  Serial.println("Task2-Part1");
  crWait(100000);
  Serial.println("Task2-Part2");
  crWait(100000);
  crFinish;
}


// test round robin time:
uint32_t task3() {
  crBegin;
  digitalWrite(13, !digitalRead(13));
  crWait(20000);
  crFinish;
}

// blink LED with 20 ms pulses
uint32_t task4() {
  crBegin;

  // ATTENTION: you will find these WHILE(1) constructs in many preemptive MT programs
  // you are able to use them in cooperative MT as well !
  
  while(1) {
    digitalWrite(2, !digitalRead(2));
    crReturn;
  }
  crFinish;
}


void setup() {
  pinMode(13,OUTPUT);      // LED
  pinMode(2,OUTPUT);       // for scope to test round robin time
  Serial.begin(115200);
  Serial.println("Start...");
}

void loop() {
  task1();
  task2();
  task3();
  task4();
}

 

I am sure: this is a very simple way to begin MT with Arduino and it is usefull for beginners.

 

Please experiment with this template and report from your tests.

 

Next time i will show you, how to combine it with interrupts.

 

To be continued ...

 

Last Edited: Fri. Apr 10, 2015 - 01:41 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

Interrupt driven SoftSerial Receiver using 115200 Baud - that is not possible. That is what you can read everywhere.

Here I show you, how to do it together with MT.

Abouot 6300 characters per second are transfered !

 

 

The Pin Change Interrupt starts with the falling edge of the stop bit. Then the bits are measured with timed loops with disabled interrupts.

This will stop the Interrupt system for about 82 us.

 

It worked good, until the timer-0 interrupt starts just in front of the receiver interrupt  and delays the SoftSerial Receiver Pin change Interrupt.

I rewrote the Timer 0 interrupt to be interrupted  by Pinc Change Interrupt.

In fact this gives the SoftSerial Interrupt a higher priority.

Without this we would get some corrupted characters due to the delays of the timer interrupts

 

SoftSerial got a buffer of 500 Bytes. The Buffer is printed, if the Buffer is filled more than 80%.

 

The Hardware:

With normal Serial some Text is printed. Pin 2 (TX) is connected to pin 8, which is RX of SoftSerial.

SoftSerial collects the incomming bytes and prints the buffer from time to time via Serial.

When printiing the buffer we have to disable to buffer THESE characters to avoid recursion.

 

The Tasks:

Task-1 prints a text line every5 ms

Task-2 tests every 2ms if the buffer is filled with mare than 400 characters (80%). If it is it prints the buffer.

          Task-2 could be used to implement XON/XOFF protocoll !

Task-3 prints the numer of characters received every 100ms

 

Here you can see the traffic at the SoftSerial Line

 

The first line at "ReadBit" shows, when the SoftSerial IRQ is started from the start bit.

Next 8 lines show when the bits are read.

 

A normal timer 0 interrupt. The test of SoftSerial receiver interrupt-flag is done in the gap of 1.8us

 

 

Here a timer Interrupt happens just in front of the stopbit. But the Pin Change Interrupt Flag is tested in the timer 0 interrupt routine.

Pin Change Interrupt routine is called from the Timer 0 Interrupt routine.

The normal delay form edge of stopbit to PCI routine of 3us is now a little bit more: 5.5 us

This is compensated in the PCI routine and the bits are red at the correct points.

When PCI returns control to Timer Interrupt routine (marked as IRQ nesting) the timer interrupt goes on to do its job.

By this the timer-0 interrupt is stretched to 92us

 

Here  is the output of the program.

 

Start...
<Here is Task-1 64288>
           Task-3 42
<Here is Task-1 69596>
<Here is Task-1 74620>
<Here is Task-1 79648>
<Here is Task-1 84720>
<Here is Task-1 89792>
<Here is Task-1 94816>
<Here is Task-1 99844>
<Here is Task-1 104872>
<Here is Task-1 109908>
<Here is Task-1 114948>
<Here is Task-1 119988>
<Here is Task-1 125028>
<Here is Task-1 130012>
<Here is Task-1 135052>
<Here is Task-1 140092>
<Here is Task-1 145132>

SoftSerial received:
[
Here is Task-1 64288
Here is Task-1 69596
Here is Task-1 74620
Here is Task-1 79648
Here is Task-1 84720
Here is Task-1            Task-3 606
89792
Here is Task-1 94816
Here is Task-1 99844
Here is Task-1 104872
Here is Task-1 109908
Here is Task-1 114948
Here is Task-1 119988
Here is Task-1 125028
Here is Task-1 130012
Here is Task-1 135052
Here is Task-1 140092
Here is Task-1 145132
]
<Here is Task-1 208008>
<Here is Task-1 213044>
<Here is Task-1 218032>
<Here is Task-1 223068>
<Here is Task-1 228108>
<Here is Task-1 233172>
<Here is Task-1 238232>
<Here is Task-1 243272>
<Here is Task-1 248312>
<Here is Task-1 253352>
<Here is Task-1 258372>
<Here is Task-1 263412>
<Here is Task-1 268452>
           Task-3 1215
<Here is Task-1 274184>
<Here is Task-1 279248>
<Here is Task-1 284308>
<Here is Task-1 289348>

 

 

This is the code of the program. No Libraries are used - full functional:     ( Do not forget the wire from pin1 to pin 8!)

// (C) 2015 Helmut Weber

// SoftSerial receiver 115200 baud using pin change interrupt
// Receiver buffer of 500 Bytes
// Receiving about 6300 characters per second !
// Cooperative Multitasking
// New Timer0 Interrupt

// _________________________________________________________________________________________
// ______________________ Atomic access __ _________________________________________________
// _________________________________________________________________________________________
#define ATOMIC_ON     uint8_t sreg; sreg = SREG;  cli()      // Clear interrupts
#define ATOMIC_OFF    SREG = sreg;                           // Original state of Interupts



// _________________________________________________________________________________________
// ______________________ Bit manipulation _________________________________________________
// _________________________________________________________________________________________

// set bit
static inline void BIT_SET(volatile uint8_t *target, uint8_t bit) __attribute__((always_inline));
static inline void BIT_SET(volatile uint8_t *target, uint8_t bit) {
  *target |= (1 << bit);
};

// clear bit
static inline void BIT_CLEAR(volatile uint8_t *target, uint8_t bit) __attribute__((always_inline));
static inline void BIT_CLEAR(volatile uint8_t *target, uint8_t bit) {
  *target &= ~(1 << bit);
};

// test bit
static inline bool BIT_TEST(volatile uint8_t *target, uint8_t bit) __attribute__((always_inline));
static inline bool BIT_TEST(volatile uint8_t *target, uint8_t bit) {
  return (*target & (1 << bit));
};

#define BIT_PULSE(PORT,BIT) BIT_SET(PORT, BIT); BIT_CLEAR(PORT, BIT)
// _________________________________________________________________________________________


// _________________________________________________________________________________________
// ______________________ Prepare new Timer0 interrupt______________________________________
// _________________________________________________________________________________________


// the prescaler is set so that timer0 ticks every 64 clock cycles, and the
// the overflow handler is called every 256 ticks.
#define MICROSECONDS_PER_TIMER0_OVERFLOW (clockCyclesToMicroseconds(64 * 256))

// the whole number of milliseconds per timer0 overflow
#define MILLIS_INC (MICROSECONDS_PER_TIMER0_OVERFLOW / 1000)

// the fractional number of milliseconds per timer0 overflow. we shift right
// by three to fit these numbers into a byte. (for the clock speeds we care
// about - 8 and 16 MHz - this doesn't lose precision.)
#define FRACT_INC ((MICROSECONDS_PER_TIMER0_OVERFLOW % 1000) >> 3)
#define FRACT_MAX (1000 >> 3)

volatile unsigned long _timer0_overflow_count = 0;
volatile unsigned long _timer0_millis = 0;
static unsigned char timer0_fract = 0;

unsigned long m = _timer0_millis;
unsigned char f = timer0_fract;


// _________________________________________________________________________________________
// ______________________ New _micros(), _delay() and Timer0 interrupt _____________________
// _________________________________________________________________________________________


unsigned long _micros() {
  unsigned long m;
  uint8_t t;
  ATOMIC_ON;
  m = _timer0_overflow_count;
  t = TCNT0;
  ATOMIC_OFF;  
  if ((TIFR0 & _BV(TOV0)) && (t < 255)) 	m++;
  return ((m << 8) + t) * (64 / clockCyclesPerMicrosecond());
}

volatile bool shortenStopBit;

ISR (PCINT0_vect);

ISR(TIMER0_COMPA_vect) {
  BIT_PULSE(&PORTD, 6);
  if ( BIT_TEST(&PCIFR, PCIF0)) {   // First do pending SoftSerial Interrupt
    shortenStopBit=true;
    PCINT0_vect();
    shortenStopBit=false;
    
    BIT_PULSE(&PORTD, 5);  
  }
  m += MILLIS_INC;
  f += FRACT_INC;
  BIT_PULSE(&PORTD,6);

  sei();                        // allow SoftSerial IRQ as fast as possible !!!
  if (f >= FRACT_MAX) {
    f -= FRACT_MAX;
    m += 1;
  }
  timer0_fract = f;
  _timer0_millis = m;
  _timer0_overflow_count++;

}



void _delay(unsigned long ms)
{
  uint16_t start = (uint16_t)_micros();

  while (ms > 0) {
    if (((uint16_t)_micros() - start) >= 1000) {
      ms--;
      start += 1000;
    }
  }
}


// _________________________________________________________________________________________
// ______________________ Multitasking state machine _______________________________________
// _________________________________________________________________________________________


// to be used later
#define FINISHED   4000000000
#define ENDED      4000000001
#define WAITMUT    4000000002


#define crReturn do{ state=__LINE__; return 0; \
  case __LINE__:; } while (0)

#define crBegin static uint32_t state=0; static uint32_t strt; \
  static bool first=true; \
  if (first) { first=false; strt=_micros(); } \
  switch(state) { case 0:

#define crFinish state=0; first=true; return FINISHED;}

#define crEnd    state=0; first=true; return ENDED;}

#define crWait(X)  \
  do{ \
    state=__LINE__;   return 0; \
  case __LINE__: ;\
  } while ((_micros()-strt) < (unsigned long)X); first=true

unsigned long start1 = _micros();

// _________________________________________________________________________________________
// ______________________ Build SoftSerial Receiver Ringbuffer _____________________________
// _________________________________________________________________________________________


#define   RBUFMAX  500
uint8_t            RBuf[RBUFMAX];
volatile uint16_t  RBufNums = 0;
volatile uint16_t  RHead = 0;
volatile uint16_t  RTail = 0;
volatile bool      LF = false;

volatile bool      ReceiveAllowed = false;
volatile bool      SendAllowed = true;



// _________________________________________________________________________________________
// ______________________ SoftSerial Interrupt Receiver buffer _____________________________
// _________________________________________________________________________________________



#define PINX        0
#define DDBX        DDB0+PINX
#define PORTBX      PORTB0+PINX

uint8_t RByte = 0;

// __________________ Receive one bit and delay ________________________________

inline bool fromPin(unsigned int dely)  {
  bool HiLo;
  HiLo = (BIT_TEST(&PINB, 0) != 0);
  switch (HiLo) {
    case (HIGH): RByte >>= 1; BIT_SET(&RByte, 7);
      //BIT_PULSE(&PORTD,6);
      break;
    case (LOW):  RByte >>= 1; BIT_CLEAR(&RByte, 7);
      asm("nop"); asm("nop");
      break;
  }                                  // 0.875 us until now
  while (dely) {                     // 1 - 1.875us 2 - 2.375  10 - 6.375 14-8.375
    asm("nop"); dely--;
  }
  asm("nop");                        // Fine tune delay
  asm("nop");
  asm("nop");
  asm("nop");
  asm("nop");
}

uint32_t ReceivedChars;


// __________________ SoftSerial Receive Interrupt ________________________________
// Receive at PortB, Pin 0 = Arduino digital pin 7

uint8_t  ii;
uint8_t RRByte;

ISR (PCINT0_vect)
{
  ATOMIC_ON;
  BIT_PULSE(&PORTD, 7);
  // reduce Stop-Bit-Time when called from timer-interrupt
  shortenStopBit ? fromPin(9) : fromPin(14);
  RByte = 0;
  for (ii = 0; ii < 8; ii++) {
    BIT_PULSE(&PORTD, 7);
    fromPin(11);
  }
  RRByte = RByte; if (RByte == '\n') LF = true;
  fromPin(12);     // Stopbit
  BIT_SET(&PCIFR, PCIF0);
  if (ReceiveAllowed) {
    if (RBufNums < RBUFMAX) {
      RBuf[RHead++] = RRByte; if (RHead == RBUFMAX) RHead = 0;
      RBufNums++;
    }
  }
  ReceivedChars++;
  ATOMIC_OFF;
}


// _________________________________________________________________________________________
// ______________________ Tasks __________ _________________________________________________
// _________________________________________________________________________________________



// Send a message at at Serial pin TX = Arduino digital pin 1
uint32_t task1() {
bool int1;
  crBegin;
  while (1) {
    ReceiveAllowed = true;        // Allow SoftSerial to receive characters
    if (SendAllowed) {
      Serial.print("<Here is Task-1 "); Serial.print(_micros()); Serial.println(">");
    }
    ReceiveAllowed = false;       // Disallow SoftSerial to receive characters
    crWait(5000);
    crReturn;
  }
  crFinish;
}


// Print the characters received by SoftSerial
uint32_t task2() {
static uint16_t nums;
  crBegin;
  while (1) {
    //Serial.print("Here is Task-2 "); Serial.println(RBufNums);
    ATOMIC_ON;
    nums=RBufNums;
    ATOMIC_OFF;
    if (nums > 400) {
      SendAllowed=false;    // Avoid recursion
      Serial.print("\nSoftSerial received:\n[\n");
      while (nums) {
        if ((RBuf[RTail]=='<') || (RBuf[RTail]=='>')) { // Filter < and > to show: This is comming from Buffer
          RTail++;
          if (RTail == RBUFMAX) RTail = 0;
          nums--;
        }
        else {
          Serial.write(RBuf[RTail++]); if (RTail == RBUFMAX) RTail = 0;
          nums--;
        }
        ATOMIC_ON;
        RBufNums--;
        ATOMIC_OFF;
        crReturn;
      }
      Serial.println("]");
      SendAllowed=true;
      crWait(2000);
    }
    crReturn;
  }
  
  crFinish;
}

// Print "Task-3" 10 times a second
uint32_t task3() {
  crBegin;
  while (1) {
    Serial.print("           Task-3 "); Serial.println(ReceivedChars);
    crWait(100000);
  }
  crFinish;
}





void setup() {

  //set timer0 interrupt at 2kHz
  TCCR0A = 0;// set entire TCCR2A register to 0
  TCCR0B = 0;// same for TCCR2B
  TCNT0  = 0;//initialize counter value to 0
  // set compare match register for 2khz increments
  //OCR0A = 124;// = (16*10^6) / (2000*64) - 1 (must be <256)
  OCR0A = 249;// = (16*10^6) / (2000*64) - 1 (must be <256)
  
  // turn on CTC mode
  TCCR0A |= (1 << WGM01);
  // Set CS01 and CS00 bits for 64 prescaler
  TCCR0B |= (1 << CS01) | (1 << CS00);
  // enable timer compare interrupt
  TIMSK0 |= (1 << OCIE0A);


  pinMode(7, OUTPUT);      // for scope to test round robin time
  pinMode(6, OUTPUT);      // for scope to test round robin time
  pinMode(5, OUTPUT);      // for scope to test round robin time

  Serial.begin(115200);
  Serial.println("Start...");
  _delay(1000);
  // Prepare Softserial Receive interrupt at Arduino pin 8 =PORTB, pin 0
  DDRB &= ~(1 << DDBX);       // Set the PB0, PB1, PB2 pin as Input
  PORTB |= (1 << PORTBX);     // turn On the Pull-up
  PCICR |= (1 << PCIE0);      // set PCIE0 to enable PCMSK0 scan = PORTB 0-7
  PCMSK0 |= (1 << PINX);;     // set PCINT0 to trigger an interrupt on state change PB0
}


// Do Multitasking
void loop() {
  task1();
  task2();
  task3();
}

Conclusion:

SoftSerial-Receiver  with 115200 baud  and  Multitasking is possible !

 

Last Edited: Fri. Apr 17, 2015 - 02:40 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

This is the best thread I have ever seen!

I spent this day to read throw every single post and I studied a lot. enlightened

You guys, are definitely smarter than me and it was a pleasure to read your discussion. yes

I'm developing Modbus based PLCs and plus/minus 20-30ms doesn't matter for me, and I really impressed how precise are you.

From tomorrow, I shall do my job much more accurate because of you.

Thank you very much and I quite hope that you will continue this thread. smiley

 

Regards,

Peter

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Very good thread. This is a very relevant discussion.  An ATMEGA1284 has 128k bytes (64k instructions) of program memory and 16k bytes of RAM. That is enough to do a whole lot of multitasking.

I have three major projects for products that will go to mass production where this whole discussion between cooperative versus preemptive is a current topic..

Here is the difference as I see it:

In cooperative multitasking, a process has to run to completion before there can be a context switch, because 'C' only allows you one stack unless you do some nasty hacks; so you cannot

save registers on the stack half way through execution and resume another process by pulling its registers off the stack. Actually you could have 33 bytes of dedicated RAM for each

process to save its registers but this would not help you, because it is the call address trace that would screw you up.

So cooperative multitasking is fine providing no process will run long enough to delay a more urgent process from getting cycles. 

In fact, you have to look at the sum of the execution times of processes with priorities 1 to N and ensure that process N+1 can live with that delay.

You might be able to arrange that if you are writing all the code yourself. But if there are multiple programmers, the above restriction means one can affect

the operation of another's code. I don't like that at all!

 

In my projects, although we lived with that restriction up to a point in development, everybody designing a real product as opposed to an Arduino hobby/toy

will likely reach a point, as we have, where you can't live with that restriction forever.

 

So we come to preemptive multitasking.

The concept is simple. Each context gets its own stack on which its registers are saved if there is a context switch.

That allows all contexts free use of all registers without regard to what anything else is doing (Although so does cooperative multitasking).

There are a handful of system utilities that are useful for preemptive multitasking:

1. Suspend a process until reactivated (usually it will be completion of an IO operation that reactivates) and resume  executing the highest priority active context.

    There is a small table of status flags that can be scanned to find the highest active. It will be one of lower priority than the one being suspended,

    otherwise it wouldn't have been running. So you only need scan from its priority downwards.

Anyway its such a tight loop of 3 assembler instructions it takes no time at all.

 

Q0000:        LD R1, Z+          ; READ A STATUS FLAG INTO R1, INCREMENTING POINTER
                   TST R1               ; IF IT IS ZERO, KEEP LOOKING
                   BREQ Q0000      ; THE LAST PROCESS MUST ALWAYS BE ACTIVE

 

If nothing needs cycles, the lowest priority process can be a "go to sleep" program to save power if you want, which you ensure is  always active.

 

Many of the contexts that I have are Device Control Programs. They all have their own stacks but need very little. Typically just the 33 bytes for context save.

I assume any other programmer who is writing a context may call a DCP to do something like change the pins that he is allowed to control on a port that he

is allowed to control without screwing up the pins that somebody else might be allowed to control. There are other ways to mediate that using the

CBI and SBI instructions but that is just an example of something where you achieve mediation by queueing requests for action by non-rentrant resources.

Having multiple things hanging off the SPI port is another example. So another useful utility routine is

2. a QUEUER that chains requests for such a DCP

by linking Device control Blocks (DCBs) into a linked list on a first-come basis. This is also a good way of passing messages to and from processes.

You can think of the QUEUER as putting messages in a process's mailbox which it will check when it has time, and reply to on a first-come basis.

 

If the sending/calling process requested it, it may have been suspended until such a reply was received. Then it will be reactivated when the reply is received.

It may not run immediately if it is not the highest priority active process. When a process is reactivated as a result of an interrupt driven IO operation getting

completed, you can check whether the process that was interrupted was of a higher or lower priority than the process being reactivated.

If higher, then you just do an RETI. No context switch needed! Else you do a reschedule and a contect switch

 

I find the premeptive OS way easier to write complex software for.

The negative is, I haven't figured out how best to mix ATMEL macroassembler and gnu C.

Ideally, there would be a linker that linked object modules .obj produced by either assembler or C, but I haven't dug deep enough to see how to do that yet.

That is something I could use some help on. I you've figured that out, please let us know how!

 

But actually, if you Macroassembler in a smart way, it's an even higher-level language than C, because you define your own Macros in your own language,

English, Chinese or WHY, and basically write code in your mother tongue.

 

 

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks for your post increased my awareness.

PaulWDent wrote:
... because 'C' only allows you one stack ...
The word 'stack' is not in the C standard.

Stacks in AVR C compilers :

  • AVR GCC, one
  • IAR EW AVR, two (data, return addresses)
  • CodeVisionAVR, two (data, hardware)
  • ImageCraft JumpStart C for Atmel AVR, two (hardware, software)

PaulWDent wrote:
But if there are multiple programmers, the above restriction means one can affect the operation of another's code. I don't like that at all!
That's not so bad and it's common for small memory embedded systems.

What's given is cooperation between all designers, a rule set, checklists, and code reviews (at least two, better is three, at each code review)

Everyone brings something to the table; all learn.

PaulWDent wrote:
... where you can't live with that restriction forever.
It does come to that ... multi-core ... multi-processor ... "oceans" of RAM ... complex I/O

But, complexity can be reduced only to a point and what complexity there is will roost somewhere.

Experience is gained by deadlocks, task starvation, indeterminism, race conditions, etc

Trade one small headache for a larger headache wink

PaulWDent wrote:
... (usually it will be completion of an IO operation that reactivates) ...
Some computer languages define synchronization points, one typical is for I/O, where a context switch could occur.

PaulWDent wrote:
There is a small table of status flags that can be scanned to find the highest active. ...

Anyway its such a tight loop of 3 assembler instructions it takes no time at all.

Can be done in C (portability)

PaulWDent wrote:
The negative is, I haven't figured out how best to mix ATMEL macroassembler and gnu C.
AtomicZombie had that dilemma for his XMEGA384C3 VGA project where the video engine is in ISRs (assembly language) with C for the interface to the application.

His solution was to move from preferred Atmel AVR assembly to GNU assembly.

 


ISO/IEC JTC1/SC22/WG14 - C

http://www.open-std.org/JTC1/SC22/WG14/

https://gcc.gnu.org/wiki/avr-gcc#Frame_Layout

http://ftp.iar.se/WWWfiles/AVR/webic/doc/EWAVR_CompilerReference.pdf (page 193 for Available Stacks)

via

https://www.iar.com/support/user-guides/user-guides-iar-embedded-workbench-for-atmel-avr/

http://hpinfotech.ro/cvavr_documentation.html

https://imagecraft.com/help/ICCV8AVR/iccavr/6-programmingavr/stacks.htm#IX_Stacks

EmbeddedGurus

Fast, Deterministic, and Portable Counting Leading Zeros « State Space

by Miro Samek

September 8th, 2014

http://embeddedgurus.com/state-space/2014/09/fast-deterministic-and-portable-counting-leading-zeros/

Counting leading zeros in an integer number is a critical operation in many DSP algorithms, such as normalization of samples in sound or video processing, as well as in real-time schedulers to quickly find the highest-priority task ready-to-run.

...

 

P.S.

There are more than several ways to design a computer scheduler :

EmbeddedGurus

Beyond the RTOS: A Better Way to Design Real-Time Embedded Software « State Space

by Miro Samek

April 27th, 2016

http://embeddedgurus.com/state-space/2016/04/beyond-the-rtos-a-better-way-to-design-real-time-embedded-software/

...

But it [RTOS] is also the design strategy that implies a certain programming paradigm, which leads to particularly brittle designs that often work only by chance. I’m talking about sequential programming based on blocking.

...

 

P.P.S.

Though Miro's QP/C and QP/C++ for large megaAVR did not make the transition from version 4 to version 5 it's still available under GPLv3.

QP-nano is available on Arduino.

http://state-machine.com/licensing/QP-Arduino_GPL_Exception.txt

http://state-machine.com/qpn/ports_native.html

http://playground.arduino.cc/Code/QP

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

PaulWDent wrote:

In cooperative multitasking, a process has to run to completion before there can be a context switch...

 

Not really. In cooperative multitasking the running process has to voluntarily 'yield' it's execution to other processes. It can do this by running to completion or by reaching a point in its execution where it can give away its 'right' to run in anticipation of being given execution time at a later date. This can be achieved with the use of state machines.

'This forum helps those who help themselves.'

 

pragmatic  adjective dealing with things sensibly and realistically in a way that is based on practical rather than theoretical consideration.

Last Edited: Wed. Feb 15, 2017 - 10:44 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Brian Fairchild wrote:
Not really.

Indeed.

 

In fact, if a process has lengthy work to do, it may have to yield control before "completion" - to avoid locking-out the rest of the system for too long.

 

can be achieved with the use of state machines

+1

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The OP might want to consider that stack sizing a bit more - a context is 32gpregs+pc+sr then you've got local vars and calls then another context for isrs.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

In cooperative multitasking, a process has to run to completion before there can be a context switch, because 'C' only allows you one stack unless you do some nasty hacks

 Huh?  The nasty hacks are essentially identical to what you have to do in a task switch for a preemptive system.  And they're not that "nasty" - just save all of this tasks registers to the stack, then save the SP to to Task Control Block and fetch a new SP from the next task's TCB.   AVRs tend to have a depressing ratio of "amount of context" to "amount of RAM", but it's the same for either type of multitasking.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

PaulWDent wrote:
In cooperative multitasking, a process has to run to completion before there can be a context switch, because 'C' only allows you one stack unless you do some nasty hacks; so you cannot save registers on the stack half way through execution and resume another process by pulling its registers off the stack.
If I set SP to 0x100 and then create some local vars, some PUSHd registers and some CALL/RET address then later set SP to 0x200 and do the same and then continue at 0x300, 0x400 etc then how is that limiting me to "one stack"? What's more the AVR has X, Y, Z - they can all be used to implement stacks because of opcodes like "ST -X, Rn" and "LD Rm, Y+" etc. (which are effectively PUSH and POP). So I can achieve multiple stacks either by remembering the value of SP in several memory areas or perhaps by using each of SP, X, Y, Z as separate stack pointers.

 

As it happens the avr-gcc compiler does everything with the hardware SP but I believe the other compilers that gchapman listed as using "two" are probably using SP and one of X, Y or Z.

 

Also in co-op multitasking it could well be that some OS service functions (wait4Sem4() etc) may well do an implied yield() while they are waiting for the resource flag to become available.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I started this thread and i am amused it is still living!

 

This concept of multitasking is for small machines, but it worked with Ansi C.

So it is NOT machine or compiler dependent and needs no assembler!

 

Besides of Atmel processors i tested it with

Apple II with Aztek C

Xenix 386

DOS 6.22

Rasperry Pi

Wemos D1 ESP8266 (computing pi with 1000000000 iterations in the background - just for fun ;)

Windows 7

with different Coimpilers.

 

You may combine it with a timer Interrupts for small tasks or to set some flags.

For Arduino you will not get timing problems with library functions, which are not reentrant!

 

For some situations this will be  the easiest way to do multitasking - best for very small microontrollers:

Just three "defines" will do the work.

In fact, it is nothing else like a state machine - remembering the line it left a job and start at the next line

when it is called again.

 

Try it and have fun ;)

 

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

heweb wrote:
I started this thread and i am amused it is still living!

 

It was sleeping for 7 months untill you bumped it.

And if Microchip allows it you can bump it again in 5 years.

Older threads have been revived on occasion.

 

But why did you revive it?

It seems you forgot to post your "multitasking" with "3 defines"

Does this refer to protothreads?

Paul van der Hoeven.
Bunch of old projects with AVR's:
http://www.hoevendesign.com

Pages