16 bit subtract corrupt in interrupt?

Go To Last Post
14 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

This code exhibits the problem with the timer0 interrupt or soft timer write.

The length of the delay created by pause_timer switches between the intended
delay of 1.55s and 420ms. A delay of 420ms corresponds to a a delay value equal to
95 or 350 - 256, so it looks like the high byte has been masked off. It seems
to do this semi-randomly. There may be a pattern but the pattern is dependant
on the delay value.

Now it appears to be in the test "while( pause_timer )" that the value is being
corrupted. Only the lower 8 bits are bing tested then it overwrites it when the
lower 8 bits count down to 0. But then only sometimes. Hmm, what gives?

It must be something dumb that I am not seeing, any help wold be greatly
appreciated.

/* Main
*/

#include 
#include 
#include 
#include 
#include 

volatile uint16_t pause_timer; // must be volatile or the interrupt routine
								// will not change the value


/* ISR for timer0 *************************************************************
 * Generates a tic every 4443.09us for fosc = 3.6864MHz and prescale of 1024
*/
SIGNAL (SIG_OVERFLOW0)
{
	cli();
	if (pause_timer)
		pause_timer--;
	sei();
}


/* timer_write ****************************************************************
 * Turn off ints, initialize the 16 bit timer value, turn on ints.
*/
void TimerWrite16 (volatile uint16_t *timer, uint16_t tval)
{
	cli();								// turn off global interrupts
	*timer = tval;						// write to timer location 
	sei ();								// turn on global interrupts
}


/* pause timer - pause a number of timer ticks ********************************
 * initialize the timer then do nothing until the timer0 isr decrements the
 * timer value down to 0.
*/
void Pause (uint16_t tval)
{
	TimerWrite16( &pause_timer, tval );
	while( pause_timer )
		;
}


void Init (void) // initialize peripherals ***********************************
{

// set ports
	DDRB = 0xff; 						// set for all outputs
	PORTB = 0xff;						// turn off all leds
	
// set timer0
	TCCR0 |= (1 << CS11)|(1 << CS10);	// MCU clock/64 
	TIMSK |= _BV (TOIE0);				// set timer0, timer1 int 

	sei();  // turn on interrupts
}



int main(void) //  ***********************************************************
{
uint8_t i = 0;

	Init();
	
	for (;;) {
		Pause (350);					// Delay for a preset time
		if (i == 1) {					// Then toggle a bit for testing
			PORTB |=  _BV(1);
			i = 0;
		}
		else {
			PORTB &=  ~_BV(1);
			i = 1;
		}
	}

	return (0);
}
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I don't know if these issues affect your problem, but you'll eventually want to take a look at these:

- Do NOT put in cli() and sei() in your Interrupt Service Routine (ISR): SIGNAL (SIG_OVERFLOW0). The compiler automatically takes care of this for you. Just look at the generated code to see it. So, remove those two lines.

- Did you read how to read and write 16 bit timer values in the data sheet? There's a specific order in how to do it. Verify that your code in TimerWrite16() is doing this correctly (by looking at the generated assembly). I have a hunch that it is not writing the register correctly....

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I believe your problem is with this piece of code:

while ( pause_timer );

What's happening is that intermittently, it's detecting that the low byte is zero (i.e., when pause_timer decrements to 256). So then, it tests the high byte to see if it's zero. pause_timer has decremented to 255 by now and the high byte is indeed zero. The test incorrectly concludes that pause_timer is zero.

Try this instead:

for ( ; ; ) {
  cli( );
  if ( !pause_timer ) break;
  sei( );
} // end for
sei( );

Don

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I think this is what's going on:

your pause_timer is a two-byte variable. The AVR can only check one byte at a time.
ASSUMPTION: in the generated assembly code for "while(pause_timer) ;", first the low byte is checked, then the high byte.
Now what would happen if just between these two checks, you receive a timer interrupt, which decrements the variable from 0x0100 to 0x00FF ?
First check on low byte : low(0x0100) == 0; then interrupt -> counter decreases to 0x00FF; then check high byte: high(0x00FF) == 0 --> end wait loop!!

to prevent this problem you could e.g. disable interrupts while reading out pause_timer, e.g. like this:

void Pause (uint16_t tval) 
{ 
   uint16_t t;

   TimerWrite16( &pause_timer, tval ); 

   do
   {
      cli();
      t = pause_timer;
      sei();
   } while(t != 0); 
}
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Nice catch Don, I would agree with your assessment 100%. It is highly likely that any 16 bit operation is non-atomic, so one must take care to guarantee that it remains atomic in any code that is interruptable.

Eric also makes some good points. While technically the CLI/SEI are incorrect, their redundant nature would only lead to a potential stack overflow. His other point about access order is a good one as well, however again, since hardware is not being affected/accessed here, it does not matter what order the bytes are written out into SRAM.

Writing code is like having sex.... make one little mistake, and you're supporting it for life.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Excellent! Thank you all for the fast response. Removing the cli, sei from the interrupt routine (not needed) and putting it around the 16 bit timer variable test (not monotonic) fixed the problem. The explanation for why it it seems random makes good sense. Next time I'll know what to look out for. Again, thanks for helping me understand the problem and how best to fix it.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

glitch wrote:
Nice catch Don

And (simultaneous independent posts) nice catch geraetsr.

Don

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

donblake wrote:
glitch wrote:
Nice catch Don

And (simultaneous independent posts) nice catch geraetsr.

Don


Yep, that one appeared while I was replying.

Writing code is like having sex.... make one little mistake, and you're supporting it for life.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

So I have been thinking about this and I am wondering is this behavior compiler dependant? Do all compilers test the low byte first and then the high byte? My initial thought is that if I test the high byte first then the low byte it will behave as it is intended. I can now see in the disassembly where the test occurs and that it does indeed test the low byte then the high byte.

Any comments on portability?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Actually the test order does not matter, since in between the two byte wise tests both values can change, corrupting your result. To avoid this, you must make it atomic with CLI/SEI. Reversing the order might reduce the occurrences of corruption, but will not remove them entirely.

Writing code is like having sex.... make one little mistake, and you're supporting it for life.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Oh boy, so the code the other guy wrote for the 8051 project I worked on is likely also subject to the same problem. It used similar timers without turning off interrupts prior to testing the timer value. Let me run through this one time to make sure I understand it thoroughly

The test looks like this:
if (timer_value)
//do stuff

The 16bit timer value is updated (decremented) in an interrupt. The interrupt can occur between the test of the high byte of timer_value and the low byte of timer_value.

Test 1:
Assume the high byte is tested first then the low byte.
Assume the timer value is at 0x0100.

Test of high byte shows that it is not 0. True, so it's ok.

Test 2:
Assume the timer value is at 0x0001

Test of the high byte value is 0. True so far
Interrupt occurs and decrements timer_value to 0x0000
Test of the low byte value is 0. True? Yes, it is within one count of being true. Actually, it is true.

Test 3:
Assume the low byte is tested first then the high byte
Assume the timer_value = 0x0100.

Test of the low byte value is 0. True
Interrupt occurs and decrements timer_value to 0x00ff.
Test of the high byte value is 0. False, the overall timer value is still 0x00ff.

Does that look right? It seems like it works fine with the test order reversed. It's not perhaps the best practice but it seems to give a valid test result. I also did not see the same problem with the 8051 code but that could be because I was not looking for it. If I get the chance, I'll look and see what byte order the 8051 does the test on the 16 bit values.

Thanks again.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

How about the condition where the counter is zero, and the interrupt occurs rolling it back around to 0xffff? So instead of getting a false trigger, you may not see the trigger at all. Admittedly this is not likely going to happen in your construction above, but I don't want you to get in the habit of coding that way, without understanding the pitfalls. Had your code actually done something else during the pause loop, it could happen.

Counter=0x0000
Test high byte for 0? True
Interrupt occurs 0x0000 -> 0xFFFF
Test low byte for 0? False

Overall Test False.

Also be wary of reversing the operands. Compares are typically performed as subtracts, and by reversing the byte compare order, you will need to add a 3rd step to account for any borrow that needs to occur as a result of the subtraction in the low byte. This is why compilers will test the low order value before the high order value, when they cannot perform the operation atomically.

Writing code is like having sex.... make one little mistake, and you're supporting it for life.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

glitch wrote:
How about the condition where the counter is zero, and the interrupt occurs rolling it back around to 0xffff?

If you check the original code, the counter is only decremented if it's not zero, so it should never roll over.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Indeed, I missed the 'if' for some reason.

Writing code is like having sex.... make one little mistake, and you're supporting it for life.