saving unused registers on the stack in ISR

Go To Last Post
28 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi guys,

using inline assembler, gcc and atmega32..

void TIMER0_OVF_vect( void ) __attribute__ ( ( signal, naked ) );
void TIMER0_OVF_vect( void )
{
	asm volatile (	"push	r0						\n\t" \
					"push	r1		; ?				\n\t" \
					"in		r0, __SREG__			\n\t" \
					"push	r0		; ?				\n\t" \
					"push	r24						\n\t" \
					"lds	r24, Timer0Overflows	\n\t" \
					"subi	r24, 255 ; add 1		\n\t" \
					"sts	Timer0Overflows, r24	\n\t" \
					"pop	r24						\n\t" \
					"pop	r0						\n\t" \
					"out	__SREG__, r0			\n\t" \
					"pop	r1						\n\t" \
					"pop	r0						\n\t" \
					"reti							\n\t" \
	);
}

if i remove the lines marked "?" (and the corresponding pops, of course), the application doesn't work (i'm not quite sure what goes wrong).. why do i need to save R0 and R1? as the code shows, they are not being used in the heart of the ISR.. does any of the other instructions use them as temp registers or something?

btw: the only reason i tried it this way is, that the compiler emits something similar when using the C-equivalent of the ISR code..

cheers...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The compiler generally relies on R1 holding 0 and this is why you may see it labelled as __zero_reg__ most of the time. It uses R0 as __tmp_reg__ which, as the name suggests, is used as a temporary register in various library code. The exception is when MUL is being used but the contents of R1 is set back to 0 after use.

Also any ISR code that might change SREG (using an instruction such as SUBI) must preserve the contents of SREG which is what the second "push r0" in the above is actually doing.

Cliff

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

cliff:

i see what you mean, but since i'm not calling any library code and not using mul, R0 and R1 should *not* get affected by the lds/subi/sts part...

i was aware that the compiler uses R0 and R1 for special purposes, btw..

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

But I'll bet you main() code has generated things like:

CPI R16,31
BREQ .+8
LDI R24, 17
...

consider what happens if R16 is 31 and that timer interrupt were to "fire" between the CPI and the BREQ and that it destroyed the state of SREG. The jump that should have occurred would not have done.

Similarly the code might be doing:

MOV R30,__zero_reg__
MOV R31,0xDF

where it *thinks* R1 is 0 but it is is no longer as the ISR destoyed it

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

take a look at my ISR code and tell me how it can destroy the contents of R1 if I don't save it on the stack, please.. because I don't see how that could happen..

cheers..

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
and the corresponding pops, of course
maybe you are removing the wrong 'pop's (put question marks by the ones you are removing).

Should be something like-

push r0         //save r0
in r0,__SREG__  //store sreg
push r24        //save r24
lds r24,Timer0Overflows
subi r24,255
sts Timer0Overflows, r24
pop r24         //restore r24
out __SREG__,r0 //restore sreg
pop r0          //restore r0
reti
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

hooverphonique wrote:
take a look at my ISR code and tell me how it can destroy the contents of R1 if I don't save it on the stack, please.. because I don't see how that could happen..

cheers..


But that's not the issue here - it's likely the corruption of SREG that is the problem.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi curt,

your code is what i would believe to work, and is similar to mine with the push/pops removed:

	asm volatile (	"push	r0						\n\t" \
					";push	r1		; ?				\n\t" \
					"in		r0, __SREG__			\n\t" \
					";push	r0		; ?				\n\t" \
					"push	r24						\n\t" \
					"lds	r24, Timer0Overflows	\n\t" \
					"subi	r24, 255 ; add 1		\n\t" \
					"sts	Timer0Overflows, r24	\n\t" \
					"pop	r24						\n\t" \
					";pop	r0						\n\t" \
					"out	__SREG__, r0			\n\t" \
					";pop	r1						\n\t" \
					"pop	r0						\n\t" \
					"reti							\n\t" \
	);

code copied from my source file.. which doesn't work as expected..

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

As already stated you're likely mismatching the ones you removed

try:

void TIMER0_OVF_vect( void ) __attribute__ ( ( signal, naked ) );
void TIMER0_OVF_vect( void )
{
   asm volatile (   "push   r0                  \n\t" \
               "in      r0, __SREG__         \n\t" \
               "push   r24                  \n\t" \
               "lds   r24, Timer0Overflows   \n\t" \
               "subi   r24, 255 ; add 1      \n\t" \
               "sts   Timer0Overflows, r24   \n\t" \
               "pop   r24                  \n\t" \
               "out   __SREG__, r0         \n\t" \
               "pop   r0                  \n\t" \
               "reti                     \n\t" \
   );
} 

note that "similar" is not good enough. If yours is different, please show us your non-working version with the push's and pop's removed.

Writing code is like having sex.... make one little mistake, and you're supporting it for life.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

glitch: the last one i posted is the non-working one, with the push/pops commented out..

sorry, english is not my first language.. "similar" should be "functionally identical"..

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

hooverphonique wrote:
why do i need to save R0 and R1?
You don't unless you do something in the code that might modify them. Try this:

asm volatile ( \
  "push   r24                  \n\t" \
  "in     r24, __SREG__        \n\t" \
  "push   r24                  \n\t" \
  "lds    r24, Timer0Overflows \n\t" \
  "subi   r24, 255 ; add 1     \n\t" \
  "sts    Timer0Overflows, r24 \n\t" \
  "pop    r24                  \n\t" \
  "out    __SREG__, r24        \n\t" \
  "pop    r24                  \n\t" \
  "reti                        \n\t" \
); 

Don Kinzer
ZBasic Microcontrollers
http://www.zbasic.net

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

thats exactly the last thing i tried yesterday, and that didn't work either.. somehow it only works if i also save R0 and R1.. i wonder if the issue is something completely different, like a timing thing or something.... i'm puzzled...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

hooverphonique wrote:
thats exactly the last thing i tried yesterday, and that didn't work either.
Clearly, something is going on that is not yet understood. It's not sufficient to look at the source code alone; you must look at the actual code generated by the compiler which can be found in the .lss file. Can you excerpt and post the portion related to this ISR?

If your makefile doesn't have a rule for generating the .lss file, use this:

%.lss: %.elf
  avr-objdump -h -S $< > $@

Don Kinzer
ZBasic Microcontrollers
http://www.zbasic.net

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The second example that the OP provided will work just fine, I believe, if he will just remove the "signal" attribute. I may be misremembering things, but doesn't "signal" cause the compiler to emit what it believes are the necessary prelude and epilogue instructions to allow the global interrupt enable to be turned back on inside the ISR, so that it can be interrupted by other ISRs?

If so, I can't see any reason why the OP's ISR code should need to allow itself to be interrupted by other sources, because, as it's written, it will probably finish its work in less time than the prelude/epilogue instructions will need to allow nesting in the first place (or darn close to it).

P.S. If I'm right, this is another instance where being concerned about, and able to interpret, the assembly code generated by the compiler is important. Rather than moaning about what one's statements ought to have caused the compiler to do, look at the .lss file to see what the compiler DID!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Whoops; overanxious responder finally RTFMs....

The "signal" attribute doesn't cause prelude/epilogue code for reenabling interrupts, instead it causes prelude/epilogue code for saving and restoring the registers normally used by the compiler. Since the OP's code doesn't call any compiled-from-C routine, and correctly saves/restores everything it clobbers, though, I still think that the "signal" attribute should be deleted.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

from the listing file...

without saving r0/r1 (non-working):

00000d08 <__vector_11>:
     d08:	8f 93       	push	r24
     d0a:	8f b7       	in	r24, 0x3f	; 63
     d0c:	8f 93       	push	r24
     d0e:	80 91 c0 01 	lds	r24, 0x01C0
     d12:	8f 5f       	subi	r24, 0xFF	; 255
     d14:	80 93 c0 01 	sts	0x01C0, r24
     d18:	8f 91       	pop	r24
     d1a:	8f bf       	out	0x3f, r24	; 63
     d1c:	8f 91       	pop	r24
     d1e:	18 95       	reti

with saving r0/r1 (working):

00000d08 <__vector_11>:
     d08:	0f 92       	push	r0
     d0a:	1f 92       	push	r1
     d0c:	8f 93       	push	r24
     d0e:	8f b7       	in	r24, 0x3f	; 63
     d10:	8f 93       	push	r24
     d12:	80 91 c0 01 	lds	r24, 0x01C0
     d16:	8f 5f       	subi	r24, 0xFF	; 255
     d18:	80 93 c0 01 	sts	0x01C0, r24
     d1c:	8f 91       	pop	r24
     d1e:	8f bf       	out	0x3f, r24	; 63
     d20:	8f 91       	pop	r24
     d22:	1f 90       	pop	r1
     d24:	0f 90       	pop	r0
     d26:	18 95       	reti

so the code emitted by the compiler is as expected/desired.. i don't get it...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

the code as it is is fine, your problem must be something external. Is this "not working" in the simulator, or on a real part?

Writing code is like having sex.... make one little mistake, and you're supporting it for life.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
somehow it only works if i also save R0 and R1.. i wonder if the issue is something completely different, like a timing thing or something.... i'm puzzled...
Well, I have a completely different (unlikely correct) theory for you...

Are you maybe experiencing a stack overflow issue? The extra push and pops just happen to move things around enough that something really critical is not getting corrupted? As a test, try pushing and popping two registers other than R0 and R1. This will tell you if it is something specific about R0 and R1 or if it is just the change in stack locations used.

If the above doesn't shed any light on the problem, I suggest stripping your program down to the smallest possible that still shows the problem. In the process, you may find the problem. If not, post the complete compilable code here.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

i'm working on the real part...

@kevin: that's actually an interesting suggestion, since i'm using FreeRTOS and therefore a lot of stack exercising is going on.. I will have a look at it..

edit: that did the trick.. saving r2/r12 (just to pick some) has the same effect as saving r0/r1.. so now on to figuring out, where stack space is lacking..

thanks.. :)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

turned not to be a stack issue at all, but a timing issue with another interrupt and the Timer0Overflow variable.. if the stack massaging code is replaced with a sufficient number of nop's, it also works.. 16-bit software timers are a bitch when other interrupts are on as well...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

hooverphonique wrote:
16-bit software timers are a bitch when other interrupts are on as well...

Hi hooverphonique,
can you elaborate on that please?

Cheers
Thomas

pycrc -- a free CRC calculator and C source code generator

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Presumably that access to 16bit vars needs protecting?

int counter;

int main(void) {
 display(counter); // 'counter' updated in ISR
}

would be unwise as in the preamble to the call to display() the code could be half way through accessing the two bytes of 'counter' when an interrupt fires that increments it. Say it had been 0x00FF, the code had collected the 0xFF but but not the 0x00 upper byte then the interrupt fires, the variable is incremented from 0x00FF to 0x0100 and now the main() code collects the upper byte. The value passed to display() is now 0x01FF and not either 0x0100 or 0x00FF. To avoid this, non atomic accesses (usually 16 bit then) need protecting:

int counter;

int main(void) {
 int counter_copy;
 uint8_t old_SREG;
 old_SREG = SREG;
 cli();
 counter_copy = counter;
 SREG = old_SREG;
 display(counter_copy); // 'counter' updated in ISR
}

or similar.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks clawson.

I was expecting some very weird behaviour of stacking interrupts or the like. :-)

Thomas

pycrc -- a free CRC calculator and C source code generator

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

if my case only had been so simple...

i was reading the 16-bit timer value from another interrupt (external int1), so protection was not the issue..

after entering the int1 interrupt, the counter/timer can still increase, causing the "software" high byte to be out of sync with the low one, when the counter wraps.. even checking the overflow flag just before reading TCNT0 is not enough, since the counter can overflow *between* reading the flag and reading the counter.. normally this can be solved by enabling interrupts just before reading TCNT0 and disabling them immediately after again, causing the counter overflow interrupt to be executed just at the right time to increase the high byte.. but when other interrupts than int1 and counter 0 overflow are enabled, any other pending interrupt than the overflow one can get executed in place of the overflow interrupt, causing the high and low bytes to still be out of sync..

hope this makes sense to you guys, otherwise ask..

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

hooverphonique wrote:
i was reading the 16-bit timer value from another interrupt (external int1), so protection was not the issue..

after entering the int1 interrupt, the counter/timer can still increase, causing the "software" high byte to be out of sync with the low one, when the counter wraps..

That sort of thing is why AVRs have a high byte buffer.
There is no way to directly access the high byte of a 16-bit timer.
Accessing the low byte causes a 2-byte transfer.
One of the bytes is transferred between the high byte and the high byte buffer.
Trying to access the high byte actually accesses the high byte buffer.
To read a 16-bit timer, read the low byte before reading the high byte.
If interrupts are disabled, as in an interrupt handler,
that will do the trick.
If you have been accessing them in the reverse order,
I'm surprised it worked at all.
I'd expect the high byte to lag behind when the low byte wrapped.

Also, signal implies not naked.

Iluvatar is the better part of Valar.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

@skeeve: if you read the thread, you will realize that i was using an 8-bit hw timer and generating the high byte using the overflow interrupt :-)

i'm completely aware of the high byte buffering for (some) 16-bit registers, but I only had an 8-bit timer available..

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

hooverphonique wrote:
@skeeve: if you read the thread, you will realize that i was using an 8-bit hw timer and generating the high byte using the overflow interrupt :-)
Ah. I'd taken '"software"' to mean 'as seen by software'.
Examining your assembly more closely would
have made me realize what you were doing.

I did something like that without interrupts
to make a 32-bit timer out of a 16-bit timer.
The fetch function kept track of the old hw value.
If the current hw value was less than the old,
it incremented the upper word.
The fetch function could be fooled
if it wasn't called often enough.

Iluvatar is the better part of Valar.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

yes.. i did something similar, but has now changed it, since i realized, i could actually use the 16-bit timer, even if it was also used for the RTOS kernel tick.. so no more (potentially unsafe) trickery needed..