SOLVED: Separate stack for interrupts with GNU AVR

Go To Last Post
30 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hello,

 

I am running a few threads under a simple scheduler.   Since I have interrups as well I have to allocate the interrupt stack size to every thread stack which I would like to avoid by allocating a dedicated stack space for just the interrupt function.

 

Is there a way of tweaking the stack pointer before the prologue and after the epilogue with GNU AVR?   I would like to avoid the __naked__ attribute; otherwise I'd have to deal with the registers myself (which I do not know before compiling).   A stub function would do the trick but at the expense of another call / return.

 

I am sure this problem has been discussed before but the search function here and reading GNU documentation did not yield the desired results; maybe I was using the wrong keywords.   Could anybody point me in the right direction?

 

Regards,

Hagen

.

This topic has a solution.
Last Edited: Fri. Apr 10, 2015 - 07:14 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Afaik, FemtoOS does exactly this (this stack switching for interrupt use is selectable via a config switch).

Einstein was right: "Two things are unlimited: the universe and the human stupidity. But i'm not quite sure about the former..."

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

How is it done?    Are you using it and could you have a look at the source / executable code?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Just checked.    He has  a __naked__ attribute and then he is calliing "portSaveContext" in "femtoos_port.c" where pretty much everything is saved.   

 

Thank you very much.   Not what I am looking for.   I want to save stack space and I want it fast.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Just to say that is pretty much what FreeRTOS does too. I can't help thinking that if the authors of two respected RTOS could not find a "better way" then there probably isn't one.

 

(of course you could hack the compiler source but that's probably a bit extreme!)

This reply has been marked as the solution. 
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Ok, good point.   Btw I always thought the RT in RTOS stands for "real time".    How do they want to achieve anything close to real time when stacking the whole context for every interrupt?

 

Hacking the GNU compiler?    Yes, it crossed my mind for a few seconds (very few) :-).

 

Stub function with __naked__ it is then; the call and return is not gonna make a big difference.

 

Just noticed that the stack usage of one thread is excessive; about 120 bytes 0x00 in it too.    Would you know of any stack hungry library function?

Last Edited: Fri. Apr 10, 2015 - 02:46 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

hugo_habicht wrote:

...Btw I always thought the RT in RTOS stands for "real time".

 

There is absolutely no such thing as a "Real Time Operating System"*; the best you'll ever get is a "Fast Enough Operating System". And "Fast Enough" is application dependent so shouldn't be applied to anything without a load of caveats.

 

 

*of course, should Quantum Computing ever become a reality then maybe we will have true real time performance.

#1 This forum helps those that help themselves

#2 All grounds are not created equal

#3 How have you proved that your chip is running at xxMHz?

#4 "If you think you need floating point to solve the problem then you don't understand the problem. If you really do need floating point then you have a problem you do not understand." - Heater's ex-boss

Last Edited: Thu. Mar 26, 2015 - 11:09 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

BTW I'm a little confused. A task switch certainly needs the complete AVR context to be stored but surely ISR()s have their own register protection. Are you saying that all the hardware vectors are pointing to code that has to do a full context switch? Why?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

As far as see it FemtoOS does a complete context save for an interrupt function if you select it.   Please double check in the file I mentioned above; I may have overlooked something.   

I think you have the choice to simply use ISR():

From FemtoOS web site: "Thus, in the Femto OS a separate stack is used for the OS. Of course this implies an extra change of stack on every save context. Something similar applies to the interupt service routines. You may give it its own space of let it make use of the OS stack. A similar apprach is used for the isr's, which may use the OS stack or a stack on their own. "

Using the system stack would be slow but the stack space of the threads is not affected by the interrupt function.

 

@Brian: FEOS, I like that.
 

Last Edited: Thu. Mar 26, 2015 - 01:19 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Do the interrupts not just save/restore on the current task's stack then? I suppose I could see a scenario where an ISR re-enabled interrupts then the timer went off and context switched to a different stack before the interrupted ISR completed but don't you just make a rule that ISRs cannot re-enable interrupts?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

As far as I can see you have the choice, task stack or separate stack.   Reenabling interrupts cannot be done then (or should not be; it would still finish after the other tasks are done).   But I'm not sure (and I don't care), I am not using FemtoOS.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'm still not getting how there can be a separate stack for interrupts. The whole point of an interrupt is that you cannot know when one is going to occur (they "interrupt" things!) so how could you switch SP to a "special" stack for ISRs ?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Where is the problem?    In the ISR prologue you change the stack pointer to an area in RAM dedicated for that purpose, save the registers the ISR is gonna use, run the ISR and in the epilogue reverses the whole thing.   Or simply save the registers into the RAM dedicated area without using SP.    This way you do not have to allocate the interrupt stack space for each task/thread because the interrupt can happen anywhere.   FemtoOS took the path of least resistance and simply saves the whole context (if you so chose a separate stack space).

 

My question was aimed at: is there a guru out there that knows the magic compiler keyword that allows you to insert a few statements before the ISR prologue and after the epilogue or is there a way that the compiler tells you what registers it will use in the ISR.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

 In the ISR prologue

But by the time execution has got to any ISR code the AVR CPU has already PUSH'd the interrupted address on the stack (wherever that stack was at the time). My point is how can you know that an interrupt vector is about to be called and change SP before the call via the vector is made?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Ahhh that's what you meant.   Yes, of course you are right the return address is on the stack at the time, you cannot prevent that.    But a whole bunch of registers can potentially follow and that's what I wanted to avoid.    I have to add that I have an odd case here where the interrupt function calls a lot of stuff and multiplying that stack space with the number of tasks / threads really hurts.   I know what you will say now, don't do a lot of processing in an interrupt function :-)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

but at least I will say if it "calls a lot of stuff" it will push and pop a lot of registers. 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

In which case I don't need to say it ;-)

 

But what's the "worst case" I wonder? Is it something like 14 registers that might be pushed when an ISR() calls out to a function? Sure that means you have to make an allowance for an additional 14 bytes per task but is that *really* onerous?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Well that's the thing: if it "calls a lot of stuff" it will be more than 14 registers.   Otherwise I agree with you; 14 bytes per stack will not make a difference (really); typically my interrupt functions are even much smaller.

 

Are there actually any stack usage tools available for the GNU AVR compiler?    I am not so happy about the testting method.     I have a "Fast Enough" stack checking at every task switch and I can look at the usage through a pattern but there is still a good chance to miss something.

Last Edited: Thu. Mar 26, 2015 - 04:54 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Well that's the thing: if it "calls a lot of stuff" it will be more than 14 registers.   

Nope, it's never going to be more than the 14 (or whatever the real number is). Just write an empty ISR() that calls an external function and compile it. You will see the added PUSH/POP - that is the only additional overhead.

 

In fact I just tried that:

	.text
.global	__vector_1
	.type	__vector_1, @function
__vector_1:
	push __zero_reg__
	push r0
	in r0,__SREG__
	push r0
	clr __zero_reg__
	push r18
	push r19
	push r20
	push r21
	push r22
	push r23
	push r24
	push r25
	push r26
	push r27
	push r30
	push r31
/* prologue: Signal */
/* frame size = 0 */
/* stack size = 15 */
.L__stack_usage = 15
	call foo
/* epilogue start */
	pop r31
	pop r30
	pop r27
	pop r26
	pop r25
	pop r24
	pop r23
	pop r22
	pop r21
	pop r20
	pop r19
	pop r18
	pop r0
	out __SREG__,r0
	pop r0
	pop __zero_reg__
	reti
	.size	__vector_1, .-__vector_1

That is the worst case possible unless you are really going to succeed in writing an ISR() so complex that it uses all 32 registers in which case I suppose you might see additional ones there - the the very worst it could ever be is 32 bytes.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

And in the external function there are more pushs and pops....

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

hugo_habicht wrote:

Ok, good point.   Btw I always thought the RT in RTOS stands for "real time".    How do they want to achieve anything close to real time when stacking the whole context for every interrupt?

 

Hacking the GNU compiler?    Yes, it crossed my mind for a few seconds (very few) :-).

 

Stub function with __naked__ it is then; the call and return is not gonna make a big difference.

 

Just noticed that the stack usage of one thread is excessive; about 120 bytes 0x00 in it too.    Would you know of any stack hungry library function?

avrfreaks does not support Opera. Profile inactive.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I've never written an RTOS but I'm with clawson, why do you need a separate stack?

 

Also it seems you only have two choices;

1) You only deal with the regs that the interrupt uses and set a flag indicating that an interrupt has occurred which makes it an I'll get to it when I get to it OS or

2) You do a full context switch when an interrupt occurs.

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

If function calls from an ISR are OP's issue, tampering with the ISR prologue is insufficient.

One can do approximately what OP wants:

  .global __wrap_vector666
  __wrap_vector666:
  sbic GPIO0, x
    jmp __real_vector666
  sbi GPIO0, x  ; disable context switching
  store Ry
  store SP
  load SP
  call __real_vector666
  cli
  cbi GPIO0, x ; enable context switching
  restore SP
  restore Ry
  reti

See avr-ld --wrap .

'Tis rather slow.

The first three instructions and the cbi are only necessary if interrupts might be enabled within an ISR.

They add at most four cycles.

 

Edit: enable context switching

"SCSI is NOT magic. There are *fundamental technical
reasons* why it is necessary to sacrifice a young
goat to your SCSI chain now and then." -- John Woods

Last Edited: Wed. Apr 15, 2015 - 02:43 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

@skeeve: Thank you for the sample code of the wrapper function although I don't see where GPIO0 comes into the equation.

 

@Mike32217:  An interrupt function requiring n bytes of stack space running in a multitasking system under a scheduler with t task requires t x n bytes of RAM stack space allocation.    If it uses it's dedicated stack space it only uses n bytes of RAM.   My question was if anybody knows a way telling the GNU compiler to insert lines of code before the prologue (changing SP) and after the epilogue.

Last Edited: Wed. Apr 8, 2015 - 08:18 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I picked GPRIO0 because you might need a bit that can be manipulated and tested without affecting SREG or R0-R31.

Performing a context switch during an ISR would be difficult at best.

As noted in the comment, the bit is used to disable context switching.

The switcher would need to test the bit and lay off if the bit was set.

Of course, the issue only arises if interrupts are re-enabled during an ISR.

"SCSI is NOT magic. There are *fundamental technical
reasons* why it is necessary to sacrifice a young
goat to your SCSI chain now and then." -- John Woods

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

hugo_habicht wrote:

Just checked.    He has  a __naked__ attribute and then he is calliing "portSaveContext" in "femtoos_port.c" where pretty much everything is saved.   

 

Thank you very much.   Not what I am looking for.   I want to save stack space and I want it fast.

 

I didn't follow this thread for a few days, but i still think FemtoOS does exactly what you're after [from the FemtoOS features web page]:

"Regular operating systems have no special stack for the OS. The code for save/restore context is simple, but on every task stack a copy of variables used in the OS appear. This is such a waste! Thus, in the Femto OS a separate stack is used for the OS. Of course this implies an extra change of stack on every save context. Something similar applies to the interupt service routines. You may give it its own space of let it make use of the OS stack. A similar apprach is used for the isr's, which may use the OS stack or a stack on their own. "

 

If you have a look at the function portEnterISR in file femtoos_port.c, you will see how the stack switching to a dedicated ISR stack is being done. So, at least your first requirement of saving stack space is fulfilled, if it in fact fulfills your request for speed, is something different. But then, we don't live in a perfect world, do we?

Einstein was right: "Two things are unlimited: the universe and the human stupidity. But i'm not quite sure about the former..."

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thank you for your comment and for educating me regarding stack switching. 

I have marked the solution (due to assumed lack of compiler directive) for my original question.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi!

skeeve wrote:

If function calls from an ISR are OP's issue, tampering with the ISR prologue is insufficient.

One can do approximately what OP wants:

  .global __wrap_vector666
  __wrap_vector666:
  sbic GPIO0, x
    jmp __real_vector666
  sbi GPIO0, x  ; disable context switching
  store Ry
  store SP
  load SP
  call __real_vector666
  cli
  restore SP
  restore Ry
  reti

See avr-ld --wrap .

'Tis rather slow.

The first three instructions are only necessary if interrupts might be enabled within an ISR.

They add at most four cycles.

Operations from 'disable context switching' to 'load sp' have to be atomic (think about xMega)!

 

Ilya

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

501-q wrote:
skeeve wrote:

 

If function calls from an ISR are OP's issue, tampering with the ISR prologue is insufficient.

One can do approximately what OP wants:

  .global __wrap_vector666
  __wrap_vector666:
  sbic GPIO0, x
    jmp __real_vector666
  sbi GPIO0, x  ; disable context switching
  store Ry
  store SP
  load SP
  call __real_vector666
  cli
  restore SP
  restore Ry
  reti

See avr-ld --wrap .

'Tis rather slow.

The first three instructions are only necessary if interrupts might be enabled within an ISR.

They add at most four cycles.

 

 

Operations from 'disable context switching' to 'load sp' have to be atomic (think about xMega)!

An xMega would be more complicated, but similar code would still work.

More importantly, I'd forgotten to re-enable context-switching.

I've edited my original post.

"SCSI is NOT magic. There are *fundamental technical
reasons* why it is necessary to sacrifice a young
goat to your SCSI chain now and then." -- John Woods

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi!

skeeve wrote:

An xMega would be more complicated, but similar code would still work.

 

Similar but not the same.  "reti" for xMega does more work than for atMega.

 

I use value of SPH as flag for context-switch (in my programs there is more than 256 byte gap, used for variables, between ISR stack and threads' stack).  And in interrupt used 4 bytes of thread's stack: return address, work reg, SREG (for xMega it may be 12 bytes: 4 per used interrupt priority).

 

Ilya