Overhead for interrupts context switching

Go To Last Post
11 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hello guys,
What is the approximate total overhead in terms of clock cycles for context switch to and from an (external) interrupt? In other words, if thread A is running, when it gets interrupted after how many cycles will it resume, assuming all I do inside the interrupt is increment a byte?

Thanks

EDIT: Just to precise, I'm using Avr Studio 4

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Ok, nvm, I was being lazy.
Did some search and found this:

theusch wrote:
Quote:

Quote:
t2overflow interrupt cost 11 cycles.
No, it doesn't. There is additional overhead to set up the interrupt (4 cycles), plus the rjmp or jmp (2 or 3 cycles) to get from the vector to the code. (

I usually use 12 cycles as my overhead number for an empty ISR. Now, Atomic Zombie will make it deterministic in his video work, but in a general app it is not nearly so.

-- Finish current instruction. This is 0+ (current instruction is almost finished) to nominally 4 cycles. But wait--what if you are doing an EEPROM read and the processor is halted for 4 cycles? What if you have CLI to do an atomic operation for a few cycles?

-- 4 cycles to invoke the interrupt
-- 2/3 cycles for the RJMP/JMP
-- 4 cycles for the RETI

So, you have 10/11 as hard overhead, plus the current instruction. On an RJMP model, I call that an average of 2 and come up with my 12 cycles.

The minimal (useful) ISR guts is an SBI/CBI at 2 cycles. As it affects no SREG flags, I treat this as the minimum. SBR is one cycle but affects SREG so then the save/restore overhead is there. I suppose if one had dedicated registers you could do an SER or MOV at 1 cycle.

IF YOU REALLY HAVE TO shave off a couple cycles and you have unused following vector slots, you could put the ISR right in the vector table and save 2/3 cycles.

So if I guess I should add 4 cycles for the status register push/pop and 4 more cycles for each register used right?

4 more cycles for the counter and 1 for the increment, for a total of about 20 am I right?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Depends on your C compiler. In GCC you'd be hard pushed to get in and out under 30. GCC has a nasty habit of expecting R1 to always hold 0 (it's historical). Because the interrupt may have occurred in a MUL R1 may not hold 0. So the standard entry/exit code is:

	push r1	 ; 
	push r0	 ; 
	in r0,__SREG__	 ; 
	push r0	 ; 
	clr __zero_reg__	 ; 

ISR

	pop r0	 ; 
	out __SREG__,r0	 ; 
	pop r0	 ; 
	pop r1	 ; 
	reti

A missed optimisation is that this minimal epilogue/prologue is used even if the body of "ISR" does not change SREG and does not actually use R1=__zero_reg__=0

Of course maybe you are talking about a different C compiler?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

If you do a subroutine call in an interrupt handler, 31 regs gets pushed and pulled, which takes 2usec for the pushes and 2usec for the pulls. Moral of the story is dont call subroutines in isrs.

Imagecraft compiler user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
So if I guess I should add 4 cycles for the status register push/pop and 4 more cycles for each register used right?

4 more cycles for the counter and 1 for the increment, for a total of about 20 am I right?


Yes. In an ideal world, the body of the ISR() could be as minimal. You also have the initial JMP to the vector, the JMP to your ISR(), the RETI.

The simple answer is to just use the Simulator. It will not only tell you the cycles but the actual time in microseconds.

We don't live in an ideal world. So most C compilers may have less efficient sequences.

You have probably noticed that if you can sacrifice registers or guarantee well behaved programs, you can reduce the PUSH and POP count.

David.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
If you do a subroutine call in an interrupt handler, 31 regs gets pushed and pulled
Not true. First of all, the worst case is that R18 through R31 will be pushed and popped (in addition to R0 and R1), it will never be 31. Second, there are some situations (such as when the function being called is in the same .c file) that the compiler will be able to push only registers necessary (or even inline the function).

Regards,
Steve A.

The Board helps those that help themselves.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

and as we all know.. any function called by an ISR usually has to be reentrant.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yeah i ran the simulator and was wondering about the R1 stuff (that's like 5 cycles wasted right there). Seems like I'm getting about 30 cycles for a 16 bit counter, which is fine for my application (I'm using 2 high precision motor encoders).

Btw, it's not really relevant to my case, but about the function call, I really don't see why the ISR should push all those registers... cannot the function called be responsible of pushing the registers it is using? Why should the ISR worry about that?

Thanks

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yes, any ISR() must leave no footprints. e.g. all registers, SREG etc should be unchanged.

You appear to be talking about avr-gcc in a general forum. avr-gcc provides you with sufficient rope to screw your ISR(). Use it if you want.

As a general answer for ANY compiler, you can always put your ISR() in an ASM file.

Regarding the compiler model. A called function could look after every register, or you publish the rules, and expect punters to obey the rules.

David.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

Btw, it's not really relevant to my case, but about the function call, I really don't see why the ISR should push all those registers... cannot the function called be responsible of pushing the registers it is using? Why should the ISR worry about that?

As hinted, if you are indeed going to go into the low-level details, then you had better explicitly state which toolchain and version you are using.

Painting a picture and then decrying "Why does the >>compiler<< do it this way?!?" is somewhere between n/a and inaccurate. It is like complaining about e.g. some feature on an automobile, particular brand and model, and then asking "Why do car makers do it this way?" when most other brands/models do it differently.

What frequency are you trying to achieve? How close have you come? If you then also post some code, experienced people here will enjoy the challenge.

But if you are trying for skinny ISRs and are using a toolchain not known for them--it is hard to get much sympathy from this corner.

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yeah, I just remembered Avr Studio 4 didn't come by default with gcc. I think I have one of the 2011 versions of avr-gcc, although I don't have my computer right now to check.

My objective is to use the interrupt for motor encoders, which will be running at about 12k interrupts/s each at full speed. At 30-40 ticks per interrupt that makes less than 1M-cycle per second for 2 encoders, which is good for me since I need cpu for other tasks.