avr-gcc is uselessly saving r24 in ISR.

Go To Last Post
6 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I have this code:

 

#include <avr/io.h>
#include <avr/interrupt.h>

register uint8_t val_off asm("r2");
register uint8_t val_on asm("r3");

ISR (PCINT0_vect) {			// Pin change int on PB4 triggers PCINT0. Enters ISR in about 11 cycles.
	PORTB = val_on;
	PORTB = val_off;
}



int main(void) {
	val_off = 0;
	val_on = 1 << PINB3;
	
	DDRB = 1 << PINB3;		// set PB3 as output

	PCMSK0 = 1 << PCINT4;		// enable pin change int on PB4 (MISO/PCINT4)
	PCICR = 1 << PCINT0;		// enable PCINT0
	sei();				// enable global interrupts
	
    while (1);
}

 

I was expecting the ISR to be something like:

out	0x05, r3
out	0x05, r2
reti

 

But, with avr-gcc 5.4.0 (-O3), I get:

 

00000080 <__vector_3>:
  80:	1f 92       	push	r1
  82:	0f 92       	push	r0
  84:	0f b6       	in	r0, 0x3f	; 63
  86:	0f 92       	push	r0
  88:	11 24       	eor	r1, r1
  8a:	8f 93       	push	r24
  8c:	35 b8       	out	0x05, r3	; 5
  8e:	25 b8       	out	0x05, r2	; 5
  90:	8f 91       	pop	r24
  92:	0f 90       	pop	r0
  94:	0f be       	out	0x3f, r0	; 63
  96:	0f 90       	pop	r0
  98:	1f 90       	pop	r1
  9a:	18 95       	reti

 

Now, I'm not complaining about the r0/r1 stuff, this is a know problem that is more or less solved in recent versions. The issue here is why is r24 saved and restored if it's not even used?

This problem is even worse in recent versions (avr-gcc 8.0.1):

 

00000080 <__vector_3>:
  80:	8f 93       	push	r24
  82:	35 b8       	out	0x05, r3	; 5
  84:	82 2d       	mov	r24, r2
  86:	25 b8       	out	0x05, r2	; 5
  88:	8f 91       	pop	r24
  8a:	18 95       	reti

 

Now, the old prologue/epilogue issue is solved, but still r24 is being uselessly saved, restored, and now even clobbered for no reason.

 

Lastly, I'll just apologize for reporting gcc problems here instead of a proper bug report, but hey, I know SprinterSB is watching, and besides, this is a more public place.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 3

> The issue here is why is r24 saved and restored if it's not even used?
 
This basically means that r24 was used at the time when register allocation and epilogue / prologue layout was computed, but a later pass applied some optimization after which r24 is no more needed. In particular, at the time prologue is set up, r24 is used as intermediate register for the values you are storing to the SFRs. As R2/R3 are fixed registers (because they are global), these register are not as accessible to the register allocator as ordinary registers, and the allocator generates an intermediate move through r24. Then in hard-register copy-propagation, the superfluous move is optimized to a direct one that no more needs r24, and the superfluous move of r2 to r24 is cleand up an a pass that runs even later (not so in v8 for some reason). To see how it works, compile with -fno-cprop-registers. If you like to study what exactly the passes are doing, then add -fdump-rtl-all. The optimization of interest can be seen in .cprop_hardreg.

avrfreaks does not support Opera. Profile inactive.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Ok, I understand. So for some gcc design reason the epilogue and prologue layouts are computed before all optimizations are applied, so they are unaffected by subsequent optimizations. Still, there seems to be a bug in version 8 leaving that "mov r24, r2" instruction behind.

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Maybe the R24 thing would go away with one more round of register optimization.

I think gcc has an option for that.

"Demons after money.
Whatever happened to the still beating heart of a virgin?
No one has any standards anymore." -- Giles

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

> So for some gcc design reason the epilogue and prologue layouts are computed before all optimizations are applied, so they are unaffected by subsequent optimizations.
 
Yes. Reason is that some back-ends want to optimize against the prologue / epilogue instructions. A target can, in principle, reorder passes according to its needs, so you can play around with rearranging the passes. However you cannot reorder passes at will; there are restrictions.
 
> Still, there seems to be a bug in version 8 leaving that "mov r24, r2" instruction behind.
 
For v8, the situation is even more complicated. The gccisr pseudo-instruction picks some reg to safe/restore Rtmp, Rzero, SREG as needed. If there is some GPR in use, this register (actually only its register number) is passed to the assembler as arg to gccisr. Hence r24 *is* actually used, as the gccisr reg is computed at prologue time. Hence even if you manage to get rid of the MOV, you won't get rid of PUSH/POP r24.
 
To get rid of the MOV, you can try to extend gasisr insn to also include a clobber of the payload reg (provided it's GPR, otherwise just clobber 0 to have the same insn anatomy for simplicity). This is an easy and mechanical change. It's a bit odd, though, because you'd clobber a reg as you are restoring its content, therefore you might prefer some unspec magic instead. If that doesn't help I'm out, you'll have to dig into gcc for further clues.
 
If you want to get rid of the PUSH/POP, then you can add one more target pass that patches gasisr to use Rtmp instead provided the old payload reg is not life (where you must not consider gasisr insns in life analysis, of course).

avrfreaks does not support Opera. Profile inactive.

Last Edited: Mon. Feb 26, 2018 - 09:32 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

Thanks for your analysis, but I don't know enough to do that. I'd rather write the ISR in assembly if needed.