avr-gcc optimization of call-used registers...

Go To Last Post
10 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Call-used registers (r18-r27, r30-r31): May be allocated by gcc for local data. You may use them freely in assembler subroutines. Calling C subroutines can clobber any of them - the caller is responsible for saving and restoring.

 

So I have a program (the Optiboot bootloader) where the compiler is apparently going to some lengths to some lengths to avoid putting "long-term" data into call-used registers.  Instead it seems to go to some length to use the 2nd-class low-registers, instead.   In fact, the final program includes:

    7e40:	35 e0       	ldi	r19, 0x05	; 5
    7e42:	c3 2e       	mov	r12, r19

Even  though r19 is not used anywhere else in the program!  Apparently, since the call-used registers could be obliterated by any function call, and my code does indeed occasionally call functions, avr-gcc avoids even trying to use these registers.

 

Is there any way that I  can mess with the compiler's idea of which registers are actually used by subroutines?  I would sort-of expected this to be done by -flto, but that doesn't seem to be the case.  :-(

(and yes, I'm at a stage where having the 510-byte program be a dozen bytes shorter would be helpful...)

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

I had excellent results with -flto when writing a bootloader to fit into 512 bytes. It reduced the flash use by approximately 20%. The resulting code was virtually unreadable however with some variables allocated to a register for the entire lifetime of the code. In fact I think the optimiser had actually inspected/modified the register use of some functions and had used some of the "call-used" registers with the knowledge that my function (s) wouldn't clobber them.

 

Does your program call some assembly that -flto cannot examine ?

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

hmm. The SPM code is inline asm, and the eeprom code is a .s off in avr-libc...
Lto reduces code size by one instruction...

Last Edited: Thu. Aug 9, 2018 - 09:24 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yeah, My SPM stuff being small was inline ASM. I think the compiler gets reasonable visibility of that, but assembly files are probably opaque as far as the compiler is concerned. I had no .S files in my bootloader so the -flto optimiser could examine everything. I guess that's why I got much better results than you.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

(actually, I get similar behavior if I leave out the EEPROM support, leaving only C with a bit of inline assembler.)

It DOES do the "/var/folders/gv/zn3wcml52jq0vvjnd95j8g6h0000gp/T/arduino_build_92685/Blink.ino.elf" thing, but some of those its putting in low registers instead of high registers that are available:

 

// Loading constants that are used MUCH later

    7e3a:	03 e0       	ldi	r16, 0x03	; 3 in r16

    7e3c:	dd 24       	eor	r13, r13    ; 1 in r13
    7e3e:	d3 94       	inc	r13

    7e40:	35 e0       	ldi	r19, 0x05	; 5 in r12
    7e42:	c3 2e       	mov	r12, r19
    // But at least r19, r21, and r23 are otherwise unused...

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Which version of gcc are you using? I noticed that newer versions like 7.x or 8.x sometimes save a few bytes.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Which version of gcc are you using?

 This is the latest from Atmel: gcc5.4, I believe.

avr-gcc (AVR_8_bit_GNU_Toolchain_3.6.1_495) 5.4.0

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The 8.x versions have the ISR optimization thing. Could save a few bytes!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Here are some other options that may or may not save a few extra bytes: -mrelax -fno-jump-tables

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

It doesn't use ISRs, I've got all that other stuff (or it isn't relevant.)

The complaint about using low registers instead of high registers is the only thing I can find to complain about, from reading the object code :-(  (well, I suppose I could rewrite the EEPROM functions...)

 

It's pretty amazing that what's there compiles a small as it is - Code size is down 450 bytes, but I have two features I want to add (EEPROM and an application-callable "SPM" function), and they only fit one-at-time.

https://github.com/Optiboot/opti...