How to enforce avr-gcc to treat assembly code as a function call

Go To Last Post
14 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Happy New Year to everyone!

I’m working on bootloader for XMEGA with 256 kB FLASH. My bootloader has some functions, which will be eventually called by regular application code. So, as usual, I made jumptable, which is placed in bootloader code, through which my BL functions will be called. Till now, everything seems to be ok.

The problem is the application. Because bootloader code starts at 0x40000, so obviously to call bootloader function, I have to use far call/jmp or EICALL/EIJMP. The regular code, generated by avr-gcc uses EICALL:

static inline void NVM_ExecCommand_SPM(uint16_t address, uint8_t cmd)

{ ((PF_BL_TwoUInt816) (BOOTLOADERBASE/2 + 4))(address, cmd); };

     2c6:     6a e1         ldi    r22, 0x1A     ; 26

     2c8:     80 e0         ldi    r24, 0x00     ; 0

     2ca:     90 e0         ldi    r25, 0x00     ; 0

     2cc:     e2 e0         ldi    r30, 0x02     ; 2

     2ce:     f1 e0         ldi    r31, 0x01     ; 1

     2d0:     19 95         eicall

Of course generated code is wrong, because EIND register is not correctly set. I can do it, using a small code which will set EIND before call, and set it to zero after return. No problem with that.

However, using eicall is not very effective, as compiler has to set Z register, and I need to manually take care about EIND. So I’d like to use regular CALL, which on AVR with >128 kB of FLASH can address the whole address space. But how to enforce compiler to use call? The obvious solution is to use assembly:

static inline void NVM_ExecCommand_SPM(uint16_t address, uint8_t cmd)

{

       asm volatile(

       "call  %[label]" "\n\t"   

       :: [address] "r" (address), [cmd] "r" (cmd), [label] "i" (BL_Vector1):

       );

}

Works almost great, but… the compiler doesn’t treat my function as a function, I mean, the compiler doesn’t prepare registers for function parameters passing. address variable is placed in R24:R25 according to function call convention, but cmd is placed in R18, instead or R22:

     2c0:     2a e1         ldi    r18, 0x1A     ; 26

     2c2:     80 e0         ldi    r24, 0x00     ; 0

     2c4:     90 e0         ldi    r25, 0x00     ; 0

     2c6:     1e 94 00 01   call   0x40200       ;

I can manually assign registers, by moving R18, to R22 (“mov R22, %[cmd]”), but it will generate unnecessary code.

So my question is – do you know how to enforce compiler to treat my function as a function?

Of course removing static inline, and defining function code in c file will not help.

This topic has a solution.
Last Edited: Wed. Jan 2, 2019 - 04:46 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Can you try to reduce the optimization level to O1 maybe, or not inline that function to see what is generated (just for testing, I understand you want it to be inline for optimization)? Maybe the parameter was in r22 originally, then was moved to r18 and this transference was optimized out.

 

That is, I suspect this code is generated pre-optimization, then the compiler decides it can use just r18 when the function is inlined.

ldi     r22, 0x1A
ldi    r24, 0x00    
ldi    r25, 0x00
call function


function:
mov r18, r22
call   0x40200 
ret

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

With –O1 the problem persists. If I define the function body in assembler (.S file), C treats it as function and prepares registers according to ABI specification. However, it will cost me additional rcall/ret, as function is not inlined in that case.

As I understand, the compiler inlines my function (which is my intention), but leaves me with necessity to move function parameters to appropriate registers. My problem is, how to convince the compiler, that my call is a call to function indeed.  

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The ABI applies to callable functions.  If you inline the function, the ABI doesn't really apply anymore.

 

How do you expect to call a function that was inlined, anyway?  How do you propose that function to return?  An inlined function is inlined precisely to avoid the call/ret and prologue/epilogue overhead.  Without it, you can't call it because there's no return and no preservation of registers.

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

This reply has been marked as the solution. 
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Ok, after a bit of searching I found a feature that is supposed to solve this: https://gcc.gnu.org/onlinedocs/g...

 

So:

static inline void NVM_ExecCommand_SPM(uint16_t address, uint8_t cmd)
{
	register uint16_t address_ asm("r24") = address;
	register uint8_t cmd_ asm("r22") = cmd;
	asm volatile(
	"call  %[label]" "\n\t"
	:: [address] "r" (address_), [cmd] "r" (cmd_), [label] "i" (BL_Vector1):
	);
}

 

edit: as explained in #4, the compiler doesn't need to follow the ABI for inlined functions, since they are not called, but this method should force what you want.

Last Edited: Wed. Jan 2, 2019 - 04:22 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

joeymorin wrote:
The ABI applies to callable functions. If you inline the function, the ABI doesn't really apply anymore. How do you expect to call a function that was inlined, anyway? How do you propose that function to return? An inlined function is inlined precisely to avoid the call/ret and prologue/epilogue overhead. Without it, you can't call it because there's no return and no preservation of registers.

Yes, I know. But this is very specific situation in which only wrapper is inlined, but inside the wrapper is a call to regular callable function. The compiler doesn’t know about it, so it optimizes code as for inlined function. The problem is, how to inform the compiler, that the inlined function should follow ABI specification.

El Tangas wrote:
Ok, after a bit of searching I found a feature that is supposed to solve this: https://gcc.gnu.org/onlinedocs/g...

Thanks, it did the trick. Now everything compiles perfectly. No more tricky EIND modification.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

One more thing, I think it's safer if you add all the call-used registers defined in the ABI to the clobber list: https://gcc.gnu.org/wiki/avr-gcc...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

It's probably a thing that "needs" to be solved in the link stage, rather than in the C source.

I'd think it would be possible to simply have:

extern int bootLoaderFunc(int, int, int);
   :
   
  i = bootLoaderFunc(a, b, 0x1000);

and then tell the linker "bootLoaderFunc = 0x40002", so that the call isn't reduced to an rcall by -mrelax, and etc.

avr-gcc -mmcu=atmega2560 defsymbol.c -Os -g -Wl,--defsym=bootLoaderFunc=0x40002

 seems to work, for example.

 

I suspect that there's a way to create a .o file containing such definitions that don't actually contain code...  objcopy, perhaps?

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

westfw wrote:
I suspect that there's a way to create a .o file containing such definitions that don't actually contain code... objcopy, perhaps?

That’s another great idea. I will try it too. In the meantime, I have another problem – my bootloader doesn’t fit bootloader section, so parts of it must be moved to regular FLASH. But this is another story.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

If it's that big it's too complex. As the bootloader is the only thing that must be 100% error free when you ship I'd look at throwing away any unnecessary eye candy just keep the essential comms/program loop.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
If it's that big it's too complex. As the bootloader is the only thing that must be 100% error free when you ship I'd look at throwing away any unnecessary eye candy just keep the essential comms/program loop.

I’d like to do that. Unfortunately, I need to implement relatively complex multimaster RS485 protocol, implementation takes about 3.7 kB, so I have only 350 bytes for the bootloader itself. I can save some bytes if I will modify interrupt vector table, and replace push/pop sequences in ISRs by calling a procedure for that. But it is still not enough. I will try to identify  the code sequences which can be improved (I use rather complex data structures, so maybe gcc doesn’t translate it to optimal assembly code).

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

TFrancuz wrote:
my bootloader doesn’t fit bootloader section, so parts of it must be moved to regular FLASH.

TFrancuz wrote:
implementation takes about 3.7 kB, so I have only 350 bytes for the bootloader itself.

???

Biggest bootloader section size on an Atmega2560 is 8k Bytes.

 

Stefan Ernst

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The target device is XMEGA32E5/16E5, only because of USB and easier debugging I use temporary XMEGA256A3BU. So I have only 4 kB of bootspace. 32E5 is still an overkill, as application code is about 2-3 kB. I choose this part, because of 4 kB bootsector. Seems that I will be able to shrink my code, to fit into 4 kB.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

TFrancuz wrote:
The target device is XMEGA32E5/16E5
Ah, I saw the ATmega2560 in the command line given by westfw. I should have looked at the OP, sorry.

Stefan Ernst