I'm using the version of avr-gcc currently shipping with Atmel Studio.
Consider this minimal example:
/* * library.c * * Created: 6/4/2018 9:25:36 AM * Author : Luke_M */ #include <stdint.h> typedef struct { uint8_t value1; uint16_t value2; } myStruct; extern const __flash2 myStruct myArray[2][4]; void myFunc1(uint8_t x, uint8_t y) __attribute__((noinline)); extern void myFunc2(void) __attribute__((noinline)); extern void myFunc3(uint16_t value) __attribute__((noinline)); void myFunc1(uint8_t x, uint8_t y) { if(myArray[x][y].value1) { myFunc2(); } else { myFunc3(myArray[x][y].value2); } } /* Replace with your library code */ int myfunc(void) { myFunc1(1,1); return 0; }
When this library is compiled with any level of optimization, avr-gcc appears to make a mistake around the dereferencing of myArray[x][y].value1. It correctly loads RAMPZ with 0x02 before issuing an ELPM command. However, afterwords it fails to clear RAMPZ back to zero. In contrast, when it dereferences myArray[x][y].value2 a few lines later, it does correctly clear RAMPZ when it's done with it.
As a consequence, if myFunc1 follows a path through myFunc2, then after it returns it will leave RAMPZ in a non-zero state. For most AVRs this isn't a problem, because any future invocation of ELPM will be expected to initialize RAMPZ correctly before use. However, in XMEGA processors supporting an EBI, the LD/ST Z family of instructions always implicitly make use of RAMPZ (but the compiler treats SRAM pointers as though they were pure 16-bit). In this case, the failure to clear RAMPZ will cause any future code to fail as soon as it attempts to access a RAM variable trough the Z pointer. (For example, avr-libc's implementation of memcpy uses the Z pointer and thus it would crash.)
For example, here's the assembly generated for myFunc1 with optimization -Os:
.global myFunc1 .type myFunc1, @function myFunc1: .LFB0: .file 1 ".././library.c" .loc 1 24 0 .cfi_startproc .LVL0: /* prologue: function */ /* frame size = 0 */ /* stack size = 0 */ .L__stack_usage = 0 .loc 1 25 0 ldi r23,0 movw r30,r22 lsl r30 rol r31 add r22,r30 adc r23,r31 .LVL1: ldi r25,lo8(12) mul r24,r25 add r22,r0 adc r23,r1 clr __zero_reg__ movw r30,r22 subi r30,lo8(-(myArray)) sbci r31,hi8(-(myArray)) ldi r18,2 out __RAMPZ__,r18 elpm r24,Z .LVL2: cpse r24,__zero_reg__ .loc 1 27 0 jmp myFunc2 .LVL3: .L2: .loc 1 31 0 adiw r30,1 ldi r18,2 out __RAMPZ__,r18 elpm r24,Z+ elpm r25,Z+ out __RAMPZ__,__zero_reg__ jmp myFunc3
Can anybody else confirm? Is this still a problem in the latest upstream avr-gcc 8.1?