AVR-GCC optimization bug? (reported by a newbie..)

Go To Last Post
10 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hello, I'm reporting a bug here, since I'm 99,9% sure it's not a real bug and you guys can help me! :D

I'm having troubles with the optimization levels of the avr-gcc compiler. My code works fine if i'm using -o0 or -o1, but it doesn't work if I use -o2.

I'm using the MACB of a UC3A. In general the Ethernet communication works fine, but in a certain part of my code - with -o2 on - a packet is not sent when it should be. I believe the error is in this part of the code:

My code calls an Ethernet driver function and gives the driver a pointer to a struct, which contains the data to be sent out and a pointer to the size of the data (*int). Immediately after the driver has been called, the size of the data is reset to zero:

//Set the size pointer
stTx.puiSize=&stCommunication.stTx.uiBufferSize;

//Write to ethernet driver
dev_eth_write(&stTx);

// Initialise size to zero
stCommunication.stTx.uiBufferSize=0;

The Ethernet driver function formulates the packet headers and calls the software framework function lMACBSend to send out the packet:

int dev_eth_write( const transmit_struct *stPtr )
{  
// Set protocol parameters
ucaEthHeader[ APCI_PROT_PLACE + APCI_HEADER_PLACE_OFFSET ] = stPtr->uiProtId;

// Set destination mac address  
memcpy( &ucaEthHeader[ETH_HEADER_PLACE_OFFSET + ETH_DEST_MAC_PLACE],  
stPtr->pucMac, ETH_DEST_MAC_SIZE);

// Copy header data to the frame
memcpy(ucaEthernetUserData, ucaEthHeader, CONST_ETH_HEADER_SIZE);

// Copy user data to the frame
memcpy( &ucaEthernetUserData[ CONST_ETH_HEADER_SIZE ], stPtr->pucData, *stPtr->puiSize);

// Send the data
lMACBSend(&AVR32_MACB, ucaEthernetUserData, *stPtr->puiSize + CONST_ETH_HEADER_SIZE,TRUE);

    return 0;
}

I use this same function for sending data all the time and it works just fine. Only when -o2 is used and the code above calls the function, nothing is sent out.

If I modify the Ethernet driver slightly by introducing a variable 'a' which contains the size of the packet and change the call to lMACBSend to use the variable 'a' as the size, everything suddenly works!

int dev_eth_write( const transmit_struct *stPtr )
{  
// Copy size to a
int a = *stPtr->puiSize+CONST_ETH_HEADER_SIZE;

// Set protocol parameters
ucaEthHeader[ APCI_PROT_PLACE + APCI_HEADER_PLACE_OFFSET ] = stPtr->uiProtId;

// Set destination mac address  
memcpy( &ucaEthHeader[ETH_HEADER_PLACE_OFFSET + ETH_DEST_MAC_PLACE],  
stPtr->pucMac, ETH_DEST_MAC_SIZE);

// Copy header data to the frame
memcpy(ucaEthernetUserData, ucaEthHeader, CONST_ETH_HEADER_SIZE);

// Copy user data to the frame
memcpy( &ucaEthernetUserData[ CONST_ETH_HEADER_SIZE ], stPtr->pucData, *stPtr->puiSize);

// Send the data with size a
    lMACBSend(&AVR32_MACB, ucaEthernetUserData, a,TRUE);

    return 0;
}

Or if I add a gpio toggle between the call to the driver and setting the size to zero, everything works.

Have you heard of anything like this before? Any ideas on how to investigate this further and fix this? There are some minor differences in the disassemblies with and without 'a', but I don't understand (optimized) assembler enough to spot any problems. :(

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hello,

I have also seen problems with the -02 optimization, and I told Atmel about it, but they didn't seem to care. And is not related to missing volatiles or to code that could produce undefined behavior. For some reason (specially with large functions) AVR32-gcc simply decides to mess up the code. The only solution I have found is to stick to -O1.
Once I checked all the optimization flags that -O2 adds to -O1 (I did it one by one), I found the one causing the problem and disabled it, but the solution was only temporal, because other problems appeared later when the code grew. To me it seems to be related to large source files and long routines.

Daniel Campora http://www.wipy.io

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Ok, probably the same thing here. The problems started appearing when the project got larger. This wasn't exactly the first time the optimization messes things up, only this time it almost caused something rather fatal so I started investigating it further. I guess we have to stick to -o0 since all the debugging has been done with it and it is the most likely not to cause any problems.

I have a hunch that the actual problem might also have something to do with interrupts or timing, but it is really hard to track which part fails in a large project like this when several guys write the code at the same time. :?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Can you try adding this compiler memory barrier between calling dev_eth_write and setting the size to zero? Then try -O2 again.

asm volatile("" ::: "memory");
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

I guess we have to stick to -o0 since all the debugging has been done with it and it is the most likely not to cause any problems.

Jamirant, What I have been doing for two years with the AVR32 is to debug with -O0 and then use -O1 for the release build. With that approach everything has been working fine. Do not use -O0 for the release build since -O0 produces a very large and inefficient binary. AFAIK with -O1 you are safe of optimizations bugs.

Daniel Campora http://www.wipy.io

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I highly doubt this is an optimization bug. I use -O3 exclusively and have never encountered any problems. This includes a couple UC3 projects that have over 60,000 lines of source code and some very large functions. I'll bet the OP's problem is a simple race condition. Look at what he's doing; setting up some data, passing it to a send() function and then immediately messes with the data being sent. The send() is probably still in progress when he zeros out the buffer size parameter. How do you know the send is complete at that point?

Letting the smoke out since 1978

 

 

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

digitalDan wrote:
I highly doubt this is an optimization bug. I use -O3 exclusively and have never encountered any problems. This includes a couple UC3 projects that have over 60,000 lines of source code and some very large functions.
I’m with you on that. I use -O3 for developing, debugging and the release as well. No problems (except for freaking hardware errors).
digitalDan wrote:
I'll bet the OP's problem is a simple race condition. Look at what he's doing; setting up some data, passing it to a send() function and then immediately messes with the data being sent. The send() is probably still in progress when he zeros out the buffer size parameter. How do you know the send is complete at that point?
While I expect a race condition as well, overwriting the size parameter should be safe at that point, because it got passed to lMACBSend(...) via call-by-value.

The compiler can break your program by optimising, but that’s because you haven’t told it what to optimise and what not (hence my compiler optimisation barrier mentioned before). It’s a stupid rule-driven program after all. We programmers should be glad that we’re not obsolete yet. ;-)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

catweax wrote:
While I expect a race condition as well, overwriting the size parameter should be safe at that point, because it got passed to lMACBSend(...) via call-by-value.

Ah, I missed that first time around, good point. I'd like to see the assembly dumps from the OP's two examples...

Letting the smoke out since 1978

 

 

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

digitalDan wrote:
I'd like to see the assembly dumps from the OP's two examples...

Thank you all for helping me with this! I'll post the dumps tomorrow.

I'm also pretty sure it is some kind of race condition at some level since at least sometimes when debugging through the optimized code, it seems to work fine.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

OK I need to take what I said back...
There are no optimization bugs on the gcc port of AVR32, thanks digitalDan and catweax for your comments. A couple of days ago I was able to find a few sections of my code that were compiling to undefined behavior. The thing is that other compilers like IAR are a lot more forgiving about code that might produce undefined behavior (even on the highest optimization level), and 99% of the time they generate correctly working code, and the other 1% of the time IAR warns you about the possible undefined behavior.
GCC on the other hand, when choosing optimization level -O2 and -O3 will easily compile crazy code without letting you know. This is what I have learned to avoid the "undefined behavior" issues, I applied this to a very large project that wasn't working with -O2 or -O3 and now is running like a champ:

1. Always write defensive code, which means:

a. The first thing a function should do is to check for null pointers passed to it, and return immediately in case the pointer is NULL. It doesn't matter if you are 100% sure that you will never pass a NULL pointer to this function, GCC can't know that, and will definitely produce undefined behaving code.
b. Always check array indexes to ensure it never goes beyond the array size neither it becomes negative. Sizeof() is your friend here.
c. Always check that when incrementing/decrementing a variable it doesn't overflow/underflow.
d. Check for division by zero (this one is super obvious, but anyway...).

The thing is, that all of this is obvious to any programmer, but sometimes when you write code and you are sure that none of those situations would happen because you are passing your parameters correctly, you omit the NULL pointer, array index, etc. checks, and that's when the undefined behavior bytes you...
The "funny" thing about the undefined behavior is that it doesn't show all the time, or even doesn't seem to come from the problematic function, is erratic, and confusing, and that's why is so hard to detect and correct.

I hope this helps anyone having problems with the optimizations.

Best regards,

Daniel Campora http://www.wipy.io