UPDATE: I think I can say this is "solved" now. You may read more on this here: https://www.avrfreaks.net/index.p... , and of course Catweax's topic: https://www.avrfreaks.net/index.p... as this was discovered to be the same bug.
Reflecting to one of my previous posts I tracked down a processor bug.
Linux + avr32-gcc 3.4.1-348
The processor is used on a custom board, we did not use any evaulation board, so I can not give a "ready to use" test case. I tested it on more different boards of ours, the failure reproduces, so I think it is really in the processor (that is, not our application is faulty).
The failure described:
When using an USART (maybe also applies to other peripherals, I did not test), if the USART's interrupts are disabled on the peripheral with it's 'idr' register, if this register was accessed indirectly, some read accesses a few cycles later will partially fail apparently reading zero for the bottom or the top halfword of a 32bit word.
A test case is attached, it works as follows:
It is set up for one of our boards, but I don't think it will be hard to apply it on an another board.
It sets up an USART interrupt, and runs a stream of instructions causing the bug. On the USART it spits out repeatedely a 8 byte sequence which includes a fault counter and the failing read (this is only updated on a fault, so it can be watched real time). It seems that it reliably works with any optimization setting, however if not (or after modifying the USART routines), in the source generating the fault the number of 'nop's may be adjusted. The fault seems to slide around depending on various factors I could not discover, probably including the absolute placement of the generation code.
Anyone willing to verify this?