I do not want to start a new 'compiler war', I just need some advice from experienced users of the GCC tools (WinAVR):
I have been a defender of the IAR compiler since a long time. At least the generated code ist always very small AND fast. Support is ok, bugs are removed only in the 'next release' 3-4 months later, annual maintenance fees are higher than most of the other compilers .... we have heard all this before.
After all the good advice found in this forum regarding the GNU tools, I just downloaded the new AVR Studio and WinAVR. Installing is done with just a few mouseclicks, the compiler is very nicely integrated into AVR-Studio. After minutes I could compile and debug a first project. Very impressive!!!
But now I have some questions regarding the code generation. If I compile some of my old routines and compare the runtime to the IAR generated code, the GCC code is much slower. I just tried some very special routines (two different square root routines, which take both about 60-80% longer) which have to be fast for me.
Example:
unsigned int IntSqrt(unsigned long w1) { unsigned long i1; unsigned int k0,k1; k0 = 512; k1 = 0; while (k0>0) { i1 = k1 + k0; i1 = i1 * i1; if(i1<=w1) k1 = k1+k0; k0 = k0 >> 1; } return(k1); } unsigned int Sqrt_FL(unsigned long value) { unsigned int PosCnt = 0x8000; unsigned int RetVal = 0; unsigned long Tmp = 0; unsigned char Flag = 0; while(PosCnt) { Tmp = (Tmp&0x0000FFFF)|((unsigned long)PosCnt<<16); Tmp = (Tmp>>1)|((unsigned long)RetVal<<16); if(Flag || (Tmp<=value)) { value -= Tmp; RetVal |= PosCnt; } Flag = !(!(value&0x80000000L)); value = value<<1; PosCnt = PosCnt>>1; } return RetVal; } void main(void) { volatile unsigned long value; volatile unsigned int result; value = 123456789; while(1) { result = IntSqrt(value); result = Sqrt_FL(value); value++; } }
runtime IAR (GCC):
IntSqrt: 402 (750) cycles
Sqrt_FL: 607 (900) cycles
This is quite a big difference on what seems to be quite straightforward routines to me.
I have tried different optimization levels -O0 to -O3, -Os. Is this what I have to expect with the gcc compiler or am I doing something wrong?
Another detail:
I have to declare result as volatile, because at least the IAR would otherwise remove the complete calculation. If I remove the volatile declaration of value, which is senseless, then the GCC code for Sqrt_FL takes ~70 cycles longer!?!?
Every advice would be greatly appreciated, because I am really investigating to change from IAR to gcc (on AVR, MSP430 and ARM).
Regards,
Jörg.