I just stumbled upon this document at Atmel and i want to share it with you. It may have been posted before but I probably won't hurt to be reminded of again:
It discusses how to optimize code for size or speed by writing. It does not discuss compiler optimization.
Did you know that a increment loop will produce slightly larger code than a decrement loop?