AvrDelayLoop.exe is a handy tool but it generates some really complicated code. Example:

; ============================= ; delay loop generator ; 50000000 cycles: ; ----------------------------- ; delaying 49939965 cycles: ldi R18, $FF WGLOOP0: ldi R19, $FF WGLOOP1: ldi R20, $FF WGLOOP2: dec R20 brne WGLOOP2 dec R19 brne WGLOOP1 dec R18 brne WGLOOP0 ; ----------------------------- ; delaying 60030 cycles: ldi R18, $57 WGLOOP3: ldi R19, $E5 WGLOOP4: dec R19 brne WGLOOP4 dec R18 brne WGLOOP3 ; ----------------------------- ; delaying 3 cycles: ldi R18, $01 WGLOOP5: dec R18 brne WGLOOP5 ; ----------------------------- ; delaying 2 cycles: nop nop ; =============================

I put together a web page using Javascript that generates much simpler code, such as:

; Delay 50 000 000 cycles ldi r18, 254 ldi r19, 167 ldi r20, 102 L1: dec r20 brne L1 dec r19 brne L1 dec r18 brne L1 nop nop

I think I got the math right, and I tested it on Firefox and IE on Windows, but if you have any problems let me know. If I've reinvented the wheel and there's already something that calculates this type of loop, let me know that too. I looked but didn't find.

The math works out like this:

Zero, one, or two "nop" to get to a multiple of 3.

For cycles up to 768, a single loop at 3 cycles each, 2 cycles for the last loop, 1 cycle for the LDI, so just 3 cycles per loop.

For cycles up to 197121, an outer loop is added. Each time an outer loop runs for the first time, the inner loop runs for the specific short amount of time, and then it runs for 256 loops each time after that because the counter has been reset to zero. The inner loop takes 770 cycles after the first time (256*3 -1 for failed branch, +3 for new DEC/BRNE).

After a third loop is added, the two inner loops together with their counters reset to zero take 197122 cycles each time. Then three loops take 50463234 cycles, four take about 13 billion, etc. (256 * previous + 2).

This example for 50 million cycles takes 253 * 197122 + 166 * 770 + 101 * 3 + 9 + 2.

Here's a one-day loop for a 20MHz clock:

; Delay 1 728 000 000 000 cycles ldi r18, 134 ldi r19, 195 ldi r20, 193 ldi r21, 122 ldi r22, 166 L1: dec r22 brne L1 dec r21 brne L1 dec r20 brne L1 dec r19 brne L1 dec r18 brne L1 nop nop