AVR Delay Functions

Last post
34 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi Reader,
am not new to asm programming but new in using the CodeVision, any way my problem is, i wana write a function its input is the delay i want in sec or msec and it generate to me this delay. using c or asm its ok, but asm is prefered. thnx for ur help

please send to me on: hazem_x_h_work@hotmail.com

urs hazem hegazy

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

There is DELAY.h header that you should include at your C program before using belove functions.
Pay attention to dissable all interrupts before calling the functions to working the dalay properly.
every where in your program you can use these Functions:
void delay_us(unsigned int n)
void delay_ms(unsigned int n)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hello,
I've got the same problem, and would like to have precise delays without timers, only that delay.h seems too big to be used on an ATtiny2313 with its 2kB Flash!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

josefbs wrote:
Hello,
I've got the same problem, and would like to have precise delays without timers, only that delay.h seems too big to be used on an ATtiny2313 with its 2kB Flash!

In AVR-LIBC the main delay function accepts float type of variables, using even a single float variable can take 2-4kB of memory. I always custom wrap the inline assembler loops to my own macro. Maybe this is possible in your compiler too.

- Jani

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0
//-------------------
void delnms(unsigned int n){
//delay n ms
unsigned int x;

  while(n--){
    x=2600;       //empirically determined fudge factor 16mhz
    while(x--);
  }
}

//-------------------
void del100us(unsigned int n){
//delay n 100us
unsigned int x;

  while(n--){
    x=260; 
    while(x--);
  }
}

//-------------------
void del10us(unsigned int n){
//delay n 10us
unsigned int x;

  while(n--){
    x=26; 
    while(x--);
  }
}

MHz N
14 2500
16 2600
18 2800

Imagecraft compiler user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi..

Using the delay.h library from Codevision you need to disable the interrupts when you call for delays functions if you don´t disable the interrups the time delay could not work right.

You could check this in Help from Codevision.

Regards,

Bruno Muswieck

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

It'll just be longer, but usually thats why you want a delay... to give something a few ms to happen....

Imagecraft compiler user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I mean if you want a precise delay you need to disable the interrupts using the delay.h libraty.

Regards,

Bruno Muswieck

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

brunomusw wrote:
I mean if you want a precise delay you need to disable the interrupts using the delay.h libraty.

Offtopic yes for using delay loops, but if precise delays are required, I would use a timer interrupt to generate that, so that other interrupts would have little effect on delay length.

- Jani

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
Hello,
josefbs wrote:
Quote:

Hello,
I've got the same problem, and would like to have precise delays without timers, only that delay.h seems too big to be used on an ATtiny2313 with its 2kB Flash!

In AVR-LIBC the main delay function accepts float type of variables, using even a single float variable can take 2-4kB of memory. I always custom wrap the inline assembler loops to my own macro. Maybe this is possible in your compiler too.


There will be no floating point operation on runtime, if your delay parameter is a constant and if you have enabled the optmizer. Anyway, I recommend to use at least -O1. Then the floting-point calculation will be evaluated at compile-time by the compiler!

Take attention about the suitable delay range, which does depend on the F_CPU. Refer to the avr_libc documentation for details.

Peter

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

josefbs wrote:
Hello,
I've got the same problem, and would like to have precise delays without timers, only that delay.h seems too big to be used on an ATtiny2313 with its 2kB Flash!

Presumably you are using AVR Studio as an IDE for avr-gcc? For some completely odd reason the guys at Atmel decided to default projects to build -O0 rather than the more usual -Os default. The delay.h functions in avr-libc's delay.h will only resolve to optimised delays at compile time IF the optimisation for the compile is something other then -O0 - yet another reason why AVR Studio should NOT default to -O0 but I guess it's done so there's a one to one corresondence when debugging/simulating between the source and the object

Cliff

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

Hello,
I've got the same problem, and would like to have precise delays without timers, only that delay.h seems too big to be used on an ATtiny2313 with its 2kB Flash!

Huh? CodeVision's delay library?

delay_us() is done in-line; maybe 10 words per invocation. The delay_ms() routine is small; call overhead is maybe half a dozen words.

You must have a whole bunch of invocations to make a noticeable impact on flash usage.

Lee

You can put lipstick on a pig, but it is still a pig.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

For a small flash processor, I vote for not using inlines if the flash starts filling up. There should be 2 versions of the delays.. one that takes the call and return overhead into the calc, and one for inline/macro use. That 'count an int down from 2400' trick is quick and easy and scales well... I tested it by calling it 10 times in a loop that repeated 1000 times, so the loop overhead is small.

Imagecraft compiler user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Allthough it is for WinAVR, take a look at this project:
http://www.avrfreaks.net/index.php?module=Freaks%20Academy&func=viewItem&item_type=project&item_id=665
It creates delays as accurate as a single clock cycle.
It requires compile time constants and at least -O1 optimization.

Heinrichs.hj

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

There should be 2 versions of the delays.. one that takes the call and return overhead into the calc, and one for inline/macro use.

CodeVision already has that, Bob-- delay_us() is inline for accurate timing. delay_ms() is a function call. You can, however, do whatever you want. :) With the library routines in CV, and if you tell the truth about your AVR clock speed, and if you don't let ISRs get in the way, the CV times are going to be quite accurate down to a us or so for delay_ms(), and nearly that for delay_ms() regardless of your AVR clock speed.

Lee

You can put lipstick on a pig, but it is still a pig.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi,
I am using AVR studio. I want to generate us delays.
I included the delay.h and called the function as _delay_us(1) to get a 1us delay.
The compiler optimization is set to -00.
when I compile, I see severe code bloating and the FLash consumption increased from 33% to 50%.
Can anyone suggest why this is happening.?
What is the workaround.?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I understood from the header description that this is because the floating point libraries are linked.
Please suggest a solution.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Perhaps a solution could also be found in the documentation... In case actually checking is a bit much to ask... ALWAYS compile with optimization >= -O1, and with a constant parameter, so that the floating-point libraries are not pulled in.

Martin Jay McKee

As with most things in engineering, the answer is an unabashed, "It depends."

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
obviously you need to figure out how to prevent optimization from taking your code out
And why would the optimizer take that code out?

Regards,
Steve A.

The Board helps those that help themselves.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

Quote:
delay_us() is inline for accurate timing

it depends on what you mean by "accurate".


OK, we are mixing apples and oranges here. But you did get a rise out of me.

The quote is from the old posts, referring to CodeVision. The new posts are talking about GCC.

[and as always, the question of accurate long delays, and even short delays in C, on a microcontroller is a tempest in a teapot. Any real app has a lot more to do than sit in one place and twiddle, And those apps that need cycle-counting (e.g., video generation)--need cycle-counting. Again the implementation of delay routines as C library functions is immaterial.]

So let's get back to your quote and question. What do >>you<< mean by accurate? AFAIK the CV routines will indeed give the correct microseconds. I don't know if at 3.6864MHz clock and a requested delay of (say) 4us whether you get 15 clocks or 16. I'd speculate it is consistent.

AFAIK CodeVision doesn't have a delay_cycles(). IIRC someone developed/posted a nice one for GCC.

So, are you implying that the CV delay_us() is >>not<< accurate?

[edit] You got me curious. So I poked a few delay_us() into a test program and ran it through Simulator2 using a 3.6864MHz AVR clock setting.

AVR Clock 3.6864MHz
_us     cycles      time
6       22          5.968
5       18          4.833
4       15          4.069
3       12          3.255
2       7           1.899
1       3           .814

You can put lipstick on a pig, but it is still a pig.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

I am implying nothing but saying exactly what I said.

Then tell me what >>is<< accurate, if as you are implying they are not accurate?

LOL -- but as I've said before, in the real world of AVR microcontroller apps WTFC (Who The Freak Cares) whether the delay_us() at 3.6864MHz is 15 or 16 cycles as long as the result is consistent and as close as practical?

Let's use the energy to solve practical situations rather than debating how many angels can fit through the eye of a needle.

Lee

You can put lipstick on a pig, but it is still a pig.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
Quote:
AFAIK the CV routines will indeed give the correct microseconds.

what does that mean?
Since delay_us takes an int, why would you expect anything better than +/- 0.5us?

Regards,
Steve A.

The Board helps those that help themselves.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Can someone explain how the optimizer and the delay are related??

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

Can someone explain how the optimizer and the delay are related??

Talk about a co-incidence. I just posted an explanation about this to the following thread:

http://www.avrfreaks.net/index.php?name=PNphpBB2&file=viewtopic&t=99192

In that my _delay_ms(1.7) example boils down to just FOUR opcodes but this is because all the "heavy work" (calculating how many times to iterate the tight loop) was done at COMPILE time and not RUN time. But it was only done at compile time because the optimiser was enabled and if it can (because values are fixed at compile time) it will do all the heavy maths during the compilation rather than leaving it to be done at run time.

It's the same reason why if I write:

int a,b,c;
a = 34;
b = 56;
c = a * b;
OCR1A = c;

then with optimization the code produced is:

	int a,b,c;
	a = 34;
	b = 56;
	c = a * b;
	OCR1A = c;
  96:	80 e7       	ldi	r24, 0x70	; 112
  98:	97 e0       	ldi	r25, 0x07	; 7
  9a:	90 93 89 00 	sts	0x0089, r25
  9e:	80 93 88 00 	sts	0x0088, r24

a,b,c never exist and the result 34*56=1904=0x0770 is simply loaded straight into OCR1A.

Built without optimisation this would produce:

  9a:    00 d0           rcall    .+0          ; 0x9c 
  9c:    00 d0           rcall    .+0          ; 0x9e 
  9e:    00 d0           rcall    .+0          ; 0xa0 
  a0:    cd b7           in    r28, 0x3d    ; 61
  a2:    de b7           in    r29, 0x3e    ; 62
    int a,b,c;
    a = 34;
  a4:    82 e2           ldi    r24, 0x22    ; 34
  a6:    90 e0           ldi    r25, 0x00    ; 0
  a8:    9e 83           std    Y+6, r25    ; 0x06
  aa:    8d 83           std    Y+5, r24    ; 0x05
    b = 56;
  ac:    88 e3           ldi    r24, 0x38    ; 56
  ae:    90 e0           ldi    r25, 0x00    ; 0
  b0:    9c 83           std    Y+4, r25    ; 0x04
  b2:    8b 83           std    Y+3, r24    ; 0x03
    c = a * b;
  b4:    2d 81           ldd    r18, Y+5    ; 0x05
  b6:    3e 81           ldd    r19, Y+6    ; 0x06
  b8:    8b 81           ldd    r24, Y+3    ; 0x03
  ba:    9c 81           ldd    r25, Y+4    ; 0x04
  bc:    ac 01           movw    r20, r24
  be:    24 9f           mul    r18, r20
  c0:    c0 01           movw    r24, r0
  c2:    25 9f           mul    r18, r21
  c4:    90 0d           add    r25, r0
  c6:    34 9f           mul    r19, r20
  c8:    90 0d           add    r25, r0
  ca:    11 24           eor    r1, r1
  cc:    9a 83           std    Y+2, r25    ; 0x02
  ce:    89 83           std    Y+1, r24    ; 0x01
    OCR1A = c;
  d0:    e8 e8           ldi    r30, 0x88    ; 136
  d2:    f0 e0           ldi    r31, 0x00    ; 0
  d4:    89 81           ldd    r24, Y+1    ; 0x01
  d6:    9a 81           ldd    r25, Y+2    ; 0x02
  d8:    91 83           std    Z+1, r25    ; 0x01
  da:    80 83           st    Z, r24

In which case a,b,c are created on the stack (the three RCALL's at the start) the values 34 and 56 are loaded into a and b then the multiplication a*b is calculated before the result is finally loaded into OCR1A.

This is one of many reasons to never use -O0

If I build the 4 opcode delay from that thread I've linked to without optimisation then the loop calculation is done using floating point at run time:

  9a:    cd b7           in    r28, 0x3d    ; 61
  9c:    de b7           in    r29, 0x3e    ; 62
  9e:    2e 97           sbiw    r28, 0x0e    ; 14
  a0:    0f b6           in    r0, 0x3f    ; 63
  a2:    f8 94           cli
  a4:    de bf           out    0x3e, r29    ; 62
  a6:    0f be           out    0x3f, r0    ; 63
  a8:    cd bf           out    0x3d, r28    ; 61
  aa:    8a e9           ldi    r24, 0x9A    ; 154
  ac:    99 e9           ldi    r25, 0x99    ; 153
  ae:    a9 ed           ldi    r26, 0xD9    ; 217
  b0:    bf e3           ldi    r27, 0x3F    ; 63
  b2:    8b 87           std    Y+11, r24    ; 0x0b
  b4:    9c 87           std    Y+12, r25    ; 0x0c
  b6:    ad 87           std    Y+13, r26    ; 0x0d
  b8:    be 87           std    Y+14, r27    ; 0x0e
 */
void
_delay_ms(double __ms)
{
    uint16_t __ticks;
    double __tmp = ((F_CPU) / 4e3) * __ms;
  ba:    6b 85           ldd    r22, Y+11    ; 0x0b
  bc:    7c 85           ldd    r23, Y+12    ; 0x0c
  be:    8d 85           ldd    r24, Y+13    ; 0x0d
  c0:    9e 85           ldd    r25, Y+14    ; 0x0e
  c2:    26 e6           ldi    r18, 0x66    ; 102
  c4:    36 e6           ldi    r19, 0x66    ; 102
  c6:    46 ee           ldi    r20, 0xE6    ; 230
  c8:    53 e4           ldi    r21, 0x43    ; 67
  ca:    0e 94 55 01     call    0x2aa    ; 0x2aa <__mulsf3>
  ce:    dc 01           movw    r26, r24
  d0:    cb 01           movw    r24, r22
  d2:    8f 83           std    Y+7, r24    ; 0x07
  d4:    98 87           std    Y+8, r25    ; 0x08
  d6:    a9 87           std    Y+9, r26    ; 0x09
  d8:    ba 87           std    Y+10, r27    ; 0x0a
    if (__tmp < 1.0)
  da:    6f 81           ldd    r22, Y+7    ; 0x07
  dc:    78 85           ldd    r23, Y+8    ; 0x08
  de:    89 85           ldd    r24, Y+9    ; 0x09
  e0:    9a 85           ldd    r25, Y+10    ; 0x0a
  e2:    20 e0           ldi    r18, 0x00    ; 0
  e4:    30 e0           ldi    r19, 0x00    ; 0
  e6:    40 e8           ldi    r20, 0x80    ; 128
  e8:    5f e3           ldi    r21, 0x3F    ; 63
  ea:    0e 94 d4 00     call    0x1a8    ; 0x1a8 <__cmpsf2>
  ee:    88 23           and    r24, r24
  f0:    2c f4           brge    .+10         ; 0xfc 
        __ticks = 1;
  f2:    81 e0           ldi    r24, 0x01    ; 1
  f4:    90 e0           ldi    r25, 0x00    ; 0
  f6:    9e 83           std    Y+6, r25    ; 0x06
  f8:    8d 83           std    Y+5, r24    ; 0x05
  fa:    3f c0           rjmp    .+126        ; 0x17a 
    else if (__tmp > 65535)
  fc:    6f 81           ldd    r22, Y+7    ; 0x07
  fe:    78 85           ldd    r23, Y+8    ; 0x08
 100:    89 85           ldd    r24, Y+9    ; 0x09
 102:    9a 85           ldd    r25, Y+10    ; 0x0a
 104:    20 e0           ldi    r18, 0x00    ; 0
 106:    3f ef           ldi    r19, 0xFF    ; 255
 108:    4f e7           ldi    r20, 0x7F    ; 127
 10a:    57 e4           ldi    r21, 0x47    ; 71
 10c:    0e 94 51 01     call    0x2a2    ; 0x2a2 <__gesf2>
 110:    18 16           cp    r1, r24
 112:    4c f5           brge    .+82         ; 0x166 
    {
        //    __ticks = requested delay in 1/10 ms
        __ticks = (uint16_t) (__ms * 10.0);
 114:    6b 85           ldd    r22, Y+11    ; 0x0b
 116:    7c 85           ldd    r23, Y+12    ; 0x0c
 118:    8d 85           ldd    r24, Y+13    ; 0x0d
 11a:    9e 85           ldd    r25, Y+14    ; 0x0e
 11c:    20 e0           ldi    r18, 0x00    ; 0
 11e:    30 e0           ldi    r19, 0x00    ; 0
 120:    40 e2           ldi    r20, 0x20    ; 32
 122:    51 e4           ldi    r21, 0x41    ; 65
 124:    0e 94 55 01     call    0x2aa    ; 0x2aa <__mulsf3>
 128:    dc 01           movw    r26, r24
 12a:    cb 01           movw    r24, r22
 12c:    bc 01           movw    r22, r24
 12e:    cd 01           movw    r24, r26
 130:    0e 94 d8 00     call    0x1b0    ; 0x1b0 <__fixunssfsi>
 134:    dc 01           movw    r26, r24
 136:    cb 01           movw    r24, r22
 138:    9e 83           std    Y+6, r25    ; 0x06
 13a:    8d 83           std    Y+5, r24    ; 0x05
 13c:    0f c0           rjmp    .+30         ; 0x15c 
 13e:    8e e2           ldi    r24, 0x2E    ; 46
 140:    90 e0           ldi    r25, 0x00    ; 0
 142:    9c 83           std    Y+4, r25    ; 0x04
 144:    8b 83           std    Y+3, r24    ; 0x03
    milliseconds can be achieved.
 */
void
_delay_loop_2(uint16_t __count)
{
    __asm__ volatile (
 146:    8b 81           ldd    r24, Y+3    ; 0x03
 148:    9c 81           ldd    r25, Y+4    ; 0x04
 14a:    01 97           sbiw    r24, 0x01    ; 1
 14c:    f1 f7           brne    .-4          ; 0x14a 
 14e:    9c 83           std    Y+4, r25    ; 0x04
 150:    8b 83           std    Y+3, r24    ; 0x03
        while(__ticks)
        {
            // wait 1/10 ms
            _delay_loop_2(((F_CPU) / 4e3) / 10);
            __ticks --;
 152:    8d 81           ldd    r24, Y+5    ; 0x05
 154:    9e 81           ldd    r25, Y+6    ; 0x06
 156:    01 97           sbiw    r24, 0x01    ; 1
 158:    9e 83           std    Y+6, r25    ; 0x06
 15a:    8d 83           std    Y+5, r24    ; 0x05
        __ticks = 1;
    else if (__tmp > 65535)
    {
        //    __ticks = requested delay in 1/10 ms
        __ticks = (uint16_t) (__ms * 10.0);
        while(__ticks)
 15c:    8d 81           ldd    r24, Y+5    ; 0x05
 15e:    9e 81           ldd    r25, Y+6    ; 0x06
 160:    00 97           sbiw    r24, 0x00    ; 0
 162:    69 f7           brne    .-38         ; 0x13e 

That doesn't even include the floating point library functions. One thing's for sure it's a whole heap more than four opcodes!

PS I wrote this tutorial forum article about Optimization

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

theusch wrote:

LOL -- but as I've said before, in the real world of AVR microcontroller apps WTFC (Who The Freak Cares) whether the delay_us() at 3.6864MHz is 15 or 16 cycles as long as the result is consistent and as close as practical?

Let's use the energy to solve practical situations rather than debating how many angels can fit through the eye of a needle.

Lee

I do. And so should others.
I think it does come down to exactly what you said:
"is consistent and as close as practical".
The current functions (that use the basic delay loops) fail on both counts.

Yes a given delay is consistent for a given clock rate, but
the behaviour and final delay is inconsistent across different clock rates due to the rounding involved.
i.e. you might get the delay you asked for or you might
get 2 to 3 clock cycles less than you asked for depending
the AVR clock rate.

Consider this real world example.
If you write an open source "library" you will
have no idea what speed AVR that code will be running on.
And a good library should "just work".
In some cases, you need to guarantee a short (microsecond or less) minimum hardware setup delay.
Sure you can be way conservative and ask for delays that are much longer than you need, but then the performance needlessly suffers.

This is the world I've lived in for the past 12+ months doing an open source GLCD Arduino library and why I've often spoken
about the limitations of the functions.

The current functions round delays down to the nearest 3 or 4 cycle boundary and also don't have the ability
to guarantee that the delay is as least as long as requested.

The functions broke down the most for the situations where you needed them the most because for very short (microsecond and less) delays there is no alternative to
CPU busy wait type delays.

Luckily, moving forward, there is an update to that resolves all these types of issues and should soon be available in a future AVR libc update.

--- bill

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Clawson,
Thanks for the detailed explanation.
That really helped.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

@millwood

Do you feel good , when you write those short "unconstructive" messages ?

While everybody is entitled to their own opinion , their expressions ought to be constructive.

I have been following your postings for a while , and i'm starting to wonder if you're the author of the Besserwisser game. Or just acting like that.

/Bingo

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

Do you feel good , when you write those short "unconstructive" messages ?

He clearly didn't bother to follow the links as one is the delay_x.h that's already been discussed and that has been subject to intense scrutiny by bperrybap in previous threads about delays.

The other is a tool to directly generate a sequence of Asm code with an exact cycle count. There's no question of "C loop iteration calculation overheads" or anything like that. The Asm can be lifted straight into Studio and proven there with the cycle counter. Asked to delay 10000 cycles it produced:

; ============================= 
;    delay loop generator 
;     10000 cycles:
; ----------------------------- 
; delaying 9999 cycles:
          ldi  R17, $21
WGLOOP0:  ldi  R18, $64
WGLOOP1:  dec  R18
          brne WGLOOP1
          dec  R17
          brne WGLOOP0
; ----------------------------- 
; delaying 1 cycle:
          nop
; ============================= 

In the sim this executes 9999 cycles including the final NOP so it would appear to be 1 cycle short in fact.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
In the sim this executes 9999 cycles including the final NOP so it would appear to be 1 cycle short in fact.
No, I get 9999 for the double loop, plus the one cycle for the NOP, so it looks spot on.

Regards,
Steve A.

The Board helps those that help themselves.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Steve,

That's very curious then. I built/simulated for mega48. At the reset the yellow arrow was positioned on the LDI R17,$21 with cycles=0. I then put a breakpoint on a second NOP beyond this code and ran to it. As you'll see, it's saying 9999 ...

Attachment(s): 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I just found a bug is the simulator! Using the code as you have it, I come out with 9999 as well. But if you start the debugger and single step over the first LDI, you will notice that the cycle counter doesn't change! If you put a NOP before the code, single step over it, then reset the cycle counter (which should do nothing since the cycle counter still said 0 at that point), then you end up with 10000 instead of 9999.

Regards,
Steve A.

The Board helps those that help themselves.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

If the question of using optimization >= -O1 is only due to tons of lines of disassembled code... What if I am using Mega128 (which has 'lot' of flash) and using delay functions i.e. _delay_ms() and _delay_us() for integer delay values with optimization level -O0. will it still produce precise delays or it's still dicey?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

kulu2002 wrote:
If the question of using optimization >= -O1 is only due to tons of lines of disassembled code... What if I am using Mega128 (which has 'lot' of flash) and using delay functions i.e. _delay_ms() and _delay_us() for integer delay values with optimization level -O0. will it still produce precise delays or it's still dicey?

Not sure what you mean buy "dicey".

The thing to keep in mind is that the delay code uses floating point math to calculate what you pass in to a number of CPU cycles.
The compiler will eliminate the floating point calculation by doing the calculations at compile if the following conditions are met:
- The argument is a constant not a variable
- The optimizer is enabled.

If either one of those is not true, then the code generated will be terrible and more than likely include calls to floating point routines to run time calculate the delay cycles.
It definitely won't be creating the expected delays particularly if the delay is fairly small as the floating point calculation itself could dwarf the actual desired delay.

--- bill