AVR Delay routing using parameter as delay needed

Go To Last Post
65 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I write routine (label) that is making precision delay of 1us:

 

.equ wus=((16000000/1000000 - 4) / 4); used crystal 1us delay

rjmp WaitUs
sleep

WaitUs:
	ldi  zh, HIGH(wus)	; used crystal 1us delay (high Byte) 1cy
	ldi  zl, Low(wus)       ; used crystal 1us delay (Low Byte)  1cy
WAITUZ:	sbiw  zl, 1		; count down, 2 Clock cycles
        brne  WAITUZ		; Not zero: 2 Clock cycles, zero: 1 Clock cycle
	ret			; back: 4 Clock cycles     

 

 This above code works perfectly for delay of 1us...simulated using AVR-Simulator and i get 1.0us.

 

    

 

 I want to write delay routine that is accepting delay time so that i don't need multiple routine delay for each delay (eg: 1us, 2us...100us etc)...so i write this code that does not work as expected:

 

 

.equ wus=((16000000/1000000 - 4) / 4); used crystal 1us delay

ldi r16, 0x02             ; make 2us delay
rjmp WaitUs
sleep


WaitUs:
	ldi  zh, HIGH(wus)	; used crystal 1us delay (high Byte) 1cy
	ldi  zl, Low(wus)       ; used crystal 1us delay (Low Byte)  1cy
WAITUZ:	sbiw  zl, 1		; count down, 2 Clock cycles
	brne  WAITUZ		; Not zero: 2 Clock cycles, zero: 1 Clock cycle
        dec r16                 ; count down, 1 Clock cycle
        brne WAITUS             ; Not zero: 2 Clock cycles, zero: 1 Clock cycle
        ret			; back: 4 Clock cycles

 

 But i get this delay (insted of 2us):

 

 

 

 Could i get any help how to fix code to have parameter for us delay? I make this .equ wus constant that calculate using used crystal (16MHz in my case) the desired clocks for 1us and then loop as many times as r16 (user defined variable in register) to archieve 2us delay.

 

so using ldi r16, 0x02  i need to get 2us delay, using ldi r16, 0x08 i need to get 8us delay...etc.

 

Thanks

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Why don't you use a timer ?

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Because i im writing code for frequency meter and t0 and t1 will be used for counting pulses...so i need precision delay using parameter in us.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

which kind of freq do you look at ?

 

How would you deal with timer overflow ? (ISR will change the time of the delay)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

frequency counter counting from 0.45Hz to 10MHz...but this is another thread...now i im focusing on precision delay routine..any code how to get 1us delay using parameter? I can use push and pop stack pointers to improve above code.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

then make a delay loop that always take 16 clk!

and make sure that you also know how long time it take from you clear/read the Timers.(if that is a hole multiply also of 16).

 

And then you should have timer ISR to tell about overflow (and perhaps lose count).

 

 

And a way could be to read the timers 10 times for every 1/10 of the time, and that way make the timers "bigger"

 

 

 

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Can you give example codehow to do that? I made code without parameter and it took 16cycles for 1us...that works..but when i add parameter it don't work

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You simply MUST count all instruction cycles used by EVERYTHING used in creating & calling the delay.  Are you doing that?

 

If you want it extremely exact, you also have to say what you count as part of the delay?  Are you including the call & the return?  What happens when an IRQ happens during the delay?

 

Account for your fixed & repetitive timings separately when calculating.  Doubling the repetitive & ignoring the fixed does not give double time!

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

Last Edited: Mon. Apr 13, 2020 - 01:55 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

What will your microsecond timer be used for, give an example please.

Jim

 

 

FF = PI > S.E.T

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I im finishing low level driver for lcd 16x2...there i need delay of 1us for E pin toggle from hight to low...i know it is not time critical(above code works) but i want to write time precisiin routine...so that i will have it for next projects in the feauture.

As for timer i will rewrite code from this meter and instead 7seg displays i will put lcd 16x2...this removes multiplexing as it is problem when you insert board into analog function generator...and most important...i will learn how it works...there i simulate line by line to see what happends in ave core.
http://danyk.cz/avr_fmetr3_en.html

Later i will modify time base from 1 sec to 0.5sec(to get faster display update
..eg 2readings per second)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

It looks like your trying to solve something that is rarely needed in embedded programming, and as you have found is vary difficult to do with any accuracy as you don't know the overhead of calling or looping the function and if interrupts are enabled than that can throw off all your calc's, and will need to be adjusted for different cpu clock speeds makes a universal delay almost impossible!  

Most apps that need a small microsecond delay, seldom need a variable delay, best to use the fixed delay provided by the toolchain. 

 

Millisecond delays, are fairly easy to do, as a few microseconds either way are not critical most of the time, and lend themselves to h/w timer use in a system tick function.

 

Jim

 

 

FF = PI > S.E.T

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

robydream wrote:
to 10MHz
Impossible with a 16 MHz system clock.  Fastest you can get is less than Fclk/2.  Recommended is Fclk/2.5, or 6.4 MHz with a 16 MHz system clock.

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I know...16MHz is what i have at home right now...20MHz will be...as it is Fclk/2 so 10MHz is maximum....as it will be used as frequency counter in analog signal generator it will count frequency from 15Hz to 165kHz so it is overkill...but this is not thema of this topic..delay is more important...as it after the stack is initialized it calls LcdInit that uses ms and us delay routing to initialize display, not sei or other interrupts are used at beginning of start program...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

robydream wrote:
LcdInit that uses ms and us delay routing to initialize display, not sei or other interrupts are used at beginning of start program...

but fixed delays will work fine here.

 

 

FF = PI > S.E.T

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I know...there is no reason to get 1.0us or 1.25us both works on lcd...but here is point that i want to learn and make waitus code that accept parameter for feauture projects (so that i don't need to write wait1us, wait2us and so one...)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

robydream wrote:
20MHz will be...as it is Fclk/2 so 10MHz is maximum
As I said, Fclk/2.5 is the highest recommended for reliable operation:

Each half period of the external clock applied must be longer than one system clock cycle to ensure correct sampling. The external clock must be guaranteed to have less than half the system clock frequency (f ExtClk < f clk_I/O /2) given a 50/50% duty cycle. Since the edge detector uses sampling, the maximum frequency of an external clock it can detect is half the sampling frequency (Nyquist sampling theorem). However, due to variation of the system clock frequency and duty cycle caused by Oscillator source (crystal, resonator, and capacitors) tolerances, it is recommended that maximum frequency of an external clock source is less than f clk_I/O /2.5.

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

Last Edited: Mon. Apr 13, 2020 - 04:23 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

A busy wait that is not screwed up by

interrupts pretty much requires a timer.

For a timer-based busy wait:

Read timer

Calculate end time

Wait for end time.

Without compare match, that is a 4- or 6-cycle loop.

Moderation in all things. -- ancient proverb

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

skeeve wrote:

A busy wait that is not screwed up by

interrupts pretty much requires a timer.

For a timer-based busy wait:

Read timer

Calculate end time

Wait for end time.

Without compare match, that is a 4- or 6-cycle loop.

But then what happens if an interrupt fires just before the end time ?

(I'm assuming here you are polling timer to find out when the end time is reached).

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

MrKendo wrote:
But then what happens if an interrupt fires just before the end time ?

the effect is the same as a software timer, although with a h/w timer, you will know how far off you are(by reading the timer value, something a soft timer will never do).

 

Jim

 

 

FF = PI > S.E.T

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

robydream wrote:

 

.equ wus=((16000000/1000000 - 4) / 4); used crystal 1us delay

ldi r16, 0x02             ; make 2us delay
rjmp WaitUs
sleep


WaitUs:
	ldi  zh, HIGH(wus)	; used crystal 1us delay (high Byte) 1cy
	ldi  zl, Low(wus)       ; used crystal 1us delay (Low Byte)  1cy
WAITUZ:	sbiw  zl, 1		; count down, 2 Clock cycles
	brne  WAITUZ		; Not zero: 2 Clock cycles, zero: 1 Clock cycle
        dec r16                 ; count down, 1 Clock cycle
        brne WAITUS             ; Not zero: 2 Clock cycles, zero: 1 Clock cycle
        ret			; back: 4 Clock cycles

 

 But i get this delay (insted of 2us):

 

 

 

 

The cause of your problem is simple.  dec r16 and brne WAITUS both take time to execute.  They add 3 additional cycles of overhead each time through the outer loop, so you need to include those 3 cycles in each loop delay cycle count.  But it's even more involved than this, because there are other instructions that only execute once (ret and the fall-thru of brne WAITUS) so you need to account for those cycle times only once, not each time through the outer loop.  Since the fall-thru is actually 1 cycle faster than taking the branch, you have a total of 3 additional clock cycles that you need to account for one time.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

you have a total of 3 additional clock cycles that you need to account for one time.

Zactly...see my comment in #8 

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I try but i im getting every time different timing...so if someone could get a view of the code above and rewrite to make for example 1us 5us and 12us delay

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I try but i im getting every time different timing

What do you mean by that? Isn't different timings what you want?  Go through your code & calculate exactly how many cycles each takes...it is that simple.

 

If you need 537 cycles...make sure your code adds up to exactly 537 cycles

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

sparrow2 wrote:
Why don't you use a timer ?
robydream wrote:
Because i im writing code for frequency meter and t0 and t1 will be used for counting pulses...so i need precision delay using parameter in us.

Ever think that you should be using a different chip such as a Cortex M0+ for this project instead?

 

More timers, faster clock, etc... means no having to fudge with a software delay or the Fclk/2.5 limitation at all!

"I may make you feel but I can't make you think" - Jethro Tull - Thick As A Brick

"void transmigratus(void) {transmigratus();} // recursio infinitus" - larryvc

"It's much more practical to rely on the processing powers of the real debugger, i.e. the one between the keyboard and chair." - JW wek3

"When you arise in the morning think of what a privilege it is to be alive: to breathe, to think, to enjoy, to love." -  Marcus Aurelius

Last Edited: Mon. Apr 13, 2020 - 10:50 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Here is code that i try to archieve 40us delay:

 



.equ wus=((16000000/1000000 - 15+1) / 1); used crystal 1us delay

ldi  r16, 40			; 1 cycle
rjmp WAITUS			; 2 cycles

WAITUS:
        push zh
	push zl			; 2cycle
        ldi zh, HIGH(wus)	; 1cycle
	ldi zl, LOW(wus)	; 1cycle
WAITUZ:
	sbiw zl, 1		; 2cycle
	brne WAITUZ		; non zero: 2cycle, zero: 1cycle
        pop zh
        pop zl			; 2cycle
	dec r16			; 1 cycle
	brne WAITUS		; non zero: 2cycle, zero: 1cycle
	ret			; back 4cycle   

 

 When i run simulation i get 40.3125us delay (instead of desired 40.000us)...then i try to get 10us delay i get 10.3125us delay, then i try to get 1us delay i get 1.3125us....

 

As for clock of 16MHz i get 5clocks more that i need (so there is 0.3125us more delay that i need)...can you see the code and make correction where i im doing wrong?

Last Edited: Mon. Apr 13, 2020 - 09:53 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

can you see the code and make correction where i im doing wrong?

No, did you bother to count up how many cycles the code will take?  

Request 1 integer less than you do now (if requesting 157, request 156 instead)...add in enough nops to fill in the fractional remainder to give you the exact time.  Then  all of your requests will be correct. 

 

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0
.equ wus=((16000000/1000000 - 15+1) / 1); used crystal 1us delay 

ldi  r16, 40			; 1 cycle
rjmp WAITUS			; 2 cycles

WAITUS:
        push zh                 ; 2cycles
	push zl			; 2cycles
        ldi zh, HIGH(wus)	; 1cycle
	ldi zl, LOW(wus)	; 1cycle
WAITUZ:
	sbiw zl, 1		; 2cycle
	brne WAITUZ		; non zero: 2cycle, zero: 1cycle
        pop zh                  ; 2cycle
        pop zl			; 2cycle
	dec r16			; 1 cycle
	brne WAITUS		; non zero: 2cycle, zero: 1cycle
	ret			; back 4cycle

 

 Cycles Calulation:

 

    8: rjmp, push, push, ldi, ldi

    1: 1 clock cycle per count - 1

    9: 3 clock cycles for last count

    4: ret

    = 1 * Z + 8 - 1 + 9 + 4

    = 1 * Z + 20

    Z = (clock cycles - 20) / 1

 

  So i will define at beginning or program:

.equ wus=((16000000/1000000 - 20) / 1); used crystal 1us delay

  And i get endlees loop beacuse Z - 1 must NOT be below 15-1 (remember 16cycles at 16MHz is 1us), and when i put -15 i get 1,3125us delay so i need to get 5cycles lower to get into 1.00us...but this is not possible...can you show me your code? I try to count above cycles, and it is correct but Z cannot be below 0(negative number because of endless loop)...so i don't know where to start digging to fix this problem....

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Why in the world don't you simulate this ? (in studio7 it would take less than 10 min to setup and run).

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks for replaying...but i im learning AVR asm delay functions so i made this from beginning:

 

.equ wus=((4*16000000/1000000 - 4)/4); used crystal 1us delay


rjmp WaitUs			; 2 cycles
sleep				; Stop here To see how many cycles we have

WaitUs:
	ldi  zh, HIGH(wus)	; used crystal 1us delay (upper nibble) - 1 cycle
	ldi  zl, Low(wus)       ; used crystal 1us delay (lower nibble) - 1 cycle
WAITUZ:	sbiw  zl, 1		; count down - 2 Clock cycles
	brne  WAITUZ		; Non zero: 2 Clock cycles, zero: 1 Clock cycle
	ret			; back: 4 Clock cycles

 

 This code works very good..without error...i put 40us, 4us, 1us and every time i get right delay...i start from beginning and i got this code..it works very nice....the problem you have to have very long to setup un AVR IDE Studio is this way that i did not correctly put stack pointer (push, pop), this is my next learning asm to study...and then i need to add parameter...i put parameter in above loop as you can see 4* means 4us delay...try this but add at the beginning code init stack pointer and choose right crystal (16MHz)...it works but when it is returning from ret it drops message about stack pointer not initialized....so i im studying where i need to put it and change above formula to be compatibile with push and pop.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

rjmp then ret,

that's not right is it, shiould be call then ret ?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yes you are right...i im reading atmel avr pdf of mnemonics and see that it must go rjmp...this takes 1cycle more that rcall (3cycles vs 2cycles) so above code using rjmp must be:

 

.equ wus=((1*16000000/1000000 - 11+7)/4); used crystal 1us delay 

 I will keep you posted when i got stack pointer used...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

robydream wrote:
the problem (sic) you have to have very long to setup un AVR IDE Studio is this way that i did not correctly put stack pointer (push, pop)

The whole point of programming in assembler is precisely that you are entirely responsible for everything!

 

If you consider it a "problem" that you have to spend your time setting up the stack pointer, managing pushes & pops, etc, etc - then that's exactly why we have languages like 'C' and C++

 

laugh

 

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

MrKendo wrote:
But then what happens if an interrupt fires just before the end time ?

(I'm assuming here you are polling timer to find out when the end time is reached).

Emphasis added.

In that case, the busy wait will end a little late.

Timers keep running during interrupt-handling:

Interrupts have to be strategically timed to mess up a timer-based busy wait.

Usually a timer-based busy wait need not interfere with other timer-using tasks.

All you need is a timer that ticks at CPU speed and to which no task assigns.

 

Moderation in all things. -- ancient proverb

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Make sure you set SPH/SPL so you can do rcall (or possibly call) & ret, they use the stack. This should be the first lines of code for any project!

 

Your delay routine does NOT need to use push and pop, unless you want to do so. Using pus/pop would allow the delay to be called from anywhere at any time, without worrying about registers (variables) being effected.  

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Ok..i finally got code that works using stack pointer (does not use push and pop) but it is ok, i don't need it as you say in this application:

 

.equ wus=((1*16000000/1000000 - 8)/4); used crystal 1us delay

.def tmp1=r16					; temporary register

	.CSEG
.ORG	0x0000	

Reset:
	ldi		tmp1, Low(RAMEND)		; initialize stack pointer
	out		SPL, tmp1
	ldi		tmp1, High(RAMEND)
	out		SPH, tmp1

	rcall WaitUs					; 3 cycles

MAIN:
	sleep						; Stop here To see how many cycles we have
	rjmp MAIN

WaitUs:
	ldi  zh, HIGH(wus)		; used crystal 1us delay (upper nibble) - 1 cycle
	ldi  zl, Low(wus)       ; used crystal 1us delay (lower nibble) - 1 cycle
WAITUZ:	sbiw  zl, 1				; count down - 2 Clock cycles
	brne  WAITUZ			; Non zero: 2 Clock cycles, zero: 1 Clock cycle
	ret						; back: 4 Clock cycles

    Now the interesting thing that i need to do is how to set above variable wus to accept first argument? In above 1* means 1 us delay, for example i would like to set it using asm code like this:

 

   

ldi tmp1, 40    ; set variable 40us delay
rcall WaitUs    ; call WaitUs routine

    So above variavle wus will be from 1* to 40* and all will be automated... can i use .equ variable to insert first parameter? If yes can i see some example code?

 

   Thanks

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

BTW a group of eight bits that a processor handles

as a unit is generally called a byte, not a nibble.

Half such a byte is often called a nybble.

 

Though its rare, a byte need not be eight bits.

That is why some standards refer to octets.

Moderation in all things. -- ancient proverb

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

skeeve wrote:
Half such a byte is often called a nybble.

also often spelled "nibble"

 

https://en.wikipedia.org/wiki/Nibble

 

 

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0
ldi tmp1, 40    ; set variable 40us delay

NO, you need to load ZH & ZL,with your desired amount, then call your delay

 

sbiw ZH:ZL   is preferred over sbiw ZL (in fact I'm not 100% sure your notation is proper, though it prob is....that is not the way the command is described by Atmel).  

 

from command help file:

Example:

sbiw r25:r24,1 ; Subtract 1 from r25:r24

sbiw YH:YL,63  ; Subtract 63 from the Y pointer(r29:r28)

 

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

???

Studio7 have the AVR assembler build in, run the code directly from there, so you don't need all these questions about why the clk cycles don't match!

    

Last Edited: Tue. Apr 14, 2020 - 09:23 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Another option- just put in what you want into a c compiler using the values you need via _delay_nn macros, copy the resulting few lines, place then inline into your code.

 

You can use an online compiler to do the job-

https://godbolt.org/z/uPhw4B

 

When you need 40us, 37us, 150us, 1us, just plug in the F_CPU, delay time needed, copy the few lines and you are done. No call/return, no rcall vs call, no counting instructions, etc. 

 

These things are small enough so they can be placed inline without any trouble, and you typically will not need many of them per project.

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0
.equ wus=((1*16000000/1000000 - 15)); used crystal 1us delay

.def tmp1=r16 ; temporary register

.CSEG
.ORG 0x0000

Reset:
         ldi tmp1, Low(RAMEND) ; initialize stack pointer
         out SPL, tmp1
         ldi tmp1, High(RAMEND)
         out SPH, tmp1

         ldi tmp1, 5 ; max 255us delay - i choose here 5us delay
         rcall WaitUs ; 3 cycles

MAIN:
         sleep ; Stop here To see how many cycles we have
         rjmp MAIN

WaitUs:
        adiw  ZH:ZL, wus+1
        WAITUZ: sbiw  zl, 1 ; count down - 2 Clock cycles
        brne  WAITUZ ; Non zero: 2 Clock cycles, zero: 1 Clock cycle
        dec   tmp1      ;  1 Clock cycle
        brne  WaitUs  ; Non zero: 2 Clock cycles, zero: 1 Clock cycle
        ret ; back: 4 Clock cycles

When i run 1us delay i need to remove +1 from wus in adiw, and when i run 2us to 255 i need to add +1 to wus in adiw. The problem is that i im missing here are the calculated cycles for 16MHz and simulated (error marked here):

1us=>16cycle get (14)         error: 2cycle missing
2us=>32cycle get (30)       error: 2cycle missing
3us=>48cycle get (42)       error: 6cycle missing
4us=>64cycle get (54)       error: 10cycle missing
5us=>80cycle get (66)       error: 14cycle missing

 

As you can see i im missing 4cycles when using 2us to 255us delay...so how to fix this? I know that counting cycles are missing and how to calculate this but i im not sure about formula because using above .equ wus formula i im missing 4 cycles constantly...could someone rewrite code to return? I have tried in simulator but have no clue how to fix this.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

what a mess!

 

hint (I'm not writhing your code, I would still say use a timer you have 3 of them). make so it take 16 clk with a value of 1 , and no loop (breq forward) and add nop so it takes 16 clk.

then make a loop that take 16 clk.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

This seems to work.  Doesn't use any registers, other than the passed-in us count.  7 instructions (one more than your looping version.)
It'll need to be adjusted for other clock frequencies...

 

;;us delay: enter with us Count (1-255) in R24
;; the call and return time is included as part of the delay.
usdelay:
;; one us is 16 cycles at 16MHz
   ;; 3 cycles for call
   rcall cycdelay7  ;  10 cycles
   subi r24,1		;   11 cycles
   brne delayp6     ;    12 cycles if not taken
   ret              ;     16 cycles
delayp6:
   ;; Extra cycles for each loop that doesn't incldude call/return
   ;;   but does include an extra "successful jump" cycle.
   ;; this'll be at 10 cycles without the call.  
   rjmp PC+1		;11, 12		            
   rjmp PC+1		; 13, 14
   rjmp usdelay     ;  15, 16
  
;; a call to a routine containing only a ret takes 7 cycles.
;; (normally, stick this on any convenient "ret" instruction.
;;  separate out here, for clarity.)
cycdelay7: ret

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The ldi or mov to load r24 costs you a cycle, so replace one of the rjmp PC+1 with nop.  Won't work for 1us, though.  To cover that case you'd need to replace the rcall cycdelay7 with 3 x rjmp PC+1 instead.

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Here's how I did cycle-accurate delays in one project:

tmp  = 16	; Scratch register
.macro delay_c cycles								;
  ; Fail on negative.								;
  .if \cycles < 0								;
	.error "delay_c called with a negative cycle count"			;
  ; Fail on too many cycles.							;
  .elseif \cycles > (((3 * 256) + 7) + 2)					;
	.error "delay_c called with too many cycles"				;
  ; Use subroutine for long delays.						;
  .elseif \cycles >= 10								;
	; Value of 256 passed as 0.						;
	ldi	tmp, ((\cycles - 7) / 3) % 256					; 1
	rcall	delay_loop							; 3+
    ; Extra one or two cycles not possible with subroutine's 3-cycle loop.	;
    .if ((\cycles - 10) % 3) == 2						;
	rjmp	.+0								; 2
    .elseif ((\cycles - 10) % 3) == 1						;
	nop									; 1
    .endif									;
  ; Too short for subroutine, inline loop instead.				;
  .elseif \cycles >= 9								;
	ldi	tmp, \cycles / 3						; 1
0:	dec	tmp								; 1
	brne	0b								; 1/2
    ; Extra 1 or two cycles not possible with 3-cycle loop.			;
    .if (\cycles % 3) == 2							;
	rjmp	.+0								; 1
    .elseif (\cycles % 3) == 1							;
	nop									; 1
    .endif									;
  ; Short delays can be done in the same or fewer words, without clobbers.	;
  .else										;
    .if \cycles >= 8								;
	rjmp	.+0								; 2
    .endif									;
    .if \cycles >= 6								;
	rjmp	.+0								; 2
    .endif									;
    .if \cycles >= 4								;
	rjmp	.+0								; 2
    .endif									;
    .if \cycles >= 2								;
	rjmp	.+0								; 2
    .endif									;
    ; Odd cycle not possible with rjmp.						;
    .if (\cycles % 2) == 1							;
	nop									; 1
    .endif									;
  .endif									;
.endm										;
delay_loop:									;
	dec	tmp								; 1
	brne	delay_loop							; 1/2
	ret									; 4

Use like this:

	delay_c	46

I used it in an app running at 8.25 MHz, so the above resulted in a 5.818 µs delay.

 

Most it can do is 777 cycles, which at 16 MHz is 48.5625 µs.

 

EDIT:  sp

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

Last Edited: Thu. Apr 16, 2020 - 03:17 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Here's a tweaked version of Bill's code (clobbers r0):

dlyrpt:

lpm

lpm

usdelay;

rcall cycdelay7

dec r24

brne dlyrpt

cycdelay7:

ret

 

I have no special talents.  I am only passionately curious. - Albert Einstein

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I hope that Z don't point at anything where a read has a side effect.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

sparrow2 wrote:

I hope that Z don't point at anything where a read has a side effect.

 

LPM is Load Program Memory (flash), not LD, which reads from SRAM.  SRAM reads on traditional AVRs can have side-effects, but not flash.

 

I have no special talents.  I am only passionately curious. - Albert Einstein

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Point take that is only a problem on the newer once. (and we all assume that a read from undefined flash just is remapped).

 

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

sparrow2 wrote:

Point take that is only a problem on the newer once. (and we all assume that a read from undefined flash just is remapped).

 

I don't assume that.  Not only have I read about other people's tests, I've tested flash address wraparound on several tiny/mega AVRs.  It's also well documented (search for " --pmem-wrap-around ").

 

I have no special talents.  I am only passionately curious. - Albert Einstein

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

ralphd wrote:
sparrow2 wrote:

Point take that is only a problem on the newer once. (and we all assume that a read from undefined flash just is remapped).

 

I don't assume that.  Not only have I read about other people's tests, I've tested flash address wraparound on several tiny/mega AVRs.  It's also well documented (search for " --pmem-wrap-around ")

In the usual case with power of two flash, I believe that wraparound is implicitly guaranateed.

The datasheet spcifies what bits of Z are used: just enough to cover the entire range.

Moderation in all things. -- ancient proverb

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I also read somewhere here that:

--pmem-wrap-around  

Don't always work. But I can't remember if it was a HW error or a compiler error.

Also on of the chips (I think a 16k flash version) always push a illegal return addr. for (r)call ! so that for will (has to do wrap all the time).

 

I have not seen anywhere that Atmel has guarantied wrap to work other than for 8k chips. 

 

Add

But this is about a SW delay OP for some reason want to do. 

I wanted OP to do it him self but now it's solved.

     

Last Edited: Thu. Apr 16, 2020 - 05:26 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

joeymorin wrote:
Here's how I did cycle-accurate delays in one project:

I've lost track here.  Your solution, joey, is a build-time solution, right?  Didn't OP want run-time?

robydream wrote:
 I want to write delay routine that is accepting delay time

 

I thought that the "newer" GCC facilities were cycle-accurate?

 

If there is a reasonable range of values, couldn't a Duff's device-type-thing be used?

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

theusch wrote:

I thought that the "newer" GCC facilities were cycle-accurate?

In C, via __builtin_avr_delay_cycles, yes.  It's a GCC extension.

 

For my purposes, I needed assembly language, and I needed something that would use a minimum of code space via a (re)callable subroutine.  Looks like the OP wants assembly language, too.

theusch wrote:

Didn't OP want run-time?

Where did he say that, exactly?

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

theusch wrote:

Didn't OP want run-time?

joeymorin wrote:
Where did he say that, exactly?

Isn't that what "using parameter as delay needed" (in the title) means ?

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

awneil wrote:

Isn't that what "using parameter as delay needed" (in the title) means ?

I don't see how 'runtime' is implied.  Macros take parameters, too.

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

fair point - and it's the way the 'C' delay works, too.

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

joeymorin wrote:

Where did he say that, exactly?

I quoted, from #1.  At least that was the way I took it.

"  I want to write delay routine that is accepting delay time ..."

And the "not working" code immediately following sets a "parameter" and then jumps to the "routine".

 

So you can't use _builtin_xxx except in a C source?  [not a GCC person]

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

Last Edited: Fri. Apr 17, 2020 - 01:07 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

joeymorin wrote:

I don't see how 'runtime' is implied.  Macros take parameters, too.

C'mon; you are stretching things now, aren't you?  Are you arguing that there is, then, no difference between e.g. delay routines that are expanded at build time such as your example code, and those that pass a parameter at run-time?

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

theusch wrote:
Are you arguing that
Of course not.  Why be specious?  This is irrelevant to the thread.  If the OP can achieve what they're looking for by extracting from my or your or anyone else's posts, what difference does it make?

 

theusch wrote:
C'mon; you are stretching things now, aren't you?
If you say so.

 

Macros have parameters, as you well know, and as has already been established in this thread.  A suggestion was made in #41 to examine and emulate _delay_nn macros' code generation, but you don't have a problem with that.

 

theusch wrote:
So you can't use _builtin_xxx except in a C source?  [not a GCC person]
Possibly, but none that I know of.  It is a C extension.

 

theusch wrote:
I quoted, from #1.  At least that was the way I took it.

"  I want to write delay routine that is accepting delay time ..."

Have you >>looked<< at the code in #46?  It is a macro, yes, but it generates a call to a subroutine.

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

joeymorin wrote:

Have you >>looked<< at the code in #46?  It is a macro, yes, but it generates a call to a subroutine.

 

Perhaps I'm getting too old, or the virus is affecting me.

joeymorin wrote:

Use like this:

	delay_c	46

 

is, to me, at build time.  It cannot accept a parameter at run time...let's say 8-bit (for convenience) ADCH, can it?

 

Why am I having such a hard time grasping how

 

joeymorin wrote:

Of course not.  Why be specious?  This is irrelevant to the thread.  If the OP can achieve what they're looking for by extracting from my or your or anyone else's posts, what difference does it make?

is relevent to the thread?  Are you saying that you code can take, e.g. the value in ADCH at run time after a conversion and produce a proportional delay?  I thought that is what OP was asking for in #1.  Which post can be extracted to produce this varying ADCH-proportional delay?

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

theusch wrote:
Why am I having such a hard time grasping how
How should I know?

 

The first (and only) mention of 'runtime' was by you.

 

The first (and only) mention of ADCH was by you.

 

theusch wrote:
Perhaps I'm getting too old, or the virus is affecting me.
Look at the macro in #46 again.  It generates two instructions.  An 'ldi' and an 'rcall'.  If you wish, you can rcall delay_loop yourself, having arranged to load tmp with whatever value you wish, perhaps with the contents of ADCH.

 

The code in #46 was offered with:

Here's how I did cycle-accurate delays in one project:

No other claims were made.

 

I fail to understand the value of this diversion.  It is not helping the OP.  Who, as you can see, hasn't been here for a while.

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

Last Edited: Fri. Apr 17, 2020 - 09:00 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Boy, I must have tickled your isolation bone.  OK, tell me the exact wording to distinguish between build-time and runtime?  Call them case A and case B.  I read that OP wanted case B.  Did you read that differently?

 

The answers to this might be important, as it may well be the niggle.

joeymorin wrote:

The first (and only) mention of ADCH was by you.

And that was a bad example to make it clear as an example of a runtime/dynamic/not-the-same-value-every-time ?

joeymorin wrote:

If you wish, you can rcall delay_loop yourself, having arranged to load tmp with whatever value you wish, perhaps with the contents of ADCH.

I'm very confused again -- no, you cant pass that and have your macro lines such as

.elseif \cycles > (((3 * 256) + 7) + 2)	

  work.  How?  "cycles" is ADCH at build time when the macro is expanded.  How do you know is ADCH is less than 0 or greater than a million?  Won't ou be working with the I/O address at this time?

 

joeymorin wrote:
I fail to understand the value of this diversion.  It is not helping the OP. 

Who asked for run-time delay, and you gave build-time.  And no, substituting ADCH value as "cycles" ain't gonna carry out your statements.

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I can't quite figure out why this is a bone you can't seem to let go of, but whatever gets you through the day...

 

theusch wrote:
Who asked for run-time delay
You've failed to convince me.  But this is a fun game.

 

theusch wrote:
How?
Never said.

 

Did say:

joeymorin wrote:

If you wish, you can rcall delay_loop yourself, having arranged to load tmp with whatever value you wish

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]