[DIS] [ASM] Program Tricks: Shrinking the OSCCAL Routine

Go To Last Post
61 posts / 0 new

Pages

Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

SUBTITLE: I WISH I WAS AN OSCCAL MYER WINNER

UPDATE APR 5/06: C VERSION ONLINE!

Many thanks to ARod(alejmrm) who ported and posted "C" version of the following ASM OSCCAL routine. Two versions on third page, fourth post down. Thanks a million ARod!

PURPOSE:

Discuss methods used to reduce size of typical OSCCAL routine using ARV machine language.

Quote:
Your code will not properly calibrate the oscillator!

Quote:
It is clear from this that your code does not do the same operations that the designers of the Butterfly thought were required for accurate calibration!

Quote:
What value does TMP have when entering the fragment?

Quote:
My main question also revolves around TMP, indirectly.

Quote:
Seems to be an interest in the way that you get the OSCCAL value, and I am interested in the details... do you mind to move it to a new thread to talk about it? As some body said, what kind of asumptions are you taking?, etc...

Thanks.. really clever ideas!

PREAMBLE:

When Giorgos announced his latest version of a Butterfly Bootloader. His first complaint about condensing its size was the length of the OSCCAL routine. Since I had to deal with same problem myself doing bootloaders, I thought I might help him out by giving him my much smaller routine. At the time I did not expect the critical reaction and wide interest in the routine.

Due to the many comments messing-up his original thread about the routine and how it works, plus private mail request, I started this thread to answer all those questions without hijacking his original thread. I hope you find this discussion informative as well as entertaining.

CAVEAT LECTOR:

Some of the techniques contained herein are outside normal coding practices and may be offensive to some programmers.

Last Edited: Fri. Apr 7, 2006 - 06:24 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

INTRODUCTION: THE HEART OF THE OSCCAL ROUTINE

The heart of the OSCCAL routine is very simple.The basic idea is to set up a counter on the Oscillator, after a fixed amount of time you examine the counter to see if Oscillator is running fast or slow. You then adjust the OSCCAL register, then go back and check again. We repeat this process until the Oscillator is running within an acceptable range.

Because the Oscillator rate can change with temperature etc. I go back and re-calibrate it each time the Butterfly wakes, something the other Bootloaders seem to have missed.

I am going to dig up an older bootloader with original OSSCAL routine and compare it to my current condensed version and try to remember the steps and logic I used to get from one to the other.

CONTENTS AND CODE FRAGMENTS:

For the purpose of this discussion I am only focusing on the actual OSCCAL adjustment loop and not the pre-amble or set-up.

As you examine the code fragments and techniques used, please keep in mind that the goal is to produce the smallest code possible and many traditional and accepted coding practices may go out-the-window.

FINAL WARNING:

Make sure that all the necessary precautions have been taken... you are about to enter the mind of a certified Hack.

Last Edited: Mon. Apr 3, 2006 - 06:12 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

Here's a typical OSCCAL routine. This one clipped from George Kolovos' dis-assembly of the Atmel *.hex file.

The first thing that jumps-out-at-me are the two routines at the bottom that adjust the OSCCAL register. To me they stand out like two warts on an otherwise straight-forward routine. So I would focus on those first.

; //////////////////////////////////////////////////////////////////////////////
; //    Description:   An enhanced bootloader for the ATMEL Butterfly unit    //
; //    Copyright (C) 2006  George Kolovos       //
; //                                                                          //
; //////////////////////////////////////////////////////////////////////////////

-------------------------------------------------

		; Test the internal oscillator speed
 OSC_Test:	ser	tmp1		; Reset any TC1 and TC2 flags
		 out	TIFR1,tmp1
		 out	TIFR2,tmp1
		sts	TCNT1H,Zero	; Reset TCNT1
		 sts	TCNT1L,Zero
		sts	TCNT2,Zero	; Reset TCNT2
		ldi	tmp1,1		; Start TC1
		 sts	TCCR1B,tmp1
		; Wait for the TC2 compare-match
		sbis	TIFR2,OCF2A	; 200 *	1/32768	= 6103.52us
		 rjmp	PC-1
		sts	TCCR1B,Zero	; Time-out: Stop the TC1
		sbic	TIFR1,TOV1	; TC1 overflowed?
		 rjmp	OSC_Too_Fast
		lds	XL,TCNT1L	; X = TCNT1
		 lds	XH,TCNT1H
		;  Is the oscillator slow?
		cpi	XL,Byte1(OSC_Lo) ; cpi TCNT1,6120
		 ldi	tmp1,Byte2(OSC_Lo)
		 cpc	XH,tmp1
		 brlo	OSC_Too_Slow
		;  Is the oscillator fast?
		cpi	XL,Byte1(OSC_Hi) ; cpi TCNT1,6251
		 ldi	tmp1,Byte2(OSC_Hi)
		 cpc	XH,tmp1
		 brsh	OSC_Too_Fast
		; The oscillator frequency is within the acceptable limmits
 OSC_Done:	ret

		; Decrease the oscillator frequency
 OSC_Too_Fast:	lds	tmp1,OSCCAL	; OSCCAL--;
		 dec	tmp1
		 sts	OSCCAL,tmp1
		rjmp	OSC_Test

		; Increase the oscillator frequency
 OSC_Too_Slow:	lds	tmp1,OSCCAL	; OSCCAL++;
		 inc	tmp1
		 sts	OSCCAL,tmp1
		rjmp	OSC_Test

---------------------------------------------------------
Last Edited: Fri. Apr 7, 2006 - 03:27 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

TITLE: ADVENTURES IN SHRINKAGING THE OSCCAL ROUTINE

Here are the two program "warts" that bothered me.
I capitalized them for my own benefit.

OSC_Too_Fast:
        LDS    TMP1,OSCCAL  ;DECREASE FREQ
        DEC    TMP1
        STS    OSCCAL,tmp1
         RJMP  OSC_Test

OSC_Too_Slow:
        LDS    TMP1,OSCCAL  ;INCREASE FREQ
        INC    TMP1
        STS    OSCCAL,TMP1
                 RJMP  OSC_Test

The first thing that strikes me is that with the exception of the INC & DEC statement the two routines are identical. Perhaps we can combine them somehow and turn two warts into just one.

We'll start by combining the two writes into one by using a programming technique I call "Sharing-a-Piece-of-Arse!"

OSC_FAST: LDS    TMP1,OSCCAL
          DEC    TMP1        ;<=== STS OSCCAL REMOVED
           RJMP  OSC_ADJ     ;<=== RE-ADJUSTED
OSC_SLOW: LDS    TMP1,OSCCAL
          INC    TMP1
OSC_ADJ:  STS    OSCCAL,TMP1 ;<===  SHARED ARSE-END
           RJMP  OSC_TEST

The next thing I notice is that there are two reads of the OSCCAL register. If we move that read back into the main program BEFORE these rotuines are called we can eliminate another line of code:

OSC_FAST: DEC    TMP1        ;<=== LDS REMOVED
           RJMP  OSC_ADJ
OSC_SLOW: INC    TMP1
OSC_ADJ:  STS    OSCCAL,TMP1 ;<=== LDS REMOVED
           RJMP  OSC_TEST

Now we've really shrunk those two routines down into one rather small extention of the main program.

Last Edited: Thu. Apr 6, 2006 - 10:39 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

Next I'd like to move that write (STS OSCCAL) out of this program "extension" and back to the main routine. There's no savings in terms of memory space for doing this so you'll just have to trust me on this for the moment, I have something in mind for later.

Since the write is the last thing we do before jumping back to the start of our calibration loop, the ideal place to move it would be right at the start.

We're not saving anything by doing this, but I'm half-way to doing something that will. So just wait a bit.

Remember the LDS OSCALL statement that we removed earlier, well a good place for it would be just before we enter our main calibration loop, but we'll need to switch to another unused register so we can hold that value undisturbed. Since tmp1 gets used inside the routine I'll call this new register TMP and make equal to R0.

Remember we haven't saved anything yet, however I am working up to something that requires that I make this move. So after moving the write out and making the other changes, re-adjusting the RJMPs and cleaning up, this is what we have:

; MAIN OSCCAL CALIBRATION LOOP
       LDS OSCCAL,TMP    ;<===MOVED HERE
OSC_TEST:
       STS OSCCAL,TMP    ;<===MOVED HERE
      
RET ; OSCCAL ADJUSTMENT ROUTINES OSC_FAST: DEC TMP RJMP OSC_TEST ;<===RE-ADJUSTED OSC_SLOW: INC TMP RJMP OSC_TEST ;<===RE-ADJUSTED

At this point I think I've answed some of the questions that were quoted at the start of this tread concerning the initial value of TMP prior to entering the calibration loop:

Quote:
What value does TMP have when entering the fragment?

Quote:
My main question also revolves around TMP, indirectly.

Quote:
As some body said, what kind of asumptions are you taking?

Well obviously from the code, the value of TMP prior to entering the main loop is the current value of OSCCAL that we are going to adjust within the routine. So I hope that answers the above questions to everyone's satisfaction.

       LDS OSCCAL,TMP

;MAIN OSCCAL CALIBRATION LOOP
OSC_TEST: 

I should have posted this one extra line. I didn't expect to be cross-examined on the routine, I was posting it to help someone who knew exactly what TMPs value would be.

Last Edited: Mon. Apr 3, 2006 - 06:25 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

Well the two "warts" are down to a single line with a jump back to the main routine. We've taken two warts, combined into a single wart, then turn it into two small blemishes.

I don't think there is much more we can do with them at this point. We're gonna' have to go back and re-examine the main routine. We'll start by looking at where these two routines are called from:

		sts	TCCR1B,Zero
		sbic	TIFR1,TOV1
		 rjmp	OSC_Too_Fast   ;<== CALL TOO FAST

		lds	XL,TCNT1L
		 lds	XH,TCNT1H
		cpi	XL,Byte1(OSC_Lo)
		 ldi	tmp1,Byte2(OSC_Lo)
		 cpc	XH,tmp1
		 brlo	OSC_Too_Slow  ;<== CALL TOO SLOW

		cpi	XL,Byte1(OSC_Hi)
		 ldi	tmp1,Byte2(OSC_Hi)
		 cpc	XH,tmp1
		 brsh	OSC_Too_Fast   ;<== CALL TOO FAST

The sequence is TOO_FAST / too_slow / TOO_FAST.

For reasons that will become apparent shortly I want the sequence to change from 2FAST/2slow/2FAST to 2FAST/2FAST/2slow. We can do this easily by just switching the last two tests around.

If you'd like to see a actual example of the routine we've worked out so far, even though we're only half-way into my next reduction, Giorgos somehow got this incomplete version into his latest Bootloader.

The following is a snippet from his latest source code.
I've highlighted the areas of interest for us: (Assume TMP=R0) and notice that the sequence is now 2FAST/2FAST/2slow and it contains all the exact modifications we've made so far.

		 lds	r0,OSCCA ;<=== FETCHING OSCAL PRIOR TO ENTERING LOOP
		
		
 OSC_Test:	sts	OSCCAL,r0 ;<=== SETTING OSCCAL AT START OF LOOP
		ser	tmp1
		 out	TIFR1,tmp1
		 out	TIFR2,tmp1
		sbis	TIFR2,TOV2
		 rjmp	PC-1
		lds	XL,TCNT1L
		 lds	XH,TCNT1H
		 sts	TCNT1H,Zero
		 sts	TCNT1L,Zero
		sbic	TIFR1,TOV1
		 rjmp OSC_Too_Fast ;<============== TOO_FAST
		cpi	XL,Byte1(Upper_Limmit)
		 ldi	tmp1,Byte2(Upper_Limmit)
		 cpc	XH,tmp1
		 brsh	OSC_Too_Fast ;<============ TOO_FAST

		cpi	XL,Byte1(Lower_Limmit) 
		 ldi	tmp1,Byte2(Lower_Limmit)
		 cpc	XH,tmp1
		 brlo	OSC_Too_Slow ;<============ TOO_SLOW
		
 OSC_Done:	sts	TCCR1B,Zero
		 sts	TCCR2A,Zero
		ret

 OSC_Too_Fast:	dec	r0 ;<========== LDS & STS REMOVED
		 rjmp	OSC_Test

 OSC_Too_Slow:	inc	r0 ;<========== LDS & STS REMOVED
		 rjmp	OSC_Test	

[/b]

Last Edited: Fri. Apr 7, 2006 - 06:31 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

TITLE: DANGEROUS DAN's SHRINKAGE OF THE OSCCAL ROUTINE

One programming "Trick" when you have an A-or-B option like we have above with the TOO_FAST-or-TOO_SLOW option, is to ASSUME that one of them is true, and later if it turns out not to be true, you re-adjust.

DANGEROUS DAN PROGRAM TIP: USE ASSUMPTIONS TO SIMPLIFY YOUR CODE


Obviously it's best to assume the case that will be true the most often, but here we don't have that option, but we do have TWO calls to the TOO_FAST routine and only one to the TOO_SLOW so let's choose to ASSUME that our oscillator is always running too fast.

I notice that when the Oscillator is too fast we decrement TMP=R0 so I add this as the first line of our routine. Now when the Oscillator actually turns out to be too fast our assumption is correct so we just jump back to the start of the main loop and totally by-pass the old TOO_FAST routine.

Making this change and removing the TOO_FAST routine we end-up saving another program line and streamline our routine.

While we're at it another trick with the AVRs with so many registers is to pre-define them to values that you might find handy. Just about everyone has a ZERO, but I also define ONE, TWO, THREE, FOUR, V128 and FF=255 because I find they come-in-handy.

I notice that in the 2nd line of the following code segment Giorgos is setting the TMP1 to 255 using the SER command and writing it to the TIFRn ports. If we use my pre-defined FF Register we can knock off another word from this routine.

DANGEROUS DAN PROGRAM TIP: USE PRE-DEFINED REGISTERS

OSC_TEST: DEC R0	;<======= NEW: ASSUME TOO FAST
          STS OSCCAL,R0
;------------------------------
;          ser TMP1
;          out TIFR1,TMP1
;          out  TIFR2,TMP1
;------------------------------
          OUT TIFR1,FF  ;<======== CHANGED
          OUT TIFR2,FF  ;<======== CHANGED
		sbis	TIFR2,TOV2
		 rjmp	PC-1
		lds	XL,TCNT1L
		 lds	XH,TCNT1H
		 sts	TCNT1H,Zero
		 sts	TCNT1L,Zero

		sbic	TIFR1,TOV1
		 RJMP	OSC_TEST ;<=== RE-ADJUSTED

		cpi	XL,Byte1(Upper_Limmit) 
		 ldi	tmp1,Byte2(Upper_Limmit)
		 cpc	XH,tmp1
		 BRSH	OSC_TEST ;<=== RE-ADJUSTED

		cpi	XL,Byte1(Lower_Limmit)
		 ldi	tmp1,Byte2(Lower_Limmit)
		 cpc	XH,tmp1
		 brlo	OSC_Too_Slow

 OSC_Done:	sts	TCCR1B,Zero
		 sts	TCCR2A,Zero
		 RET

;--------------------------------------
; OSC_Too_Fast:	dec	r0              ;<=== UN-NEEDED!
;		 rjmp	OSC_Test  ;<=== UN-NEEDED!
;--------------------------------------

 OSC_Too_Slow:	inc	r0
		 rjmp	OSC_Test
Last Edited: Mon. Apr 3, 2006 - 06:51 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

We're assuming that the oscillator is running fast and DECrementing R0 without permission so if it turns out that we're wrong we need to adjust for this.

The simplest solution is to add one to get us back to where we should be; then we add another one to increment the OSCCAL value. My first reaction is to simply add another INC R0 to the TOO_SLOW routine. But we have a register called TWO=2 so instead of adding any more program steps I simply change the INC R0 to ADD R0,TWO.

Now let's sweep away the discarded code fragment and see where we're at.

OSC_TEST: DEC R0
          STS OSCCAL,R0
          OUT TIFR1,FF
          OUT TIFR2,FF
		 sbis	TIFR2,TOV2
		 rjmp	PC-1
		 lds	XL,TCNT1L
		 lds	XH,TCNT1H
		 sts	TCNT1H,Zero
		 sts	TCNT1L,Zero

		 SBIC	TIFR1,TOV1
		 RJMP	OSC_TEST

		 cpi	XL,Byte1(Upper_Limmit) 
		 ldi	tmp1,Byte2(Upper_Limmit)
		 CPC	XH,tmp1
		 BRSH	OSC_TEST

		 cpi	XL,Byte1(Lower_Limmit)
		 ldi	tmp1,Byte2(Lower_Limmit)
		 cpc	XH,tmp1
		 brlo	OSC_Too_Slow

	    sts	TCCR1B,Zero
		 sts	TCCR2A,Zero
		 RET

 OSC_Too_Slow:	ADD R0,TWO    :<=== MODIFIED!
		 RJMP  OSC_Test
Last Edited: Mon. Apr 3, 2006 - 06:55 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

Earlier I wanted the structure of the main routine to be switched from 2FAST/2slow/2FAST to 2FAST/2FAST/2slow? This next move will explain why.

If you remember we moved the DEC R0 out of our old OSC_TOO_FAST routine and stuck in the main-line to eliminate the entire routine. We can do the same with the ADD R0,TWO in the TOO_SLOW routine and remove it also. This simplifies our code and eliminates another program statement.

The reason I wanted the 2FAST/2FAST/2slow structure was to make this move. If we left it as it was, we'de have to INC R0, then later ADD R0,TWO then SUB R0,TWO.

OSC_TEST: DEC R0
          STS  OSCCAL,R0
          OUT  TIFR1,FF
          OUT  TIFR2,FF
          sbis TIFR2,TOV2
           rjmp PC-1
          lds  XL,TCNT1L
          lds  XH,TCNT1H
          sts  TCNT1H,Zero
          sts  TCNT1L,Zero

          SBIC TIFR1,TOV1
           RJMP OSC_TEST
          cpi  XL,Byte1(Upper_Limmit) 
          ldi  tmp1,Byte2(Upper_Limmit)
          CPC  XH,tmp1
           BRSH  OSC_TEST

          ADD  R0,TWO   ;<=== NEW ADDITION!
          cpi  XL,Byte1(Lower_Limmit)
          ldi  tmp1,Byte2(Lower_Limmit)
          cpc  XH,tmp1
           BRLO  OSC_TEST  ;<=== RE-ADJUSTED

          sts  TCCR1B,Zero
          sts  TCCR2A,Zero
           RET

So we've finally removed those two blemishes and have a fairly decent piece of code now.

Two final things I like to do is remove the two lines that shut-off the timers, I deal with that in another part of my code.

The other thing is to get rid of that ugly rjmp PC-1 on line six. No progammer I know ever uses this nomeclature for relative jumps. It's a sure sign that this was a dis-assembly of someone's HEX file.

Last Edited: Thu. Apr 6, 2006 - 10:41 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

What we have so far looks like this:

OSC_TEST: DEC R0            ;SET-UP TIMERS
          STS  OSCCAL,R0
          OUT  TIFR1,FF
          OUT  TIFR2,FF
W6103:    SBIS TIFR2,TOV2   ;WAIT
           RJMP W6103
          lds  XL,TCNT1L    ;READ TIMER
          lds  XH,TCNT1H
          sts  TCNT1H,Zero
          sts  TCNT1L,Zero

          SBIC TIFR1,TOV1   ;CHECK TOO FAST
           RJMP OSC_TEST
          cpi  XL,Byte1(Upper_Limmit) 
          ldi  tmp1,Byte2(Upper_Limmit)
          CPC  XH,tmp1
           BRSH  OSC_TEST

          ADD  R0,TWO       ;CHECK TOO SLOW
          cpi  XL,Byte1(Lower_Limmit)
          ldi  tmp1,Byte2(Lower_Limmit)
          cpc  XH,tmp1
           BRLO  OSC_TEST
            RET             ;RETURN IF WE'RE JUST RIGHT

PRELIMINARY CONCLUSION:

So far we've taken a fairly large piece of code, straightened it out by removed two ugly routines hanging off the end, and reduced it from about 32 lines to 22, about 2/3rds it's original size.

Even if you're not hell-bent on reducing a routine's size, reducing it's complexity is always a good thing. Bugs are directly proportional to some power of the length and complexity of the code. With Microshaft Windoze being millions of lines of code, is it any wonder it crashes on a regular basis?

With less "going on" in the silicon, there's less chance of anything going awry. Also, trying to debug a "flat, smooth" routine is always far easier and much faster than trying to sort out some lumpy, bent and twisted piece of "spaghetti code" from an unskilled code-smithy.


BUGS ~= [ SIZE x COMPLEXITY ]**P, where P >1

PROGRAMING TIP: REDUCE SIZE & COMPLEXITY OF CODE

Now that we've removed all the ugliness from the code and reduced it in size, Most people would expect that I stop at this point.

Obvioulsy don't know me very well, because this is the exact point where I start pulling out my bag of "dirty" tricks and try to squeeze the program down even further. I'm not happy until I've beaten a routine down so far it changes from Coal to Diamond.

TO BE CONTINUED AFTER THE INTERMISSION...

Attachment(s): 

Last Edited: Fri. Apr 14, 2006 - 06:12 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

[Delete]

Last Edited: Sat. Apr 15, 2006 - 12:56 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

read the post, it was a pm from me :)

---
ARod

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

[Delete]

Last Edited: Sat. Apr 15, 2006 - 12:56 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

[Delete]

Last Edited: Sat. Apr 15, 2006 - 12:57 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

THE SAGA CONTINUES: REDUCTUM AD ABSURDUM

From the time we are children, we are taught integers with the visual aid of a ruler. So we normally think of a single computer integer byte as running from the value of 0 at one end to "255" at the other end running in a straight line, liek a ruler.

However, your microprocessor thinks of integers in a different way than us humans. To a digital MCU a byte wraps around on itself. The "distance" between 0 and 255 is not 255 but 1, if you subtract one from zero you get 255 and if you add one to 255 you get zero. So it's best to think of unsigned integers as little circles that wrap around on themselves the way a "digital brain" does.


DAN's PROGRAM TIP: THINK OF UNSIGNED INTEGERS AS LITTLE CIRCLES

In our routine we are adjusting the OSCCAL value which is a single byte that will wrap around on itself. So instead of incrementing it in one direction when we are too slow, and decrmenting it when we are too high, perhaps we can just take it in ONE direction knowing it will eventually wrap-around to the value we need. Sure it will take a little longer at the microprocessor level, but will that translate into any real difference on a human scale?

OSC_TEST: DEC TMP           ;SET-UP TIMERS
          STS  OSCCAL,TMP
          OUT  TIFR1,FF
          OUT  TIFR2,FF
W6103:    SBIS TIFR2,TOV2   ;WAIT
           RJMP W6103
          lds  XL,TCNT1L    ;READ TIMER
          lds  XH,TCNT1H
          sts  TCNT1H,ZERO
          sts  TCNT1L,ZERO

          SBIC TIFR1,TOV1   ;CHECK TOO FAST?
           RJMP OSC_TEST
          cpi  XL,Byte1(Upper_Limmit) 
          ldi  tmp1,Byte2(Upper_Limmit)
          CPC  XH,tmp1      ;CHEC TOO FAST?
           BRSH  OSC_TEST

          ADD  TMP,TWO ;<=============== CAN WE REMOVE THIS?
          cpi  XL,Byte1(Lower_Limmit)
          ldi  tmp1,Byte2(Lower_Limmit)
          cpc  XH,tmp1      ;CHECK TOO SLOW?
           BRLO  OSC_TEST
            RET             ;OSCCAL IS FINE
Last Edited: Wed. Apr 5, 2006 - 11:07 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

My latest Bootloader, the [CRICKET], will "chirp" when reset and "chirp" again after the oscillator's been calibrated, then it waits a while for you to start your upload and if nothing happens it will give a final chirp before going to sleep. I did this because I got tired of havng to press on the joystick each time I wanted to upload a new program.

The time between the two initial "chirps" is the time that it takes to calibrate the oscillator. So I loaded a version of the Bootloader with the ADD TMP,TWO included into a Butterfly and another without it into another and compared results. The one with it missing was slightly slower, but without the audible clues, no one would ever notice. So on a human scale there's not much difference.

By safely removing the ADD TMP, TWO line we can save another program step:

OSC_TEST: DEC TMP           ;SET-UP TIMERS
          STS  OSCCAL,TMP
          OUT  TIFR1,FF
          OUT  TIFR2,FF
W6103:    SBIS TIFR2,TOV2   ;WAIT
           RJMP W6103
          lds  XL,TCNT1L    ;READ TIMER
          lds  XH,TCNT1H
          sts  TCNT1H,ZERO
          sts  TCNT1L,ZERO

          SBIC TIFR1,TOV1   ;CHECK TOO FAST?
           RJMP OSC_TEST
          cpi  XL,Byte1(Upper_Limmit) 
          ldi  tmp1,Byte2(Upper_Limmit)
          CPC  XH,tmp1      ;CHECK TOO FAST?
           BRSH  OSC_TEST
                                     ;<===== ADD TMP,TWO REMOVED!
          cpi  XL,Byte1(Lower_Limmit)
          ldi  tmp1,Byte2(Lower_Limmit)
          cpc  XH,tmp1      ;CHECK TOO SLOW?
           BRLO  OSC_TEST
            RET             ;OSCCAL IS FINE

So the logic of our routine has changed: instead of incrementing or decrementing based on whether the oscillator is fast or slow, now it knows the oscillator is out, and decrements the OSCCAL register. If we happen to be moving it in the "wrong" direction, no problem because, once it hits ZERO it will wrap to 255 and start working downward towards the correct setting. In fact the high bit is not used so it will "wrap" at 127 so the entire process is very fast.

The fact that it takes a tiny bit longer is actually a bonus, because when first started, it's best to let the oscillator "stablize" and the more time that passes, the better. So not only have we removed a program step, we've actually improved over the "standard" routine.

Attachment(s): 

Last Edited: Fri. Apr 14, 2006 - 06:06 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

Since the entire logic has changed, We should re-writing the test part of the routine. Now that we're not concerned if we're running fast or slow, but only if we're outside the acceptable range, maybe there's some savings to be had.

Since Upper and Lower Limits are both constants, perhaps we can calculate the difference at assembler time. Then we just read the Ocillator subtract the ideal speed of 6103 and see if difference is within that range.

At this point I really did not expect to see much savings since a compare is almost the same as a subtraction, so instead of testing the Oscillator reading against an Upper_Limit and a Lower_Limit, we're subtracting the Ideal_Limit and comparing the difference to the difference between the Upper_Limit and the Lower_Limit.

Both approaches would take about the same number of program steps, two 16-bit compares is going to be the same as a 16-bit subtraction and a 16-bit compare. So I stop here for a while.

Last Edited: Fri. Apr 7, 2006 - 06:44 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

All that leave us with is that first test for the Oscillator running too fast. I wonder if we could do something with these lines of code that check if out timer has overflowed.

          SBIC TIFR1,TOV1 
           RJMP OSC_TEST 

I've seen OSCCAL routines with 6103 +/- 100uS but most seem to use +/- 50uS and I use a "tighter" +/- 40uS.

This means that our "correct" range is 80uS long. What are the chances that when the timer overflows that it will co-incidentally fall within this range and give us a "False Positive?"

Well based on pure randomness it would be:

Probability = Correct_Range/Total_Range x 100%
Probability = 80/65,536 x 100%
Probability = 0.12%

One in a thousand...Hmm, not too bad, however, the real probability is much, much lower than this.

There's a small chance that the Oscillator can be so fast that it's outside our range test and overflows to 10. There's an even smaller chance that it will be over by 100. There's an even micro-chance that it will be out by 1000.

The probability that the oscillator could be that far out, by 6100, AND still fall within my small range to give a false positive are slim-to-none. We can safely eliminate this line from the code and save ourselves two more program lines.

OSC_TEST: DEC TMP         ;SET-UP TIMERS 
          STS  OSCCAL,TMP 
          OUT  TIFR1,FF 
          OUT  TIFR2,FF 
W6103:    SBIS TIFR2,TOV2 ;WAIT 
           RJMP W6103 
          lds  XL,TCNT1L  ;READ TIMER 
          lds  XH,TCNT1H 
          sts  TCNT1H,ZERO 
          sts  TCNT1L,ZERO 
                             ;<=== TIMER OVERFLOW TEST REMOVED!
          cpi  XL,Byte1(Upper_Limmit) 
          ldi  tmp1,Byte2(Upper_Limmit) 
          CPC  XH,tmp1      ;OSCILLATOR OUT? 
           BRSH  OSC_TEST 

          cpi  XL,Byte1(Lower_Limmit) 
          ldi  tmp1,Byte2(Lower_Limmit) 
          cpc  XH,tmp1      ;OCILLATOR OUT? 
           BRLO  OSC_TEST 
            RET           ;OSCCAL IS FINE 
Last Edited: Mon. Apr 3, 2006 - 09:44 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

Then it dawned on me...

The reason that re-writing the comparision section wasn't a great idea was because a 16-bit subtraction and a 16-bit compare are essentially the same as two 16-bit compares.

However, from the calculations I just made I realize that my "correct range" is only 80 and that will fit into a byte, so that translates into a 16x8 bit compare not a 16x16. This may save us another line of code:

So the "new" concept was to subtract the correct range from our readings and check if results fell within our range:

TST_OSC: DEC TMP         ;SET-UP TIMERS 
          STS  OSCCAL,TMP 
          OUT  TIFR1,FF 
          OUT  TIFR2,FF 
W6103:    SBIS TIFR2,TOV2 ;WAIT 
           RJMP W6103 
          lds  XL,TCNT1L  ;READ TIMER 
          lds  XH,TCNT1H 
          sts  TCNT1H,ZERO 
          sts  TCNT1L,ZERO 
          
          SUBI  XL,LOW(6103-40)  ;CALC SAFE-RANGE
          SBCI  XH,HIGH(6103-40)
          CPI   XL,LOW(UP_LIMIT-LO_LIMIT)
          CPC   XH,ZERO          ;WITHIN RANGE?
           BRPL  TST_OSC      
            RET 

The above code looks great: nice, small, then I realize if clock reading is less than 6,103 - 40 then I've got to deal with a "negative" number. I like to avoid "signed" integers whenever I can because they sometimes have a habit of coming back to bite-you-in-the-butt!

Last Edited: Mon. Apr 3, 2006 - 10:59 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

A SMALL DEVIATION: TOWARDS A SUPER-FAST CALIBRATION ROUTINE

This approach of checking if the size of the "Error" of the Oscillator rather than using the traditional method of just seeing if it falls between an upper and lower boundary and incrementding or decrementing the OSCCAL register might be useful to some.

If you wanted a super-fast calibration, you could measure the size of the error and adjust the OSCCAL register accordingly rather than in small increments/decrments of one. The method which seems to be used by most.

Since this particular routine is done shortly after power-up (or reset) the longer it takes to adjust the OSCCAL register, the better, because it gives the oscillator more time to "settle."

Okay, I think I've been a "deviant" long enough, time to get back to the main topic of this thread...the shrinking of the OSCCAL Routine.

Last Edited: Mon. Apr 3, 2006 - 11:19 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

I went back and looked at the actual typical values of the Upper and Lower Limits.

Typical Lower-Limit:

LOWER = IDEALTIME - 50uS
LOWER = 6103 - 50
LOWER = 6053 = $17:A5

Typical Upper-Limit:

UPPER = IDEAL + 50uS
UPPER = 6103 + 50
UPPER = 6153 = $18:09

Notice that the high bytes are only out by one, and that if I reduced the upper value by 10, then the high byte would drop to 17 and both high bytes would be the same:

My Lower-Limit:

LOWER = IDEALTIME - 40uS
LOWER = 6103 - 40
LOWER = 6043 = $17:AF

My Upper-Limit:

UPPER = IDEAL + 40uS
UPPER = 6103 + 40
UPPER = 6143 = $17:FF

This means that if we use +/- 40uS instead of +/- 50uS we can simplify part of our routine by simply checking if the high byte equals $17. You'll learn in the next post why it is important that I use exactly +/- 40uS.

TSTOSC: DEC TMP         ;SET-UP TIMERS 
        STS  OSCCAL,TMP 
        OUT  TIFR1,FF 
        OUT  TIFR2,FF 
W6103:  SBIS TIFR2,TOV2 ;WAIT 
         RJMP W6103 
        LDS  XL,TCNT1L  ;READ TIMER 
        LDS  XH,TCNT1H 
        STS  TCNT1H,ZERO 
        STS  TCNT1L,ZERO 
        CPI  XH,23  ;<===== CHECK IF HIGH BYTE IS $17
         BRNE TSTOSC
        (INCOMPLETE AS YET)
          RET 
Last Edited: Mon. Apr 3, 2006 - 11:50 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

Now that we've "taken-care-of" the high byte all we need to do now is check if the lower byte is above or below our selected range? That's amounts to just two single byte compares.

However, if we can "set-things-up" so that our "range" begins or ends at ZERO or 255 then we need only check in "one direction." That means a single compare statment instead of two.

You might have missed it first-time-around so look again at the Lower-Byte value of the Upper-Limit in hex notation, it's $FF:

My Upper-Limit:

UPPER = IDEAL + 40uS
UPPER = 6103 + 40
UPPER = 6143 = $17:FF

I chose to use +40uS so that the test for our lower-byte would fall on the byte's upper "boundary" at $FF=255. This mean that all we need to do is compare our lower-byte against our lower-boundary value and if we fall below it then our oscillator is out-of-range.

Let me expand on this "trick" in case some readers don't get it, because it is a little hard to follow...

Remember earlier I said that single unsigned bytes are actually like little circles and not little rulers. Well if we compare our timer against our range, and the range just happend to end at $FF=255. If we're over 255 we actually "wrap-around" to ZERO and now we're actually UNDER, so that counts as a failure.

Also if we compare against our range and we actually do fall under, then that's a failure also. So we've magically combined two tests, one for over and another for under into a singe test for being under because the over will "wrap" to being under.

So our final program looks like this:

TSTOSC: DEC TMP         ;SET-UP TIMERS 
        STS  OSCCAL,TMP 
        OUT  TIFR1,FF 
        OUT  TIFR2,FF 
W6103:  SBIS TIFR2,TOV2 ;WAIT 
         RJMP W6103 
        LDS  XL,TCNT1L  ;READ TIMER 
        LDS  XH,TCNT1H 
        STS  TCNT1H,ZERO 
        STS  TCNT1L,ZERO 
        CPI  XH,23  ;<===== CHECK IF HIGH BYTE IS $17
         BRNE TSTOSC
        CPI XL,175 ;<===== CHECK IF UNDER $AF
         BRLO TSTOSC ;<=== ALSO SNEAKY TEST IF OVER $FF
          RET 
Last Edited: Mon. Apr 3, 2006 - 08:30 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

So after we add a few lines of set-up the final routine looks like the below, which should be fairly close to the routine I posted in Giorgos' thread:

        LDS TMP,OSCCAL
TSTOSC: DEC TMP         ;SET-UP TIMERS 
        STS  OSCCAL,TMP 
        OUT  TIFR1,FF 
        OUT  TIFR2,FF
        STS TCNT1H,ZERO
        STS TCNT1L ZERO
        STS TCCR1B,ONE
        STS TCNT2,ZERO
W6103:  SBIS TIFR2,1    ;WAIT 
         RJMP W6103 
        LDS  XL,TCNT1L  ;READ TIMER 
        LDS  XH,TCNT1H 
        CPI  XH,23  ;<=== HIGH BYTE = $17?
         BRNE TSTOSC
        CPI XL,175  ;<=== LOW BETWEEN $AF-$FF? 
         BRLO TSTOSC
          RET 

IN CONCLUSION:

Well I hope I've answered all your questions about my condensed OSCCAL Routine. How it works, why it works and how it got to it's present form.

I certainly hope you found the trip entertaining as well as informative.

I've used this routine now for hundreds of uploads to Butteflies and have not experienced a single problem. There have been hundreds of by Bootloaders downloaded wich use this OSCAL Routine and have yet to receive any reports of problems.

CONTINUING EDUCATION:

To learn more about the AVR Butterfly in gerneral, you can visit the Butterfly & Beginner's Web Site at: http://retrodan.tripod.com/ or you can visit the Butterfly & Beginners Forum at: http://groups.yahoo.com/group/AVRButterFly/ or the AVR Assembler Site at: http://avr-asm.tripod.com

REQUEST FOR FEEDBACK:

If you found this tutorial discussion interesting and/or entertaining and would like to see more like it, please let the moderator(s) know.

Thank you for your time and consideration.
Have a wonderful day!

Attachment(s): 

Last Edited: Fri. Apr 14, 2006 - 06:07 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

RetroDan wrote:
TITLE: DANGEROUS DAN's SHRINKAGE OF THE OSCCAL ROUTINE

[RESERVED FOR FUTURE USE BY THE OP]


Could you please stop this nonsense? Don't post a new message when you don't have anything to say. (Or are you desperate to get more stars after your name?)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

What the f*ck are you doing, RetroDan ????
Remove those emty mails !

Board Admin / Moderators ! Why can't we get rid of this nuisance ? I know you've had several complaints !

/Jesper
http://www.yampp.com
The quick black AVR jumped over the lazy PIC.
What boots up, must come down.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Retro,

Now, I got it! and thanks for taking time to write this tutorial about your micro-osccal-function.

I will try your functions all the way down to the last post.. but I can tell you that so far I had tested the functions that you post until yesterday(april 2,) and so far so good...

Footnote:
I notices that people doesn't like the way that you post your tutorials... In the begining I also didn't like it, and still don't like it.. BUT I understand why you have it. each post is a "chapter" and you are just "reserving" space for your tutorial to have a sequence without interruptions... any way these are my 2 cents.(dos centavos).

thanks again for taking time for this tutorial. :)

---
ARod

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Retro,
I like your stuff too! I also like the "chapter" style of posting. I bet that if you concatenated it all together it would not all fit in 1 message.

Those other guys got all hot abou the "reserved" space before you filled them. Maybe you should write the stuff in notepad or something then do the filling really quickly before some people get hot under the collar.

Personally, I have not used your stuff, but I might need it in the future. I like the contraversy it has created here and there. Personal attacks aside, the discussion of why your small stuff is not new or not "good" is interesting. Keep it up. If I learn why some stuff is not "good" and still works, or even works "better", then I have learned stuff that is very useful. But then I am an iconoclast from way back. I love new stuff, but the new stuff has to be BETTER than what it replaces. Otherwise I stick with the old stuff.

-Tony

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
One in a thousand...Hmm, not too bad, however, the real probability is much, much lower than this.

The probability of this is, in fact, 0 (assuming that the crystal attached to the TOSC lines is really 32.768kHz and the cpu clock prescaler is set to 8 ).

Consider this. 6103 comes from this:

N = (F_CPU / 32.768kHz) * 200

Where the nominal cpu frequency is 1mHz, and 200 is the number of ticks of the 32.768kHz crystal that we wait for in the sample. To find the actual cpu frequency we turn it around:

F_CPU = N * 32.768kHz / 200

If timer 1 overflowed then the count value would be at least 65536. Putting that into the equation we find that the actual F_CPU would have to be almost 11mHz. Since the OSCAL at maximum can achieve at most twice the nominal frequency, this will be impossible. And if you forgot to set the clock prescaler to 8 it may have gotten this high with the OSCAL at it's highest. But even with the lowest OSCAL, it would still result in a count of at least 32768, so it would not be possible to adjust it to within range. So if timer1 overflows, this is necessarily an error condition.

Both your routine and Atmel's original code are not fault tolerant. If any condition occurs that doesn't allow the count to get into range, it will loop forever (which in real time is a significant amount ;)). You've got to be careful about the acceptable range you use. If you get too small, you might not be able to get within that range. Calculating it I find that an adjustment of 1 to OSCAL results in a change of about 18 in the count. So your range of 80 should work without a problem, but making it any narrower or lowering the number of ticks per sample could be dangerous.

Quote:
The one with it missing was slightly slower, but without the audible clues, no one would ever notice. So on a human scale there's not much difference.

If the OSCAL setting is within range on the first pass, then the wait time is virtually zero. But if the setting is just out of range on the low side, the delay will be 1.5 secs. Whether or not this is acceptable would depend on the circumstances. If the calibration is being done upon waking from sleep in order to service a signal on the UART, this could be entirely unacceptable. Your suggestion of adjusting according to the size of the error is a good one for this type of thing. You could also use successive approximation.

Quote:
Since this particular routine is done shortly after power-up (or reset) the longer it takes to adjust the OSCCAL register, the better, because it gives the oscillator more time to "settle."

This is not a good thing since if the OSCAL happens to be set so that it is within range in the first couple of passes, then the oscillator will drift out of acceptable range after the routine is run. You must let the oscillator settle before you run the calibration routine.

Regards,
Steve A.

The Board helps those that help themselves.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Koshchi wrote:
Since this particular routine is done shortly after power-up (or reset) the longer it takes to adjust the OSCCAL register, the better, because it gives the oscillator more time to "settle."

This is not a good thing since if the OSCAL happens to be set so that it is within range in the first couple of passes, then the oscillator will drift out of acceptable range after the routine is run. You must let the oscillator settle before you run the calibration routine.

Excellent point Steve, you are 100% correct of course. One of the pit-falls of listing code fragments is that people have no idea what is going on before or after the routine is called. Typically there is a wait period before the OSCCAL routine is called so we are not relying on this one "delay" as our only way to allow the Oscillator to settle after power-up. The point I was trying to make is that the "extra" time spent trying to calibrate the oscillator is not a bad thing and might actually be beneficial for this particular application.

Thank you ever-so-much for verifying my "gut felling" about the timer over-flow. I knew the probability of a "false positive" would be infinitely small, but I'm over-joyed to hear it's actually Zero. How did you get so smart?

If the objective was not to optimize for minimal size, perhaps a slow expanding of the tollerance range might be one way to prevent an infinite loop rather than a direct time-out. A less-than-perfect calibration might be better than something that is way off. I use +/- 40uS but I think it should work for upto +/-100uS as well. I haven't done the math, has anyone worked back-wards from BAUD Rate tollerances to the OSCCAL tollerances for 19,200 to get some exact figures?

Thank you very much for the excellent feed-back in this, and many of my other threads.

Have a great day, Steve!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I got to thinking about successive approximation. Here's the algorythm:

Mask = 0b10000000; // start with the highest bit
Value = Mask;
for(8 times)	// for 8 bits of accuracy
{
	OSCAL = Value;
	Wait for timeout;
	if (CountIsTooLow)
	{	
		 //OSCCAL needs to be lower, so remove the bit
		Value = Value - Mask;
	}
	Mask >>= 1;	//shift the mask one bit
	Value = Value | Mask; //add the bit into the value
}

This is pretty simple. You only need to compare the actual count with the desired count. No need to test against a range since we are guaranteed to be accurate to the nearest value. This gets the count to within around 20, which is even better than Dan's routine. It is also much quicker. The average time for Dan's is around 0.75 sec. For this it is about 0.05 sec. Furthermore, the time is constant (within microseconds). It also has the advantage that it is more fault tolerant. The only way it could get stuck is if there is no crystal at all on the TOSC pins.

Using Dan's routine as a base here is my solution (CAVEAT: I have not had the time to check it yet, but I believe it is correct):

		CLR  TMP
		LDI  MASK, 0X40
TSTOSC: OR  TMP, MASK
		STS  OSCCAL, TMP
		OUT  TIFR1, FF 
		OUT  TIFR2, FF 
		STS TCNT1H, ZERO 
		STS TCNT1L, ZERO 
		STS TCCR1B, ONE 
		STS TCNT2, ZERO 
W6103: SBIS TIFR2,1    ;WAIT 
		 RJMP W6103 
		LDS  XL, TCNT1L
		LDS  XH, TCNT1H
		CPI  XH, 26
		BRLT NEXT
		SUB  TMP, MASK
NEXT: LSR  MASK
		BRCC TSTOSC
		RET

You'll notice that I changed the check of the high byte of TCNT1 to 26. What I did was increase the ticks per sample to 218 instead of 200. This makes the target count 6652, which is just 4 short of 0x1a00 (6656), which is close enough as to not matter. This eliminates the need for checking the low byte. Also I am ending the loop by seeing if the bit has been shifted into the carry, eliminating the need for a counter. This routine is only one line longer than Dan's in the loop.

Edit: Corrected code
Changed BRGE NEXT to BRLT NEXT.
Changed LDI MASK, 0x80 to LDI MASK, 0x40

Regards,
Steve A.

The Board helps those that help themselves.

Last Edited: Tue. Apr 4, 2006 - 06:00 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You bring up a few excellent points Steve.

I have a confession to make: I'm actually even more of a Hack than I have already told you. However I'm a little gun-shy about posting my stuff because people seem so eager to tear-it-apart even though it works.

The ACTUAL routine that I use at HOME is only 15 lines long. I know there'll be questions about it, so I might as well just post it now and get-it-over-with. I was going to post in the future as an update.

So far I have not had a single failed connection the tollerances are such that "Mister 19200 BAUD" never has problem with it.

Here's the actual Code I'm using:

;------------------------------------;
; RETRO DAN's 15 LINE OSCCAL ROUTINE ;
;------------------------------------;
TSTOSC:  DEC    TMP        ;ADJUST OSCAL  
         STS    OSCCAL,TMP
         OUT	TIFR1,FF    ;RESET
         OUT	TIFR2,FF    ;OSC COUNTER
         STS	TCNT1H,ZERO ;START IT
         STS	TCNT1L,ZERO ;FROM ZERO
         STS	TCCR1B,ONE  ;READY...GO!
         STS    TCNT2,ZERO ;CLEAR TIMER
W6103:   SBIS	TIFR2,1    ;CHECK TIMER
          RJMP	W6103     ;6103uS PASSED?
         LDS    XL,TCNT1L  ;READ COUNTER
         LDS	XH,TCNT1H   ;CHECK ACCURACY
         CPI    XH,23      ;CHECK HIGH BYTE
          BRNE  TSTOSC     ;OUT-OF-RANGE
           RET

Wildman that I am, what I've done to save 2 program steps (4 bytes) is to remove any check on the lower-byte. Yeeha!

You hinted at a similar short-cut in your last post. As long as the "ball-is-in-the-park" it doesn't have a problem putting it into play! It calibrates extemely quickly and I've never experienced a failed connection.

If I'd offered Giorgos this version, he'd probably accused me trying to sabotage his work. Ha ha ha!

Attachment(s): 

Last Edited: Fri. Apr 14, 2006 - 06:16 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
Wildman that I am, what I've done to save 2 program setps (4 bytes) is to remove any check on the lower-byte. Yeeha!

And throw your accuracy out the window.

With successive approximation, you are guaranteed to get better accuracy with every bit, which is why I always do all 8 iterations and don't try to bail early. My routine will always result in a count within about 20 of 6656, making the accuracy better than 0.3%.

Your routine needs a range. With your routine only checking the high byte, your range is now 5888 to 6143. At 5888 you are now 215 off of the target count of 6103. That make the clock 3.5 % high.

Regards,
Steve A.

The Board helps those that help themselves.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

That's why I only use it at home Steve, it's nowhere as accurate. and I would not have offered this version to Giorgos.

However, if we are to be true to our initial goal of reducing size - all other considerations such as accuracy go out the window as long as the routine continues to perform as good or better than the original. This one hasn't failed yet.

Don't get me wrong however, I highly value your input, and another version of OSCCAL that can calibrate to a much higher accuracy in a fraction of the time must have needed application also.

As proof that I value your routine, I actually tested it. I remember that the branch after the test needed to be pointed to the top of the routine. However, I must have done something else wrong because it failed on me. I set OCR2A=218 (not 200) is there something else I overlooked because I'd love to see this routine work. It's still much smaller than the original 30-plus line version and should be super-fast and super-accurate.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

TITLE: COMPLETE SUPER-CONDENSED OSCCAL ROUTINE IN ASM

For those of you that want to put your programs on a diet, here is the complete useable routine including set-up. It's about 24 lines. Most OSCCALs I've seen run 50 to 60 lines. If you decide to use it, please remember to mention me.

;------------------------------------------;
; CALIBRATE INTERNAL RC OSCILLATOR: OSCCAL ;
;------------------------------------------;
RETRO_CAL: STS  CLKPR,V128  ;DROP CLOCK TO 1Mhz
           STS  CLKPR,THREE ;
           STS  TIMSK1,ZERO ;SETUP TIMER1
           STS  TCCR1B,ONE  ;TC1 = 1MHz = 1uS 
           STS  TIMSK2,ZERO ;SETUP TIMER2
           STS  ASSR,EIGHT  ;CLOCK FROM 32KHz OSCILLATOR
           LDI  TMP1,200    ;OCR2A = 200
           STS  OCR2A,TMP1  ;200 / 32768Hz ~= 6103uS
           STS  TCCR2A,ONE  ;TC2 = 32768Hz ~= 30.5176uS 
           LDS  TMP,OSCCAL  ;READ OSCCAL
TSTOSC:    STS  OSCCAL,TMP  ;WRITE OSCCAL
           DEC  TMP         ;ADJUST OSCCAL
           OUT  TIFR2,FF    ;RESET FLAG
           STS  TCNT1H,ZERO ;STARTEM @ZERO
           STS  TCNT1L,ZERO ;
           STS  TCNT2,ZERO  ;
           STS  TCCR1B,ONE  ;READY...GO!
W6103:     SBIS TIFR2,1     ;CHECK TIMER
            RJMP W6103      ;6103uS PASSED?
           LDS  XL,TCNT1L   ;READ COUNTER
           LDS  XH,TCNT1H   ;
           CPI  XH,23       ;CHECK HIGH BYTE
            BRNE TSTOSC     ;WAY OUT-OF-RANGE
             RET            ;ITS MILLER TIME!

For the less adventurous among you that would like to maintain that nice "tight" +/-40uS calibration, simply add these two lines between the BRNE TSTOSC and final RET.

         CPI XL,175     ;CHECK LOW BYTE
          BRLO TSTOSC   ;OUT-OF-RANGE

If nothing else, perhaps this routine can form the basis for your very own OSSCAL routine, like Steve's super-fast, super-accurate version. Happy Programming!

Last Edited: Wed. Apr 5, 2006 - 09:38 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I have corrected an error in my successive approximation code above. The line:

BRGE NEXT

has been changed to:

BRLT NEXT

Also, in looking at the datasheet, I noticed that OSCCAL is only 7 bits, not 8, so I could change my initial mask value to 0x40 instead of 0x80.

A couple things I noticed in testing the code.

First, it is possible that the second to last approximation could actually be closer to the target value than the last approximation, so the value could be one off. However the maximum error will still be about 1.1%

Second, the count value between two successive OSCCAL values is much greater than I had expected. For my routine (with 218 ticks) is about 80. For 200 ticks it's about 75. So Dan, narrowing your range to 80 is getting pretty dangerous.

Both of these things affect my accuracy. It is about four times what I had calculated. I'm still thinking about how to get my extra bit back. If I can I'll get half my accuracy back.

One more thing, and this applies to both our routines. The line:

STS TCCR1B,ONE

to enable timer1 doesn't need to be there in the loop since we are never disabling it. This line can be moved to before the loop.

Regards,
Steve A.

The Board helps those that help themselves.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

This line can go also:

         OUT   TIFR1,FF    ;RESET 

Thereby saving another program step.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hey,

Wow - that's an in depth tutorial! You weren't kidding ;-)

However this is a note to RetroDan and other people who might follow suit:

Please try to keep as much in one message as possible, and not to post multiple messages for each section.

The reason for this are multiple:

-The posts will skew search results, one search for "OSCCAL" will yield individual hits for each message and not one hit for the whole message

-Seems to inflate post count

Again thanks for your tutorial though, it is a great asset! However if possible it would be greatly appreciated if these could be put in a few longer messages. The search function is one of the most useful things at AVRFreaks, and you can see with this example. Notice how it takes until the end of page 2...

Warm Regards,

-Colin O'Flynn

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Actually Dan,

Why can't you just put your tutorials into a PDF files. I think that would be much more benificial.

Great job, though!

You can avoid reality, for a while.  But you can't avoid the consequences of reality! - C.W. Livingston

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I was looking at the C code and noticed a couple of things. After setting the registers for timer2, the code waits for the update of the registers to be complete (with timer2 in asynchronous mode the update is not immediate). However when timer2 is cleared within the loop, this wait is not done. I wondered if this would make a difference in the count, so I tested it. I am getting values that are consistently lower by about 32 when I wait for the update busy flag to clear than when I don't. (By the way, I moved the clearing of timer2 to just above the clearing of timer1 so that timer1 is not busy counting while I'm waiting for timer2.) I also noticed that the C code is centering on a count value of 6185 instead of 6103. This may have been done to compensate for the low readings that I observed. They may have even used an oscilloscope to calibrate the calibration routine. I don't have a scope. Maybe someone with a scope can check and see what actual frequency we are getting with our routines.

Regards,
Steve A.

The Board helps those that help themselves.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

In another thread you said:

Quote:
I've acutally run tests taking it down to +/-5uS to see if it would speed-up communications by lowering transmission errors. at +/-5uS it can take some time to calibrate.

Your fooling yourself. You couldn't possibly have calibrated to within +/-5us. As I have stated above, changing the OSCCAL by one changes the count by around 75us. +/-5us would put the oscillator to within less than 0.1%. The oscillator itself will vary far more than this at any single OSCCAL setting.

The maximum number of loops through your routine before it finds the best OSCCAL setting should be 127 since this is the number of settings that OSCCAL is capable of, and you are stepping through them. This means that the maximum time your routine should take is about 0.8 sec (I had said 1.5 earlier, but I had thought that OSCCAL was 8 bits).

The only reason why you get out of your calibration routine at all is that through the natural variance of the oscillator, the count happens to fall within your range on one of your passes. The answer will be correct, but at any given time if you checked the count again (at that OSCCAL setting), it will fall out of your range most of the time.

Regards,
Steve A.

The Board helps those that help themselves.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Of course you are correct Steve. I was just screwin' around one say and wanted to see how small I could make the target range before I ran into trouble. I really wanted to see how long on a human scale it would take to really push-the-limit, I was also curious as to how much it actually "wandered" around the target point. Sometimes there's nothing more fun than just doing something crazy just to see what happens. Quite often the results are not what is expected and can be useful.

One idea that I had that might speed-up the original routine without the penalty of adding more lines is to increment by three instead of one, it will "wrap" twice-as-fast, and if by chance it can't find a good value first-time-around, on the second trip it will hit all those values it missed first time through.

Last Edited: Wed. Apr 5, 2006 - 09:44 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

alejmrm wrote:

Hey guys,

If this helps, I have the C version of OSCCAL_retrocal(), but right now I am in the office so allow me 8 hours to return home, and I will post the OSCAL_minical (function that is small from retroDan tutorial and still readable for human beings) and the OSCCAL_retrocal(function that is even small but not readable for normal mortals)... and the functions takes the same space as the asm... I am using gcc... so I think is really to go in any project that use the win-avr gcc compiler...


You ported my routine to "C"? That's fantastic!!! Guess that's one benenfit for doing a detailed tutorial.

You don't mind posting it when you are done do you? I wonder how much "overhead" C will add, if any to length? Be interesting to find-out. Can it really be the same size?

And thanks for giving the routine a name, although it sounds like a new diet food to me: "Retro-Cal the new diet craze for over-weight OSSCALs." Ha ha ha!

Happy Programming!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

koshchi wrote:

The maximum number of loops through your routine before it finds the best OSCCAL setting should be 127 since this is the number of settings that OSCCAL is capable of. . .
but I had thought that OSCCAL was 8 bits.

I checked '169 datasheet and OSCCAL uses all 8 bits, only not as one would suspect. There are two frequency ranges that are overlapped/interleaved:

Atmega169.pdf wrote:
. . .The oscillator can be calibrated to any frequency in the range 7.3 - 8.1 MHz within ±1% accuracy. . .

The CAL7 bit determines the range of operation for the oscillator. Setting this bit to 0 gives the lowest frequency range, setting this bit to 1 gives the highest frequency range.

The two frequency ranges are overlapping, in other words a setting of OSCCAL = 0x7F gives a higher frequency than OSCCAL = 0x80.

The CAL6..0 bits are used to tune the frequency within the selected range. A setting of 0x00 gives the lowest frequency in that range, and a setting of 0x7F gives the highest frequency in the range. Incrementing CAL6..0 by 1 will give a frequency increment of less than 2% in the frequency range 7.3 - 8.1 MHz.

Last Edited: Thu. Apr 6, 2006 - 01:21 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Ok, it took me a little bit more time because Retro made some changes to reduce few more bytes. but any way, here is my C version of retro's code.

/*********************************************************** OSCCAL_retrocal */
/* NAME               : OSCCAL_retrocal
**
** PURPOSE            : Calibrate the internal OSC, using an 32,768 kHz crystal 
**                      as reference
**
** DESIGN REFERENCE   : Original author: retroDan from avrfreaks.net forum:
**
**                      [DIS] [ASM] Program Tricks: Shrinking the OSCCAL Routine
**
**                      converted to C function by alejmrm.
**
**                      LOWER = IDEALTIME - 40uS
**                      LOWER = 6103 - 40
**                      LOWER = 6043 = $17:AF
**
**                      UPPER = IDEAL + 40uS
**                      UPPER = 6103 + 40
**                      UPPER = 6143 = $17:FF
** 
** INPUT PARAMETERS   : None
** PERMITTED RANGE    : OSCCAL register
**
** RETURN PARAMETERS  : None
**
** REVISION HISTORY   : 1.0, 04/01/2006, AR, First release.
**
******************************************************************************/
void OSCCAL_retrocal(void)
{
    register uint8_t    temp;
    register uint8_t    val_255 =0xFF;
    register uint8_t    val_1 = 0x01;
    union x_t{
        uint8_t     val[2];
        uint16_t    ref;
    }regx;
    
    // set the CPU Frequency to 1MHz (8Mhz / 8 = 1Mhz)
    CLKPR   = (1<<CLKPCE);        
    CLKPR   = (1<<CLKPS1)|(1<<CLKPS0);
    // setup timer1 TC1= 1MHz = 1usec
    TIMSK1  = 0;
    TCCR1B  = val_1;
    // setup timer2
    TIMSK2  = 0;
    // set 32,768kHz osc as source for timer2
    ASSR    = (1<<AS2);
    // 200 / 32768Hz ~= 6103uS
    OCR2A   = 200;
    // TC2 = 32768Hz ~= 30.5176uS
    TCCR2A  = val_1;

    temp = OSCCAL;
    for(;;)
    {
        OSCCAL  = temp;
        // adjust oscal
        temp--;
        // reset counter
        TIFR2   = val_255;
        // start them @ zero
        TCNT1H  = 0;
        TCNT1L  = 0;
        TCNT2   = 0;
        // ready..go!
        TCCR1B  = val_1;
        //check timer, 6103 usec passed?
        while ( !(TIFR2 & (1<<OCF2A)) );


        regx.ref     = TCNT1;
        if( regx.val[1] != 23){
            continue;
        }else if(regx.val[0] < 175){
            continue;
        }
        break;
    }
    return;
}

After compiling and a quick verification of the assembler generated by the compiler; this function is about 6 or 8 bytes bigger than retroDan's RETRO_CAL assembler function... the main reason are the "wildcards" declarations that he has implemented within registers.(ZERO,ONE, TWO, THREE, V128, FF, etc) In my case, these declaration are local to the function at best, and not global to the application.

One more thing, (Jobs quote) I tried this function before posting, in fact, I have 2 day using it.. and so far so good..

---
ARod

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

For those interested in the less fatter than Atmel's OSCCAL function where we are still able to follow the flow of the events, here is OSCCAL_minical.

/************************************************************* OSCCAL_minical */
/* NAME               : OSCCAL_minical
**
** PURPOSE            : Calibrate the internal OSC, using an 32,768 kHz crystal 
**                      as reference
**
** DESIGN REFERENCE   : Original author: retroDan from avrfreaks.net
**                      [DIS] [ASM] Program Tricks: Shrinking the OSCCAL Routine
**                      converted to C function by alejmrm.
**
** 
** INPUT PARAMETERS   : None
** PERMITTED RANGE    : OSCCAL register
**
** RETURN PARAMETERS  : None
**
** REVISION HISTORY   : 1.0, 04/01/2006, AR, First release.
**
******************************************************************************/
void OSCCAL_minical(void)
{
    register uint8_t    temp;
    register uint8_t    val_256 =0xFF;
    uint16_t            ref;

    // set the CPU Frequency to 2MHz (8Mhz / 4 = 2Mhz)
    CLKPR   = (1<<CLKPCE);        
    CLKPR   = (1<<CLKPS1);
    // set 32,768kHz osc as source for timer2
    ASSR    = (1<<AS2);
    // disable any interrupt sources
    TIMSK1  = 0;
    TIMSK2  = 0;
    // start timers, both timers have the same bit location
    temp    = 1;
    TCCR1B  = temp;
    TCCR2A  = temp;
    // wait for TCR2UB to be cleared
    while(ASSR & (1<<TCR2UB));

    // Wait 1 sec for external crystal to stabilise
    // (31 * 32.77ms) approx. 1 sec.
    temp    = 31;
    do
    {
        TIFR1  &= ~(1<<TOV1);
        while( !(TIFR1 & (1<<TOV1)) );
    }while(temp--);
    // reset TCNT2 only
    TCNT2   = 0;

    temp = OSCCAL;
    for(;;)
    {
        // adjust oscal
        temp--;
        OSCCAL  = temp;
        // reset counters
        TIFR1   = val_256;
        TIFR2   = val_256;
        // wait for the compare-match on timer2
        while ( !(TIFR2 & (1<<TOV2)) );
        // read timer
        ref     = TCNT1;
        TCNT1   = 0;

        // check too fast?
        if( TIFR1 & (1<<TOV1) ) continue;
        if( ref > (uint16_t)UPPER_LIMIT ) continue;
        temp   += 2;
        // check too slow?
        if( ref < (uint16_t)LOWER_LIMIT ) continue;
        // we are ok!
        break;
    }
    return;
}

Also here are the constants that I am using.

// 19200bps, 2 MHz, Double Speed, Error +0.2%
#define UPPER_LIMIT 0x3D6C      // 15724
#define LOWER_LIMIT 0x3C73      // 15475

Edited 4/6/06 function name match comments.

---
ARod

Last Edited: Thu. Apr 6, 2006 - 02:39 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
I checked '169 datasheet and OSCCAL uses all 8 bits, only not as one would suspect. There are two frequency ranges that are overlapped/interleaved:

This is for the mega169P. The previous version (the one that is probably in your Butterfly) did not have this feature. I checked it on mine and found that I am getting the same results whether or not bit 7 is set. Also, in reading the OSCCAL the high byte is always clear.

Regards,
Steve A.

The Board helps those that help themselves.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks a million Steve!

Geez, had no idea I could not rely on Data Sheets.. so much for RTFM eh?

My Butterflies are only a month old. They have Atmega169V's does that mean only 128 "wide" OSCCAL as far as you know?

Koshchi wrote:
. . . changing the OSCCAL by one changes the count by around 75us. +/-5us would put the oscillator to within less than 0.1%.

I must be doing something wrong Steve, I got a value of 12.5uS per OSCCAL step, what are your calculations for getting 75 I think my math's off.

ARod(alejmrm) Thank you so very much for posting the "C" version. As most people are programming in "C" these days I'm sure it will be greatly appreciated. I'm afraid ASM programmers like me are headed for the bone-yard. I posted a special thanks and notice on Page 1 also.
.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
Geez, had no idea I could not rely on Data Sheets.. so much for RTFM eh?

The mega169 and the mega169P datasheets are listed separately on Atmel's site. You just have to look at the correct datasheet. There is mega169PV as well that has the 8 bit OSCCAL. I don't believe that the P versions are available yet, so I'm pretty sure yours are the older version.

Quote:
what are your calculations for getting 75 I think my math's off

I didn't use math. I tried that myself and got something completely different than 75 as well. The 75 I got from actual values from my butterfly. I have debug code for my successive approximation routine that saved out all the counts and OSCCAL settings.

Quote:
I'm afraid ASM programmers like me are headed for the bone-yard.

There will always be a need for us. There are some things that you simply can't do in C.

Regards,
Steve A.

The Board helps those that help themselves.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Koshchi wrote:

The mega169 and the mega169P datasheets are listed separately on Atmel's site. You just have to look at the correct datasheet.

Here's the confusing part Steve. First page of Datasheet says it's for the Atmega169V(?) Go figure!

Koshchi wrote:
I didn't use math. I tried that myself and got something completely different than 75 as well. The 75 I got from actual values from my butterfly. I have debug code for my successive approximation routine that saved out all the counts and OSCCAL settings.

Wow, thanks for clearing that mystery up. I tried a lot of things but never got anything that resembled 75uS.

I found a graph of OSCCAL versus Frequency for the '169s and it seems that it's "set-up" so that 8MHz occurs at OSCCAL=96. I think you mentioned somewhere about Atmel starting at 96 and were wondering why.

I'm going to make an immediate change to my routine and start my OSSCAL value slightly above this since I decrement on each error and it should "Fall-Into-Place" Super-Fast, on the Odd Occasion that it doesn't, it will "wrap around."

A fellow mentioned in another thread that you can't move OSSCAL in large steps or you might screw-up everything including for the MCU.

Randy Ott wrote:
Later devices, such as the tiny2313 have a warning in the data sheet that says:

Avoid changing the calibration value in large steps when calibrating the calibrated internal RC Oscillator to ensure stable operation of the MCU. A variation in frequency of more than 2% from one cycle to the next can lead to unpredictable behavior. Changes in OSCCAL-register should not exceed 0x20 for each calibration.

I wonder if this could also apply to the M169?

The above quote is very confusing to me because data sheet says that single steps of the OSCCAL will vary the Oscillator by 1% (or was it 2%) which it the Maximum according to the above, and then they say above you can push it to $20 at-a-time(?)

Attachment(s): 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
I found a graph of OSCCAL versus Frequency for the '169s and it seems that it's "set-up" so that 8MHz occurs at OSCCAL=96.

Hmmm... In one of my butterflys the preset value is 80, and the other it is 74.

Regards,
Steve A.

The Board helps those that help themselves.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Is that after the factory bootloader does it's own OSCCAL calibration and launches your program, or are those figures from a "fresh" startup, ie: no bootloader interference?

Pages