Calibration sometimes goes wrong

Go To Last Post
12 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi Folks,

I have a small problem.
I'm using an external 32kHz XTAL to calibrate the internal 8Mhz oscillator to 7.3728MHz of an Atmega644P.

The method I use is from Application Note "AVR055: Using a 32kHz XTAL for run-time calibration of the internal RC".

But sometimes this calibration goes completely wrong. The cpu is working much faster (LED is blinking faster) and the serial connection is broken.

I have a bootloader which calibrates to 8MHz and cause we cannot change the bootloader in this project phase, I have to do the calibration for 7.3728MHz at the beginning of the program. But I have no idea why this happens.

I'm getting lost on this whole calibration thing.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I presume that you are intending to calibrate a whole production run of mega644P's.

You can look at the factory calibration byte and observe its value.

If your re-calibration is going to encounter the discontinuity, you can either use smaller steps or adjust the algorithm.

Looking into my crystal ball, it says that you want to achieve 115200baud. With 7.3728MHz you can have UBRRxL = 3 or UBBRxL = 7 + U2X=1.

You can actually achieve 115200 baud with UBRRxL = 8 + U2X=1 with a oscillator frequency of 8.2944MHz. This involves a +3.7% frequency change rather than the -7.8% change to get to 7.3728MHz.

If you have a substantial number of AVRs to "calibrate" you want a foolproof system.

David.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

david.prentice wrote:

Looking into my crystal ball
, it says that you want to achieve 115200baud. With 7.3728MHz you can have UBRRxL = 3 or UBBRxL = 7 + U2X=1.

Damm i knew i should have lived in Harry Potter country.

I always have to read the DS :-(

/Bingo

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

In your data sheet look at:
Figure 28-38. Calibrated 8 MHz RC Oscillator Frequency vs. OSCCAL Value.

If the factory calibration value is going to cross the discontinuity, you will see a strange effect.

As a rule of thumb, if the Factory Calibration byte is more than 140 (0x8C), you can probably get to 7.3728MHz before hitting 0x80.
Likewise if the Calibration is < 120, you can probably get to 8.29MHz without hitting 0x80.

But as Lee (theusch) noted, in a large batch of AVRs you will probably get some end-range individuals. And the Factory Calibration is only promised for 10%.

David.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Ah, "that" discontinuity. :D
Basically the algorithm in AVR055 works like this:
Set the OSCCAL to 0x40. Count the clock.
If smaller than expected, add 0x20. If bigger sub 0x20
Count clock again
If smaller, add 0x10. If bigger sub 0x10
and so on until the add/sub value is 0x00.
Then, do a neighborhood search. test some values around the found one and check if they are a better match.

If the result is not ok, do a search in the upper range of the OSCCALL value starting at C0.

In this way the algorithm shouldn't be able to run into the discontinuity.

From my understanding, the discontinuity cannot make problems when passing it on purpose. e.g. start calibration by switching from factory calibration 0xA3 to calibration start value 0x40 should not make a problem.
It makes only problems when the calibration algorithm cannot handle the switch in frequency when jumping from OSCCAL 0x7F to 0x80.

My calibration gives me a good result in the lower OSCCAL range and sometimes in the higher, which means the lower failed. And sometimes just works way to fast because the lower range search and the upper range search didnt't work ok.

Oh and not to forget. Of course, the global interrupt is switched off, so that nothing can hit in while counting.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

OK. Now I found out, that one controller which i have here and fails a lot of time, gets sometimes a calibration value of 128 and sometimes when he is running to fast gets a calibration value of 127 which makes him way to fast. I can see that it is possible with the algorithm to run over the discontinuity due to the neghbour search, but the algorithm should recognize this because the countvalue will go way up when switching to 127 and so the algorithm should go back to the old value (128) and use that value.
Or can there be some strange behaviour when switching over the discontinuity which results in bad counting so that the algorithm thinks the value is better?

Maybe I should just stay in the lower OSCCAL region when calibrating for 7.3728MHz.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Baldrian wrote:
OK. Now I found out, that one controller which i have here and fails a lot of time, gets sometimes a calibration value of 128 and sometimes when he is running to fast gets a calibration value of 127 which makes him way to fast. I can see that it is possible with the algorithm to run over the discontinuity due to the neghbour search, but the algorithm should recognize this because the countvalue will go way up when switching to 127 and so the algorithm should go back to the old value (128) and use that value.
128 is at a local minimum.
If 128 is too fast,
there is no way to get where you want with a local improving search.
The algorithm, if too fast decrease by 1,
if too slow, increase by 1 will work if there is a solution.
Quote:

Or can there be some strange behaviour when switching over the discontinuity which results in bad counting so that the algorithm thinks the value is better?

Maybe I should just stay in the lower OSCCAL region when calibrating for 7.3728MHz.

Iluvatar is the better part of Valar.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

skeeve wrote:
128 is at a local minimum.
If 128 is too fast,
there is no way to get where you want with a local improving search.
The algorithm, if too fast decrease by 1,
if too slow, increase by 1 will work if there is a solution.

Yeah, but the algorithm should normaly go like this.
-searching lower range. Result is not 100%. So search upper range.
In upper range: worst case getting maximal down to 129. Then start the neighbour search: Because going down with the last step and don't reach 100% good value, test the next 4 smaller values:
algorithm:

128 hmm. still to fast but better than 129
127 wth. much to fast and and more worse than 128
126 holy moses still much faster than the count from 128
125 merde, that is also way faster than 128.
So, go back to 128 and life with that

But it seems that this is not happening. Most of the times the algorithm seems to take a value which is in the 125-127 region and running way to fast.
That's why I ask if there can be some sideeffects switching through the discontinuity.

But if there are no hardware sideeffects, I will find the problem in the algorithm. :evil:
I don't like to just use the easy way and calibrate only in the lower region. I want to know why it happens and not how to easily avoid it without knowing the reasons. :lol:

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

I don't like to just use the easy way and calibrate only in the lower region.

What about the simpler algorithm: Just step (by one) in the "right" direction until you reach Nirvana, or until you run out of counts.

david.prentice referred to where I discussed this before.

In practice once you get close continued recals probably only move a bit one way or another over time and temperature.

[As mentioned, I also tried to be "cute" and take proportional steps until I got caught in a time-space continuum situation--errr, jumped into the discontinuity. So I reverted to the brute-force one-step-at-a-time.]

Lee

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Baldrian wrote:
skeeve wrote:
128 is at a local minimum.
If 128 is too fast,
there is no way to get where you want with a local improving search.
The algorithm, if too fast decrease by 1,
if too slow, increase by 1 will work if there is a solution.

Yeah, but the algorithm should normaly go like this.
-searching lower range. Result is not 100%. So search upper range.
In upper range: worst case getting maximal down to 129. Then start the neighbour search: Because going down with the last step and don't reach 100% good value, test the next 4 smaller values:
algorithm:

128 hmm. still to fast but better than 129
127 wth. much to fast and and more worse than 128
126 holy moses still much faster than the count from 128
125 merde, that is also way faster than 128.
So, go back to 128 and life with that

But it seems that this is not happening. Most of the times the algorithm seems to take a value which is in the 125-127 region and running way to fast.
That's why I ask if there can be some sideeffects switching through the discontinuity.

An algorithm that worries about the goodness of the last
change will need to explicitly deal with the discontinuity.
For a correct value of alpha, the following algorithm will work.
while too fast or too slow
    OSCCAL+=alpha*(desired_speed-speed)

alpha should be big enough to ensure that OSCCAL will
be adjusted when adjustment is needed and small enough
that there is no more than 50 per cent over-correction.
There is no need to wth.

Iluvatar is the better part of Valar.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Did you ever resolve this issue? I've just started to bring a product to production that seems to be having the same problem. Some units calibrate great, but others seem to be way off. I also have a "recalibrate" piece of code that triggers calibration periodically or on temp/voltage change - and sometimes a unit that had previously calibrated ok, gets a new bad cal.

I've double-checked my code against the AVR055 app note and it's not my version of the code that's the problem.

I suppose if the binary + neighbor search in both halves of the OSCCAL register lead to results that crossed the 127/128 boundary, then it could lead to a bad cal, but otherwise I thought the algorithm was pretty sound. I'm debating adding code that re-calibrates if I encounter communication problems, but I'd rather solve the problem at the source...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I finally found the problem with AVR055 application note! The application note defines "countDiff" and "bestCountDiff" as unsigned char instead of unsigned int.

The COMPUTE_COUNT_VALUE macro defines countVal to be ((EXTERNAL_TICKS*CALIBRATION_FREQUENCY)/(XTAL_FREQUENCY*LOOP_CYCLES)), which equates to 1111, so if you get a calibration count of 256 or more off from that, then it will calibrate incorrectly.

In my case, I was calibrating to 3.6864 MHz, setting the EXTERNAL_TICKS to 255 - which meant that the desired countVal was 4098. In searching through the OSCCAL values for the calibration, if I measured a countVal of 4354 (3.916 MHz), then it would think that I had found a perfect calibration!

So in summary: change countDiff and bestCountDiff to unsigned shorts (and initialize to 0xFFFF instead of 0xFF).

Best of luck!