## F_CPU at 8388608 Hz?

25 posts / 0 new
Author
Message

I've been using timer3 with the MCU clock for interrupts to run the Raven radio duty cycle. With a 1024 prescaler at 8MHz that gives about the right time precision of 128 microseconds, at first sight an attractive number but turns out to be 7812 counts per second, and that requires division to calculate the expected turnon time of another radio.

It would be nice if the counts per second were a power of two, as then the calculations could be done by masking and shifting. It would also give more accuracy after an MCU sleep. That is based on a timer2 wake from the 32768 Hz watch crystal which will jitter the next receive window.

Everything would be a lot simpler if F_CPU were a power of two! 4 MHz would not be fast enough to unload the radio frames, but OSCCAL can easily drive it to 8388608.

It would take a bit of time to modify things to work with a new F_CPU define, so before I try it can anyone advise how stable such 5% overclocking would be?

Quote:

but turns out to be 7812 counts per second, and that requires division to calculate the expected turnon time of another radio.

You've lost me. Are you using the internal oscillator? OSCCAL doesn't apply to crystals AFAIK.

For longer time intervals, why not just set up a compare match and trip at 1 second or whatever. 7812 or 4567 or 1234 or whatever.

Division required? 1 division per setup? So what; do the divide.

As this "would take a bit of time to work out", what problem needs to be solved? Saving a few microseconds of calculation time every xxx seconds/minutes?

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

Well low power is one of the considerations and unnecessary division cuts into the sleep time. The radio is turned on for a channel check at 8 Hz and that is timed by the timer3 compare match, but transmission time to another Raven has to be calculated modulo the channel check interval. One divide is not so bad but drift correction adds three more. True, for tx the CPU idles anyway until the radio turnon time but the calculation delay could occasionally bump that into the next time slot.

```#if PHASE_DRIFT_CORRECT
{ int32_t s;
if (e->drift > cycle_time) {
s = e->drift% cycle_time/(e->drift/cycle_time);  //drift per cycle
s = s*(now-sync)/cycle_time; //estimated drift to now
sync += s;                   //add it in
#endif
#if 0
/* Faster if cycle_time is a power of two */
wait = ((sync - now) & (cycle_time - 1));
#else
/* Works generally */
wait = cycle_time - ((now - sync) % cycle_time);
#endif
while (TCNT3 < (now + wait)) {};```

The up to 128 millisecond deep sleep is done using the external crystal and that has to adjust timer3 to compensate for the lapse, which can't be done accurately when the timer3 to timer2 tick ratio is 7812/1024. And it needs another 32 bit division as well.

You can speed up the arithmetic using the fact that cycle_time is a constant.
'Twould also be good to know the range of e->drift.

Moderation in all things. -- ancient proverb

That is the problem, cycle time is in general not a constant but might be a power of two. This code should work on all platforms, from mighty OSX down to the lowliest PIC :)

e->drift can be anything modulo 16 bits for AVRs, but e->drift per cycle time is by derivation less than a cycle time.

You mean overclocking the internal oscilator? Or are you also using a cpu/voltage rated at only 8MHz? Tuning the internal oscillator to run at 2^23 Hz is an interesting idea (you should try it and let us know how it works!), but I'm not sure you'd get more accuracy from your "even power of two" than you'd lose from not using a crystal.

Crystals are available in some "common" frequencies that divide more nicely to seconds or milliseconds or baudrates, without being exact powers of 2. 8.192MHz is one, for example...

There were some "interesting" routines for developing reasonably accurate millisecond-scale timing based on a clock tick that is "not quite" an even fraction of a millisecond. Developed for Arduino, of all things, although it got corrupted with extra code before actually being released. See http://arduino.cc/forum/index.ph...

Yes, the question is about overclocking the 1284p on the Raven so that F_CPU is a power of two. They nominally run at 8 MHz (but usually ~1% slow) using the internal RC oscillator.

There is a 32768 Hz watch crystal that can be used to set OSCCAL and provide wake interrupts. Accuracy or drift is not really the issue, assuming it gets no worse than it is now. Atmel recommends not overclocking more than 10% but don't say what problems could arise.

Pinging from another radio (this one crystal controlled) and looking at the phase of the response, the higher clock speed seems to be just as stable. But have not tried long term operation or changing temperature or voltage other than 3v3. Would the OSCCAL routine have to be run periodically, or could it use a baked value once and for all?

dak664 wrote:
Accuracy or drift is not really the issue, assuming it gets no worse than it is now.
The datasheet provides the "typical" values for temperature/voltage variation of the internal RC oscillator. It surely may drift over time, too, but I'd say it's negligible compared to the first two. The "tuning step" is around 40kHz or 0.5% , so there's no way you can get better precision anyway. I doubt aging alone (with stable temperature/voltage) would change the frequency more than a fraction of % within days.

dak664 wrote:
Atmel recommends not overclocking [...]
I'd avoid the word "overclocking" unless you are talking about clock above maximum frequency for given supply voltage - which you are probably not. I'd use "adjusting the internal RC oscillator's frequency", even if it is a bit longer.

dak664 wrote:
[...] more than 10% but don't say what problems could arise.
The only place I've found any reference to maximum deviation of the RC oscillator from the nominal is the following quote from the datasheet - but that's quite specific with regards of what problems could arise, so I'd like to know what other "Atmel recommendation" you are referring to.
ATMega1284 etc. datasheet wrote:
If the EEPROM or Flash are written, do not calibrate to more than 8.8 MHz. Otherwise, the EEPROM or Flash write may fail.

JW

I got the 10% from AVR055 http://atmel.com/dyn/resources/prod_documents/doc8002.pdf It does warn about flash and eeprom write problems at high tunings and concludes with

`Knowing the fundamental characteristics of the RC oscillators, it is possible to make an efficient calibration routine that calibrates the RC oscillator to a given frequency, within 10% of the base frequency, at any operating voltage and at any temperature with an accuracy of +/-1%.`

So there is no down side to running at 0x800000 Hz? It seems odd that is not a common choice.

dak664 wrote:
I got the 10% from AVR055 http://atmel.com/dyn/resources/prod_documents/doc8002.pdf
OK but that's in the same way specific in the consequences of tuning beyond 10%, while you above said that "Atmel recommends not overclocking more than 10% but don't say what problems could arise." That confused me.

dak664 wrote:
So there is no down side to running at 0x800000 Hz? It seems odd that is not a common choice.
The traditional mcu clock frequencies were chosen to provide UART baudrates, i.e. 300Hz up. This, together with the 8051-induced factor of 12 results in the most common crystal frequencies being integer multiples of 1.8432MHz.

JW

That is the next step, to recover the 57600 baud connection to the STK500 level shifters which has never worked at 115200. It would be useful if it did at the higher clock rate as it is a bottleneck for network communication.

Quote:
Atmel recommends not overclocking more than 10% but don't say what problems could arise.

First requirement is that on some AVRs (particularly this millennium versions) there is no extra asynchronous clock source for flash/eeprom timing. These use a main RC and thus OSCCAL value affects the memory timings (only erase/write/erase+write). As your 2^N is within 110% limit - I do not see a problem there.

Second is that OSCCAL is not recommended to be modified by more than small steps at once.

Third thing is that RC is controlled by a low quality R2R ladder and you can hit the problem of huge step in the F_CPU value from one step to the other.

My suggestion is that you could drop the idea of OSCCAL modification by making an OCRn register value for CTC a function of ICRn and RTC counter overflow signal. Some kind of a software PLL. OCRn value will vary from run to run, but eventually I think this can be made non-accumulating error and with arbitrarily limited jitter down to +-1 timer tick.

How to make it?
Connect OC pin of RTC to trigger an ICPn (via ICP pin, ADC channel or comparator). When ICP happens, you can update OCRn register in ISR(ICP) so that TCNTn will lock on the frequency desired (could be 2^N or whatever, faster or slower than 32kHz). Remember OCRn registers are buffered only in PWM modes!

The value of the OCRn will vary in time of course (introducing some jitter) but I think that accuracies better than 2^-16 could be achieved (15ppm) plus the reference accuracy of RTC quartz(20ppm). If you would like it to be non-accumulating (I have no idea if that is necessary in your application), you need some integral value of the time base shift in OCRn calculations.
Looks like doable, I think the main challenge is to build it robust to withstand a step injected into RC clock, that is dRC/dt=f(dtemperature/dt, dvoltage/t, dOSCCAL/dt) can make the software PLL unstable if badly designed.

No RSTDISBL, no fun!

dak664 wrote:
That is the problem, cycle time is in general not a constant but might be a power of two. This code should work on all platforms, from mighty OSX down to the lowliest PIC :)
Now I'm really confused.
Something that is not a constant is not going to stay a power of two.
Do you mean that cycle_time is not a mathematical constant?
A compile-time constant is good enough.
Quote:
e->drift can be anything modulo 16 bits for AVRs, but e->drift per cycle time is by derivation less than a cycle time.
Can e->drift be negative?
If so, what is the desired sign of e->drift % cycle_time?

At compile-time, what is known about the value(s) of cycle_time?

With what precision do you need to do the correction?
Would a small correction now cause a compensating correction later?

My guess is that some kind of correct the old value algorithm would be fast enough for you.

All that said, I suspect that we could do better for you if you took a step back and stated more precisely what you are trying to do.

Moderation in all things. -- ancient proverb

For generality the routine uses a a call parameter of uint16_t or uint32_t cycle_time, depending on platform. At present the only calls to the routine that I know about use a predefined constant CYCLE_TIME for that parameter. Although the existing routine assumes cycle_time is a power of two, I suspect the idea was to allow cycle_time to be doubled after a period of inactivity and halved when traffic picks up. That is similar to the method used for routing information sent through the RPL/ROLL protocol. Also different neighbors might have different cycle times; door sensors might be faster than temperature sensors for example.

Here is a description of what I am doing http://www.sics.se/contiki/wiki/index.php/RDC_Phase_Optimization

perhaps it sounds stupid, or I'm missing something. Why not just buy a 8.388... MHz crystal for \$1.

No place on the Raven board to solder that. For new low power designs crystal control would be a consideration, but if time-slot reception can work on the Raven as-is then it should work even better on a tailored design.

The original question of the thread was why not run F_CPU at a power of two for efficiency in calculating modulo remainders. Nothing to do with frequency stability. An 8.000000 MHz crystal would have the same issues with unnecessary divide time.

dak664 wrote:
Here is a description of what I am doing http://www.sics.se/contiki/wiki/index.php/RDC_Phase_Optimization
If I understand it correctly,
e->drift is positive and e->drift/cycle_time should not be much larger than 8.
The division is easy and e->drift % cycle_time can be computed as
e->drift - e->drift/cycle_time*cycle_time.
I suspect that (now-sync) % cycle_time could be computed with similar ease.

It seems to me that s should be uint16_t.

Computing s*(now-sync)/cycle_time is a little trickier.
The trick of multiplying to divide by a constant (bear with me) would involve 64-bit arithmetic,
which IIRC is rather slow with avr-gcc.
Since s< cycle_time and now-sync fits in sixteen bits,
s*(now-sync)/cycle_time will also fit in sixteen bits.

To divide by a near power of two:
Mathematically,
q=n/(2**p-e) satisfies q=(n+eq)*2**-p assuming all values are positive.
An iteration based on second equality converges from any starting point.
In particular, it converges from n*2**-p and n*2**(1-p).
In C unsigned integer arithmetic, one can use q=(n+e*q)>>p.
Assume 2e< 2**p.
Starting from n>>(p-1), it will reach the correct value rounded down.
Starting from n>>p, it might be one too low,
but I think q=(n+eq+e)>>p would work.
The rate of convergence depends on e*2**-p.
7812=2**13-380
380*2**-13=.0486
I think that three iterations would be enough.

Moderation in all things. -- ancient proverb

A negative phase drift shows up as a large positive modulo drift and distorts my per-cycle extrapolation. I've since gone to a 64 bit multiplication to do per tick extrapolation.

Interesting, did not occur to me to use an approximation when exact calculation was possible. I now have four methods to compare the calculation times (note I have not actually run this on the Raven yet and the speed is not an issue on the ARM7 side).

I've found that calibrating to 8.00 MHz at boot comes close to phase stability most of the time. OSCCAL can vary by 2 or 3 with a resulting phase drift in either direction. If the calibration is going to be done anyway I don't see any reason not to calibrate to 0x800000 Hz.

dak664 wrote:
A negative phase drift shows up as a large positive modulo drift and distorts my per-cycle extrapolation. I've since gone to a 64 bit multiplication to do per tick extrapolation.
Oh Gawd. You're doing something wrong.
Quote:
Interesting, did not occur to me to use an approximation when exact calculation was possible. I now have four methods to compare the calculation times (note I have not actually run this on the Raven yet and the speed is not an issue on the ARM7 side).
My arithmetic was not approximate.
Quote:
I've found that calibrating to 8.00 MHz at boot comes close to phase stability most of the time. OSCCAL can vary by 2 or 3 with a resulting phase drift in either direction. If the calibration is going to be done anyway I don't see any reason not to calibrate to 0x800000 Hz.
You won't get it to exactly 0x800000 Hz,
so you still won't have an exact power of two.

I strongly suspect that the problem has been poorly expressed so far.
Let me try.

The last two sync signals occurred when the timer read t1 and t2.
The last sync signal occurred on the count listen time after its predecessor.
The listen times occurred at listen_period timer cycles apart.
The problem is to compute a new value for listen_period.

```uint16_t diffu=t2-t1-count*listen_time;
int16_t diff=(int16_t)diffu; // assumes correct value in range -2**15..2**15-1
int16_t correction=diff/count;
listen_period+=correction;```

The coercion is undefined,
but will work if the compiler is not vicious.
If one wants to be technically correct,
one could rewrite to explicitly test the high-order bit.

In any case, no 64-bit numbers are required.
If avr-gcc's 16-bit division is fast enough,
no 32-bit numbers are required.

Moderation in all things. -- ancient proverb

I'll give that a try. What's working right now is adjusting OSCCAL once per second based on the drift of the 8192 tick/second TIMER3 relative to the crystal-controlled 128 tick/second TIMER2. This is similar to Brutte's idea of phase lock but straightforward when both are a power of two. It takes 12 seconds to lock and then OSCCAL wobbles between 140 and 142. The startup has no effect on 57600 baud to a real COM1 port, but also does not get me 115200 baud. That may be a STK500 limitation combined with my breadboarding.

It does a random walk due to last_phase being updated each time. But I think it could do a lock to any desired phase, once it gets past the startup change.

```ISR(TIMER2_COMPA_vect)
{
if(++scount >= CLOCK_SECOND) {
scount = 0;
seconds++;
#if AVR_CONF_USE32KCRYSTAL && AVR_CONF_CAL32KCRYSTAL
/* Adjust OSCCAL once per second based on rtimer value */
{volatile static int16_t last_phase;
int16_t rtimer_phase = TCNT3 & 0x1fff;
if (last_phase > rtimer_phase) OSCCAL++;else OSCCAL--;
last_phase = rtimer_phase;
}
#endif
}```
```seconds, rtimer_phase, OSCCAL
1 7712 129
2 7309 130
3 6939 131
4 6609 132
5 6301 133
6 6035 134
7 5804 135
8 5615 136
9 5429 137
10 5286 138
11 5178 139
12 5117 140
13 5079 141
14 5088 140
15 5051 141
....
2343 3776 140
2344 3729 141
2345 3726 142
2346 3761 141
2347 3759 142
2348 3793 141
2349 3787 142
2350 3821 141
2351 3820 142
2352 3854 141
2353 3849 142
2354 3885 141
2355 3885 140
2356 3838 141```

Quote:

then OSCCAL wobbles between 140 and 142.

Then I'd look at your algorithm.

Quote:

if (last_phase > rtimer_phase) OSCCAL++;else OSCCAL--;

Not correct--as you can see you can be [nearly] perfect and still force a correction. (I'll dig out one on Mon. if I remember to...)

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

Quote:
then OSCCAL wobbles between 140 and 142.

Then the output of the filter is unstable. When made properly and not subjected to external disturbances, when stabilized, it should at most jump over consecutive values, it seems.

Anyway, the dRC/dOSCCAL is huge and as OSCCAL is an integer value, you will not get high accuracy neither in terms of relative frequency of both signals nor in terms of relative phase of both signals (phase shift is not considered in your example), when compared to the proposed 16-bit timer with much better granularity.

Even for high end U4 series (there dRC/dOSCCAL~=const) that granularity is about 40kHz/count_of_OSCCAL. At 8MHz this means taking 0,5% steps with each OSCCAL modification (for other AVRs that relation is not very linear and not even monotonic, which will complicate the design greatly).

This means that USART at baud rates requiring nominal UBRR>100 a more accurate than --OSCCAL++ idea would be to actually leave OSCCAL and tweak UBRR/U2X values at run-time to lock those on RTC. This seems not to be very practical in most cases, as UBRR>100 falls somewhere below 9600 bauds. And besides UBRR is a TOP of the (hidden state) inaccessible counter, which further complicates things.

But for timer clocked peripherals, like PWM, ADC triggering or USI (not for USART as AFAIK USART cannot be clocked with a timer) any reasonable frequency, both lower and higher than 2^15Hz, can be generated with much better granularity with a timer/counter than with OSCCAL. The bordering frequency where the timer is more accurate seems to be somewhere under RC/200[Hz], that is under 40kHz.

So for example if you have RC 8e6Hz+-10% + RTC 2^15Hz +-20ppm and would like to generate a 1kHz PWM, this could be done tweaking OSCCAL, obtaining 0,5% (+20ppm) frequency error worst case. But the accuracy of tweaked TCNTn locked on RTC will be fourty times better then, that is 125ppm (+20ppm).

No RSTDISBL, no fun!

theusch wrote:
Quote:

if (last_phase > rtimer_phase) OSCCAL++;else OSCCAL--;

Not correct--as you can see you can be [nearly] perfect and still force a correction. (I'll dig out one on Mon. if I remember to...)

```if(last_phase> rtimer_phase+delta) OSCCAL++;
if(last_phase< rtimer_phase-delta) OSCCAL--;```

I've assumed the variables are signed and the results are representable.
delta should correspond to just over a half step in OSCCAL.

Moderation in all things. -- ancient proverb

1% is certainly stable enough for the UART.
If he can tweak the PC serial port frequency, he should be good to go.

Moderation in all things. -- ancient proverb

Some amusing plots trying to stabilize at a specified phase.

```#define TARGET_PHASE 0xfff
/* Adjust OSCCAL once per second based on rtimer value */
{volatile static int16_t last_phase=0x1fff;
rtimer_phase = TCNT3 & 0x1fff;
if (last_phase > rtimer_phase) OSCCAL++;else OSCCAL--;
if (seconds < 60) { //give a minute to stabilize
last_phase = rtimer_phase;
target_osccal = OSCCAL;
} else if (last_phase == TARGET_PHASE) {  //locked
} else if (last_phase <= (TARGET_PHASE - 10)) {
last_phase+=10;                         //walk toward lock
} else if (last_phase >= (TARGET_PHASE + 10)) {
last_phase-=10;
} else last_phase = TARGET_PHASE;
}```

gives unstable feedback. Don't know if the parameters can be adjusted to fix that, but limiting the range of OSCCAL works. Kind of ugly though...

```#define TARGET_PHASE 0xfff
/* Adjust OSCCAL once per second based on rtimer value */
{volatile static int16_t last_phase=0x1fff;
volatile static uint8_t target_osccal;
rtimer_phase = TCNT3 & 0x1fff;
//   if (last_phase > rtimer_phase) OSCCAL++;else OSCCAL--;
if (seconds < 60) { //give a minute to stabilize
if (last_phase > rtimer_phase) OSCCAL++;else OSCCAL--;
last_phase = rtimer_phase;
target_osccal = OSCCAL;
} else {
if (last_phase == TARGET_PHASE) {  //locked
} else if (last_phase <= (TARGET_PHASE - 10)) {
last_phase+=10;                         //walk toward lock
} else if (last_phase >= (TARGET_PHASE + 10)) {
last_phase-=10;
} else last_phase = TARGET_PHASE;
if (last_phase > (rtimer_phase+100)) OSCCAL=target_osccal+1;
else if (last_phase < (rtimer_phase-100)) OSCCAL=target_osccal-1;
else OSCCAL = target_osccal;
}
}```

Wonder if constant writing of OSCCAL can wear something out?