## Efficient Division with constant numerator.

22 posts / 0 new
Author
Message

Hi, I'm working on a PID algorithm that controls a computer fan by watching its blades go past an IR reflective sensor.

I'm finding that my divide is very computationally expensive and taking about 74uS which is about 600 clock cycles.

`uint16_t bladeFrequency = F_CPU / clocksSinceLastBlade; // clocksSinceLastBlade is a uint32_t and varies from 35,000 to 700,000 in general use`

There are plenty of articles written about efficiently dividing with constant denominator, but none that I can find addressing a constant numerator.

I was wondering if you guys have any suggestions on how I can tackle this more efficiently? It doesn't need to be entirely accurate.

I can think of ways, but before using them, you should consider the answers to a couple questions:

Why is 74uS too slow?

My suspicion is that you want to process data way more often than necessary or useful.

Can you use the reciprocal?

Shift the numerator and denominator until the latter occupies a single byte.

The numerator will occupy two bytes

Dividing two bytes by one should be faster than four by four.

If the denominator requires precisely 9 bits, 0x10000/denominator can be found from a 0x100-byte table.

Successive approximation can be useful.

If one of the inequalities is false, bladeFrequency can be adjusted by one.

Iluvatar is the better part of Valar.

Simple - work in time not frequency. If you need frequency for display to a human, then convert.

First I think it's a overkill if uS 76 is a problem.

But if you want higher speed of the div use MUL by 1/x and scale so >>8 (16 or 24) can be used.

like say you want to divide with 100 then you do a MUL by (x*65536/100)>>16 = x*655 and use the high 16 bit.

Make sure nothing overflow. Using ASM it can be done fast, but I don't think it save a lot  using C. (because the calc has to be done in 32 bit)

Last Edited: Thu. Jun 11, 2015 - 08:50 AM

Thanks Guys,

Skeeve, these are both great suggestions. I tried your first and this got it down to about 40uS which is a good improvement. I think the successive approximation approach will slow down my response time too much in my example but I can see how it would work in other cases.

Kartman, that's a really lucid approach. I'm re-writing my code right now to see how I fare.

Sparrow2, 76uS was only a problem because I'm a nut and was carrying out the calculation inside an ISR (naughty I know), and I had other ISRs firing every 64uS. I think Skeeve was also alluding to the fact that I'm doing something a little dodgy if I think 76uS is too slow. I got around it by running sei() at the start of the routine, and it actually works, but I'd much rather speed up the ISR or delegate the calculation to main() if necessary.

I got around it by running sei() at the start of the routine

You mean in one of the ISR()? A dangerous strategy. Are you sure an interrupt of the same type won't occur before you finish the current processing? If it does your ISR() will be interrupted by itself. The prologue registers will be stacked again and there's even an issue of whether the code is designed to be re-entrant anyway. If it keeps occuring those prologue registers will eat through all the RAM

Bosch used the same strategy in an early ignition controller. With a 800kHz cpu, you don't have too much time to do much, so they worked everything in period. Of course, the period doesn't make much sense to a human, so I converted the period back to frequency then to rpm to make sense of what it was doing.

One would wonder why you need such tight control over a computer fan. You could average a few periods then convert to frequency for when you do your pid calc. Considering the fan is mechanical, running the pid loop much faster than 100Hz might just be a waste of time. Feel free to enlighten us.

clawson wrote:

I got around it by running sei() at the start of the routine

You mean in one of the ISR()? A dangerous strategy.

Yep. I agree. It's just a hack fix for now. The ISR shouldn't fire again before completion since even if it takes 76uS, the fan would have to spin at 112,000RPM for that to happen, but it still gives me the willies. I'm working on kartman's suggestion right now and I think it'll speed it up enough that I don't need to re-enable interrupts during the ISR.

kartman wrote:

One would wonder why you need such tight control over a computer fan.

It's a pretty unique case I think. I'm using the fan as the motor for a magnetic stirrer and I'm digitally setting the speed between 100 to 2000rpm. The trouble is that at 100RPM, the fan has very little inertia and only having 7 blades to sense, there is very limited resolution for the PID loop. Rather than running a loop with a fixed time window and counting how many blades have been sensed in a set period of time, I've done the inverse and I measure the period between the blades. This works about as good as possible for low RPMs, but it's overkill at 2000RPM. Should still be well within the AVR's capabilities though.

If you have a timer ISR every 64uS why don't you check the blade speed there, (then it could be time consistent ), and for every ? 100ms or so set a flag so main know to run the PID again.(so it's outside the ISR).

How do you control the speed? (PWM)

I think i'd be using a bldc motor and the speed is synchronous.

sparrow2 wrote:

How do you control the speed? (PWM)

Speed is controlled with PWM. I'm using OCR1B, Fast PWM, 9 Bits which gives about 16KHz with an 8MHz CPU. I'm using a ULN2003 to drive the fan (along with some other stuff), but in future I'll use a discrete MOSFET for the fan drive as the ULN2003A has strange issues with cross-talk generated during the fan's commutation.

Kartman wrote:

I think i'd be using a bldc motor and the speed is synchronous.

I wish! Or one better, forget the rotor and use the rotating magnetic fields from the BLDC's stator to drive the magnetic stir bar - no moving parts! Alas, a DC fan is \$1.77 from digikey @ qty=100 whereas a BLDC motor, plus 3 phase h-bridge + drivers et.al really costs a lot more.

Regarding the original post, I ended up moving the computation into my main thread so interrupts are now lightning fast again. I really appreciate all the advice and learned a lot!

whereas a BLDC motor, plus 3 phase h-bridge + drivers et.al really costs a lot more.

That kind of depends on what kind of BLDC you are talking about. To fly R/C planes I've rewound many a CD-ROM motor for the cost of just some N45 magnets and some copper wire. I also used to buy kits from www.gobrushless.com back when they had \$5..\$10 kits. These days there's no point when you can buy read made motors for half nothing anyway..

http://www.hobbyking.com/hobbyki...

(and those aren't the cheapest BLDC by any means - Hoobyking is just "easy"). Same goes for BLDC ESCs. There used to be loads of interest in using micros (often AVRs) to build them from scratch but, again, they're commercially available for half nothing - no point building something that's cheaper ready-made.

Quote:
Skeeve, these are both great suggestions. I tried your first and this got it down to about 40uS which is a good improvement. I think the successive approximation approach will slow down my response time too much in my example but I can see how it would work in other cases.
Moving the computation to the main line was the right answer.

That said, I'd like to clarify successive approximation:

The assumption behind it is that the fan speed would not change very fast,

rarely, if ever, more the one unit between computations.

bladeFrequency would change by at most one each 76 uS.

Iluvatar is the better part of Valar.

Last Edited: Fri. Jun 12, 2015 - 01:03 AM

Why does one need PID for this?

Why burden yourself with doing time-between-pulses?  Use timer-as-counter, with a second timer for e.g. 100ms to read the other timer and close the loop.  Overhead is practically nil.

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

skeeve wrote:

I'd like to clarify successive approximation:

The assumption behind it is that the fan speed would not change very fast,

rarely, if ever, more the one unit between computations.

bladeFrequency would change by at most one each 76 uS.

That's exactly right, in this case at low RPM the fan can completely stall in half a revolution and I don't think the successive approximation would ramp the power up in time.

theusch wrote:

Why burden yourself with doing time-between-pulses?  Use timer-as-counter, with a second timer for e.g. 100ms to read the other timer and close the loop.  Overhead is practically nil.

I think the trouble with this is that with a 100mS window, when the fan is running at 100RPM (~11 blades per second) it will only count 1 or 2 blades in the 100mS window meaning the resolution is far too low. It would be great if the fan had an optical encoder with 100's of pulses per revolution, but in this case there's only 7 blades to sense.

EDIT:

clawson wrote:

whereas a BLDC motor, plus 3 phase h-bridge + drivers et.al really costs a lot more.

That kind of depends on what kind of BLDC you are talking about.

Thanks clawson. I was into the foamie/CDRom craze back in the day too but I just can't make it work economically for this project. If you know of a BLDC combo that rivals \$1.77 please let me know.

Last Edited: Fri. Jun 12, 2015 - 12:13 AM

m3gabyte wrote:
skeeve wrote:

I'd like to clarify successive approximation:

The assumption behind it is that the fan speed would not change very fast,

rarely, if ever, more the one unit between computations.

bladeFrequency would change by at most one each 76 uS.

That's exactly right, in this case at low RPM the fan can completely stall in half a revolution and I don't think the successive approximation would ramp the power up in time.

If one starts at 115, at most 115 iterations will be needed.

If that is too many, then likely another technique should be chosen.

I can think of ways to make it faster, but they're not worth the effort.

Iluvatar is the better part of Valar.

If you know of a BLDC combo that rivals \$1.77 please let me know.

Take this:

from an old PC. Dismantle it and remove the motor. Remove the stator. Unwind the existing wire. Rewind with higher gauge wire packed more densely. Job done ;-) For extra oomph remove the ring magnet in the cap and glue 12 square N45 magnets.

clawson wrote:
Dismantle it and remove the motor. Remove the stator. Unwind the existing wire. Rewind with higher gauge wire packed more densely. Job done ;-)

m3gabyte wrote:
Alas, a DC fan is \$1.77 from digikey @ qty=100 whereas a BLDC motor, plus 3 phase h-bridge + drivers et.al really costs a lot more.

 "Experience is what enables you to recognise a mistake the second time you make it." "Good judgement comes from experience.  Experience comes from bad judgement." "Wisdom is always wont to arrive late, and to be a little approximate on first possession." "When you hear hoofbeats, think horses, not unicorns." "Fast.  Cheap.  Good.  Pick two." "We see a lot of arses on handlebars around here." - [J Ekdahl]

theusch wrote:

Why burden yourself with doing time-between-pulses?  Use timer-as-counter, with a second timer for e.g. 100ms to read the other timer and close the loop.  Overhead is practically nil.

I think the trouble with this is that with a 100mS window, when the fan is running at 100RPM (~11 blades per second) it will only count 1 or 2 blades in the 100mS window meaning the resolution is far too low.

Resolution far too low for what?  We aren't positioning nuclear power plant control rods.  I assume this "computer fan" is for cooling either the processor, or the case exhaust.  I'd think that you could close the loop once a second, or once every few seconds, and get good results.

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

theusch wrote:
Resolution far too low for what?  We aren't positioning nuclear power plant control rods.  I assume this "computer fan" is for cooling either the processor, or the case exhaust.  I'd think that you could close the loop once a second, or once every few seconds, and get good results.

m3gabyte wrote:
It's a pretty unique case I think. I'm using the fan as the motor for a magnetic stirrer and I'm digitally setting the speed between 100 to 2000rpm. The trouble is that at 100RPM, the fan has very little inertia and only having 7 blades to sense, there is very limited resolution for the PID loop. Rather than running a loop with a fixed time window and counting how many blades have been sensed in a set period of time, I've done the inverse and I measure the period between the blades. This works about as good as possible for low RPMs, but it's overkill at 2000RPM. Should still be well within the AVR's capabilities though.

m3gabyte wrote:
That's exactly right, in this case at low RPM the fan can completely stall in half a revolution and I don't think the successive approximation would ramp the power up in time.

 "Experience is what enables you to recognise a mistake the second time you make it." "Good judgement comes from experience.  Experience comes from bad judgement." "Wisdom is always wont to arrive late, and to be a little approximate on first possession." "When you hear hoofbeats, think horses, not unicorns." "Fast.  Cheap.  Good.  Pick two." "We see a lot of arses on handlebars around here." - [J Ekdahl]

Sorry, joey, I missed that part.

So, just limit the lowest allowed speed to above "stall".  Again, in steady-state I can't see where any instant response is needed.  But perhaps for your app it is needed.

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

In case theusch wasn't clear to OP:

If there is a speed below which the precise speed does not matter,

that case can be handled with an if-statement.

Again, if OP has code in the main line that works,

there is no point in "fixing" it.

Iluvatar is the better part of Valar.