| Author |
Message |
|
|
Posted: Jun 01, 2006 - 04:08 PM |
|

Joined: Oct 08, 2005
Posts: 36
|
|
Hi ,
I have the following PID routine running at 1Khz which is very basic in nature. I want to make it faster. Can sumbody plz advice me a clever solution??
I am using AT90CAN128 and using AvrStudio.
Clock runnin at 16 Mhz
Code:
//PID routine
Xk = adcout/KconRadByte; //converting to Radians
ADCSRA|=~(1<<ADSC)|(1<<ADATE); //Stop ADC after reading in the corresspondin position
Ek= Wk-Xk; //calc error
sum = sum + Ek; //adding the errors
Pk= Kp*Ek; //prop error correction
Ik= Ki*(sum); //intergral error correction
Dk= Kd*(Ek-Ekp); //diff error correction
Uk= (Pk+Ik+Dk); //total correction
Ekp= Ek; //storing the present error value into another variable
Ek=0;
if (Uk<0) //changin the direction signal
PORTC = 0x00;
else PORTC = 0xff;
//End of routine
Kaushik |
|
|
| |
|
|
|
|
|
Posted: Jun 01, 2006 - 04:10 PM |
|

Joined: Oct 08, 2005
Posts: 36
|
|
I am using code to control the position of a servo motor.....forgot to mention
Kaushik |
|
|
| |
|
|
|
|
|
Posted: Jun 01, 2006 - 04:21 PM |
|


Joined: Feb 19, 2001
Posts: 26115
Location: Wisconsin USA
|
|
Atmel app note AVR221 claims to do PID in 877 cycles.
Lee |
|
|
| |
|
|
|
|
|
Posted: Jun 01, 2006 - 04:30 PM |
|


Joined: Sep 04, 2002
Posts: 21396
Location: Orlando Florida
|
|
| Convert everything to scaled integer. Thats faster than sw floats. I bet someone could speed up the program if you showed all of it to us. You could run at 18.432. |
|
|
| |
|
|
|
|
|
Posted: Jun 01, 2006 - 04:43 PM |
|

Joined: Jun 28, 2002
Posts: 164
Location: Boulder, Colorado,USA
|
|
Bob is right!
Replace every single float point representation by a fixed point representation.(integers) so lets say that you have a 16 bit signed integer and you want to represent 1.5 in this format so you have 0x0180 => 1.5, my convention is S7.8, so your minimum value could be 0.00390625 => 0x0001 or negative -0.00390625 => 0xFFFF (two's complement)
hope this helps! |
_________________ ---
ARod
|
| |
|
|
|
|
|
Posted: Jun 01, 2006 - 05:19 PM |
|

Joined: Oct 08, 2005
Posts: 36
|
|
Thanks guys for ur advices..... I did change all the integer types from float to int but didnt notice significant improvement in the speed....
But hav found an interesting situation....The problem is with the compiler I think.......If I just disable the if condition in the routine or use if with sumother variable apart from Uk, the rountine speed is almost abt 800 Khz and the side of the hex code generated is ard 300 bytes which is 10 times lesser than wat it was originally(3600 bytes)!!!
Can sumbody plz guide me out of this crux as I need the if condition to set my direction output??
I have attached the full code
Code:
//PID-Controller
/* Table
********************************************************* Tc=0.92s
Controller Kp(*Kc) Ti (*Tc) Td (*Tc)
P 0.50 - -
PI 0.45 0.83 _
PID 0.60 0.50 0.125
**********************************************************
*/
#include <avr/io.h>
#include <avr/interrupt.h>
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#define SEI() asm volatile ("sei")
#define NOP() asm volatile ("nop")
#define reload 49535
#define prescaler0 1 //prescaler timer0
#define prescaler1 1 //prescaler timer1
#define AngReqd 90.0 //req posn
#define Wk 2.5 //=AngReqd*M_PI/180 ; reqd posn in rads
#define KconRadByte 81.169 //81.169 = 255/M_PI for conv to 8 bit value from radians
#define Kc 1.75 //Critical prop constant obtained using pcont for stable constant ossci.
#define Tsamp 0.001 //Tsamp=(65535-reload)*prescaler1/crystal.freq(=16 MHz)
#define Ti 0.46
#define Td 0.115
#define Kp 1.05
#define Ki 0.002283 //=Tsamp*Kp/Ti
#define Kd 120.75 //=Td*Kp/Tsamp
volatile unsigned int counter = 0; // value to be written to OCRA
volatile unsigned int adcout = 0; //value read from the ADC gives posn
volatile float Ekp=0; //previous error
volatile float sum=0; //sum
volatile float Xk=0; //present posn from adcout
volatile float Ek=0; //present posn
volatile float Dk=0; //diff error corrtn
volatile float Ik=0; //int error corrtn
volatile float Pk=0; //prop error corrtn
volatile float Uk=0; //net corrtn
int main(void)
{
// Setup Timer1
TCCR1A = 0;
TCCR1B = (1<<CS00); // Prescale = 256 ;alternatively TCCR1B = prescaler1
// note that the higher register is always written first
TCNT1 = reload;
TIMSK1 = (1<<TOIE1);
// Setup timer0 WITH PWM,
TCCR0A |= (1<<WGM00)|(1<<COM0A1)|(1<<CS00);
// Write a proper value to the compare register
OCR0A = counter; //initial setting is 0
// output compare interrupt enabled
TIMSK0 = (1<<OCIE0A);
// Setup the ADC
ADMUX = (1<<ADLAR);
//ADCSRB = (1<<ADTS2)|(1<<ADTS1);
ADCSRA = (1<<ADEN)|(1<<ADIE)|(1<<ADPS2)|(1<<ADPS1)|(1<<ADPS0); //not started yet
// Set the B-Pins to OUTPUT (since OC1A is located at PB7)
DDRB = 0xff;
DDRD = 0x00;
DDRC = 0xff;
DDRE = 0xff;
DDRA = 0xff;
PORTA = 0xff;
PORTC = 0xff;
PORTE = 0xff;
//enable all interrupts
SEI();
while (1) //endless loop
{}
return 0;
}
//Timer 1 overflow
ISR (TIMER1_OVF_vect)
{
TCNT1 = reload;
ADCSRA|= (1<<ADSC)|(1<<ADATE);//Start ADC
//It doesnt seem like a wait routine is required for the completion of ADC conversion
short int i=0;
while(i<18)
{
NOP();
i++;
}
//to chk the frequency of the loop
int temp=0;
temp= PORTA;
PORTA=255-temp;
//PID routine
Xk = adcout/KconRadByte;
ADCSRA|=~(1<<ADSC)|(1<<ADATE); //Stop ADC after reading in the corresspondin position
Ek= Wk-Xk;
sum = sum + Ek; //adding the errors
Pk= Kp*Ek; //prop error correction
Ik= Ki*(sum); //intergral error correction
Dk= Kd*(Ek-Ekp); //diff error correction
Uk= (Pk+Ik+Dk); //total correction
Ekp= Ek; //storing the present error value into another variable
Ek=0;
if (Uk<0) //changin the direction signal
{
PORTC = 0x00;
}
else PORTC = 0xff;
//End of routine
counter = fabs(KconRadByte*Uk); //value betn 0 and 255 for the PWM compare reg
}
//Timer 0 PWM
ISR (TIMER0_COMP_vect)
{
OCR0A = counter;
}
//ADC
ISR (ADC_vect)
{
adcout = ADCH ;
}
Kaushik |
|
|
| |
|
|
|
|
|
Posted: Jun 01, 2006 - 05:29 PM |
|


Joined: Feb 19, 2001
Posts: 26115
Location: Wisconsin USA
|
|
Well, I'd never structure my program that way--I'd start the next conversion after the completion of the previous, and always have a fresh A/D reading available.
And you are concerned about speed, yet you have wasted time in the ISR.
But what the other posted have said is certainly true--you are only starting with an 8-bit reading. So everything can be done in integers using ratios to get your factors. With careful attention to possible overflows, 16 bits might even be enough for all calculations. At the worst, only an "overflow" to 32 bits would be needed during ratio operations.
Lee |
|
|
| |
|
|
|
|
|
Posted: Jun 01, 2006 - 06:11 PM |
|


Joined: Sep 04, 2002
Posts: 21396
Location: Orlando Florida
|
|
| Looks like you are doing dozens of sw fp calcs in the timer interrupt handler?? Get those out of there! |
|
|
| |
|
|
|
|
|
Posted: Jun 01, 2006 - 08:04 PM |
|


Joined: Feb 19, 2001
Posts: 26115
Location: Wisconsin USA
|
|
Uncle Bob--
If the primary or sole part of the app is to close the PID loop every millisecond, there wouldn't be anything fundamentally wrong with doing it right in the timer interrupt. But I tend to agree with you that you or I would structure the program differently. It also limits what can be done, especially during dev--say you want to dump the position to the UART--now you'd be asking for trouble.
Lee |
|
|
| |
|
|
|
|
|
Posted: Jun 01, 2006 - 09:01 PM |
|


Joined: Sep 20, 2003
Posts: 705
Location: Semmering, Austria
|
|
| From my own tests the fastest way to do this type of operdation is to use scaled unsigned integers wherever possible. Also determine if there are any variables that can be scaled to unsigned char which is even faster. Then it is good to go through the code assigning the most used variables to registers. I concur about getting as much as possible out of the ISR. |
_________________ Ralph Hilton
|
| |
|
|
|
|
|
Posted: Jun 01, 2006 - 10:11 PM |
|


Joined: Sep 04, 2002
Posts: 21396
Location: Orlando Florida
|
|
| But you can get a timer overflow every 128 usec on an 8 bit timer and prescaler of 1. That gives nice fine deltat resolution if you are measureing time, but not if he's doing 1ms of calcs (60 fp ops!) in the handler! |
|
|
| |
|
|
|
|
|
Posted: Jun 02, 2006 - 08:17 AM |
|

Joined: Oct 08, 2005
Posts: 36
|
|
After doin lots of circus, I finally figured out the in the whole routine the part which took the most time was the comparison of floats using 'if' (rate determinin step) . If I do the same thing with an int, my code is much faster........I also found that contrary to the general opinion that use of floats for calcns makes the code slower is not totally correct(for small calcn routines).....for my small piece of code, using both floats or scaled integers didnt make much of a difference.
Thanks all!!
Kaushik |
|
|
| |
|
|
|
|
|
Posted: Jun 02, 2006 - 08:42 AM |
|

Joined: May 18, 2006
Posts: 142
|
|
|
kaushikj wrote:
I also found that contrary to the general opinion that use of floats for calcns makes the code slower is not totally correct(for small calcn routines).....for my small piece of code, using both floats or scaled integers didnt make much of a difference.
FWIW:
There are tricks one can play with interger scaling that are guarenteed to blow away floats everytime, and it often involves "lying" about where the decimal point really is as well as moving it around during a given calculation -- Go back to your code in 6 months and you'll think you were drunk when you wrote that code section.
That said you might have to drop down to ASM (or ASM("")) to really gain the benefit as the C compiler just might not be efficient enough in some cases for you to really tighen up the int code and see any benefit.
In fact I rearly even use floats on a PC, or an MPU that has them, unless I really need the dynamic range they offer. |
|
|
| |
|
|
|
|
|
Posted: Jun 02, 2006 - 09:44 AM |
|


Joined: Aug 21, 2004
Posts: 2226
Location: germany
|
|
Hi
@Lee
Quote:
I'd start the next conversion after the completion of the previous,
If you start the AD after the PID you donīt have equal timesteps for calculating the PID.
Therfore you get different time form one sample to the next.
This results in different time_behaviour of the whole PID routine. You may risk the PID routine to oscillate.
I recommend to run the ADC with a CONSTANT sample frequency.
Then the software should be like this:
* init
* clear PID output value
* start conversion
Loop:
* wait till ADC finished
* (start a new conversion if not in free running / timer controlled mode)
* output PID output value
* read ADC value in
* process PID
* goto Loop
Then you have equal timing:
* constant sample frequency
* constant dead time (time between data input and data output)
BTW: i often check the runtime by settin a AVR output when ADC is finished and clear it befor going to Loop. You can check the processing time with an oscilloscope. |
_________________ Klaus
********************************
Look at: www.megausb.de (German)
********************************
|
| |
|
|
|
|
|
Posted: Jun 02, 2006 - 01:59 PM |
|


Joined: Sep 04, 2002
Posts: 21396
Location: Orlando Florida
|
|
| He wants it to run as fast as possible. It was running 1ms per loop! And he was griping that was not fast enough!. I'd time it in a loop till I got it to run as fast as possible, lets say 400usec, then set a timer int for 500usec that sets a flag, hang in main for the flag to change. Then you have nice low jitter timing |
|
|
| |
|
|
|
|
|
Posted: Jun 02, 2006 - 02:29 PM |
|

Joined: Feb 22, 2002
Posts: 220
|
|
Avoiding floats on a PC is a recipe for poorer performance. The FP units on modern X86 variants are as fast as the integer units, and they're idle unless you use them. Unless you are using SSE integer vector math, you should see an improvement doing your math in FP, as that frees the integer units to run your control code.
Code development goes more quickly, and with fewer errors, when you don't have to keep track of where your decimal point belongs.
Given that the ouput of kaushikj's routine is "bang-bang", I can see why he wants it to run faster. If it's possible, making the output proportional via PWM might yield better peformance than speeding it up. |
|
|
| |
|
|
|
|
|
Posted: Jun 02, 2006 - 02:51 PM |
|

Joined: Oct 08, 2005
Posts: 36
|
|
Bob, the only reason i am runnin the loop every ms is because I cannot go faster.......the interrupt interferes with itself
Scott, Can u plz explain what exactly u meant by sayin the follwing
Quote:
making the output proportional via PWM
Thanks all.
Kaushik |
|
|
| |
|
|
|
|
|
Posted: Jun 02, 2006 - 03:36 PM |
|

Joined: Feb 22, 2002
Posts: 220
|
|
Ah, I failed to read the comment about changing direction, and presumed the error signal was being used only to drive the direction control line. (It was the only line in the routine that talked to hardware). That's known as bang-bang control and it works if you can run the control algorithm fast enough.
Your second listing clears it up, and I should have read it before commenting. My apologies.
I do think you'd get significantly better performance from carefully crafting a fixed point version of the routine. As someone gleaned from an Atmel app note, you should be able to get through the PID routine in about 50 microseconds at 16MHz. I haven't looked at that app note, but even that seems like a long time. |
|
|
| |
|
|
|
|
|
Posted: Jun 02, 2006 - 04:28 PM |
|


Joined: Sep 04, 2002
Posts: 21396
Location: Orlando Florida
|
|
| Put the PID calcs in a subroutine, call it from main. The calcs are taking longer than your timer interrupt. That's why your delta-T computations are strange! |
|
|
| |
|
|
|
|
|
Posted: Jun 03, 2006 - 01:07 AM |
|

Joined: May 18, 2006
Posts: 142
|
|
|
ScottKroeger wrote:
Avoiding floats on a PC is a recipe for poorer performance. The FP units on modern X86 variants are as fast as the integer units, and they're idle unless you use them. Unless you are using SSE integer vector math, you should see an improvement doing your math in FP, as that frees the integer units to run your control code.
I disagree as you are making 2 assumptions:
1) The OS is smart enough to "parallize" the operations, which windows does not due on a single CPU processor
2) You still typically need to perform a FLOAT <-> LONG conversion step. So even if the FPU runs as fast as the CPU the float still incurs additional overhead.
Now that said, with the Dual Core's starting to become standard, and the OS actually having a shot at throwing both CPUs at the problem, I think you're right that I may need to consider getting a little more orthodox.
Also, when the Pentium came out Intel stopped communicating actual clocks per instruction claiming they were meaningless number due to the speculative pipeline they were now implementing. So if you know where I can see some hard data sheets that actually are giving numbers again regarding the FPU and INT math operations based on clocks I'd *LOVE* to see it.  |
|
|
| |
|
|
|
|
|
Posted: Jun 03, 2006 - 02:12 AM |
|

Joined: Feb 22, 2002
Posts: 220
|
|
"The OS is smart enough to "parallize" the operations, which windows does not due on a single CPU processor "
Neither the OS, nor the compiler need know much, if anything, about the X86's ability to simultaneously issue FP and integer instructions. If you have a mix of FP and integer instructions, the instruction scheduling hardware will execute them in parallel for you. It will also speculatively execute instructions on both sides of a branch, and discard those that get bypassed. Intel's compilers do try to keep the execution pipelines full by avoiding data dependencies that cause stalls.
The P4's SSE2 unit can do four FADDs or two FMULs per clock. While the integer units are calculating source and destination pointers and testing condition codes, the FP unit can be doing the math. All this parallelism is in the hardware. So if you were going to compute the average of an array of 32 bit numbers, you'd find it going faster if the numbers were floating point.
Of couse you can write a routine that runs poorly in FP, but the idea that FP is always slower is untrue, and as more emphasis is placed on FP performance, it will only get better. The PowerPCs AltiVec unit is a good example of FP being faster than integer. |
|
|
| |
|
|
|
|
|
Posted: Jun 03, 2006 - 04:03 AM |
|

Joined: May 18, 2006
Posts: 142
|
|
|
ScottKroeger wrote:
The P4's SSE2 unit can do four FADDs or two FMULs per clock.
Wow! Point taken, obviously been away *FAR* too long on where Intel's at. Last I knew they were no where near parity (about 50% as I recall). And my programming habits obviously show it.
Seriously, can you refer me to some white papers so I can get caught up?
Man this place is *AWESOME*!  |
|
|
| |
|
|
|
|
|
Posted: Jun 03, 2006 - 05:28 AM |
|

Joined: Feb 22, 2002
Posts: 220
|
|
I don't have any white papers to recommend, but Ars Technica has been a good source of CPU architecture comparisons over the years. They've been keeping track of the relative merits of PowerPC and the various spins of the X86.
The rising importance of gaming is putting pressure in Intel to improve FP performance. AMD has the edge in gaming and Intel wants it back. It's nice to know that soon, the easiest way to slog through numerical algorithms (FP) will also be the fastest way, even if you don't code well. |
|
|
| |
|
|
|
|
|
Posted: Jun 03, 2006 - 06:11 AM |
|

Joined: Sep 25, 2003
Posts: 2189
Location: Los Angeles, USA
|
|
|
Quote:
I am using AT90CAN128 and using AvrStudio.
Long ago I lost the connection between this thread and the OP request. |
|
|
| |
|
|
|
|
|