I'm not posting this code as solution to the stated problem, I'm just experimenting to get a feel for the problem and possible solutions. Better try first on a simple 8bit version. Already I've found several problems that may come up, mostly rounding errors and limitations of the AVR instruction set.
Can't load immediates on r0-r15, destination of multiplies is fixed at r1:r0, fmul can't take all regs as arguments, adci instruction doesn't exist, etc. all these cost cycles...
Well, for now I solved the rounding errors, this code takes an 8bit unsigned in r19 and returns correct decimal digits in r18:r17:r1 for all values 0-255
;multiply
ldi r17, 164
fmul r19, r17
adc r18, r18
;shift
add r0, r0
adc r1, r1
adc r18, r18
;rounding correction
ldi r16, 0xFF
sub r1, r16
sbc r18, r16
;in case r18 was not cleared on entry
andi r18, 3
;final multiplies for tens and units
ldi r16, 10
mul r16, r1
mov r17, r1
mul r16, r0
The algorithm is:
multiply by 1/100 * 2^14 *2 with fmul
multiply by 2 with shift, final result is value * 1/100 * 2^16 = hundreds digit
I suppose instructions must be compatible with the whole range of AVRs to make it into libC…?
There are 17 architectures and I sort of assume that the lib code must be written for the lowest common denominator. Having said that I'm not sure what the implications are for "brain dead" tiny chips? It would seem to imply you wouldn't ever be able to use anything beneath R16. I suppose some of the LibC source may have some kind of "#ifdef BRAIN_DEAD .. do something really stupid ... #else ..." ? Clearly it would seem unlikely that all AVRs would be forced to do multiplcation (say) without the use of MUL though.
Not sure if there bis some kind of coding rules written down for it somewhere? Perhaps people "in the know" just know what these restrictions are they passed by word of mouth?
Actually I'm intrigued now. I just picked a file at random:
Oh, c'mon, I've always wanted to use fmul for something. So is mul used anywhere (I don't have the source here, could yo check)? It's also not available in every AVR. Also adiw and sbiw, probably others.
edit: nvm, just saw the file clawson linked has mul in it. Every AVR with mul also has fmul, so...
@Cliff, we cross-posted. I believe my last post answers one of your question. Have been intrigued by __AVR_TINY__ too, haven't looked into it yet… :-p
@El Tangas, it seems fmul is going to be ok if you can provide a "proxy" macro with alternate code for chips that don't have it, as in the movw case. I kind of found it suspicious however that it did not exist yet…
As indicated above, this was from libc-1.8.1. I guess I should run it again on 2.0.0.
Also relevant to the discussion: there have been talks about improving these functions on the mailing lists.
For example:
optimization for ltoa/printf (2011) mentions optimization for certain radices only. Also used lookup tables. Was eventually refused it seems because of code size vs speed trade-of. No one seemed to argue against optimized code for certain radices (10 in particular) on principle.
Speeding [u]ltoa by 35% (2012) introduced radix checks in stdlib.h (I believe has been committed)
edit 2: You know, just remembered, I've done this before, several years ago, for x86 assembly: https://board.flatassembler.net/... There was even some guy claiming he invented this algorithm in 1999...
It seems to me you're the right man for this job. :-)
Let me not make any assumptions on what os wrong and just give the facts (As I see them)
I have and atmega with a pot giving out a pulse that ranges from approx 1 to 40 hz. I have another atmeg reading the time between cycles every second.
When at 40 HZ, I get a count of 91516 (my clock's prescaler is 4)
When I start decreasing the frequency, the timer count goes up and then around 2 Hz, the value just keeps increasing each second. It is like both cntH and cntL are never reset.
Every now and then, I get a count much larger that what it should be.
91516 40Hz
91516
91514
163443
146897
155983
117972
215110
215112
215114
215112
116696
170956
170959
170957
170957 20Hz
290749 <------- out of sequence
192329
192331
193355
128721 <--------
180942
180940
180940
180941
333496
426928 <------- out of sequence
426664
426669
189002
264769
579356
609680
692722
775762
858804
1007380 2 hz from now on the count just keeps increasing even though the input is still 2 hz.
1090422
1173464
1256505
1405082
1488122
1571164
1654204
1737245
1820286
1903328
2051905
2134947
2217988
2301029
2449606
2532647
2615689
2698729
2847306
2930347
3013389
3096430
3179471
3262513
3345553
3494131
3577173
3660213
3743255
3891832
3974873
4057914
4140955
4289532
4372573
4455615
4538655
/*
Read input pulse length using PD0
*/
#define F_CPU 16E6
#define USART_BAUDRATE 57600
#define F_BAUD_PRESCALE (((F_CPU/(USART_BAUDRATE*16UL)))-1) // from manual
#include <avr/io.h>
#include <util/delay.h>
#include <string.h>
#include <stdlib.h>
#define lcd_SetCursor 0x80 // set cursor position
#define lcd_Clear 0x01
// Functions
void lcd_write_nibble(uint8_t);
void lcd_write_inst(uint8_t);
void lcd_write_char(uint8_t);
void lcd_write_string(uint8_t *);
void lcd_write_long(long);
void lcd_init_4d(void);
void serial_init(unsigned int BAUD_PRESCALE);
void serial_send(char* data);
void init_timer1();
int tov_check(int);
unsigned char sndtime[20];
long cntL;
long cntH;
int edge; // first occurrence
int i; // pointer
/******************************* Main Program Code *************************/
int main(void)
{
lcd_init_4d(); // initialize the LCD display for a 4-bit interface
timer1_init();
serial_init(F_BAUD_PRESCALE);
while(1)
{
edge=1;
while (!(PIND & (1<<PD0))) // stay in loop while positive signal
{
if (edge == 1)
{
TCNT1=0; // Reset Timer
cntH=0;
edge=0;
}
cntH=tov_check(cntH); // Count overflows
}
while ((PIND & (1<<PD0))) // stay in loop while negative signal
{
cntH=tov_check(cntH);
}
if (!(PIND & (1<<PD0))) // wait for rising edge
{
cntL=TCNT1; // store timer
lcd_write_inst(0x01); // clear display
lcd_write_inst(0x80);
_delay_ms(2);
lcd_write_inst(lcd_SetCursor | 0x00); // Line 1
lcd_write_long(cntH);
lcd_write_inst(lcd_SetCursor | 0x40); // Line 2
lcd_write_long(cntL);
cntL=cntL+(cntH<<16); // Turn into Long
ltoa(cntL,sndtime,10);
serial_send("SS");
serial_send(sndtime);
serial_send("EE");
}
_delay_ms(1000);
}
}
void timer1_init()
{
TCCR1A = 0x00; // Normal count Mode and waveform
TCCR1B = (1 << CS11); // CS12 and CS10 = 1 Prescaler 4
TCNT1 = 0; // Initalize
}
int tov_check(int data)
{
if (TIFR1 & (1 << TOV1))
{
data++;
TIFR1 = (1 << TOV1);
}
return(data);
}
// LCD Setup
void lcd_init_4d(void)
{
DDRD = 0xF0; // 4 data lines - output
DDRB = 0x03; // Control Lines
_delay_ms(100); // initial 40 mSec delay
lcd_write_nibble(0x30); // reset sequence
_delay_ms(10);
lcd_write_nibble(0x30);
_delay_us(200);
lcd_write_nibble(0x30);
_delay_us(200);
lcd_write_inst(0x28); // Set interface length = 4, Display Lines, Font 5x7
_delay_us(80);
lcd_write_inst(0x08); // turn display OFF
_delay_us(80);
lcd_write_inst(0x01); // clear display RAM
_delay_ms(4);
lcd_write_inst(0x06); // shift cursor from left to right on read/write
_delay_us(80);
lcd_write_inst(0x0C); // turn the display ON
_delay_us(80);
}
void lcd_write_inst(uint8_t inst)
{
PORTB &= ~(1<<PORTB0); // select the inst Register (RS low)
lcd_write_nibble(inst); // write the upper 4-bits of the data
lcd_write_nibble(inst << 4); // write the lower 4-bits of the data
_delay_us(80);
}
void lcd_write_char(uint8_t Data)
{
PORTB |= (1<<PORTB0); // select the Data Register (RS high)
lcd_write_nibble(Data); // write the upper 4-bits of the data
lcd_write_nibble(Data << 4); // write the lower 4-bits of the data
_delay_us(80);
}
void lcd_write_nibble(uint8_t theByte)
{
if (theByte & 1<<7)
PORTD |= (1<<PORTD7); //Write 1 which is the actual data
else
PORTD &= ~(1<<PORTD7); //Write 0
if (theByte & 1<<6)
PORTD |= (1<<PORTD6);
else
PORTD &= ~(1<<PORTD6);
if (theByte & 1<<5)
PORTD |= (1<<PORTD5);
else
PORTD &= ~(1<<PORTD5);
if (theByte & 1<<4)
PORTD |= (1<<PORTD4);
else
PORTD &= ~(1<<PORTD4);
PORTB &= ~(1<<PORTB1); // make sure E is initially low
_delay_us(1);
PORTB |= (1<<PORTB1); // Enable pin high
_delay_us(1);
PORTB &= ~(1<<PORTB1); // Enable pin low
_delay_us(1);
}
void lcd_write_string(uint8_t String[])
{
int i = 0; // character counter
while (String[i] != 0)
{
lcd_write_char(String[i]);
i++;
}
}
void lcd_write_long(long data)
{
char st[50];
ltoa(data,st,10);
lcd_write_string(st);
}
// Serial setup
void serial_init(unsigned int BAUD_PRESCALE)
{
UBRR0H = (unsigned char)(BAUD_PRESCALE >> 8); //Upper 8-bits baud rate, Shift Right 8 times
UBRR0L = (unsigned char)(BAUD_PRESCALE); //lower 8-bits.
UCSR0B = (1 << RXEN0) | (1 << TXEN0); // Turn on the transmission and reception circuitry
UCSR0C = (0 << USBS0) | (3 << UCSZ00); // Stop Bits =1 8-bit
}
void serial_send(char *data) // send a string
{
while(*data != 0)
{
while( !(UCSR0A & (1<<UDRE0))); // wait for empty transmit buffer.
UDR0 = data[i] ;
data++;
}
}
After a break, I found the problem. If the input pulse is long, then the code misses the positive edge and goes straight into the negative section which just increases the counter.
Adding this after the 1 sec delay fixed one problem.
while ((PIND & (1<<PD0))) // stay in loop while negative signal
{
cntH=tov_check(cntH);
}
I have another part of the project which will be using an ISR and that timing is critical.
The input is a square wave so the edges are well defined. However as I rotate the pot, (with a scope on the input, the wave appears to increase nice and linearly ) but I still get these large numbers appearing between valid data. I'm not sure how to look for where they are coming from.
Weird! This is an unsigned conversion, yet we have:
And:
ɴᴇᴛɪᴢᴇᴎ
- Log in or register to post comments
TopMy question was just for a part of the 16 solution (for the last 2 digits) that was why I only needed to get it correct upto 99
- Log in or register to post comments
TopI'm not posting this code as solution to the stated problem, I'm just experimenting to get a feel for the problem and possible solutions. Better try first on a simple 8bit version. Already I've found several problems that may come up, mostly rounding errors and limitations of the AVR instruction set.
Can't load immediates on r0-r15, destination of multiplies is fixed at r1:r0, fmul can't take all regs as arguments, adci instruction doesn't exist, etc. all these cost cycles...
Well, for now I solved the rounding errors, this code takes an 8bit unsigned in r19 and returns correct decimal digits in r18:r17:r1 for all values 0-255
The algorithm is:
multiply by 1/100 * 2^14 *2 with fmul
multiply by 2 with shift, final result is value * 1/100 * 2^16 = hundreds digit
do rounding correction
multiply fractional remainder by 10 = tens digit
multiply fractional remainder by 10 = units digit
- Log in or register to post comments
TopI guess I should warn you asap, El Tangas: I've run a quick check and it seems fmul isn't used once in the whole libC source code.
I suppose instructions must be compatible with the whole range of AVRs to make it into libC…?
ɴᴇᴛɪᴢᴇᴎ
- Log in or register to post comments
TopOr maybe not: there are mechanisms in place in asmdef.h to mimic an enhanced instruction that is not available on every chip.
For example:
ɴᴇᴛɪᴢᴇᴎ
- Log in or register to post comments
TopNot sure if there bis some kind of coding rules written down for it somewhere? Perhaps people "in the know" just know what these restrictions are they passed by word of mouth?
Actually I'm intrigued now. I just picked a file at random:
http://svn.savannah.nongnu.org/v...
It seems the whole code is:
so do Tiny simply not get this stuff at all?
- Log in or register to post comments
TopOh, c'mon, I've always wanted to use fmul for something
. So is mul used anywhere (I don't have the source here, could yo check)? It's also not available in every AVR. Also adiw and sbiw, probably others.
edit: nvm, just saw the file clawson linked has mul in it. Every AVR with mul also has fmul, so...
edit 2: You know, just remembered, I've done this before, several years ago, for x86 assembly: https://board.flatassembler.net/topic.php?t=3924
There was even some guy claiming he invented this algorithm in 1999...
- Log in or register to post comments
Top@Cliff, we cross-posted. I believe my last post answers one of your question. Have been intrigued by __AVR_TINY__ too, haven't looked into it yet… :-p
@El Tangas, it seems fmul is going to be ok if you can provide a "proxy" macro with alternate code for chips that don't have it, as in the movw case. I kind of found it suspicious however that it did not exist yet…
As indicated above, this was from libc-1.8.1. I guess I should run it again on 2.0.0.
Also relevant to this discussion: official benchmarks for libc functions (among which those of interest to us). Excerpt:
Function
Stack bytes
MCU clocks
2
155
2
149
2
149
Stack bytes
MCU clocks
2
221
2
219
2
219
Stack bytes
MCU clocks
2
879
2
875
2
875
Stack bytes
MCU clocks
2
1597
2
1593
2
1593
Stack bytes
MCU clocks
53
1841
53
1694
53
1689
Stack bytes
MCU clocks
58
1647
58
1552
58
1547
Stack bytes
MCU clocks
67
2573
67
2311
67
2311
ɴᴇᴛɪᴢᴇᴎ
- Log in or register to post comments
TopAlso relevant to the discussion: there have been talks about improving these functions on the mailing lists.
For example:
Speeding [u]ltoa by 35% (2012) introduced radix checks in stdlib.h (I believe has been committed)
ɴᴇᴛɪᴢᴇᴎ
- Log in or register to post comments
TopIt seems to me you're the right man for this job. :-)
ɴᴇᴛɪᴢᴇᴎ
- Log in or register to post comments
TopLet me not make any assumptions on what os wrong and just give the facts (As I see them)
I have and atmega with a pot giving out a pulse that ranges from approx 1 to 40 hz. I have another atmeg reading the time between cycles every second.
When at 40 HZ, I get a count of 91516 (my clock's prescaler is 4)
When I start decreasing the frequency, the timer count goes up and then around 2 Hz, the value just keeps increasing each second. It is like both cntH and cntL are never reset.
Every now and then, I get a count much larger that what it should be.
- Log in or register to post comments
TopI would use ISR for this kind of problems!
in your code you have this:
but you call in with a long!
- Log in or register to post comments
TopSorry, we've hijacked your thread with this discussion about optimizing libc. It's a bit of a mess now that you need to get back to your issue… :-/
ɴᴇᴛɪᴢᴇᴎ
- Log in or register to post comments
TopI can't see how your edge detection works.
Basically, you want to sample the input and store the previous value.
curr = port_pin
if (prev ==0 and curr ==1)
{
//rising edge
}
prev = curr
also, tov_check accepts an int and returns an int, but you feed it longs. With this function you want to pass by reference.
- Log in or register to post comments
TopAfter a break, I found the problem. If the input pulse is long, then the code misses the positive edge and goes straight into the negative section which just increases the counter.
Adding this after the 1 sec delay fixed one problem.
I have another part of the project which will be using an ISR and that timing is critical.
The input is a square wave so the edges are well defined. However as I rotate the pot, (with a scope on the input, the wave appears to increase nice and linearly ) but I still get these large numbers appearing between valid data. I'm not sure how to look for where they are coming from.
Thanks Wallace
- Log in or register to post comments
TopI've suggested how you should fix your edge detection.
- Log in or register to post comments
TopPages