Optimizing libc integer conversion routines

Go To Last Post
69 posts / 0 new

Pages

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quite a lot of it already is ;-) 

 

Do bear in mind though the requirement to make best use of 17 different architectures (MUL when you can etc). Handling those variants may be easier in plain Asm with conditional sections. Trying to handle that too in the already fraught inline syntax could be "fun". ;-) 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Wouldn't it be easier to just learn to count in hex and then re-define ASCII so that the codes for ABCDEF came right after 0123456789? Ah, wait, no, someone would complain that they want lowercase, forget that.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

People who use lower case in hex give me the willies! OK, I suppose 0xDEADBEEF and 0xdeadbeef both work but when you have something like 0xACE81ADE ("Ace Blade") it just doesn't look right as 0xace81ade !!

 

(oh and in my world the 'x' is always 'x' and never 'X'. Euugh! This does of course mean "0x%08X" though! ;-)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Agreed, lowercase hex is nearly as bad as using spaces instead of tabs, or tab sizes other than 4.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

Ok I had some time and here is a ASM version that can run on all AVR's (with RAM).

It is 78 byte, and take max 201 clk (76  200 if you have a defined zero reg.)

But the best is actually the small amount of registers used, only 2 other than input data and pointer

It don't print starting 0's so 123 is printed as 123\0

It's based on the count down up down ....

;input r17:r16
;input Z point to string start
;change r16,r17,r24,r25,Z 


        mov     r25,ZL      ;remember org pointer start
        ldi     r24, '0'-1
ML1:	inc     r24
        subi    r16, low(10000)       
        sbci    r17, high(10000)
        brcc    ML1
        cpi     r24,'0'
        breq    PL1         ;never print a 0 on first digit
        st      Z+,r24

PL1:
        ldi     r24, '0'+10
ML2:	dec     r24
        subi    r16, low(-1000)       
        sbci    r17, high(-1000)
        brcs    ML2
        cpi     r24,'0'
        brne    PL2         ;!= '0' print
        cpse	ZL,r25      ;if nothing printed don't print '0'
PL2:	st      Z+,r24


        ldi     r24, '0'-1
ML3:	inc     r24
        subi    r16, low(100)           
        sbci    r17, high(100)
        brcc    ML3
        cpi     r24,'0'
        brne    PL3         ;!= '0' print
        cpse    ZL,r25      ;if nothing printed don't print '0'
PL3:	st      Z+,r24


        ldi     r24, '0'+10
ML4:	dec     r24
        subi    r16, -10 
        brcs    ML4
        cpi     r24,'0'
        brne    PL4         ;!= '0' print
        cpse    ZL,r25      ;if nothing printed don't print '0'
PL4:	st      Z+,r24


        subi    r16, -'0'
        st      Z+,r16      ;always print last digit
        ldi     r16, 0
        st      Z+,r16      ;0 terminator

I will look at a faster version but the code will be bigger, but something like 150 byte and 120clk should be possible

An other way is something like a 40 byte 300 clk version

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

This was better that expected :

54 byte and max 249 clk , and only r18 and r19 added as changed

_digit:
		ldi     r24,'0'-1
_digit1:
		inc     r24
		sub     r16,r18
		sbc     r17,r19
		brcc    _digit1
		cpi	r24,'0'
		brne	_digit2			;!= '0' print
		cpse	ZL,r25		;if nothing printed don't print '0'
_digit2:        st      Z+,r24
		add     r16,r18
		adc     r17,r19
		ret
convert:
		mov     r25,ZL		;pointer have moved print digit
		ldi     r18,low(10000)
		ldi     r19,high(10000)
		rcall   _digit
		ldi     r18,low(1000)
		ldi     r19,high(1000)
		rcall   _digit
		ldi     r18,low(100)
		ldi     r19,high(100)
		rcall   _digit
		ldi     r18,low(10)
;		ldi     r19,high(10)
		rcall   _digit
                subi    r16, -'0'
		st	Z+,r16	;always print last digit
		st	Z+,r19	;terminator
 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Last version before bed time it make both count down and up on each digit (but down with a factor 2 bigger).

It 68 byte and max 188 clk

_digit:
		ldi r24,'0'
_digit1:
		subi r24,-2
		sub r16,r18
		sbc r17,r19
		brcc _digit1
		lsr r19
		ror r18
		add r16,r18
		adc r17,r19
		brcs _digit3
		dec r24
		add r16,r18
		adc r17,r19
//		rjmp _digit4
_digit3:dec r24
_digit4:
		cpi		r24,'0'
		brne	_digit2			;!= '0' print
		cpse	ZL,r25		;if nothing printed don't print '0'
_digit2:st		Z+,r24
		ret

convert:
		mov		r25,ZL		;pointer have movet print digit
		ldi     r18,low(20000)
		ldi     r19,high(20000)
		rcall    _digit
		ldi     r18,low(2000)
		ldi     r19,high(2000)
		rcall    _digit
		ldi     r18,low(200)
		ldi     r19,high(200)
		rcall    _digit
		ldi     r18,low(20)
;		ldi     r19,high(20)
		rcall    _digit
        subi    r16, -'0'
		st		Z+,r16	;always print last digit
		st		Z+,r19	;terminator

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

skeeve wrote:
avr-gcc inline assembly can use R0..R31 in any way one wants. The compiler has to be informed. gcc inline assembly has the syntax and semantics to connect registers with C variables. That is what makes gcc inline assembly so much more powerful than others.

 

FYI, you can't use just any register in inline asm, for example the following code won't compile:

void f (int*);

void g (int a)
{
    f (&a);
    __asm (" " ::: "28");
}

Similar code is frequently used  — not taking the address of a parameter, but passing down the address of a local, non-static buffer array to some receiver or transmitter routine.

 

avrfreaks does not support Opera. Profile inactive.

Last Edited: Fri. Oct 21, 2016 - 06:08 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'm not an expert on inline asm, but isn't that just failing because you're telling avr-gcc you're clobbering r28, and avr-gcc requires r28 to be saved and restored? I.e. you can use it however you want if you think you know better, but telling the compiler what you're doing will make it angry.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

SprinterSB wrote:
FYI, you can't use just any register in inline asm, for example the following code won't compile:

void f (int*);

void g (int a)
{
    f (&a);
    __asm (" " ::: "28");
}

Similar code is frequently used  — not taking the address of a parameter, but passing down the address of a local, non-static buffer array to some receiver or transmitter routine.

What is the error message?

Does it help to put in the r?

Moderation in all things. -- ancient proverb

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Adding "r" won't help, that's just some sugar.

 

foo.c: In function 'g':
foo.c:7:1: error: r28 cannot be used in asm here

That's because r28 is part of the frame pointer.
 

avrfreaks does not support Opera. Profile inactive.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

SprinterSB wrote:
Adding "r" won't help, that's just some sugar.

 

foo.c: In function 'g':
foo.c:7:1: error: r28 cannot be used in asm here

That's because r28 is part of the frame pointer.

According to #5 here, it can.

Moderation in all things. -- ancient proverb

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I had a look at how small it can be, and I guess that the code like this is one of the smallest I can think of at the moment.

It is zero terminated, but the zeros aren't removed so 123 will print 00123\0 

It use a 16/8 div routine and because the digits come in the wrong order the zeros is a pain to remove!

it take r17:r16 as input and Z is the pointer

and it take about 760 clk

But it's only 38 byte in size. 

Any smaller code?

convert:
        ldi     r24,10           ;div with 10
	adiw	Z,5
	ldi	r23,5
con0:   clr     r18             ;reminder
        ldi     r25,0x10        ;loop 16 bit
con1:   lsl     r16
        rol     r17
        rol     r18
        cp      r18,r24
        brcs    con3
con2:   sub     r18,r24
        inc     r16
con3:   dec     r25
        brne    con1
	subi	r18,-'0'		;go to ACSII
	st	-Z,r18
        dec     r23
	brne    con0
	std	Z+5,r16			;0 terminating string

Edit forgot an init

Last Edited: Mon. Oct 24, 2016 - 10:07 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Nothing to do with libc, but in dumping data, say to a 9600 baud serial port, then yes, transmission speed can be the most expensive factor in data acquisition time.  It's trivial to write a routine that dumps a byte as two hex digits, but then when you paste that data dump into a spreadsheet program (eg. MS Excel), horrible things happen.

 

(0x omitted)

 

1234 is interpreted as 1234 decimal, when it should be 4660 decimal

12A4 is interpreted as hex, or 4772 decimal (as it should).

12E4 is interpreted as 120,000 decimal when it should be 4836.

 

The horrible solution I came up with is to prefix all values with 'F', which entirely defeats the transmission speed benefit, but does force all the data to be interpreted as hexadecimal.

 

F1234 mod F0000 gives 4660 decimal, always.  &c.

 

Dunno if that helps any.  S.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

But that is a good reason to use decimal numbers in you log. 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I had an AVR system in production awhile ago that used base-62 for communication.  [0-9],[A-Z],[a-z] were the allowed ASCII characters.  I wanted base-64, but the barcode system we were using for input and output wouldn't do punctuation.  Ah well...  S.

 

Edited for typo.  S.

Last Edited: Tue. Oct 25, 2016 - 01:12 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Scroungre wrote:
I wanted base-64
Ah ha - a reinvention of UUcode ? Those of us old enough to remember email before MIME existed will remember passing binary files back and forth UUencoded!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Pretty much!!  Although ours was a homegrown incompatible version.  I am indeed old - uudecode was our pal!  And to be honest, I still have systems out there that are transmission-speed dependent, and I'm always tempted to try.  Few of the Powers-With-Money are inclined to concur.  S.

 

Pages