Binary-to-Decimal Conversion--a reference for all

Go To Last Post
52 posts / 0 new

Pages

Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

There have been many, many threads on the Forum on binary-to-decimal conversion, usually for purposes of creating ASCII character representations of numbers for display.

Some of the common themes include the surprise at the size of programs containing printf(); complaints about size and/or speed of alternatives such as itoa(); and just lack of basic knowledge on the subject.

With different types of displays, and differing display requirements (sometimes I want to inject an inplied decimal port; sometimes I need utoa() for full 16 bits instead of 15; etc.) I'm always searching for the "ultimate method".

Well, I came across a reference that is detailed enough for the purists and also straightforward enough for the beginnners:
http://www.cs.uiowa.edu/~jones/b...

Quote:

Binary to Decimal Conversion in Limited Precision
Part of the Arithmetic Tutorial Collection
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Copyright © 1999, Douglas. W. Jones, with major revisions mad e in 2002. This work may be transmitted or stored in electronic form on any computer attached to the Internet or World Wide Web so long as this notice is included in the copy. Individuals may make single copies for their own use. All other rights are reserved.

I >>knew<< there had to be bettter methods than the straightforward subtraction of powers of 10, and faster methods than the elegant recursive solution (which HAS to be the smallest flash consumer). This article will give all of us something to think about. I'm looking forward to exploring more items in the "collection" now that I've found it.

Lee

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
...exploring more items in the "collection"...

Just on the off-chance to save someone *some* work while further exploring this collection - my "DIV16_XX" library (#131 in the academy) implements division by constants based on Mr. Jones' nice examples (though I've to admit that it is implemented in assembly - I just couldn't resist "36 cycles for a div. by 23" ...).

:wink:

Andreas

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi,

I don't suppose anyone has the assembly code that results from the C code presented on the site that Lee references above? If you do would you be willing to share it with a non C person (me :D )?

I am interested in seeing if I can figure out how it works. I can sort of figure out the math from the site that Lee gives but just not good enough to write the assembler code myself. Of course the C code presented is pure ______ (any foreign language to an English speaking American inserted here...don't want to offend anyone 8) ) to me. I am also interested in seeing how it compares to a subtraction type of BIN2BCD routine I just finished (and also to the Atmel app. note type of routine for BIN2BCD conversion).

[edit} BTW, thanks for sharing that site with us Lee !!!

TIA
Steve

Last Edited: Wed. Jul 14, 2004 - 04:28 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Interesting article! I'm working on the asm version :twisted:

ldi r16, (1<<MB_BEER)
out SREG, r16

see you in an hour.

Christoph

I tend to post off-topic replies when I've noticed some interesting detail.
Feel free to stop me.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Great! 8)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Here is my code to convert 1-4 byte binary do packed bcd (conv_bin2bcd) and packed bcd to binary (conv_bcd2bin).
First argument - input data,second - number of output bytes (for values 0-99 size=1,0-9999 size=2,0-999999 size=3,0-99999999 size=4).

unsigned long conv_bin2bcd(unsigned long data,unsigned char size)
{register unsigned long result asm("r16");
 asm ("mov __tmp_reg__,%A2 \n"
      "conv_bin2bcd00: \n"
        "mov r2,%A1 \n"
        "mov %A1,%B1 \n"
        "mov %B1,%C1 \n"
        "mov %C1,%D1 \n"
        "mov %D1,r2 \n"
        "dec __tmp_reg__ \n"
        "brne conv_bin2bcd00 \n"

      "eor %A0,%A0 \n"  /*clear result*/
      "eor %B0,%B0 \n"
      "eor %C0,%C0 \n"
      "eor %D0,%D0 \n"
      "mov __tmp_reg__,%A2 \n"
      "lsl __tmp_reg__\nlsl __tmp_reg__\nlsl __tmp_reg__\n"  /*__tmp_reg__=size*8*/

      "conv_bin2bcd01: \n"           /*shift loop*/
        "subi %A0,-0x33 \n"         /*add 3*/
        "sbrs %A0, 3 \n"            /*if carry to bit 3,*/
        "subi %A0, 3 \n"            /*subtract 3*/
        "sbrs %A0, 7 \n"            /*if carry to bit 7,*/
        "subi %A0, 0x30\n"          /*subtract 0x30*/
        "subi %B0,-0x33 \n"         /*add 3*/
        "sbrs %B0, 3 \n"            /*if carry to bit 3,*/
        "subi %B0, 3 \n"            /*subtract 3*/
        "sbrs %B0, 7 \n"            /*if carry to bit 7,*/
        "subi %B0, 0x30\n"          /*subtract 0x30*/
        "subi %C0,-0x33 \n"         /*add 3*/
        "sbrs %C0, 3 \n"            /*if carry to bit 3,*/
        "subi %C0, 3 \n"            /*subtract 3*/
        "sbrs %C0, 7 \n"            /*if carry to bit 7,*/
        "subi %C0, 0x30\n"          /*subtract 0x30*/
        "subi %D0,-0x33 \n"         /*add 0x33*/
        "sbrs %D0, 3 \n"            /*if carry to bit 3,*/
        "subi %D0, 3 \n"            /*subtract 3*/
        "sbrs %D0, 7 \n"            /*if carry to bit 7,*/
        "subi %D0, 0x30\n"          /*subtract 0x30*/
        "lsl %A0\nrol %B0\nrol %C0\nrol %D0\n" /*shift out buffer*/

        "sbrc %D1, 7 \n"            /*skip if msbit of input =0*/
        "sbr %A0,1 \n"
        "lsl %A1\nrol %B1\nrol %C1\nrol %D1\n" /*shift in buffer*/

        "dec __tmp_reg__ \n"        /*repeat for all bits*/
        "brne conv_bin2bcd01 \n"
              
  : "=r" (result) :"r" (data), "r" (size) : "r2"
  );
 return(result);
}
unsigned long conv_bcd2bin(unsigned long data,unsigned char size)
{register unsigned long result asm("r16");
 asm ("eor %A0,%A0 \n"  /*clear result*/
      "eor %B0,%B0 \n"
      "eor %C0,%C0 \n"
      "eor %D0,%D0 \n"
      "mov __tmp_reg__,%A2 \n"
      "lsl __tmp_reg__\nlsl __tmp_reg__\nlsl __tmp_reg__\n"  /*__tmp_reg__=size*8*/

      "conv_bcd2bin00: \n"          /*shift loop*/

        "lsr %D0\nror %C0\nror %B0\nror %A0\n" /*shift out buffer*/

        "sbrc %A1,0 \n"
        "sbr %D0,0x80 \n"

        "lsr %D1\nror %C1\nror %B1\nror %A1\n"

        "sbrc %D1, 7 \n"            /*if carry to bit 7,*/
        "subi %D1, 0x30 \n"         /*subtract 0x30*/
        "sbrc %D1, 3 \n"            /*if carry to bit 3,*/
        "subi %D1, 3\n"             /*subtract 3*/
        "sbrc %C1, 7 \n"            /*if carry to bit 7,*/
        "subi %C1, 0x30 \n"         /*subtract 0x30*/
        "sbrc %C1, 3 \n"            /*if carry to bit 3,*/
        "subi %C1, 3\n"             /*subtract 0x30*/
        "sbrc %B1, 7 \n"            /*if carry to bit 7,*/
        "subi %B1, 0x30 \n"         /*subtract 0x30*/
        "sbrc %B1, 3 \n"            /*if carry to bit 3,*/
        "subi %B1, 3\n"             /*subtract 3*/
        "sbrc %A1, 7 \n"            /*if carry to bit 7,*/
        "subi %A1, 0x30 \n"         /*subtract 0x30*/
        "sbrc %A1, 3 \n"            /*if carry to bit 3,*/
        "subi %A1, 3\n"             /*subtract 3*/

        "dec __tmp_reg__ \n"        /*repeat for all bits*/
        "brne conv_bcd2bin00 \n"
              
      "conv_bcd2bin01: \n"
        "mov __tmp_reg__,%D0 \n"
        "mov %D0,%C0 \n"
        "mov %C0,%B0 \n"
        "mov %B0,%A0 \n"
        "mov %A0,__tmp_reg__ \n"
        "dec %A2 \n"
        "brne conv_bcd2bin01 \n"

  : "=r" (result) :"r" (data), "r" (size) : "r2"
  );
 return(result);
}
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I've started coding it in asm and ran into some weird problem:

putdec:
	mov	d1, number_L	;d1 = (n>>4) & 0xF
	swap	d1
	andi	d1, 0x0F
	
	mov	d2, number_H	;d2 = (n>>8) & 0xF
	andi	d2, 0x0F
	
	mov	d3, number_H	;d3 = (n>>12) & 0xF
	swap	d3
	andi	d3, 0x0F

	mov	d0, d1			;d0 = 6*(d3 + d2 + d1) + (n & 0xF)
	add	d0, d2
	add	d0, d3
	ldi	r16, 6
	mul	d0, r16
	mov	d0, r0
	mov	r16, number_L
	andi	r16, 0x0F
	add	d0, r16
	
	ldi	r16, 0x9A	;q = (d0 * 0x19A) >> 12
	mul	d0, r16
	add	r1, d0
	swap	r1
	mov	q, r1
	andi	q, 0x0F
	
	ldi	r16, 10		;d0 = d0 - 10 * q
	mul	q, r16
	sub	d0, r0

ret

I tested it with number_H:L = 0xFFFF and d0 is now 0x09. But as 0xFFFF = 65525 I expected d0 to be 0x05 now :shock:
All registers are in the high block (d0..d4, q, Number_L:H)
I've checked the multiply d0 * 0x19A but it gives the same results in the sim as in the calculator.

Christoph

I tend to post off-topic replies when I've noticed some interesting detail.
Feel free to stop me.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi,

ldi   r16, 0x9A   ;q = (d0 * 0x19A) >> 12 
   mul   d0, r16 
   add   r1, d0 
   swap   r1 
   mov   q, r1 
   andi   q, 0x0F 

I don't have the intelligence or experience to figure out how this code works but I notice your comment mentions 0x19A but you are loading r16 with 0x9A. Sorry if this taken care of in some way that I don't see. (Please note that I am not knocking your code, I just don't have the knowledge and experience yet to figure out what you are doing.)

[on soapbox]
Man, I REALLY regret not going on to a 4 year college. You young people out there hear this I hope.)
[off soapbox]

Regards,
Steve

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I admit this one is a little tricky.

I don't have the intelligence or experience to figure out how this code works 

yes of course you do :D
Assuming A is a 16 bit number then A = AL + 256 * AH.
A * B = (AH*256 + AL)*(BH*256 + BL) = AH*BH*256*256 + AH*256*BL + AL*BH*256 + AL*BL.
In this case: A = d0 and B = 0x19A, so AH = 0 and BH = 1 which will make our above term look like this:
A*B = AL*0x01*256 + AL * 0x9A
The second part of the sum is done first: d0 * 0x9A.
Then d0 is added to the high register of the result. That's the first part of the sum and should be the same as the original code. The higher order result regs are not taken care of as

(d0 * 0x19A) >> 12 

only uses bits 15...12.

Christoph

I tend to post off-topic replies when I've noticed some interesting detail.
Feel free to stop me.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I think I found it. with n = 0xFFFF (and similar high numbers) q is greater than 0xF and then my code doesn't take care of the carry which is - as it seems - important in this case.

I tend to post off-topic replies when I've noticed some interesting detail.
Feel free to stop me.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

SteveN wrote:
Of course the C code presented is pure ______ (any foreign language to an English speaking American inserted here...don't want to offend anyone 8) ) to me.

I think the word "gibberish" is sufficiently non-locale-specific to be inoffensive and can be used in this context. Unless of course you live in Gibber. :)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Interesting approach, but I fear, the needed divisions and multiplications decrease the speed dramatically.

On my view the subtraction method should be the fastest way on the AVRs.
Following my optimized version. It swing around zero, so subtraction and comparison are done simultaneous:

;*************************************************************************
;*                                                                       *
;*                      Convert unsigned 32bit to ASCII                  *
;*                                                                       *
;*              Author: Peter Dannegger                                  *
;*                      danni@specs.de                                   *
;*                                                                       *
;*************************************************************************
;
;input: R31, R30, R29, R28 = 32 bit value 0 ... 4294967295
;output: R25, R24, R23, R22, R21, R20, R19, R18, R17, R16 = 10 digits (ASCII)
;
bin32_ascii:
        ldi     r25, -1 + '0'
_bcd1:  inc     r25
        subi    r29, byte2(1000000000)  ;-1000,000,000 until overflow
        sbci    r30, byte3(1000000000)
        sbci    r31, byte4(1000000000)
        brcc    _bcd1

        ldi     r24, 10 + '0'
_bcd2:  dec     r24
        subi    r29, byte2(-100000000)  ;+100,000,000 until no overflow
        sbci    r30, byte3(-100000000)
        sbci    r31, byte4(-100000000)
        brcs    _bcd2

        ldi     r23, -1 + '0'
_bcd3:  inc     r23
        subi    r28, byte1(10000000)    ;-10,000,000
        sbci    r29, byte2(10000000)
        sbci    r30, byte3(10000000)
        sbci    r31, 0
        brcc    _bcd3

        ldi     r22, 10 + '0'
_bcd4:  dec     r22
        subi    r28, byte1(-1000000)    ;+1,000,000
        sbci    r29, byte2(-1000000)
        sbci    r30, byte3(-1000000)
        brcs    _bcd4

        ldi     r21, -1 + '0'
_bcd5:  inc     r21
        subi    r28, byte1(100000)      ;-100,000
        sbci    r29, byte2(100000)
        sbci    r30, byte3(100000)
        brcc    _bcd5

        ldi     r20, 10 + '0'
_bcd6:  dec     r20
        subi    r28, byte1(-10000)        ;+10,000
        sbci    r29, byte2(-10000)
        sbci    r30, byte3(-10000)
        brcs    _bcd6

        ldi     r19, -1 + '0'
_bcd7:  inc     r19
        subi    r30, byte1(1000)          ;-1000
        sbci    r31, byte2(1000)
        brcc    _bcd7

        ldi     r18, 10 + '0'
_bcd8:  dec     r18
        subi    r30, byte1(-100)          ;+100
        sbci    r31, byte2(-100)
        brcs    _bcd8

        ldi     r17, -1 + '0'
_bcd9:  inc     r17
        subi    r30, 10                 ;-10
        brcc    _bcd9

        subi    r30, -10 - '0'
        mov     r16, r30
        ret
;-------------------------------------------------------------------------
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

And now the 16 bit version and in C:

char digit[5];


void bin2bcd( unsigned int val )
{
  char i;

  i = '0' - 1;
  do
    i++;
  while( !((val -= 10000) & 0x8000) );
  digit[4] = i;

  i = '0' + 10;
  do
    i--;
  while( (val += 1000) & 0x8000 );
  digit[3] = i;

  i = '0' - 1;
  do
    i++;
  while( !((val -= 100) & 0x8000) );
  digit[2] = i;


  i = '0' + 10;
  do
    i--;
  while( (val += 10) & 0x8000 );
  digit[1] = i;

  digit[0] = val | '0';
}

Peter

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Another 16-bit version in asm:

;**************************************************************************
;*
;* "Convert16" - 16-bit unsigned Binary to ASCII conversion
;*
;* Ths subroutine converts an unsigned 16-bit number (XH:XL)
;* to a 5-digit BCD number represented by 5 bytes (r7:r6:r5:r4:r3).
;*
;* MSD of the 5-digit number is placed in r7,
;*
;* The ASCII-coded digits are stored in "ASCII_digits" array
;*
;* Note: array structure (offsets from start address):
;* .DSEG
;* .org [last_equ]+1
;* ASCII_digs:   .BYTE 11     ; reserve 11d bytes for ASCII digits:
;*				; offset 0 for sign character ("+" or "-")
;*				; offset 1 for msc (ASCII code, digit 9)
;*				; offset 2 for character (ASCII code, digit 8)
;*				; 	"
;*				; 	"
;*				; 	"
;*				; 	"
;*				; 	"
;*				; 	"
;*				; 	"
;*				; offset 10 for lsc (ASCII code, digit 0)
;*
;* Register usage:
;*	r3             BCD value digit 0 (ones)
;*	r4             BCD value digit 1 (tens)
;*	r5             BCD value digit 2 (hundreds)
;*	r6             BCD value digit 3 (thousands)
;*	r7             BCD value digit 4 (tenthousands)
;*	XL             binary value LSB (16bit: Low byte)
;*	XH             binary value (16bit: High byte)
;*	r24            temporary value & output char.
;*
;* Number of words      :83
;* Number of cycles     :242-272 (incl. push/pop etc.)
;* Low registers used   :7 (r3,r4,r5,r6,r7)
;* High registers used  :5 (X,Z,r24)
;* Pointers used        :Z,X
;*
;* All registers saved
;*
;* Optimized conversion code by John Payson, port to AVR by A.L
;*
;* Note: the basic algorithm computes the BCD digits from the "binary digits"
;* (input) and represents them as negative numbers to allow a very
;* efficient "conversion by subtraction" method (~180 cycles total for
;* the 16-bit-binary to 5-digit-bcd conversion).
;*
;**************************************************************************

Convert16:
	push	r24
	push	ZH
	push	ZL
	push	XH
	push	XL
	push	r3
	push	r4
	push	r5
	push	r6
	push	r7

	; implement equations, make BCD values negative
	mov	r24,XH
	swap	r24
	andi	r24,$0F
	subi	r24,-$F0
	mov	r6,r24
	add	r6,r24
	subi	r24,-$E2
	mov	r5,r24
	subi	r24,-$32
	mov	r3,r24

	mov	r24,XH
	andi	r24,$0F
	add	r5,r24
	add	r5,r24
	add	r3,r24
	subi	r24,-$E9
	mov	r4,r24
	add	r4,r24
	add	r4,r24

	mov	r24,XL
	swap	r24
	andi	r24,$0F
	add	r4,r24

	add	r3,r24
	rol	r4
	rol	r3
	com	r3
	clc			; compensate unwanted "+1" (A.L.)
	rol	r3

	mov	r24,XL
	andi	r24,$0F
	add	r3,r24
	rol	r6

	ldi	r24,$07
	mov	r7,r24

	; BCD digits are in 2's complement form now and made
	; negative numbers (except for the "10K" digit in
	; r7, which is regarded as a positive number)

	ldi	r24,$0A		; load "10" for "normalizing"

Lb1:	; "normalize" BCD digits - "/10" & "mod 10" simultaneously
	add	r3,r24
	dec	r4
	brcs	Lb2
	rjmp	Lb1

Lb2:
	add	r4,r24
	dec	r5
	brcs	Lb3
	rjmp	Lb2

Lb3:
	add	r5,r24
	dec	r6
	brcs	Lb4
	rjmp	Lb3

Lb4:
	add	r6,r24
	dec	r7
	brcs	Lb5
	rjmp	Lb4

Lb5:				; convert and store BCD digits to array
	ldi XL,low(ASCII_digs)  ; Load pointer
	ldi XH,high(ASCII_digs)
	adiw	XL,6		; set it to ASCII array offset 6 (digit 4)

        clr     ZH		; (5 unpacked BCD digits = 5 regs)
        ldi     ZL,8 		; 16 bit: address+1 of last BCD data register (r7)
Lb6:
        ld      r24,-Z		; pre-decrement
	subi    r24,-'0'        ; convert to ASCII
	st	X+,r24		; store ASCII digits to SRAM array
        cpi     ZL,4		; address +1 of first BCD data register (r3)
        brsh    Lb6		; loop until all 5 digits are stored

	pop	r7		; EXIT module
	pop	r6
	pop	r5
	pop	r4
	pop	r3
	pop	XL
	pop	XH
	pop	ZL
	pop	ZH
	pop	r24

	ret

;**** End of Convert16 Function ---------------------------------------****

Andreas

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hmmm...why didn't all these algorithms pop up in the uncounted discussions about itoa before? :o

Christoph

I tend to post off-topic replies when I've noticed some interesting detail.
Feel free to stop me.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi.

If you are looking for speed and have enough program memory to spend, here's a routine from me.
Loops make your code slower and the cycles aren't always the same.
In this routine, there's not a single loop and the cycles are always the same.

If you are interested in other fast routines in assembly, feel free to contact me

dpagrafio@the.forthnet.gr

Attachment(s): 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

oops! I just noticed that the cycles aren't exactly the same but anyway, it's still faster than using loops :D

bye

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi,

Here is my lowly "subtraction method" of Binary to BCD (not ASCII) conversion. This is my first attempt at this type of code so don't laugh so hard that you hurt yourself :D .

I have not exhaustively tested it (as in every number between 0 and 65535... I did figure out a way to test it in assembly by starting at 0, doing a conversion, send the result via RS232, incrementing, doing a conversion, send result, blah, blah). That could be a cool project (for a newcomer like me). It seemed to work for the 10 or so numbers I put in.

As the opening comments say, this is NOT a callable routine right now. Just coded it up to work in the simulator.

; The purpose of this code is to convert a 16 bit binary value into
; its equivalent 5 digit BCD code (0xFFFF = 6 5 5 3 5 for example).
; I will use the subtraction method to perform this conversion. This
; involves subtracting 10,000; 1,000; 100; 10 and 1 from the binary
; number to be converted until the result is < zero. The subtractions
; are counted separately for each decade category. The resulting 
; counts are the BCD codes for that binary number.

; The BCD results will be available in registers r0, r1 and r2. 

; r0 will contain the 5th (10,000's) digit (low nibble, bits 0-3)

; r1 will contain the 4th (1,000's) and 3rd (100's) BCD digits
; (high nibble = 4th digit, low nibble = 3rd digit). 

; r2 will contain the 2nd (10's) and 1st (1's) BCD digits (high 
; nibble = 2nd digit, low nibble = 1st digit

; subt = name of register containing amount to be subtracted
;
; This code includes the binary number to be converted. It would
; certainly be possible to have a routine that calls this routine.
; The calling routine would be responsible for providing the binary
; number to be calculated...obviously.
;
; Registers used = 9 (r0, r1, r2, r16, r20, r21, r22, r23, r26)
; If this routine is made into a callable routine then these
; registers will need to be "pushed" and "popped" to save and 
; restore them. But, remember that r0 - r2 contain the resulting
; BCD results so "popping" them right before exiting the routine
; would destroy these BCD values. I will need to either assign
; permanent registers or assign SRAM addresses and perform the
; necessary movement to/from SRAM and registers.
;
; According to the AVR Studio v4.09 simulator the below code took
; 65.13uS to execute (from the clr r0 instruction, based on a 
; 8.00MHz clock and using a binary number = 64,999...based on a 
; very quick assumption that having as many 9's as possible would
; result in the longest elapsed time - I could be very wrong about 
; that assumption though)
;
.nolist
.include "m169def.inc"
.list


.def	subtlo	=r20			; subtlo and subthi will be used to 
.def	subthi	=r21			; store one of 5 values:
								; 10,000 (0x2710); 1,000 (0x03E8);
								; 100 (0x0064); 10 (0x000A) and 
								; 1 (0x0001)
;
.def	binlo	=r22			; binlo and binhi contain the 16 bit
.def	binhi	=r23			; number to be converted to BCD
;
.def	temp	=r16

;***** Code
;
.cseg

.org	0x0000
	jmp 	RESET 				; Reset Handler

.org	0x002E					; m169 code space starts here

RESET:							; Main program start
	ldi		temp, high(RAMEND)	; Stack pointer = top of internal SRAM
	out 	SPH,temp  
	ldi 	temp, low(RAMEND)  
	out 	SPL,temp  
;
; Setup binary number to be converted
	ldi		binlo,low(0xfde7)	; decimal 64999
	ldi		binhi,high(0xfde7)
;
	clr		r0					; clear BCD result registers
	clr		r1
	clr		r2
	clr		XH					; high byte of X register=0
;
	ldi		subtlo,low(0x2710)	; set subt regs = 10,000
	ldi		subthi,high(0x2710)
	ldi		XL,0x00				; low byte of X register=0 (points
								; (to r0, the 10,000's digit)
	call	Compare				; see how many 10,000's digits
;
	ldi		subtlo,low(0x03E8)	; set subt regs = 1,000
	ldi		subthi,high(0x03E8)
	inc		XL					; set X register to point to r1
;
	call	Compare				; see how many 1,000's digits
;
	ldi		subtlo,low(0x0064)	; set subt regs = 100
	clr		subthi				; (remains cleared for remainder of
								;  this routine)
	swap	r1					; put 1,000's digit in high nibble
								; of r1
;
	call	Compare				; see how many 100's digits (100's
								; digit is counted in lower nibble
								; of r1
	ldi		subtlo,0x0A			; set subt regs = 10
								; subthi already = 0
	inc		XL					; set X register to point to r2
;
	call	Compare				; see how many 10's digits
;
	ldi		subtlo,0x01			; set subt regs = 1
	swap	r2					; put tens digit in upper nibble of
								; r2
	call	Compare				; see how many 1's digits (ones
								; digit is counted in lower nibble
								; of r2)
Loop:
	rjmp	Loop
;
;
Compare:
	cp		binlo,subtlo			; 16 bit compare code
	cpc		binhi,subthi			;  (courtesy Atmel)
	brlo	Return
	sub		binlo,subtlo			; 16 bit subtraction code
	sbc		binhi,subthi			;  (courtesy Atmel)
;
	ld		temp,X				; temp = contents of location
								; pointed to by X register (r0-r2)
	inc		temp
	st		X,temp				; update location pointed to by
								; by X register w/ contents of temp
	rjmp	Compare
Return:
	ret
.exit

[edit] Sorry, the tabs don't line up like I would like...it was much "prettier" in Studio.

Regards,
Steve

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Conversion of binary numbers of any size. Uses shifts and bcd correction. Numbers stored in memory from lsb to msb.

;Convert binary number in memory  to packed BCD
;input : XH:XL=address of lowest byte of input number;
;           YH:YL=address of lowest byte of output number;
;           R24-input number size ,bytes;R25-output number size,bytes.
;uses  : R16,R17,R18,R19,R30,R31

conv_bin2bcd:
 movw r30,r28      ;Z->output buffer
 mov r16,r25         ;r16=output buffer size
 eor r17,r17
conv_bin2bcd05:
 st Z+,r17               ;Clear output buffer
 dec r16
 brne conv_bin2bcd05

 mov r16,r24
 lsl r16
 lsl r16
 lsl r16                     ;R16=input bits counter

conv_bin2bcd10: ;input bits shift loop
  mov r17,r24          ;R17=input buffer size
  movw r30,r26      ;Z->input buffer
 
 conv_bin2bcd20:  ;Shift input buffer loop
  ld r18,Z
  rol r18
  st Z+,r18
  dec r17
  brne conv_bin2bcd20

 mov r17,r25           ;r17=output buffer size
 movw r30,r28        ;Z->output buffer

 conv_bin2bcd30:  ;Shift output buffer loop
  in r19,SREG         ;remember carry flag
  ld r18,Z                ;read byte
  subi r18,-3           ;BCD correction
  sbrs r18,3
  subi r18,3
  subi r18,-0x30
  sbrs r18,7
  subi r18,0x30
  out SREG,r19      ;restore carry flag
  rol r18                 ;Shift bit from input buffer to output buffer
  st Z+,r18             ;store byte back
  dec r17
  brne conv_bin2bcd30;Repeat for all output buffer

 dec r16
 brne conv_bin2bcd10  ;Repeat for every input bit

ret

Last Edited: Wed. Jul 21, 2004 - 06:17 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Wow, my month-old post has gotten a lot of play in the past few days.

My powers-of-10-with-subtract routine has served me fairly well, and it is small enough and fast enough for 16-bit work. I've modified it from a "standard" itoa()-type routine for display purposes: typically right-justify; leading-0 supression or not; handle full 16-bit unsigned; insert an implied decimal point where needed; etc.

As luck would have it, a new app needs similar features for 32-bit--kind of an ltoa() mod. I hope to be able to take many of these postings & do some testing on size & speed.

Lee

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi all,

here's one for the real freaks to chew on :twisted:
I've tried to implement that algorithm and tried to write very readable code, so you should be able to understand it. I've tried it with 0xFFFF as the number to be displayed. The display routine just writes the resulting digits to sram.

First problem: I used Studio 4, as I currently don't have version 3.5 installed. This might be a source for errors - maybe my code works, but the sim doesn't know that :?

The code for the very last digit seems to work, as but the rest gives rubbish. Maybe some fresh eyes can see some error, I didn't. All c lines are in there as comments, so what the code *should* do can be seen in the first line of every code block. At the side I've added comments as well, mainly for the multiply/add operations.

It seems that the algorithm as it is written down in the final version (in the document by Mr. Jones) doesn't only need 8-bit variables, but 16 bits from time to time. d0 and q for example need 16 bits each to give correct results.

The code is VERY register hungry, but I didn't optimize it at all (a working version could be optimized, but as it doesn't work, well there's no need for optimization).

Christoph

Attachment(s): 

I tend to post off-topic replies when I've noticed some interesting detail.
Feel free to stop me.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hello buffi,

I just tried your code and noticed that some of C code comments do not match your assembly code. Even with my corrections, the answer comes out to be 63125. This is my first attempt at trying to implement this neat algorithm so I do not know where the other problems are.

The first one is at ;d1 = q + 9*d3 + 5*d2 + d1. Your first line of assembly has add d1,ql instead of add d1,d1

The next one has to do with the ; d2 = q + 2 * d2. I believe the add d2,r0 should be mov d2,r0. This also happens in the section for ; d3 = q + 4 * d3.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Ok, I am a idiot. The line of add d1,ql that I thought :oops: was a mistake and should of been replaced by add d1,d1 needs to be removed all together. The register d1 already contains the value so adding to it is a mistake. I just made the corrections (this one and the two other stated earlier) and the answer is 65535.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

OK, my code now works as well. The errors I made were a bit dumb, I must admit that :shock:
I am though quite happy about the multiply operations, they caused the greatest headache, but thinking a bit more about them was worth it as it seems.

I've attached the new (now working version of the code) for others to download. It's 124 words including init code.
Now we can start optimizing it :D

Christoph

EDIT:
Conversion timing: 0xABCD needs 187 cycles, 0xFFFF needs 197 cycles. I don't know which kind of values needs the longest time, but the variations are due to the d2 = d2 % 10 operation which is done in a loop.

EDITEDIT:
Replaced the file by new one with prettier formatting! NO code changes.

Attachment(s): 

I tend to post off-topic replies when I've noticed some interesting detail.
Feel free to stop me.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

buffi wrote:

It's 124 words including init code.
Conversion timing: 0xABCD needs 187 cycles, 0xFFFF needs 197 cycles.

Only for comparison:

subtraction method: 20 words, 20...170 cycle (without call, return)

Peter

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

That's about 8 cycles per digit increment - that's ok for a subtraction algorithm. My question is: Why does this "new" (I don't know how old it is) algorithm perform so bad on AVRs with my version of the code? Has anyone seen big performance brakes in there?

Christoph

I tend to post off-topic replies when I've noticed some interesting detail.
Feel free to stop me.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I tried to optimize the algorithm and got down to 52 bytes of code, 8 registers and 70 clocks (including call and return) but there is a problem with the value of 159. What I did was to simplify all five of the "divide by 10" with a multiply by a factor of 26 / 256. I guess there is a rounding error so the result I am getting is 1, 5, 255. I do not know if I can simply say if the remainder is 255 then use 9 instead since I have only checked 0 - 160, 12345, 32768 and 65535.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

:shock: WOW!
Was that optimization done on my code or did you optimize the original algorithm? Can I have a look at it? I'd like to see if we can find the error together as the usage statistics you gave are quite promising.

Christoph

I tend to post off-topic replies when I've noticed some interesting detail.
Feel free to stop me.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

This method for Bin2Bcd is not new, I saw it in an electronics mag some time ago written for the PIC micro.

This multiply and shift method of approximating a divide by 10 is really only much use on those advanced processors which have a barrel shifter (I think its called that). They can apply a shift of a specified number of bits to a source register in one instruction. Therefore your "q = (d0 * 0x19A) >> 12" can be done in one instruction.

Here is my GCC version of this method:
It is not scalable to 32-bits however.

//Bin2Bcd2 16-Bit binary to unpacked BCD output
//This uses a method I saw in an electronics mag.
//It is highly sneaky; converting hex digits to decimals by assuming that 4096 is really (5000-4)
//and 256 is (240+16) etc.
//Despite the wierdness it is the fastest method by a huge margin. (Ave=167cy) (96 Bytes)
//---------------------------------------------------------------
void Bin2Bcd2 (unsigned int c, char *dest)
{
char tmp, tenThou, thou, hund, tens, units;

//Using hex digits: c=0xABCD
//
//      tenThou thou hund tens units
//Initial:      4096  256  16    1
//               A     B   C     D

units=(char)c; hund=BYTEB(c);
tens=units; thou=hund;
units&=0x0F;  hund&=0x0F;
__asm__ ( " swap %0 \n" : "=r" (tens) : "0" (tens) ); tens&=0x0F;
__asm__ ( " swap %0 \n" : "=r" (thou) : "0" (thou) ); thou&=0x0F;

//Init done. Now perform the maths.
tmp=(thou+hund+tens)<<2; tmp+=20;
units-=tmp;							//units= D-4(A+B+C)-20
hund*=2; tens*=2;
tens+=hund; tens+=hund; tens+=hund;
tens-=138;							//tens = 6B+2C-138
hund+=thou-46;						//hund = A+2B-46
thou = 4*thou-64;					//thou = 4A-64
tenThou=7;							//tenThou = 7 (constant init)

//Maths done, all -VE except tenThou. Now Normalise each digit by adding 10 until +ve

__asm__ (
	"1:	dec %1		\n"
	"	subi %0,-10	\n"
	"	brcs 1b		\n"
	: "=r" (units),  "=r" (tens) : "0"  (units),  "1"  (tens)
	);

__asm__ (
	"1:	dec %1		\n"
	"	subi %0,-10	\n"
	"	brcs 1b		\n"
	: "=r" (tens),  "=r" (hund) : "0"  (tens),  "1"  (hund)
	);

__asm__ (
	"1:	dec %1		\n"
	"	subi %0,-10	\n"
	"	brcs 1b		\n"
	: "=r" (hund),  "=r" (thou) : "0"  (hund),  "1"  (thou)
	);

__asm__ (
	"1:	dec %1		\n"
	"	subi %0,-10	\n"
	"	brcs 1b		\n"
	: "=r" (thou),  "=r" (tenThou) : "0"  (thou),  "1"  (tenThou)
	);

*dest++=tenThou;
*dest++=thou;
*dest++=hund;
*dest++=tens;
*dest=units;
}

I would like to see danni's 20 word, 20 cycle subtraction code

Nigel

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

This method for Bin2Bcd is not new, I saw it in an electronics mag some time ago written for the PIC micro.

This multiply and shift method of approximating a divide by 10 is really only much use on those advanced processors which have a barrel shifter (I think its called that). They can apply a shift of a specified number of bits to a source register in one instruction. Therefore your "q = (d0 * 0x19A) >> 12" can be done in one instruction.

Here is my GCC version of this method:
It is not scalable to 32-bits however.

 

//Bin2Bcd2 16-Bit binary to unpacked BCD output
//This uses a method I saw in an electronics mag.
//It is highly sneaky; converting hex digits to decimals by assuming that 4096 is really (4100-4)
//and 256 is (260-4) and 16 is (20-4). This common -4 appears in the maths below.
//Despite the weirdness it is the fastest method by a huge margin. (Ave=167cy) (96 Bytes)
//---------------------------------------------------------------
void Bin2Bcd2 (unsigned int c, char *dest)
{
char tmp, tenThou, thou, hund, tens, units;

//Using hex digits: c=0xABCD
//
// tenThou thou hund tens units
//Initial: 4096 256 16 1
// A B C D

units=(char)c; hund=c/256);
tens=units; thou=hund;
units&=0x0F; hund&=0x0F;
__asm__ ( " swap %0 \n" : "=r" (tens) : "0" (tens) ); tens&=0x0F;
__asm__ ( " swap %0 \n" : "=r" (thou) : "0" (thou) ); thou&=0x0F;

//Init done. Now perform the maths.
tmp=(thou+hund+tens)<<2; tmp+=20;
units-=tmp; //units= D-4(A+B+C)-20
hund*=2; tens*=2;
tens+=hund; tens+=hund; tens+=hund;
tens-=138; //tens = 6B+2C-138
hund+=thou-46; //hund = A+2B-46
thou = 4*thou-64; //thou = 4A-64
tenThou=7; //tenThou = 7 (constant init)

//Maths done, all -VE except tenThou. Now Normalise each digit by adding 10 until +ve

__asm__ (
"1: dec %1 \n"
" subi %0,-10 \n"
" brcs 1b \n"
: "=r" (units), "=r" (tens) : "0" (units), "1" (tens)
);

__asm__ (
"1: dec %1 \n"
" subi %0,-10 \n"
" brcs 1b \n"
: "=r" (tens), "=r" (hund) : "0" (tens), "1" (hund)
);

__asm__ (
"1: dec %1 \n"
" subi %0,-10 \n"
" brcs 1b \n"
: "=r" (hund), "=r" (thou) : "0" (hund), "1" (thou)
);

__asm__ (
"1: dec %1 \n"
" subi %0,-10 \n"
" brcs 1b \n"
: "=r" (thou), "=r" (tenThou) : "0" (thou), "1" (tenThou)
);

*dest++=tenThou;
*dest++=thou;
*dest++=hund;
*dest++=tens;
*dest=units;
}

 

I would like to see danni's 20 word, 20 cycle subtraction code

Nigel

Last Edited: Sun. Apr 3, 2016 - 03:03 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I worked on it last night and I think it is working. I had to replace the first two divide by 10s with the 410/4096 factor due to horrible rounding errors. I wrote some Visual C++ code to test the algorithm and there are around 20,000 errors when using 26/256 for both divisions and 10,000 when using 410/4096 for d0 and 26/256 for d1.

I think it added approximately 15 words of code, another register and the cycle count was somewhere around 90. After going to bed, I think I can remove another 10 cycles from it.

This code uses the MUL instruction and will only work for 16 bit unsigned numbers. It is the caller's responsibility to do the two's complement for negative numbers.

I started with Buffi's file and optimized the math. I removed the loop for the mod 10 and with the d1 (or d2 can't remember) = d1 - q*10 code. For the multiplication, I multiplied the LOW(d0) with LOW(410) and kept the total in R1:R0. Then instead of multipling HIGH(410) * LOW(d0), I added the LOW(d0) to R1 since the HIGH(410) is always 1. I set the T flag if a carry happened. The maximum value for d0 = 6*(d1 + d2 + d3) + d0 is 285 so the HIGH(d0) will either be 0 or 1. So if the HIGH(d0) is non-zero, I added the LOW(410) to R1. Since the HIGH(410) is always 1 a, 256 also needs to be added to R1 if the HIGH(d0) is non-zero. That means always setting the T flag if the HIGH(d0) is non-zero. The multiplication result is T:R1:R0 which needs to be divided by 4096 or right shifted by 12. That means keeping only the upper nibble of R1, swapping it to the lower nibble and loading bit 4 with T.

I did the same thing for d1 but I believe the maximum value for d1 = q + 9*d3 +5*d2 +d1 is 253 so the HIGH(d1) will always be 0. This is where the ten cycles can be removed.

I will post the code later today. I am also going to try to skip the d1, d2, d3, d4 code if their digits are zero. This will make the cycle count vary. I believe it will take more cycles for testing then actually doing it but that remains to be determined.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

This version of the binary to BCD takes 88 cycles, 71 words of code, and 10 registers. I think some more cycles and maybe a register could be removed but I am tired of looking at it. This passes my POGE test. There is test code included that calls the routine with every possible value (0 .. 65535) and outputs the result in ASCII out UART0 at 38400 bps assuming a 12.288 MHz crystal is being used.

This routine uses the MUL instruction so it will not on AVRs that do not have that instruction. I do not think I have any other ATmega specific code though. It only works for a 16 bit unsigned value. If you want a signed value then do a two's complement before calling it.

I am planning on doing a 32 bit version but I do not know when I will be able to work on it.

Attachment(s): 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

1/6 incorrect values sounds like a lot, but in many apps the low digit could be dropped anyway--like calculating percentages to the hundreth but only displaying to the tenth. Ate all the errors only +/-1? Or even better, all +1 or -1?

Lee

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

No, the errors were not all by -1, the one's digit would be off by -10. Those could be corrected by adding ten to the one's digit if bit 7 was set. But the error would cascade into the ten's digit and hundred's digit so that is why I had to use the 410/4096 for the one's digit divide by ten operation.

The code I posted does not suffer from these problems.

For example when using 26 / 256 for all divide by ten operations, the number to convert is in the left column followed by each of the BCD digits.
7587, 0, 7, 5, 8, 7
7588, 0, 7, 6, 255, 254
7589, 0, 7, 6, 255, 255
7590, 0, 7, 5, 9, 0

10588, 1, 0, 5, 8, 8
10589, 1, 0, 6, 255, 255
10590, 1, 0, 6, 255, 0
10591, 1, 0, 5, 9, 1

Using 410/4096 for the one's digit divide by ten and using 26/256 for the ten's digit resulted in the ten's digit being 255 sometimes. It looked like when that happened, the hundred's digit was off by +1. That probably could of been corrected but I decided to just go ahead and use 410/4096 also.

2289, 0, 2, 2, 8, 9
2290, 0, 2, 3, 255, 0
2291, 0, 2, 3, 255, 1

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

N.Winterbottom wrote:
I would like to see danni's 20 word, 20 cycle subtraction code

Its not 20 cycle, its 20...170 cycle.

Its my C example above, written in assembler.

Peter

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I like the C version (easy for me to incorporate), but for me it had problems with unsigned's larger than 32768 (negatives), I modified the routine to check for the Carry flag in SREG and this seems to work.

char digit[5];

void bin2bcd( unsigned int val ) 
{ 
  char i; 
  
  i = '0' - 1; 
  do
  { 
    i++; 
  	val -= 10000;
  }
  while(!(SREG & (_BV(0)))); 
  digit[4] = i; 

  i = '0' + 10; 
  do 
  {
    i--; 
	val += 1000;
  }
  while((SREG & (_BV(0))) ); 
  digit[3] = i; 

  i = '0' - 1; 
  do 
  {
    i++; 
  	val -= 100;
  }
  while(!(SREG & (_BV(0)))); 
  digit[2] = i; 

  i = '0' + 10; 
  do 
  {
    i--; 
	val += 10;
  }
  while(!(SREG & (_BV(0)))); 
  digit[1] = i; 

  digit[0] = val | '0'; 
} 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

My version of another algorithm

//HEXTOBCD-HEXTOBCD-HEXTOBCD-HEXTOBCD-HEXTOBCD-HEXTOBCD-HEX

void hex_to_bcd(unsigned long information)
{
 res0=0,res1=0,res2=0,res3=0;
 res4=0,res5=0;
//	test value of tbfr_h
//	tbfr_h = 0x2710;	//10000
//	tbfr_h = 0x03e8;	//1000
//	tbfr_h = 0xea60;	//60000

//information=tbfr_h;

unsigned int s=0;		
//	s - use for devide by displacement

	while(s != 32)   // 32 for long type 16 -int
	{
		//Öèêëè÷åñêèé ñäâèã èíôîðìàöèè //Cyclic Shift
		res5 = (res5<<1)+((res4>>3)&1);
		res4 = (res4<<1)+((res3>>3)&1);
		res3 = (res3<<1)+((res2>>3)&1);
		res2 = (res2<<1)+((res1>>3)&1);
		res1 = (res1<<1)+((res0>>3)&1);
		res0 = (res0<<1)+(information>>31);  //15 for int
		//// Multiply by 2
		information = (information<<1);
		//nible 
		res0 = res0&0x0f;
		res1 = res1&0x0f;
		res2 = res2&0x0f;
		res3 = res3&0x0f;
		res4 = res4&0x0f;
		res5 = res5&0x0f;
		if(s != 31)
		{//Decimal correction
			if(res5 > 4)
				res5 = res5+3;
			if(res4 > 4)
				res4 = res4+3;
			if(res3 > 4)
				res3 = res3 + 3;
			if(res2 > 4)
				res2 = res2+3;
			if(res1 > 4)
				res1 = res1+3;
			if(res0 > 4)
				res0 = res0+3;
		}
		s++;
	}
	buf1 = digits[res0];//last significant
	buf2 = digits[res1];
	buf3 = digits[res2];
	buf4 = digits[res3];//most significant
}
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I assume packed BCD is outdated today.

It was only useful on former days, when computers are extremely short on RAM..

Today a whole byte per digit (ASCII or 7-segment code) is many more convenient.

Especially since packed BCD need more words and cycles on the AVR.

Peter

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

danni wrote:
I assume packed BCD is outdated today.

Maybe "today", but not necessarily "yesterday".

"Yesterday" we had several production designs based on AT90S4433. There weren't any Mega8's. When Mega8's did appear, they were US$1+ more costly than '4433.

With 128 bytes of SRAM yet a sizable amount of flash, large '4433 apps can get real tight on SRAM. Also, packed BCD allowed me to use registers for building a 6x 7-segment output; I wouldn't have had enough working registers for unpacked; the unpacking only was needed in the display routine itself.

"Today" a Mega8/88 with 8x SRAM of '4433 (actually since about 2 years ago when the Mega8 price dropped to ~US$2/100 qty) I might agree with you. :) But, as always, it all depends on the particular app.

Lee

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Heres the most efficient binary to bcd routine ive seen, except that its in PIC

;********************************************************************
;                Binary To BCD Conversion Routine (16 Bit)
;                       (LOOPED Version)
;
;      This routine converts a 16 Bit binary Number to a 5 Digit
; BCD Number.
;
;       The 16 bit binary number is input in locations Hbyte and
; Lbyte with the high byte in Hbyte.
;       The 5 digit BCD number is returned in R0, R1 and R2 with R0
; containing the MSD in its right most nibble.
;
;   Performance :
;               Program Memory  :  32
;               Clock Cycles    :  750
;
;*******************************************************************;
;
B2_BCD_Looped
	bsf      ALUSTA,FS0
	bsf      ALUSTA,FS1            ; set FSR0 for no auto increment
;
	bcf      ALUSTA,C
	clrf     count, F
	bsf      count,4         ; set count = 16
	clrf     R0, F
	clrf     R1, F
	clrf     R2, F
loop16a
	rlcf     Lbyte, F
	rlcf     Hbyte, F
	rlcf     R2, F
	rlcf     R1, F
	rlcf     R0, F
;
	dcfsnz   count, F
	return
adjDEC
	movlw    R2              ; load R2 as indirect address ptr
	movwf     FSR0
	call    adjBCD
;
	incf     FSR0, F
	call    adjBCD
;
	incf     FSR0, F
	call    adjBCD
;
	goto    loop16a
;
adjBCD
	movfp    INDF0,WREG
	addlw    0x03
	btfsc      WREG,3          ; test if result > 7
	movwf     INDF0
	movfp    INDF0,WREG
	addlw    0x30
	btfsc      WREG,7          ; test if result > 7
	movwf     INDF0           ; save as MSD
	return
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

Heres the most efficient ...

Efficient for what? Processor cycles? Code space?

I saw the "efficient", and then I saw not only a loop but calls & returns!?!

Lee

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Its efficient in that it uses algebra instead of division by powers and also that it converts directly to packed bcd. wow I forgot how annoying the fsr is.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

outer_space wrote:
Its efficient in that it uses algebra instead of division by powers and also that it converts directly to packed bcd.

But packed bcd was mostly not the goal, so you need further conversion steps to get ASCII or 7-segment.

And thus methods with direct ASCII output are many more efficient, e.g. look on my example code above (32 bit to ASCII). It works with the optimized subtraction powers of 10 method.

Peter

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

packed bcd is great if youre working with a segment display and want to store all the digits in low registers, also if youre using a pic you only get 64? bytes of data memory.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi Lee,
I'll let you and the 'gang' decide how efficient this code is, but it it stll found in the tools section. Performs 8,18,24,32 bit, signed and unsigned ASCII conversion. Let us know how it rates (> 2000 downloads).
"ASCII printing routines"
http://www.avrfreaks.net/index.p...

Kind Regards,
Jack Tidwell

Attachment(s): 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

outer_space wrote:
packed bcd is great if youre working with a segment display

???

Sorry, but I'm confused totally now :?:

You need a 7-segment code to drive 7-segment displays.

On my applications I use e.g. 5 bytes of SRAM for 5 digits and put the 7-segment pattern into it (with leading zeros blank) and then the multiplex timer interrupt drive digit after digit.

Thus the binary to 7-segment conversion must not be done inside the interrupt handler and so the smallest code size was most efficient.

Peter

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

following an example code for 7 segment output, extremely efficient (only 36 words).

;*************************************************************************
;*                                                                       *
;*                      Convert unsigned 16bit to 7-Segment              *
;*                                                                       *
;*              Author: Peter Dannegger                                  *
;*                      danni@specs.de                                   *
;*                                                                       *
;*************************************************************************
;
.nolist
.include "2313def.inc"
.list
;
.equ    _0A     = 0x02                          ;segment order
.equ    _0B     = 0x04
.equ    _0C     = 0x40
.equ    _0D     = 0x10
.equ    _0E     = 0x08
.equ    _0F     = 0x01
.equ    _0G     = 0x20
.equ    _0DP    = 0x80                          ;decimal point

.equ    _00     = ~( _0A+_0B+_0C+_0D+_0E+_0F     )      ;number pattern, low active
.equ    _01     = ~(     _0B+_0C                 )
.equ    _02     = ~( _0A+_0B+    _0D+_0E+    _0G )
.equ    _03     = ~( _0A+_0B+_0C+_0D+        _0G )
.equ    _04     = ~(     _0B+_0C+        _0F+_0G )
.equ    _05     = ~( _0A+    _0C+_0D+    _0F+_0G )
.equ    _06     = ~( _0A+    _0C+_0D+_0E+_0F+_0G )
.equ    _07     = ~( _0A+_0B+_0C                 )
.equ    _08     = ~( _0A+_0B+_0C+_0D+_0E+_0F+_0G )
.equ    _09     = ~( _0A+_0B+_0C+_0D    +_0F+_0G )
;
	.dseg
	.org	0x60
digits:
	.byte	5			;digit data for multiplex interrupt
	.cseg
;-------------------------------------------------------------------------
;input: R17, R16= 16 bit value 0 ... 65535
;output: digits = 5 digits (7-segment code)
;
;words: 36 (40)
;
bin16_ascii:
	ldi	yl, digits	
	ldi	zh, high(2 * segment_tab)
	ldi	zl, low( -1 + 2 * segment_tab)
_bcd1:
	inc	zl
	subi	r16, low(10000)
	sbci	r17, high(10000)
	brcc	_bcd1
	rcall	_bcd5
	ldi	zl, low(10 + 2 * segment_tab)
_bcd2:
	dec	zl
	subi	r16, low(-1000)
	sbci	r17, high(-1000)
	brcs	_bcd2
	rcall	_bcd5
	st	y+, r0
	ldi	zl, low(-1 + 2 * segment_tab)
_bcd3:
	inc	zl
	subi	r16, low(100)
	sbci	r17, 0
	brcc	_bcd3
	rcall	_bcd5
	ldi	zl, low(10 + 2 * segment_tab)
_bcd4:
	dec	zl
	subi	r16, -10
	brcs	_bcd4
	rcall	_bcd5
	ldi	zl, low(2 * segment_tab)
	add	zl, r16
_bcd5:
	lpm				;number to 7-segment
	st	y+, r0			;store in multiplex SRAM
	ret
;-------------------------------------------------------------------------
	.if	((pc + 4) ^ pc) & 0x80	;table inside the same 256 byte ? 
		.org	(pc & 0xFF80) + 0x80	;otherwise next 256 byte
	.endif
segment_tab:
	.db	_00, _01, _02, _03, _04, _05, _06, _07, _08, _09
;-------------------------------------------------------------------------

Peter

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi Guys!

 

I DO REALIZE this thread is over 12 years old.....but I really needed a 32-bit hex to BCD ASM routine and to be frank I just wasn't up to the task of writing it, so I decided to do some searching....

 

In POST#12 danni (Peter Dannegger) posted what looked like a perfect fit for my needs.....but it didn't work.... While the very cool tricks he used are way beyond my little wheelhouse, I really needed it to work, so I set to it in the debugger.....the problem starts @ _bcd7: and follows through to the end .... here it is:

 

</p>
<pre>
        ldi     r19, -1 + '0'
_bcd7:  inc     r19
        subi    r30, byte1(1000)          ;-1000
        sbci    r31, byte2(1000)
        brcc    _bcd7

        ldi     r18, 10 + '0'
_bcd8:  dec     r18
        subi    r30, byte1(-100)          ;+100
        sbci    r31, byte2(-100)
        brcs    _bcd8

        ldi     r17, -1 + '0'
_bcd9:  inc     r17
        subi    r30, 10                 ;-10
        brcc    _bcd9

        subi    r30, -10 - '0'
        mov     r16, r30
        ret
</pre><p>

 

It is a very simple slip-up.....in the section above, to fix it simply replace r31 with r29 and replace r30 with r28..... after the replacements are made the code works perfectly.  What amazes is that no one has pointed out this typo in the intervening 12 years....

 

@danni,  THANKS for the great code!

 

Fish

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Fish4Fun wrote:
@danni, THANKS for the great code!

Indeed, sentiments echoed...

 

And thanks for finding the bug.

Even though it's an old thread, it's still giving.

 

I'd just found that routine, and am now using it.

I would probably have tried one of the others if it didn't work, so lucky I saw your patch.

Works great now.

 

I wanted to limit the use of the precious high set registers, so I made some simple changes:-

* Changed the output registers to R06-R15

* Added a prologue that stores the two constants in memory (-1+'0' & 10+'0')

* Changed the LDI's for the output registers to LDS's from the two constants

 

Cheers,

Rob

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

intabits wrote:

I wanted to limit the use of the precious high set registers, so I made some simple changes:-

* Changed the output registers to R06-R15

* Added a prologue that stores the two constants in memory (-1+'0' & 10+'0')

* Changed the LDI's for the output registers to LDS's from the two constants

 

Just realized I should have pushed two high registers and kept the constants in those, making the LDS's into MOV's.

Will do that instead...

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I think the main reason for not been found is that not many use 32bit in ASM (then most people use C the the libs with the compiler ).

2. it can be rather slow with the correct (wrong) numbers, but if speed don't matter it's a easy way. (and then it can be done with a LUT and one big loop).

 

To make worst case a bit faster you can do a sub with numbers like 4000, and when negative then add (by sub the minus number) 1000.

That way you avoid the long loop for 9 (and 0 the other way) . (as I remember 3000 and 1000 work as well).

 

If you use a AVR with MUL there is a faster way by div with 10000 and deal with it as two 16 bit numbers (and a loop for first digit).

Then it can be done in less than 200clk for all numbers.
 

Pages