Fast conversion of Integer to BCD; assembly atmega328p.

Go To Last Post
141 posts / 0 new

Pages

Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Latest posts:

       Four_Bytes_To_Unpacked-BCD_V1.asm        AVG  164.08  Cycles and 146 words length

     Three_Bytes_To_Unpacked-BCD_V5.asm        AVG  102.45  Cycles and   92 words length

        Two_Bytes_To_Unpacked-BCD_v4.asm        AVG    47.86  Cycles and   43 words length

 

Divide by 10 is time consuming.

I'm using divide by 1000. That is a divide by 4,

followed by a divide by 250. (a)

Divide by 250 starts with a divide by 256.

The Quotient on divide by 256 is multiplied by 6; (256-250)=6.

Next, we add the division remainder (a), and divide again; same way.

 

The main.asm attachment is a  AVG=146 Clks Three Bytes conversion to BCD.  <-- this was first posted

Long program, but may be shortened (with lose of speed).

Four bytes(or many) can be  similarly converted, but I will not upload untested cumbersome work.

 

Added Two_Bytes_BCD.asm, better comments, and hope a better way to understand the algorithm; AVG=72.3  Cycles

Added Two_Bytes_BCD_v2.asm, 52 to 66clk as MACRO and not counted INPUT; AVG. from 0 to 65535  =  54.53  Cycles

Added Two_Bytes_BCD_v3.asm           tested     min 49 to max 52  clks                                              AVG=49.86  Cycles

Added Two_Bytes_BCD_v4.asm                                tested         min   47 to max  50  clks         AVG=  48  Cycles

Added Three_Bytes_To_Unpacked-BCD_V2.asm       tested         min 110 to max 135  clks       AVG=119  Cycles;       135 Words(cseg)

Added Three_Bytes_To_Unpacked-BCD_V4.asm       tested         min 121 to max 127  clks       AVG=122.5  Cycles;    110 Words(cseg)

Added Three_Bytes_To_Unpacked-BCD_V5.asm    tested         min 102 to max 105  clks       AVG=102.5  Cycles;      92 Words(cseg) <-- last modified on 02/24/21

Added Four_Bytes_To_Unpacked-BCD_V1.asm      tested         min 164 to max 167  clks       AVG=164.08  Cycles;  146 Words(cseg)

Attachment(s): 

This topic has a solution.
Last Edited: Thu. Feb 25, 2021 - 07:33 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

We have been there before (but not with 3 byte).

Do you have how many clk. it takes (fastest, slowest, AVG).

On AVR's with a HW mul , the fastest way is some form for mul with 1/x.

Somewhere in the forum there is code with worst case at something like 45clk (for 16bit numbers).

I also made a 16 bit version, but it's 68 clk worst case.  (it's also here somewhere)

And that worked with :

hack to find 1. digit (so 0..9999 is left, which often is all you need for 4 digit display)

div with 100

split the result and remainder to digits.

 

Add make sure that you can compete with the simple code that:

sub with 10000000 until you can't do it any more

then do the same with 1000000

then 100000

10000

...

...

With 24 bit I guess you can avoid first loop and just compare

 

It can be done a tad faster if you sub until number negative

for next digit you ADD until it becomes positive 

 

If worst case matter then this is better: ()

sub with 4000 until negative (and add 4 )

then ADD 1000 until positive

the AVG is slower but worst case is better.

Last Edited: Sat. Jan 30, 2021 - 12:42 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I don't know how to measure worst case.

Think I'm under 160clk.

Is there a way to find the worst case with Atmel Studio?

Also, are you interested on average for numbers above two bytes?

You can convert a two bytes with another program, like your version.

So, fastest above 65535?

 

My atmega328p works on 16MHz.

With three bytes you can display your real frequency, in real time.

Three bytes maximum is 16.777215 M .

I'm also displaying the temperature above the Quartz, using the same uC.

The measuring is done using an UBLOX GPS signal.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

 

With three bytes you can display your real frequency, in real time.

 

Well that goes without saying...if you run at 20MHz, and assume real time is 10 updates/sec...you could use roughly 20e6/10=  2 million cycles per display update (for everything)...don't think 160 clk or even 1600 clk  is gonna be too bad!

 

here is a source for a lot of good general ideas:

http://www.piclist.com/techref/m...

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

Last Edited: Sat. Jan 30, 2021 - 03:48 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I am counting my own uC clocks using Timer/Counter1 Capture Event,

250 times per second, on an Event raised by a GPS receiver.

 

In the same time I have a TWI connection with a thermometer,

and a 2Mbit full duplex UART connection ... and this is just the begining.

 

I have indeed a "too fast" conversion for hex to ASCII,

but the relative large length of my "routine" is not a problem;

and that is because I'm not using libraries.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

and this is just the begining.

Soon it will also play three concurrent games of chess, while controlling a pizza oven. cheeky

 

and that is because I'm not using libraries.

Doing it yourself can be customized to your needs, and without fear of library fines. 

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'm thinking on storing  frequencies on a temperature base,

 that is on TWI also. Some kind of auto-calibration.

Thank You.

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

?

5 numbers for a polynomial should cover all you ever will need, so 10 (perhaps 15) bytes of eeprom should cover. (remember the chip age so the internal clk will change over time, so a 0.1% calibration on a new chip is a overkill.) 

 

Add: a crystal also age it's just have a much smaller drift. 

Last Edited: Sat. Jan 30, 2021 - 06:12 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I meant only on crystal. As far as I've seen the large drift is on temperature.

I guess there's no problem to "link" your device on UBLOX only once a year,

to corect your freq. on temp. table, due to the aging of the Quartz.

 

adt7422 is sensitive enough for an 0.1 degree step, so I'll have 40 bytes/degree;

meaning temerature + frequency. I would prefer to store this outside uC.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Buy an external osc, a $2 one would be better that any crystal with the AVR oscillator (the two caps also provide to an error).  

(On some external osc. there are a pin you can control the speed with a DAC (only +-2kHz or so) 

 

 

 

And how accurate do you expect the clk to be ?  and do you just need to know the error or do you active want to change the speed.

 

On some old VHF radios the crystal was in a box heated up to 70deg controlled by a ntc/ptc resistor.  

Last Edited: Sat. Jan 30, 2021 - 07:29 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The code is a bit hard to follow.  The only register names I see being used are ZERO and SIX.  Naming more registers like REMAINDER would make the code easier to follow.

 

It's also hard to follow because it is not simplified.  When I see code that takes more instructions to do something than the simple way, I then have to figure out if there is a reason for doing it the complicated way, or if the writer just didn't know the simple way.

 

For example, your code takes 6 instructions to convert a byte in the range of 0 to 16 from binary to BCD.  It only takes 3 instructions to do that:

cpi r18, 10

brlo 1f

subi r18, lo8(-6)

1:

 

I have no special talents.  I am only passionately curious. - Albert Einstein

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Naming more registers like REMAINDER would make the code easier to follow.

It also GREATLY reduces the risk of an error ...much too easy to mix up  r12, r13,r2, r23, r21, at least just glancing around the code.   On the other hand, maybe no names forces you to work very extra carefully to not make a mistake.

When we used pencil cards, we had to work hours to submit 30 lines of code....so there was a LOT of care used to make sure it was correct the first time...no second chance.  

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

My code:

clr r18        ; max. 16 byte conversion to BCD
 cpi r16,10
  brlo PC+3
  subi r16,10     ldi r18,$10

  or r16,r18

First instruction should be erased; my mistake.

 

Your code is faster. I was just copying and paste (from my code above); not thinking.

Thank you.

 I know my code is hard to follow, because I had to debug it. As I've already said, it is cumbersome. I'm sorry.

There are very many remainders, so I've tried to give a general ideea on what the algorithm does.

If you'll let me know what you did understood, it might be a good start.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I don't think the Quartz should be accurate; and I do have an external oscilator(not in use for the moment).

Very good advice. Thank You.

 

I change the Counter value in the time of counting, after the reading, but not on this project.

So, as long as I know my Fclk is 15999432 on 22.2 degrees, this should be "reasonably fine".

 

Few years ago I've made a clock. Very accurate; just 1-2 seconds on a three month basis.

But it was working so fine just because the temperature in my kitchen was very stable. :-)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

1. I've tested my code from 0(one) to 256^3-1, and it is working.

2. As you probably suspect, I'm working on interrupts, I do know what confusing a register might do to you.

3. I've worked with perforated cards and perforated tapes;

I've seen pencil cards but they didn't let us run the program;

they (the theachers) were doing only a visual inspection of the program.

4. If you want to see a program REALY hard to understand, please see att.

Again, I'm sorry for the inconvenience.

Attachment(s): 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Your code really needs a comment block header explaining what is going on. I'm still struggling to read through and understand it after several passes.

 

Anyway this topic has; since the dawn of time; been a rite-of-passage that every programmer must undertake. You might find this "reference thread" interesting. https://www.avrfreaks.net/forum/binary-decimal-conversion-reference-all

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'm working on a two byte version, hoping that it would be easier to understand the algorithm.

I do believe that is the algorithm that causes you difficulties.

It's similar to making the division by hand, with a pencil.

int(32735/256)=127    mod(32735/256)=223  ; you don't have to do this because it's just BYTE1 and BYTE2

on the other hand we are looking for:

int(32735/250)=130    mod(32735/250)=235

Based on 256=250+6: this means that 127*6+223=985

now int(985/256)=3  mod(985,256)=217   ; we are kiping score on the quotient!

and again 3*6+217=235,   BUT THIS TIME the rest is smaller than 250.

So the result will be 127+3=130 and the rest is 235

Does this make sense?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I totally forgot this way to do it :

 

https://www.avrfreaks.net/forum/...

 

and the last version can be 24bit without code changes. (the code use that registers are memory mapped on org AVR's, so with more digits the start needs to be lower than 20)

 

the 16 bit take 400clk and the general should be about 1200 clk. 

 

But the last code is 27 instructions regardless of the size even if the number are 32 bit.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I was trying to avoid the usage of registers above R16. I need those for other purposes.

Also my intention was to use as few registers as possible.

PUSH and POP  on interrupts costs 4 cycles, but you have to do LDS and STS also.

So 8clk. is a price hard to bear when you are dealing with more than 250 ints./second.

 

Also, I was trying to find something ready to fit on my purpose, and I didn't.

When I have no luck, I'm getting to work on my own.

 

It was my intention to share; may be others will have more luck than me. ;-)

Last Edited: Sat. Jan 30, 2021 - 11:59 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I have not checked your code but from this comment 

; max. 99 byte conversion to BCD

and the next 17 instructions do that.

 

this can be done faster and smaller

if you need to do this more that once then have a register that hold 10 all the time and one that hold 26

the algorithm use 26/256 is close to 1/10

if number is less than 64 then high byte of mul hold the correct 10th

else dec number before mul. 

then last digit is: number - 10*the 10th number

 

(this code is just fast from my head, but about correct ;) )

; nummer in r16 only 0..99 are legal
; 10 i r21
; 1  i r20

ldi r17,26
mov r20,r16
sbrc r18,6
dec r20
mul r20,r17
mov r21,r1
ldi r17,10
mul r21,r17
sub r20,r0

is wrong correct code in #43

Last Edited: Sun. Jan 31, 2021 - 08:49 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I was trying to avoid the usage of registers above R16. I need those for other purposes.

the link showed  comparing immediate to 5  & 7 (high registers)---you can just load those into some registers before the entire code & have them ready when needed (low resgisters)

 

ldi temp, 5

mov myfive, temp

------------------------------------

so instead of: cpi r18, 5

                     brlo L3                     

 

you do:   cp r7,  myfive 

              brlo L3         

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

Last Edited: Sun. Jan 31, 2021 - 02:38 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Catalin Ioan Stanciu wrote:

If you'll let me know what you did understood, it might be a good start.

 

I understood the divide by 4 (2 shift right), and the code to convert the last byte to BCD.  I read the macro that I think is supposed to convert 1 byte to 2 BCD digits, but could not make sense of it.

When I write asm, I use GNU as format.  Most of my asm code is used by C/C++ code, so GNU as format is the only practical option in that case.

 

I'd like to second the suggestion of repeated subtraction.  The code will work on AVRs without a mul, it's still relatively fast, and it will give you the smallest code for converting a 16-bit or 8-bit value.

Here's a version I wrote last year:

https://github.com/nerdralph/deb...

 

I have no special talents.  I am only passionately curious. - Albert Einstein

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I've promised to upload a two byte conversion, for the purpose of simplicity on understanding the algorithm; see att. .

This time I've been commenting a lot, hope not too much.

This version is also tested(0 to 65535).

I'm just guessing that divideing by 1000 is less eficient on two bytes than on three bytes.

Anyway on my testings I never saw more than 72clks used; so I'm assuming less than 80.

 

Dividing by 4 first, and with 250 later is advantageous, because the 250 division will be done on a smaller number.

This time there is no MACRO, and the code is less obfuscated - I hope.

On the contrary I think MUL is a very powerful tool, even I can't argue for the moment; because of the algorithm I use.

I will end this post, and look on the link afterwords; hope you don't mind.

Thank you.

Attachment(s): 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

That is correct.

Sometimes I'm comparring (for example) with 150, 75 and 25, and later on with 60, 30 and 25.

So, many rereads on the written code will reveal gains you can make; or errors.

Also, the more people are looking to your code, the better.

Thank you.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

So what's the minimum for converting $63 to two decimal digits, on atmega328p?

This code is 12.5clk AVG, min. 9clk and max. 15clk. Not bad at all, on first writing. :-)

 

number

between     Cycles

0 and 9        9

10 and 19   13

20 and 29   14

30 and 39   10

40 and 49   14

50 and 59   15

60 and 69   10

70 and 79   14

80 and 89   15

90 and 99   11

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The code I wrote in #20 always take 11 clk as it is.

 

With constant registers for 10 and 26 then it takes 9 clk (and the 2 for init)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Mul takes two Cycles:

"MUL Rd, Rr Multiply unsigned R1:R0  Rd  Rr Z,C  2"

from 7810D–AVR–01/15  ATmega328P [DATASHEET] pag. 281

 

so total is 9+2=11

if i'll do the initialization will be 13; and I will lose the benefit of not using two more registers.

I am very, very sorry; in this case I'll chose AVG 12.5 is better.

 

Don't know if you looked at the conversion of the two Bytes to BCD, att. at #23.

I've measured AVG=72.3 , from 256 to 65535; that is except the One Byte conversion.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Given how cheap on-board flash is these days, don't you just do it with a LUT? 1 Byte to 3 BCD is only 512 bytes, or 384 with a small amount of smarts.

#1 Hardware Problem? https://www.avrfreaks.net/forum/...

#2 Hardware Problem? Read AVR042.

#3 All grounds are not created equal

#4 Have you proved your chip is running at xxMHz?

#5 "If you think you need floating point to solve the problem then you don't understand the problem. If you really do need floating point then you have a problem you do not understand."

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

"Just saw I was wrong, and you are right. Please excuse.

Your code is indeed 11 cycles and I have to reconsider my code.

Thank you."  - my original comment.

 

Now, that I've tested the code in Question,

I can say that It Is Wrong Code.

It just seems to be good.

So this code:

; nummer in r16 only 0..99 are legal
; 10 i r21
; 1  i r20

ldi r17,26
mov r20,r16
sbrc r18,6
dec r20
mul r20,r17
mov r21,r1
ldi r17,10
mul r21,r17
sub r20,r0

Gives this conversion:

Hex. Dec. Conversion Hex. Dec. Conversion Hex. Dec. Conversion Hex. Dec. Conversion
00 0 0000 19 25 0205 32 50 0500 4B 75 0705
01 1 0001 1A 26 0206 33 51 0501 4C 76 0706
02 2 0002 1B 27 0207 34 52 0502 4D 77 0707
03 3 0003 1C 28 0208 35 53 0503 4E 78 0708
04 4 0004 1D 29 0209 36 54 0504 4F 79 08FF
05 5 0005 1E 30 0300 37 55 0505 50 80 0800
06 6 0006 1F 31 0301 38 56 0506 51 81 0801
07 7 0007 20 32 0302 39 57 0507 52 82 0802
08 8 0008 21 33 0303 3A 58 0508 53 83 0803
09 9 0009 22 34 0304 3B 59 0509 54 84 0804
0A 10 0100 23 35 0305 3C 60 0600 55 85 0805
0B 11 0101 24 36 0306 3D 61 0601 56 86 0806
0C 12 0102 25 37 0307 3E 62 0602 57 87 0807
0D 13 0103 26 38 0308 3F 63 0603 58 88 0808
0E 14 0104 27 39 0309 40 64 0604 59 89 09FF
0F 15 0105 28 40 0400 41 65 0605 5A 90 0900
10 16 0106 29 41 0401 42 66 0606 5B 91 0901
11 17 0107 2A 42 0402 43 67 0607 5C 92 0902
12 18 0108 2B 43 0403 44 68 0608 5D 93 0903
13 19 0109 2C 44 0404 45 69 07FF 5E 94 0904
14 20 0200 2D 45 0405 46 70 0700 5F 95 0905
15 21 0201 2E 46 0406 47 71 0701 60 96 0906
16 22 0202 2F 47 0407 48 72 0702 61 97 0907
17 23 0203 30 48 0408 49 73 0703 62 98 0908
18 24 0204 31 49 0409 4A 74 0704 63 99 0AFF
Last Edited: Sun. Jan 31, 2021 - 10:15 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Is this just a "fun" exercise to see how fast you can do it?  What if it took 123 clocks to convert?--if run at 20MHz, that is only 6 microseconds ...doubt you will have time to complain it is too slow.  If you don't have 6us to spare every 100 ms or so to update your display, there is other trouble brewing.

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

It's your code I just give input.

 

I found my old code, but it's 12 years old and today I could do better, and as you see I use a my old code to split 0..99

https://www.avrfreaks.net/commen...

 

But as I said there are some code here in the forum that do the 16 bit conversion in less than 50 clk (worst case).

It is based on a more precise mul with 1/x so the next digits pop up from the remainder just by mul with 10

(I do it a fast way to get the correct result, but my remainder is useless, so I have to sub 10*result)

 

In general don't optimize to much of the cost of how you use the routine, you can easily waist the saved by having to store registers, do remapping etc.

(like my code leave a number 0..9 for each register not $30..$39 as your print routine want, and if left side digits are 0 they should be $20).

 

 

 

 

      

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thank You.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I just reread the thread and in #5 I found :

250 times per second, on an Event raised by a GPS receiver.

Don't your GPS give you the time? (so you don't need a precise crystal)

(else get a GPS with a 1 pps clk output)

(or check how precise the time for first char in a position is, there could be jitter but no drift) 

 

and said in #30 why can't it take a bit longer. If you run 16MHz and do this 250 times a sec (about 25 times than needed!) and the routine take 200 clk (with print), it's still only 0.3% of the time.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Given how cheap on-board flash is these days, don't you just do it with a LUT? 1 Byte to 3 BCD is only 512 bytes, or 384 with a small amount of smarts.

If the most extreme speed is needed, and your AVR has spare flash space going to waste, you may include the BCD to 7 segment conversion in the LUT...otherwise that will take as many (or more) cycles to figure out which segments to light from the BCD.   What is the end goal, is this some sort of contest?   

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

Last Edited: Sun. Jan 31, 2021 - 10:57 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I am using 0.5s or 1 second update, and I'm not using the "oscilloscope" type display; so I don't know.

I've done some programs with displaying on Hitachi HD44780; this uses 4 lines of data, and 2 lines for control.

Know I'm using only UART or TWI. I haven't got issues  with these, but I'm using short Interrupt Routines (~60Cycles).

I have two adapters TWI to Hitachi HD44780, not used.

I'm using "USART Rx Complete" Interrupt, but on the transmit side I use:

.MACRO SChar2UART
    lds        @1,UCSR0A
    sbrs    @1,UDRE0    ; Wait for empty transmit buffer
     rjmp    PC-3
    sts        UDR0,@0        ; Send CHAR
.ENDMACRO

.MACRO SendUART2ASC
    mov r16,@0
    swap r16
    ASC r16
    SChar2UART r16,r17
    mov r16,@0
    ASC r16
    SChar2UART r16,r17
.ENDMACRO

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I am using 0.5s or 1 second update, and I'm not using the "oscilloscope" type display; so I don't know.

So why worry whether it takes 10us or 10ms to do the conversion?   The conversion should have no effect at all on your interrupts...the IRQs will just interrupt your conversion whenever they need to, even if the conversion took 5000 cycles.

As you convert each digit, it will be sent into the TX buffer along with whatever other messages and tidbits your are sending out to be displayed. 

 

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

 

I. GPS

II. I'm using Counter1 To count the pulses on the signal received from UBLOX.

can't count more than 65535 on a two bytes counter (or it's difficult)

so I've chose 250x64000; that is on 64000 pulses from U-BLOX I have INT

and based on the GPS precission, I have a very good aprox of what I'said:

250 INTs/second. I'm using only 60 Cycles on every INT, generated by the UBLOX,

but I also have the USART Rx Complete INT, and the TWI transmission.

III. I've said "too fast":

I am counting my own uC clocks using Timer/Counter1 Capture Event,

250 times per second, on an Event raised by a GPS receiver.

 

In the same time I have a TWI connection with a thermometer,

and a 2Mbit full duplex UART connection ... and this is just the begining.

 

I have indeed a "too fast" conversion for hex to ASCII,

but the relative large length of my "routine" is not a problem;

and that is because I'm not using libraries.

 

Added on 03/03/2021:

Someone found something rather useful here, so att. are present-day my file sources.

Attachment(s): 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'm not  worrying " So why worry whether it takes 10us or 10ms to do the conversion?"

I've said "Fast conversion of Integer to BCD; assembly atmega328p",

and I'm am open to make it faster, if suggested.

on #19 I said " It was my intention to share", that is all.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

was wrong correct code in #43

Last Edited: Sun. Jan 31, 2021 - 08:48 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

wrong:

Hex. Dec. Conversion Hex. Dec. Conversion Hex. Dec. Conversion Hex. Dec. Conversion
00 0 0000 19 25 0205 32 50 0500 4B 75 0705
01 1 0001 1A 26 0206 33 51 0501 4C 76 0706
02 2 0002 1B 27 0207 34 52 0502 4D 77 0707
03 3 0003 1C 28 0208 35 53 0503 4E 78 0708
04 4 0004 1D 29 0209 36 54 0504 4F 79 08FF
05 5 0005 1E 30 0300 37 55 0505 50 80 0800
06 6 0006 1F 31 0301 38 56 0506 51 81 0801
07 7 0007 20 32 0302 39 57 0507 52 82 0802
08 8 0008 21 33 0303 3A 58 0508 53 83 0803
09 9 0009 22 34 0304 3B 59 0509 54 84 0804
0A 10 0100 23 35 0305 3C 60 0600 55 85 0805
0B 11 0101 24 36 0306 3D 61 0601 56 86 0806
0C 12 0102 25 37 0307 3E 62 0602 57 87 0807
0D 13 0103 26 38 0308 3F 63 0603 58 88 0808
0E 14 0104 27 39 0309 40 64 0604 59 89 09FF
0F 15 0105 28 40 0400 41 65 0605 5A 90 0900
10 16 0106 29 41 0401 42 66 0606 5B 91 0901
11 17 0107 2A 42 0402 43 67 0607 5C 92 0902
12 18 0108 2B 43 0403 44 68 0608 5D 93 0903
13 19 0109 2C 44 0404 45 69 07FF 5E 94 0904
14 20 0200 2D 45 0405 46 70 0700 5F 95 0905
15 21 0201 2E 46 0406 47 71 0701 60 96 0906
16 22 0202 2F 47 0407 48 72 0702 61 97 0907
17 23 0203 30 48 0408 49 73 0703 62 98 0908
18 24 0204 31 49 0409 4A 74 0704 63 99 0AFF
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0


ok I have not time to look at it now, and can't get to the code I have tested (and I remember it was I clk faster).

 

this is what I did yesterday in excel: 1. is with x (used 0..63) and the other is with x-1 (64..99)

0   0   0
1   0   0
2   0   0
3   0   0
4   0   0
5   0   0
6   0   0
7   0   0
8   0   0
9   0   0
10   1   0
11   1   1
12   1   1
13   1   1
14   1   1
15   1   1
16   1   1
17   1   1
18   1   1
19   1   1
20   2   1
21   2   2
22   2   2
23   2   2
24   2   2
25   2   2
26   2   2
27   2   2
28   2   2
29   2   2
30   3   2
31   3   3
32   3   3
33   3   3
34   3   3
35   3   3
36   3   3
37   3   3
38   3   3
39   3   3
40   4   3
41   4   4
42   4   4
43   4   4
44   4   4
45   4   4
46   4   4
47   4   4
48   4   4
49   4   4
50   5   4
51   5   5
52   5   5
53   5   5
54   5   5
55   5   5
56   5   5
57   5   5
58   5   5
59   5   5
60   6   5
61   6   6
62   6   6
63   6   6
64   6   6
65   6   6
66   6   6
67   6   6
68   6   6
69   7   6
70   7   7
71   7   7
72   7   7
73   7   7
74   7   7
75   7   7
76   7   7
77   7   7
78   7   7
79   8   7
80   8   8
81   8   8
82   8   8
83   8   8
84   8   8
85   8   8
86   8   8
87   8   8
88   8   8
89   9   8
90   9   9
91   9   9
92   9   9
93   9   9
94   9   9
95   9   9
96   9   9
97   9   9
98   9   9
99   10   9

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Ups also wrong

 

Add 

The error is that the last sub is from the number there is sub'ed with 1 (to save a move), and there is a way around it but I can't see it now, (I have to run now but will be back)

Last Edited: Sun. Jan 31, 2021 - 12:39 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Ok now it should work I hope: 

; nummer in r16  [0..99]
; 10 in r21
; 1  in r20

; if used more then once load 10 and 26 to two registers

ldi r17,26      ; load value for mul with 1/10 (26/256)
mov r20,r16  ;make a copy to the remainder place
sbrc r16,6      ;if more than 64 
dec r16          ; use one less
mul r16,r17    ; do the 1/10 mul
mov r21,r1     ; high result in high byte of mul
ldi r17,10       ; load value for mul with 10
mul r1,r17    
sub r20,r0      ;low digit is the remainder

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

sparrow2 wrote:

Ok now it should work I hope: 

; nummer in r16  [0..99]
; 10 in r21
; 1  in r20

; if used more then once load 10 and 26 to two registers

ldi r17,26      ; load value for mul with 1/10 (26/256)
mov r20,r16  ;make a copy to the remainder place
sbrc r16,6      ;if more than 64 
dec r16          ; use one less
mul r16,r17    ; do the 1/10 mul
mov r21,r1     ; high result in high byte of mul
ldi r17,10       ; load value for mul with 10
mul r1,r17    
sub r20,r0      ;low digit is the remainder

 

You clobber r20 with the "mov r20, r16" instruction, so why does r20 need to be initialized with the value 1?

 

I have no special talents.  I am only passionately curious. - Albert Einstein

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

?

1 is low digit of result

10 is high digit of result

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

maybe say:

 

r21:r20 will contain the 10's:1's  digit conversion result

 

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yes, this is a freak piece of code. It works.

This:

; nummer in r16  [0..99]
; 10 in r21
; 1  in r20

; if used more then once load 10 and 26 to two registers

ldi r17,26      ; load value for mul with 1/10 (26/256)
mov r20,r16  ;make a copy to the remainder place
sbrc r16,6      ;if more than 64
dec r16          ; use one less
mul r16,r17    ; do the 1/10 mul
mov r21,r1     ; high result in high byte of mul
ldi r17,10       ; load value for mul with 10
mul r1,r17    
sub r20,r0      ;low digit is the remainder

 

gives these results:

Hex. Dec. Conversion Hex. Dec. Conversion Hex. Dec. Conversion Hex. Dec. Conversion
00 0 00:00 19 25 02:05 32 50 05:00 4B 75 07:05
01 1 00:01 1A 26 02:06 33 51 05:01 4C 76 07:06
02 2 00:02 1B 27 02:07 34 52 05:02 4D 77 07:07
03 3 00:03 1C 28 02:08 35 53 05:03 4E 78 07:08
04 4 00:04 1D 29 02:09 36 54 05:04 4F 79 07:09
05 5 00:05 1E 30 03:00 37 55 05:05 50 80 08:00
06 6 00:06 1F 31 03:01 38 56 05:06 51 81 08:01
07 7 00:07 20 32 03:02 39 57 05:07 52 82 08:02
08 8 00:08 21 33 03:03 3A 58 05:08 53 83 08:03
09 9 00:09 22 34 03:04 3B 59 05:09 54 84 08:04
0A 10 01:00 23 35 03:05 3C 60 06:00 55 85 08:05
0B 11 01:01 24 36 03:06 3D 61 06:01 56 86 08:06
0C 12 01:02 25 37 03:07 3E 62 06:02 57 87 08:07
0D 13 01:03 26 38 03:08 3F 63 06:03 58 88 08:08
0E 14 01:04 27 39 03:09 40 64 06:04 59 89 08:09
0F 15 01:05 28 40 04:00 41 65 06:05 5A 90 09:00
10 16 01:06 29 41 04:01 42 66 06:06 5B 91 09:01
11 17 01:07 2A 42 04:02 43 67 06:07 5C 92 09:02
12 18 01:08 2B 43 04:03 44 68 06:08 5D 93 09:03
13 19 01:09 2C 44 04:04 45 69 06:09 5E 94 09:04
14 20 02:00 2D 45 04:05 46 70 07:00 5F 95 09:05
15 21 02:01 2E 46 04:06 47 71 07:01 60 96 09:06
16 22 02:02 2F 47 04:07 48 72 07:02 61 97 09:07
17 23 02:03 30 48 04:08 49 73 07:03 62 98 09:08
18 24 02:04 31 49 04:09 4A 74 07:04 63 99 09:09
Last Edited: Mon. Feb 1, 2021 - 06:33 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

That is pretty wicked!!  Note for clarity

4B 75 0705

 

note--should be shown as:  4B   75  07:05   , since it is 2 results, much different than the value seven hundred five.

 

Now can you make one to do 16 bits?   result from 0 to 65535

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

Last Edited: Sun. Jan 31, 2021 - 05:55 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Slim chances but, some things can't get out of my mind.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Slim chances but, some things can't get out of my mind.

A 16 bit to 5 digit, would prove that you can do it greatly, like its never been done before---are you going to take the challenge? Maybe it requires a lot of coffee

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The way OP do it he will never get under 50 clk. He will get close to my code around 70clk.

(I can never find the code that do it in less than 50 clk someone help! ).

And I think to get under 40 clk you will need one or more LUTs .

Pages