Ok there have been many different ways to convert binary to decimal.
And now I found a new way, for me at least. It actually a very old way and it seems to mostly been used in hardware solutions.
it this :
and I made a test code that solve this with a byte for each digit at the time.
16 bit with input in r17:r16
and output r24.....r20
movw r18, r16 ldi r30,16 ldi r20,0 ldi r21,0 ldi r22,0 ldi r23,0 ldi r24,0 L0: cpi r20,0x05 brcs L1 subi r20,-0x7b L1: cpi r21,0x05 brcs L2 subi r21,-0x7b L2: cpi r22,0x05 brcs L3 subi r22,-0x7b L3: cpi r23,0x05 brcs L4 subi r23,-0x7b L4: cpi r24,0x05 brcs L5 subi r24,-0x7b L5: adc r18,r18 adc r19,r19 adc r20,r20 adc r21,r21 adc r22,r22 adc r23,r23 adc r24,r24 dec r30 brne L0
and that is not very small or fast but very simple!
so here is a general version of the same code:
movw r18, r16 ldi r30,16 eor r27,r27 eor r31,r31 ldi r26,20 CLR1: st x+,r31 cpi r26,25 brne CLR1 ldi r31,25 L0: ldi r26,20 D0: ld r25,x cpi r25,0x05 brcs D1 subi r25,-0x7b st x,r25 D1: inc xl cpi xl,25 brne D0 ldi r26,18 D2: ld r25,x adc r25,r25 st x+,r25 cpse xl,r31 rjmp D2 dec r30 brne L0
again not very fast or small but one VERY good thing, the same code just with other limits can do all kind of sizes 24 32 64 80 128 ...bit of input data.
This code uses memory mapped registers because it's easy for test, but input and output can be placed in RAM.