Author Message
 RetroDan
 Posted: Mar 31, 2006 - 04:30 AM
 Joined: Feb 24, 2006 Posts: 794 Location: http://avr.x.am
 TITLE: DIRTY DAN's DIRTY MATH TRICKS PURPOSE: On Going Tutorial on Math Tricks in AVR Native Machine Language for Beginners. PREAMBLE: While trying to help someone in another FORUM with a Division-by-60 routine to converting seconds to minutes, someone made the remark: Quote: What RetroDan has is a quick and dirty approximation! I think it was meant as an insult, but it gave me a great name for the type of binary routines I love: Dirty Math! Quick & Dirty Routines that are small and super fast. CAVEAT LECTOR: If you don't like taking chances and experimenting and don't really care about how big your programs are, or how fast they excecute then this is not a place for you. May I recommend the Atmel Appnotes #200, 201 and 204 that are chock full great standard math routines for the AVR microcontrollers. Last edited by RetroDan on Mar 31, 2006 - 10:40 AM; edited 4 times in total

 RetroDan
 Posted: Mar 31, 2006 - 04:38 AM
 Joined: Feb 24, 2006 Posts: 794 Location: http://avr.x.am

 RetroDan
 Posted: Mar 31, 2006 - 05:50 AM
 Joined: Feb 24, 2006 Posts: 794 Location: http://avr.x.am
 TITLE: DIRTY DAN"s MATH TRICKS: DIVISION-BY-TEN PREAMBLE: I've been working on my own driver routines for the LCD Display on the Butterfly (same as STK502) that are super small and fast. When it came time to write a little routine that displays a number in decimal, rather than Hex, I ran into my old nemesis DIVISION! To be more precises, Division-by-Ten. Since we humans are so fond of Division-by-Ten, I thought it would make an excellent starting point for our examination of my Dirty Math Tricks. For this routine, all we want is to take a single byte value and display it on the LCD Display in Decimal Number Form. So we need to take a number with a maximum value of \$FF and break it into three: 2 / 5 / 5. The way most of us humans would do this is to first divide by it by 100 and then by 10 and then keep the remainder; Code: 255 / 100 = 2     (remainder of 55) 55  / 10  = 5 remainder = 5 Oh no, that means we need to have two division routines! One for division by 100 and another by ten. Mybe we should just use a general division routine and change the divisor from 100 to 10 rather than have two routines. So I go and check the Atmel Appnote 200 looking for a general 8x8 unsigned division routine and here is what I found: Code: ;*************************************************************************** ;* "div8u" - 8/8 Bit Unsigned Division ;* ;* This subroutine divides the two register variables "dd8u" (dividend) and ;* "dv8u" (divisor). The result is placed in "dres8u" and the remainder in ;* "drem8u". ;*  ;* Number of words   :14 ;* Number of cycles   :97 ;* Low registers used   :1 (drem8u) ;* High registers used  :3 (dres8u/dd8u,dv8u,dcnt8u) ;*************************************************************************** .def   drem8u   =r15      ;remainder .def   dres8u   =r16      ;result .def   dd8u   =r16      ;dividend .def   dv8u   =r17      ;divisor .def   dcnt8u   =r18      ;loop counter div8u:   sub   drem8u,drem8u   ;clear remainder and carry    ldi   dcnt8u,9   ;init loop counter d8u_1:   rol   dd8u      ;shift left dividend    dec   dcnt8u      ;decrement counter    brne   d8u_2      ;if done    ret         ;    return d8u_2:   rol   drem8u      ;shift dividend into remainder    sub   drem8u,dv8u   ;remainder = remainder - divisor    brcc   d8u_3      ;if result negative    add   drem8u,dv8u   ;    restore remainder    clc         ;    clear carry to be shifted into result    rjmp   d8u_1      ;else d8u_3:   sec         ;    set carry to be shifted into result    rjmp   d8u_1 Only 14 program steps is not bad. Almost 100 clock cycles per call and we need to call it twice fort a single byte conversion. That's about 200 clock cycles each. I'm sure we can do better than that... Last edited by RetroDan on Apr 13, 2006 - 09:13 AM; edited 3 times in total

 RetroDan
 Posted: Mar 31, 2006 - 06:24 AM
 Joined: Feb 24, 2006 Posts: 794 Location: http://avr.x.am
 TITLE: DIRTY DAN"s MATH TRICKS: DIVISION-BY-TEN (Continued) Previously we divided our 1 byte number by 100 then by 10 and when it comes time to expand our routine to handle 16 bit numbers we'll need to divide by 10,000, 1,000, 100 and finally by 10. At approx. 100 cycles per call that's 400 clock cycles per number. Maybe we should look at the problem again from the "other side" the right-hand-side instead of left. Code: 255 / 10 = 25   Remainder = 5 25  / 10 = 2    Remainder = 5 2   / 10 = 0    Remainder = 2 Well we've managed to extract the 255 again, but now we need Three calls to the division routine and for a 16 byte number we'll need Five. The only thing we have going for us using this method is that if we can find a "special" division routine for Division-by-Ten that is smaller and/or faster than the Atmel routine, we might have something we can work with... Last edited by RetroDan on Mar 31, 2006 - 06:50 AM; edited 2 times in total

 RetroDan
 Posted: Mar 31, 2006 - 06:46 AM
 Joined: Feb 24, 2006 Posts: 794 Location: http://avr.x.am
 TITLE: DIRTY DAN"s MATH TRICKS: DIVISION-BY-TEN (Continued) I remember someone posting a file of Division-by-a-Constant Routines, so why don't we see if we can get luck there. Here is what I found, exactly what we're looking for, a specialized routine for Division-by-Ten: Code: ;*************************************************************************** ;* Function "Div16_10" ;* Divides an unsigned 16 bit word (XH:XL) by 10 ;* Returns quotient in YH:YL and remainder in XL ;* ;* Author: Andreas Lenze (andreas.lenze@t-online.de) ;* Equations by D: W. Jones: ;* ;*   Reciprocal mul w. extra precision: ;*   unsigned int A; ;*   unsigned int scaled_reciprocal = 0xCCCD; ;*   unsigned int Q; /* the quotient */ ;* ;*   Q = ((A * 0xCCCD) >> 19) ;*    ;* Uses: high regs: 7 (r17, r18, r19, X, Y) ;*    low regs:  3 (r0, r1, r2) ;* ;*    words:     36 (w. push/pop = 8 words) ;*    cycles:    46 (w. push/pop = 16 cycles) ;* ;* Note: Hardware multiplier required ("mul" instruction) ;* ;*************************************************************************** Div16_10:    push   r2    push   r19    push   r18    push   r17    ldi   YH,0xCC      ; scaled reciprocal for /10    ldi   YL,0xCD    ; Q = A * 0xCCCD    ; (r19:r18:r17[:rXX] = XH:XL * YH:YL)         clr     r2         mul     XH, YH      ; ah * bh         movw    r19:r18, r1:r0         mul     XL, YL      ; al * bl    mov   r17,r1      ; r0 to [rXX] is superfluous         mul     XH, YL      ; ah * bl         add     r17, r0         adc     r18, r1         adc     r19, r2         mul     YH, XL      ; bh * al         add     r17, r0         adc     r18, r1         adc     r19, r2    ; Q = Q >> 16: use r19:r18 as word    ; Q = Q >> 3    lsr   r19      ; do the last 3 shifts    ror   r18    lsr   r19    ror   r18    lsr   r19    ror   r18    ; r19:r18 now "Q" (= result >> 19)    ; R = A - 10*Q;    ldi   r17,10      ; multiply r19:r18 by 10    mul     r18, r17   ; al * bl    sub   XL,r0    clr   XH    movw   YL,r18      ; make copy of Q    ; XL holds "R"    ; YH:YL holds "Q"    pop   r17    pop   r18    pop   r19    pop   r2    ret Wow, I looks huge! Let's see 46 program steps, that's a little on the heavy side, and 10 registers used ouch!. However, it is a 16 bit Divide-by-Ten Routine and it excecutes in about 1/2 the time that the Atmel 8 bit routine did, so we're moving in the right direction. Two or Three calls to this would be 100 to 150 cycles versus the 200 to 300 using the Atmel Routine. If we can't come up with something on our own, perhaps we could strip this down to an 8 bit routine, or perhaps just bite the bullet because sooner-or-later we're gonna' need to convert 16 bit values anyway... Last edited by RetroDan on Apr 13, 2006 - 09:18 AM; edited 2 times in total

 RetroDan
 Posted: Mar 31, 2006 - 07:04 AM
 Joined: Feb 24, 2006 Posts: 794 Location: http://avr.x.am
 TITLE: DIRTY DAN"s MATH TRICKS: DIVISION-BY-TEN (Continued) A while back, while literally sitting-on-the-throne, I happened to realized that 255 is evenly divisable by 5 and that it is only one away from 256 which is a Power-of-Two. After a while, it becomes a habit to always be on-the-lookout for Powers-of-Two. Either that or I'm slowly drifting into the world of padded-rooms and nice doctors in white smocks. Of all the things I've lost, I miss my mind the most. To divide by Ten we might be able to first Divide-by-Five then Right-Shift to Divide-by-Two and the result would be a Division-by-Ten. I stored that fact away and wondered if the 255-256 link could be put to use sometime. Pehaps today is a good day to drag it out of the bottom drawer and dust it off... Let's see: 255 divided by 5 is 51 and 255 is awlful close to 256 and furthermore division by 256 is super-super easy and requires ZERO program steps. The "Dirty Trick" to that is to simply ignore the least significant Byte and we've essentially Divided our number by 256. Since 255 is almost equal to 256, then 255/256 is almost equal to 1 Since multiplication by one does not change a value Then 1/5 times 255/256 shoud be pretty close to 1/5 Since division by 256 can be accomplished with the Dirty Math Trick of ignoring the lowest byte, we'll leave that to last and our equation becomes: 1/5 time 255 and later we'll truncate the lower byte. Code: 1/5 x 255 = 255 / 5 = 51 So mathematically speaking if we multiply our number by 51 and later ignore the lower byte, we should get a pretty good approximation of Division-by-Five. That's where our first routine will start...

 RetroDan
 Posted: Mar 31, 2006 - 07:59 AM
 Joined: Feb 24, 2006 Posts: 794 Location: http://avr.x.am
 TITLE: DIRTY DAN"s MATH TRICKS: DIVISION-BY-TEN (Continued) Let's start by writing a routine that muliplies by 51 and we'll look at the high byte and see if our mathematical assumptions were correct. Here is the code: Code: DIV5: LDI B,51       MUL A,B  ;ANSWER LEFT IN R1       RET Great it's only 3 lines long!!! Let's test it's accuracy. We're interested in a single byte so values can go from zero to 255 so lets check a few numbers in that range and see how it performs: Code:  10 =>  1 (off by -1)  50 =>  9 (off by -1) 100 => 19 (off by -1) 150 => 29 (off by -1) 200 => 39 (off by -1) 250 => 49 (off by -1) Hmm, we're alway off by minus one, so if we add one, we should be right on the money! So the new Division-by-Five Routine becomes: Code: DIV5: LDI B,51       MUL A,B       INC R1 ; ANSWER IN R1       RET So we're half-way-home, all we need to do now is divide our answer by two and we should have a "Quick and Dirty" Divide-by-Ten Routine...

 RetroDan
 Posted: Mar 31, 2006 - 08:00 AM
 Joined: Feb 24, 2006 Posts: 794 Location: http://avr.x.am
 TITLE: DIRTY DAN"s MATH TRICKS: DIVISION-BY-TEN (Continued) There's two ways we can use our Division-by-Five Routine for a Division-by-Ten Routine. We can modify the DIV5 routine. Earlier I stated that a "Quick and Dirty" way to Divide-by-Two is to just Shift your bits to the right. This is how we modify our DIV5 Routine to get a DIV10: Code: DIV10: LDI B,51        MUL A,B        INC R1   ;R1=A/5        LSR A    ;R1=(A/5)/2 = A/10        RET The other way we could have accomplished the same thing would be to have the DIV10 routine call the DIV5 routine and that way we "kill-two-birds-with-one-stone" because we get two usable routines from it: Code: DIV10: RCALL DIV5  ;R1=A / 5        LSR R1      ;R1=(A/5) / 2 = A/10        RET DIV5:  LDI B,51        MUL A,B        INC R1      ;ANSWER IN R1        RET Just to be on-the-safe-side let's double check the accuracy of these routines: Code:  10 =>  1 Correct!  50 =>  5 Correct! 100 => 10 Correct! 150 => 15 Correct! 200 => 20 Correct! 250 => 25 Correct! So we did it. We have our first "Dirty Math" Routine that is 100% accurate over the range we need. Stay tuned however, there's still more to this story... Last edited by RetroDan on Mar 31, 2006 - 08:26 AM; edited 2 times in total

 RetroDan
 Posted: Mar 31, 2006 - 08:18 AM
 Joined: Feb 24, 2006 Posts: 794 Location: http://avr.x.am
 TITLE: DIRTY DAN"s MATH TRICKS: DIVISION-BY-TEN (Continued) PRE-SUMMARY: Not bad for a days work eh? We took a rather large routine that we needed and reduced it to a "Quick and Dirty" little routine using a few "Dirty Math Tricks". Let's compare and see how well we did. First let's compare to the Atmel 8x8 Unsigned Division Routine: Code:                  DIV10a   DIV10b   ATMEL REGISTERS:         2        2        4  DIFFERENCE:      -2       -2        -   PERCENT:       -50%     -50%       - PROGRAM WORDS:    5         7       15(incl RET)  DIFFERENCE:    -10        -8        -   PERCENT:      -66%      -53%       - SPEED/CYCLES:     5         8       97  DIFFERENCE:    -92       -89   PERCENT:      -95%      -92% So we've used 1/2 the register space, about 60% less program space and we're running at about 20 times the speed that the Atmel Routine does. Let compare to the Lenze/Jones Routine: Code:                  DIV10a   DIV10b   LENZE REGISTERS:         2        2       10  DIFFERENCE:      -8       -8        -   PERCENT:       -80%     -80%       - PROGRAM WORDS:    5         7       36  DIFFERENCE:    -31        -29       -   PERCENT:      -86%       -80%      - SPEED/CYCLES:     5         8       46  DIFFERENCE:    -41       -38        -   PERCENT:      -89%      -83%       - Against the Lenze/Jones routine we're using 80% less Register Space, about 85% less Program Space and w'e're running at between 6 to 10 times the speed. Many might think that this is a great accomplishment and that I should stop at this point. Especially those that probably would have just used the larger routines and not had a second thought about it. Well, obviously some of you don't know me very well. I think I can pull a few more "Dirty Tricks" from out of my hat and turn this Dirty Littlle Routine into something totally scandalous. I'm not happy until the routine is so small that it pulls numbers from thin air, like magick... Last edited by RetroDan on Apr 13, 2006 - 09:24 AM; edited 1 time in total

 RetroDan
 Posted: Mar 31, 2006 - 09:07 AM
 Joined: Feb 24, 2006 Posts: 794 Location: http://avr.x.am
 TITLE: DIRTY DAN"s MATH TRICKS: DIVISION-BY-TEN (Continued) Let's re-examine what it is we're doing mathematically with our Dirty Little DIV10 Routine: We're multiplying by 51, adding one, then dividing by two, and then truncating to get a final division of 256: Code: A / 5 ~= [ [ ( A x 51 ) + 1 ] / 2 ] / 256 Let's ignore that final division by 256 because it's accomplished with ZERO program steps by ignoring the lower byte and there's no way we can improve on that unless there are opcodes that can move us backwards in time. So we want to look closely at the right side of this equation: Code: 256 x [A / 5] ~= ( 51A + 1) / 2 256 x [A / 5] ~= 51A/2 + 1/2 Well 1/2 doesn't really mean much to a microprocessor, you either have a bit that is on or its off. We'll ignore it for now and if need be, we'll increment something in the end like we did with our DIV5 routine. So now we have: Code: 256 x [A / 5] ~= 51A/2 256 x [A / 5] ~= 25.5 x A 256 x [A / 5] ~= 25A Since we dropped a 1/2 earlier and now we're stuck with another 0.5 perhaps we should round the 25.5 upto 26 rather than truncate it down to 25. Code: 256 x [A / 5] ~= 25.5 x A 256 x [A / 5] ~= 26A So to divide by 10 all we need to do is multiply by 26 and ignore lower byte: Code: MUL10: LDI B,26        MUL A,B   ;ANSWER IN R1 It sounds incredable, but let's check the results and see what our error ratio is: Code:  10 =>  1  Correct!  50 =>  5  Correct! 100 => 10  Correct! 150 => 15  Correct! 200 => 20  Correct! 250 => 25  Correct! So it's 100% accurate over the range we need and it's only two program lines. We started with rather large routines requiring 50 to 100 clock cycles per call and have something that runs in just three. Can we reduce it even more? Not really, not until Atmel comes out with a MULI command that allows us to multiply a register with an immedaite value rather than loading it into another register first. Maybe next year... It would probably look like this: Code: DIV10: MULI A,26  ;ANSWER IN R1 Last edited by RetroDan on Mar 31, 2006 - 10:14 AM; edited 1 time in total

 RetroDan
 Posted: Mar 31, 2006 - 10:08 AM
 Joined: Feb 24, 2006 Posts: 794 Location: http://avr.x.am
 TITLE: DIRTY DAN"s MATH TRICKS: DIVISION-BY-TEN (Conclusion) IN CONCLUSION: By systematically applying a few Dirty Tricks and following our instincts combined with a little high-school math, we took some rather large routines and reduced them down into just a couple program steps. Something that many would say is impossible., but never say never my friend. If I can do it, so can you! REQUEST FOR FEED BACK: I hope you found this posting entertaining as well as educational. If you would like to see more like it, please be sure to let the moderators know. Thank you for your time and consideration. Last edited by RetroDan on Apr 13, 2006 - 09:27 AM; edited 1 time in total

 alenze
 Posted: Mar 31, 2006 - 03:31 PM
 Joined: Mar 28, 2001 Posts: 416 Location: Munich, Germany
 Code: MUL10: LDI B,26        MUL A,B   ;ANSWER IN R1 Quote: So it's 100% accurate over the range we need and it's only two program lines. This is not true. For a 'div by 10' this is an approximation which will not yield correct results for all integers between 1 and 255 decimal (try for example 169, 159 ... for 'A' to see it fail). The 'scaled reciprocal' for a correct 'division by 10' of an 8 bit integer would have to be at least 9 bits to reach the required precision. Please check out the mathematical basics for this technique at http://www.cs.uiowa.edu/~jones/bcd/divide.html The claims in your comparison Code:                  DIV10a   DIV10b   LENZE REGISTERS:         2        2       10  DIFFERENCE:      -8       -8        -   PERCENT:       -80%     -80%       - PROGRAM WORDS:    5         7       36  DIFFERENCE:    -31        -29       -   PERCENT:      -86%       -80%      - SPEED/CYCLES:     5         8       46  DIFFERENCE:    -41       -38        -   PERCENT:      -89%      -83%       - are also incorrect: the '46 cycles' for 'Lenze' is including register saves (push/pop), the naked division routine w/o reg saves and getting the remainder is: Code:    ldi   YH,0xCC      ; scaled reciprocal for /10    ldi   YL,0xCD    ; Q = A * 0xCCCD    ; (r19:r18:r17[:rXX] = XH:XL * YH:YL)         clr     r2         mul     XH, YH      ; ah * bh         movw    r19:r18, r1:r0         mul     XL, YL      ; al * bl    mov   r17,r1      ; r0 to [rXX] is superfluous         mul     XH, YL      ; ah * bl         add     r17, r0         adc     r18, r1         adc     r19, r2         mul     YH, XL      ; bh * al         add     r17, r0         adc     r18, r1         adc     r19, r2    ; Q = Q >> 16: use r19:r18 as word    ; Q = Q >> 3    lsr   r19      ; do the last 3 shifts    ror   r18    lsr   r19    ror   r18    lsr   r19    ror   r18    ; r19:r18 now "Q" (= result >> 19) That's 25 cycles for a 'divide 16 bit number by 10' (result BTW is always correct over the full range 1 - 65535). _________________Andreas

 lfmorrison
 Posted: Mar 31, 2006 - 04:07 PM
 Joined: Dec 08, 2004 Posts: 4719 Location: Nova Scotia, Canada
 Well now, Andreas... The argument could be made that RetroDan's solution is *more correct* than the traditional integer approach. 169/10 = 16 if we follow traditional definitions. Of course, the "true" result is 16.9. But we cannot express 16.9 in an integer. So we round. The traditional approach for computer integer arithmetic is to always round down. Mathemeticians would tell us that we should round anything from 0.0 up to 0.49999999... as 0, and anything from 0.500000000...01 up to 0.9999999999999999... as 1. There is a gray area for 0.5 exactly. I was taught a convention that in that case, you round to the even-numbered integer. From that prespective, RetroDan's solution comes much closer to the "mathematically" correct solution. 169/10 = 17. As for the amount of pushing and popping required... well, that's relative to the register usage conventions at hand. In the worst case (with all destoryed registers saved and restored), RetroDan's solution is still smaller and faster than the best-case (with no destroyed registers saved and restored) "Lenze" solution. Last edited by lfmorrison on Mar 31, 2006 - 04:18 PM; edited 1 time in total

 clawson
 Posted: Mar 31, 2006 - 04:17 PM
 Joined: Jul 18, 2005 Posts: 62954 Location: (using avr-gcc in) Finchingfield, Essex, England
 Yup, actually I thought the *26 then ditch 8 bit thing was really clever. Obviously it works because 26/256 is 0.1015625 which for small numbers is OK as a way of multiplying by roughly 0.1. If you tried a larger value like 800 then I guess it'd be 800 * 26 = 20,800 and divided by 256 this is then 81 - so the extra 0.0015625 has come into play to create the error. But on small numbers it is a clever technique for an "integer divide" In fact just checked and it's good for all values up to 630 in fact, the first one where it goes wrong is 631 Cliff

 alenze
 Posted: Mar 31, 2006 - 06:23 PM
 Joined: Mar 28, 2001 Posts: 416 Location: Munich, Germany
 Quote: From that prespective, RetroDan's solution comes much closer to the "mathematically" correct solution. 169/10 = 17. Ah well, Luke - for '169', yes. But even if we assume an 'implicit rounding' scheme, this is not handled correctly, for ex. in the range 0-9 we get '0' as result up to dividend 9, ditto e.g. in the 50-59 range ('5' up to 59). Mathematically correct - as to the mathematicans - would be the 'increment result' from '5' or '55' upwards. OP claims '100% correct' for the algorithm which it isn't. It would be correct if it either were returning the remainder together with the result or providing the mathematically correct rounding you described. Quote: In the worst case (with all destroyed registers saved and restored), RetroDan's solution is still smaller and faster than the best-case (with no destroyed registers saved and restored) "Lenze" solution. True, except for the fact that the OP is manipulatively comparing a 'sort of 8 bit integers division' (nice as long as you don't need correct results) with a 16 bit integers division (add another 6 cycles to get the remainder - blows it up to 31 ... geeeze ...) which, thanks to Mr Jones, does return correct results. BTW: As this is the 'tutorial' forum (contents aimed at beginners) I honestly think that it would be more helpful if an approximation technique were explecitely labeled as such instead of trying to sell it off as real binary mathematics. _________________Andreas

 RetroDan
 Posted: Mar 31, 2006 - 06:35 PM
 Joined: Feb 24, 2006 Posts: 794 Location: http://avr.x.am
 alenze wrote: BTW: As this is the 'tutorial' forum (contents aimed at beginners) I honestly think that it would be more helpful if an approximation technique were explecitely labeled as such instead of trying to sell it off as real binary mathematics. I apologoize Alenze, as the comparison was not meant to in any way diminsh the excellent routines you have in your Div_xx Package. It is full of "tried and true" routine, and are backed by the work of Jones. If anyone is looking for "correct" routine I recommend them highly However, I don't think you read the CAVEAT LECTOR warning in my first post. These postings are to discuss "Dirty Math Tricks" and hopefully have some fun at the same time. If anyone is uneasy about what I'm doing, I suggest they stick to the proven routines like the ones you and Atmel provide and forget my threads even exist.

 lfmorrison
 Posted: Mar 31, 2006 - 07:16 PM
 Joined: Dec 08, 2004 Posts: 4719 Location: Nova Scotia, Canada
 "Closer to mathematically correct". I wasn't claiming that it was a miracle cure-all solution. In fact, it conveniently works out that way by pure coincidence of the range of numbers involved.. Anyway... The "traditional" approach to integer division (ignoring remainders for the moment), and Retro's "radical approach" to integer division are both wrong when compared to result using real numbers. When compared to the "real number" result, the traditional results will always be less than or equal to the true result, with a maximum absolute error error of -1 LSB, and this maximum error is repeated consistently with a periodicity of 10 for all possible inputs. RetroDan's results may differ from the "real number" result as well. The result may be greater than or less than the "real number" result. But it, too, will have a maximum absolute error of +/-1 LSB for the range of inputs we're interested in. Note that the original intent of this thread was only to do 8-bit division by 10. And we actually only approach an absolute error of -1 LSB at the lower extreme with inputs of around 0, and +1 LSB at the upper extreme with inputs of around 630. In the middle of the range, the aboslute error will be limited to a band of +/- 0.5 LSB In light of these results, I don't think it's fair to label his as "more wrong" than the traditional results. Of course, if you want to have access to the remainders so that your application can fine-tune its interpretation of the result, then that's a strong argument in favour of some other approach.

 lfmorrison
 Posted: Mar 31, 2006 - 07:38 PM
 Joined: Dec 08, 2004 Posts: 4719 Location: Nova Scotia, Canada
 Finding remainders: - Multiply the result by 10. Label as 'x'. - Subtract 'x' from the original input. - This result is equal the the "Remainder". Using RetroDan's code, the remainder can either "positive" or "negative", but it will always have a magnitude of 10 or less. If negative, then feel free to subtract 1 from the quotient and add 10 to the remainder. And presto! You have a result that is totally identical to the "Lenze" result, for all possible 8-bit inputs. Code: ; RetroDan's original code DIV10: LDI B,51        MUL A,B        INC R1   ;R1=A/5        LSR R1   ;R1=(A/5)/2 = A/10 ; Luke's insertion        MOV C, R1        LDI B, 10        MUL C, B        SUB A, R0        ; 'A' contains the (signed) remainder.        ; sign of the remainder indicates whether it is an:        ; - under-estimate (positive remainder), or        ; - over-esitmate (negative remainder)        ; 'C' contains the quotient        ; Now a 100% _mathematically_ correct result for all 8-bit inputs.        ; (Destroys the original input operand 'A'.        ;  one could return the remainder in register        ;  'B' at the cost of one extra MOV instruction.)        RET

 Koshchi
 Posted: Mar 31, 2006 - 09:47 PM
 Joined: Nov 17, 2004 Posts: 13961 Location: Vancouver, BC
 Regarding the division by 5 routine you said: Quote: Hmm, we're alway off by minus one, so if we add one, we should be right on the money! But the only numbers that you checked were ones that were actually divisible by five. In fact, these are the only numbers where this routine returns the correct answer. If someone wants to use this, then some correction would definitely be necessary. What I would suggest is to not add the one, then find the remainder as lfmorrison points out. In this case the remainder will always be in the range of [0, 5]. If the remainder is 5, add one to the quotient and make the remainder 0. Otherwise, the quotient and the remainder are correct. Applying this to the divide by 10 (the '* 51' version, not the '* 26' version), again if you drop the 'add one', the remainder is always in the range of [0, 10]. So again the only adjustment is for a remainder of 10. Keep in mind that these remainder ranges are only valid if the original number is 8 bit. I quickly goes awry for larger numbers. Even 261 produces an out of range remainder. You could subtract 10 (or 5 for the div by 5) from the remainder instead of just making it 0 to extend the range. To be more specific, it will be correct as long as the answer fit in 8 bits. _________________Regards, Steve A. The Board helps those that help themselves.

 alenze
 Posted: Mar 31, 2006 - 11:52 PM
 Joined: Mar 28, 2001 Posts: 416 Location: Munich, Germany
 One way to do this (sorry, lfmorrison's signed remainder isn't always mathematically correct - "we should round anything from 0.0 up to 0.49999999... as 0, and anything from 0.500000000...01 up to 0.9999999999999999... as 1": Code: ; Division by 10, useing reciprocal multiplication ; ; Call with: ;   8 bit dividend in r16 ; ; Returns: ;   Result in r16 ;   Remainder in r18    mov   r18, r16   ; save original dividend    ldi   r17,26      ; reciprocal, scaled *256, off a bit    mul   r16,r17 ; result of multiplication now in r1:r0 ; use only r1, thereby effectively divide by 256    dec   r1      ; if imprecise scaling value influences result,             ; result will be '+1'. Decrement to avoid neg.             ; value in later subtraction    ldi   r17,10      ; re-use r17 for divisor    mov   r16,r1      ; save result/256    mul   r16,r17      ; find remainder by multiplication of result by 10d ; result again in r1:r0 ; get remainder and correct result of 'by 10'    sub   r18,r0    cpi   r18,10         brlo   done    subi   r18,10    inc   r16 done: ; result in r16, remainder in r18 ;optional rounding:    cpi   r18,5    brlo   done_r    inc   r16 done_r: (Quick hack, up to two or three cycles could be shaved off) _________________Andreas

 Display posts from previous:  All Posts1 Day7 Days2 Weeks1 Month3 Months6 Months1 Year Oldest FirstNewest First
 Jump to: Select a forum Forum index|--[AVR (8-bit) Technical Forums]|   |-- AVR forum|   |-- XMEGA forum|   |-- AVR Wireless forum|   |-- AVR GCC forum|   |-- AVR Studio 5 and Atmel Studio 6 forum|   |-- AVR studio 4 forum|   |-- AVRfreaks Academy forum|   |-- AVR Tutorials|--[AVR Software Framework]|   |-- AVR Software Framework|--[AVR32 (32-bit) Technical Forums]|   |-- AVR32 Linux Forum|   |-- AVR32 General (standalone)|   |-- AVR32 Software Tools|   |-- AVR32 Hardware|--[General Electronics Technical Forums]|   |-- General Electronics|   |-- Atmel Security Products|--[Non-technical forums]|   |-- AVRfreaks.net Housekeeping|--[Non-topical forums]|   |-- Off-topic forum|   |-- AVRfreaks Trading Post
All times are GMT + 1 Hour