Author Message
 Broxbourne
 Posted: Apr 01, 2006 - 03:34 AM
 Joined: Nov 16, 2005 Posts: 341 Location: Calgary, Alberta, Canada
 I'm going to start with an apology here, because I might be thinking differently, and appear to be stiffling a very simulating and valuable discussion. However as a tutorial, I feel this thread is misplaced. It should be in the main forum. My thoughts of a tutorial are as an introduction, a quickstart guide, a pointer to get going and excited. It is a place for Newbies to look and gain confidence, and make things. Get involved and learn. Good examples to simply use (and with luck considered later), just standard routines. If my manager came up and said, "Eric, we stock a 4k device, the next part up is \$5 and that will take us under", then I would be fantastically interested in this thread. But If I had just found this site, looking for information, I might be scared away by the academic discourse of the moment. I'm looking for gentle introductions to do this, do that. If the device cannot divide by 10, and I cannot comprehend the details that experienced people are debating, then I'm off. Another Newbie scared away. Engineering is about compromises."Gerald!, if you can do it in a day just get on with it.". Versus, "A new part in our inventory, can't you make it smaller". Such pragmatic stuff is key to success, but we have to be careful about introducing it here (apart from sowing thoughts). I hope my opinion is understood in a positive sense. It is only one view among so many good people. Cheers All.

 lfmorrison
 Posted: Apr 01, 2006 - 04:15 AM
 Joined: Dec 08, 2004 Posts: 4719 Location: Nova Scotia, Canada
 Frankly, I don't buy that "negative remainder isn't correct" argument. It's a non-standard representation. So are improper fractions. But just like improper fractions, the result is numerically dead-on. AFAIC, 25 divided by 4 can be perfectly legally expressed as 6 remainder 1. And that is exactly identical to 5 remainder 5, or 7 remainder -3. The only possible result from RetroDan's expression would still be guaranteed to be within 1 LSB of the true result. And the "traditional approach" cannot offer any better than getting within 1 LSB either. But if it really bothers you, then you could always test the sign bit of the remainder... if it is set, then subtract 1 from the quotient and add 10 to the remainder. Code: DIV10: LDI B,51    MUL A,B    INC R1   ;R1=A/5    LSR R1   ;R1=(A/5)/2 = A/10 ; Luke's insertion    MOV C, R1    LDI B, 10    MUL C, B    SUB A, R0    ; 'A' contains the (signed) remainder.    ; sign of the remainder indicates whether it is an:    ; - under-estimate (positive remainder), or    ; - over-esitmate (negative remainder)    ; 'C' contains the quotient    ; We can return here with a totally valid and numerically correct result.    ; But for the sticklers among us, let's convert it    ; to a "cannonical form".    SBRS A, 7 ; test the sign of result.    RJMP PositiveRemainder ; We have a negative remainder; subtract 1 from ; the quotient and add 10 to the remainder.    DEC C    ADD A, B PositiveRemainder ; The result is now in perfect "cannonical" form.    RET And given the limited range of values we're concerned with here, that result (after a single check of the sign on the remainder, and possibly a single subtraction of 1 / addition of 10) will be in the "cannonical" form. (Ie. a guaranteed underesitmate in the result, and a remainder that is guaranteed to be positive, and bound between 0 and 9.)

 RickB
 Posted: Apr 01, 2006 - 04:49 AM
 Joined: Jan 30, 2005 Posts: 857 Location: Junction City, OR USA
 Broxbourne: I agree 100%. The give and take on this subject should be hashed out in another forum. Then and ONLY then should it become a tutorial. After all, what is a beginner to think when seeing the differing opinions. Most beginners are in no position to judge who is right or wrong on the details. I would like to see it here once that is done. Rick

 RetroDan
 Posted: Apr 01, 2006 - 05:05 AM
 Joined: Feb 24, 2006 Posts: 794 Location: http://avr.x.am
 Broxbourne wrote: I'm going to start with an apology here, because I might be thinking differently, and appear to be stiffling a very simulating and valuable discussion. However as a tutorial, I feel this thread is misplaced. It should be in the main forum. I hope my opinion is understood in a positive sense. It is only one view among so many good people. Cheers All. Thanks for the feed-back. You may not know this, but actual tutorials are labeled [TUT] by moderator to distiguish them from other threads. The moderator has labeled this thread [DIS] for discussion. Another label is [CODE] for code segments that are not tutorials. I think this information can be found in the "Sticky-the-Bear" threads at the top of the list.

 Broxbourne
 Posted: Apr 01, 2006 - 05:40 AM
 Joined: Nov 16, 2005 Posts: 341 Location: Calgary, Alberta, Canada
 Dan, Thank you for the information. I did not realize the moderator was adjusting the titles of the thread according to the intended audience. I just thought this was the place to find tutorials and good examples of code and AVR applications. Thanks for clarifying that. I'll make sure I only read [TUT] postings from now on to avoid getting confused by the more heady academic stuff, although I am still surprised to see such specialised discussion in a beginner forum. Practically Yours.

 alenze
 Posted: Apr 01, 2006 - 12:47 PM
 Joined: Mar 28, 2001 Posts: 416 Location: Munich, Germany
 Hi Luke, Quote: sorry, lfmorrison's signed remainder isn't always mathematically correct - "we should round anything from 0.0 up to 0.49999999... as 0, and anything from 0.500000000...01 up to 0.9999999999999999... as 1" definitely didn't come through quite as intended (sorry about that): Quote: Frankly, I don't buy that "negative remainder isn't correct" argument You are right and I phrased it badly: of course the signed remainder is correctly calculated (cleaner than in my example, too - 3 or four cycles saved). It's the 'inbuild rounding' whose threshold is off by one, mathematically speaking ( ): The quotient 'n.5' (6.5 for example if dviding 65/10) should already be rounded up to n+1 ('7'), here it is still left at '6'. 66/10 (6.6) is correctly rounded to '7' as 64/10 (6.4) becomes a rounded '6'. It's just the '.5' rounding which doesn't quite work (aren't those mathematicians a fascinating breed?). Sorry again - apologies to Broxbourne, too, guess we got carried away a little here ... _________________Andreas

 RetroDan
 Posted: Apr 06, 2006 - 05:07 AM
 Joined: Feb 24, 2006 Posts: 794 Location: http://avr.x.am
 Thank you all for the wonderful feed-back. The information was invaluable and probably saved me a lot of head-scratching as you'll soon find out. I was pleased-as-punch that my "quirky" little division by 10 routine worked. I was even more delighted to find out that it did a pretty good job of rounding to boot. However, when I got around to using it for my original purpose, I realized that rounding was the last thing I wanted. The original use for the routine was to break-down an unsigned single byte for display on the Butterfly LCD. Well I was getting results like: 74, 75, 76, 77, 80, 80, 80, 81, 82, 83. Obviously the little DIV10 routine was working over-time and rounding, so a few fine adjustments were needed to "UN-ROUND" the results. I know that you guys have already worked out your own versions, here is mine: Code: ;-----------------------------------; ; RETRO (SYNTHETIC) DIVISION BY 10  ; ; ANSWER IN R1, R0=REM, A:PRESERVED ; ;-----------------------------------; DIV10:  PUSH B         LDI  B,26   ;MUL BY 26         MUL  A,B    ;R1=A/10         PUSH R1     ;BRUTE-FORCE CALC OF REMAINDER              LDI  B,10   ;CALC REM         MUL  R1,B   ;R0=10xR1(QUOT)         POP  R1     ;RESTORE QUOT         SUB  R0,A   ;SUBTRACT REMx10 NODJST: NEG  R0     ;MAKE POSITIVE          BRPL NONEG ;STILL NEG?         ADD  R0,B   ;OOPS MAKE         DEC  R1     ;ADJUSTMENTS NONEG:   RET If you're wondering why I calculate the remainder "backwards" then negate it, it's to preserve the value of A. If there's no need to preserve A then the remainder could be calculated the other-way-around and eliminate the need for the NEG R0 line. However, you should be forewarned that "letting go" a NEG R0 without very good cause could lead to a charge of racism! Other than removing that one statement I don't know if it can be optimized any further, but it sure would be nice.

 brum103
 Posted: Oct 08, 2006 - 12:58 AM
 Joined: Nov 15, 2004 Posts: 11 Location: Netherlands

 JohnD
 Posted: Oct 21, 2006 - 04:26 PM
 Joined: Oct 12, 2006 Posts: 12
 I tried running your 8 bit division code: [quote];*************************************************************************** ;* "div8u" - 8/8 Bit Unsigned Division ;* ;* This subroutine divides the two register variables "dd8u" (dividend) and ;* "dv8u" (divisor). The result is placed in "dres8u" and the remainder in ;* "drem8u". ;* ;* Number of words :14 ;* Number of cycles :97 ;* Low registers used :1 (drem8u) ;* High registers used :3 (dres8u/dd8u,dv8u,dcnt8u) ;*************************************************************************** .def drem8u =r15 ;remainder .def dres8u =r16 ;result .def dd8u =r16 ;dividend .def dv8u =r17 ;divisor .def dcnt8u =r18 ;loop counter div8u: sub drem8u,drem8u ;clear remainder and carry ldi dcnt8u,9 ;init loop counter d8u_1: rol dd8u ;shift left dividend dec dcnt8u ;decrement counter brne d8u_2 ;if done ret ; return d8u_2: rol drem8u ;shift dividend into remainder sub drem8u,dv8u ;remainder = remainder - divisor brcc d8u_3 ;if result negative add drem8u,dv8u ; restore remainder clc ; clear carry to be shifted into result rjmp d8u_1 ;else d8u_3: sec ; set carry to be shifted into result rjmp d8u_1 When I build this code, it gives me an error: "Register already defined by the .def directive". The above code refers to R16 with two different labels which should be okay. Is there a reason why it gives me this error? Thanks, John

 Koshchi
 Posted: Oct 21, 2006 - 05:13 PM
 Joined: Nov 17, 2004 Posts: 13956 Location: Vancouver, BC
 I've gotten this error recently as well. I think that it used to work, but it has changed in recent versions of AVR Studio. Not sure if this is a bug or a deliberate change. _________________Regards, Steve A. The Board helps those that help themselves.

 froody
 Posted: Oct 21, 2006 - 11:07 PM
 Joined: Oct 20, 2006 Posts: 4
 First, let me say that I have no clue about AVR coding. An Arduino board is in the mail, but I haven't gotten it yet. I love digital math, though. So here's a little different take on the OP's trick. Dividing is the same thing as multiplying by the inverse, so dividing by 10 is the same as multiplying by 0.1. The problem is that there isn't a neat binary representation of 0.1. But it turns out that you can use a good enough 16-bit representation of 0.1 to get accurate results (meaning the same number as division by 10, rounded down as expected). With some simple fixed-point math you can then do your multiply, discard the mantissa (the bit after the . (or , for Europeans)) and there you go. What fixed-point format to use? It'll be obvious in a minute. Let's just assume 8 bits of mantissa for now and start finding the binary representation of 0.1d (decimal). Really what you're looking for is which powers of -2 you need to add up to get as close to 0.1 as possible. If you do that, you find: 1/10 =~ 0/2 + 0/4 + 0/8 + 1/16 + 1/32 + 0/64 + 0/128 + 1/256 the binary representation for this is: 00000000.00011001 So that's the 16-bit number you should multiply the original number by to divide by 10. But you'll notice we can use extra mantissa bits, as long as the top byte is 0. (If the top byte isn't 0 we might get overflow when we multiply.) So let's use 11 mantissa bits: 1/10 =~ 0/2 + 0/4 + 0/8 + 1/16 + 1/32 + 0/64 + 0/128 + 1/256 + 1/512 + 0/1024 + 1/2048 or 00000.00011001101 Here it is all nicely coded up in C. I've tested it for 0-255 and it gives the same result as C's /10 operation for every one. gcc did some awful things to the assembly, but I'll leave that part up to you guys because you're much better at it than I am. Code: uint8_t div10_fp(uint8_t number) {     // 1/10th with 11 mantissa bits     // 1/10th = 0/2 + 0/4 + 0/8 + 1/16 + 1/32 + 0/64 + 0/128 + 1/256 + 1/512 +     // 0/1024 + 1/2048     // 0.00011001101     uint16_t tenth = 0xcd;     // result with 11 mantissa bits     uint16_t result = number * tenth;   9a:   99 27           eor     r25, r25   9c:   2d ec           ldi     r18, 0xCD       ; 205   9e:   30 e0           ldi     r19, 0x00       ; 0   a0:   82 9f           mul     r24, r18   a2:   a0 01           movw    r20, r0   a4:   83 9f           mul     r24, r19   a6:   50 0d           add     r21, r0   a8:   92 9f           mul     r25, r18   aa:   50 0d           add     r21, r0   ac:   11 24           eor     r1, r1   ae:   ca 01           movw    r24, r20     // Drop mantissa to round down.     return result >> 11;   b0:   89 2f           mov     r24, r25   b2:   99 27           eor     r25, r25   b4:   86 95           lsr     r24   b6:   86 95           lsr     r24   b8:   86 95           lsr     r24 }   ba:   99 27           eor     r25, r25   bc:   08 95           ret You can do the same thing for dividing by any constant, and I'm actually kind of surprised gcc doesn't just do this for me. Tim

 brum103
 Posted: Oct 22, 2006 - 03:35 PM
 Joined: Nov 15, 2004 Posts: 11 Location: Netherlands
 That is a great idea froody! I had a look at your code, and optimized it a bit. Code: ; the input number must be in register "input" ; value/10 is in register "result" ; register temp1 and R1:R0 are clobbered div10:    ldi temp1, 205    mul temp1, input    lsr R1    lsr R1    lsr R1    mov result, R1    ret edit... i have written a little test program in ti basic. Code: :ClrHome :0->G :0->F :For(N,0,1200) :If (iPart(N/10)=iPart((N*205)/2048)) :Then :G+1->G :Else :F+1->F :If F<5 :Output(2+F,1,N :End :Output(1,1,"G=" :Output(1,3,G :Output(2,1,"F=" :Output(2,3,F :End It shows that it is correct up to 1028 and it fails at 1029, more than enough for 8 bit

 froody
 Posted: Oct 22, 2006 - 11:13 PM
 Joined: Oct 20, 2006 Posts: 4
 brum, wow. gcc really sucks, huh? Nice job on optimizing the code. I'm going to install gcc 4.1 soon. Hopefully it won't suck as much. Meanwhile, I clarified my original post a bit, and wrote some more, and now it's turned into a little article: How to divide an unsigned 8-bit integer by a constant. (c) 2006 by Tim Newsome Thanks to RetroDan and several other on avrfreaks.net for the inspiration. Dividing is slow, so let's avoid that. Instead, multiply by the inverse. You can write the inverse as a fixed-point binary number. Just like each digit in a normal binary number represents a power of 2, each digit after the decimal point in a binary number represents a power of -2. So: 10.11 = 1 * 2^1 + 0 * 2^0 + 1 * 2^-1 + 1 * 2^-2 = 2 + 1/2 + 1/4 = 2.75 Example: 0.5 = 1/2 = 0.1 (binary, 1-bit mantissa) Example: 0.25 = 0/2 + 1/4 = 0.01 (binary, 2-bit mantissa) Unfortunately there isn't a precise binary representation of most fractions, so you have to use an approximation. The same problem happens in decimal numbers, just not as often. For instance, 1/3 can be approximated as 0.3, which has an error of 0.3-1/3=-1/30. A better approximation is 0.33, which has an error of -1/300. The more digits you add, the smaller the error becomes. The same thing is true in binary: 1/3 =~ 1/2 = 0.1, error = 0.167 1/3 =~ 0/2 + 1/4 = 0.01, error = -0.0833 1/3 =~ 0/2 + 1/4 + 1/8 = 0.011, error = 0.0417 1/3 =~ 0/2 + 1/4 + 0/8 + 1/16 = 0.0101, error = -0.0208 Because we can't always find the precise inverse, we need to find one that's close enough, and gives us the right answer by rounding down always. (Rounding down just means ignoring the mantissa bits, which is easy to do.) That means we want to find an approximation that is at least as big as the real inverse. In other words, the error (computed as approximation minus real value) has to be positive. Furthermore, the error multiplied by 255 (our largest input value) must be smaller (less negative) than 1/divisor. For dividing by 3, the error must lie between 0 and 1/(3*255). Let's rework 1/3 keeping the error positive, and seeing how many bits we need before the error is small enough that we can use the approximation: 1/3 =~ 1/2 = 0.1, error = .167 1/3 =~ 1/2 + 0/4 = 0.10, error = .167 1/3 =~ 0/2 + 1/4 + 1/8 = 0.011, error = .0417 1/3 =~ 0/2 + 1/4 + 1/8 + 0/16 = 0.0110, error = .0417 1/3 =~ 0/2 + 1/4 + 0/8 + 1/16 + 1/32 = 0.01011, error = .0104 1/3 =~ 0/2 + 1/4 + 0/8 + 1/16 + 1/32 + 0/64 = 0.010110, error = .0104 1/3 =~ 0/2 + 1/4 + 0/8 + 1/16 + 0/32 + 1/64 + 1/128 = 0.0101011, error = .00260 1/3 =~ 0/2 + 1/4 + 0/8 + 1/16 + 0/32 + 1/64 + 1/128 + 0/256 = 0.01010110, error = .00260 1/3 =~ 0/2 + 1/4 + 0/8 + 1/16 + 0/32 + 1/64 + 0/128 + 1/256 + 1/512 = 0.010101011, error = .000651 So 9 bits of mantissa is enough, and our inverse is 0.010101011. Here is some C code that takes advantage of this: Code: uint8_t div3(uint8_t number) {     // Compute the intermediate value, with 9 mantissa bits.     uint16_t intermediate = (uint16_t) number * 0xab;     // Return the result, chopping off the mantissa bits.     return intermediate >> 9; } A good compiler should compile that down to a single multiply instruction, a shift, and maybe a few extra instructions. This is drastically faster than using the generic divide code. Let's do one more example, for division by 10. First figure out the inverse using the fewest number of mantissa bits: 1/10 =~ 1/2, error = 0.400 1/10 =~ 0/2 + 1/4, error = 0.150 1/10 =~ 0/2 + 0/4 + 1/8, error = 0.025 1/10 =~ 0/2 + 0/4 + 1/8 + 0/16, error = 0.025 1/10 =~ 0/2 + 0/4 + 1/8 + 0/16 + 0/32, error = 0.025 1/10 =~ 0/2 + 0/4 + 0/8 + 1/16 + 1/32 + 1/64, error = .00938 1/10 =~ 0/2 + 0/4 + 0/8 + 1/16 + 1/32 + 0/64 + 1/128, error = .00156 1/10 =~ 0/2 + 0/4 + 0/8 + 1/16 + 1/32 + 0/64 + 1/128 + 0/256, error = .00156 1/10 =~ 0/2 + 0/4 + 0/8 + 1/16 + 1/32 + 0/64 + 1/128 + 0/256 + 0/512, error = .00156 1/10 =~ 0/2 + 0/4 + 0/8 + 1/16 + 1/32 + 0/64 + 0/128 + 1/256 + 1/512 + 1/1024, error = .000586 1/10 =~ 0/2 + 0/4 + 0/8 + 1/16 + 1/32 + 0/64 + 0/128 + 1/256 + 1/512 + 0/1024 + 1/2048, error = .0000977 Finally we've reached the point where 255 * error is less than 1/10. So we need 11 mantissa bits, and our inverse is 0.00011001101. Here's the C code: Code: uint8_t div10(uint8_t number) {     // Compute the intermediate value, with 11 mantissa bits.     uint16_t intermediate = (uint16_t) number * 0xcd;     // Return the result, chopping off the mantissa bits.     return intermediate >> 11; } So there you have it. Fast division by a constant of an unsigned 8-bit integer. If you followed what I've done here, you should be able to extend this to 16-bit or any other sized number. Last edited by froody on Oct 22, 2006 - 11:50 PM; edited 1 time in total

 abcminiuser
 Posted: Oct 22, 2006 - 11:29 PM
 Joined: Jan 23, 2004 Posts: 9878 Location: Trondheim, Norway
 Froody, Good article. However, in your code examples don't you need an explicit cast to the "number" parameter to force GCC to use 16-bit operations? I would expect GCC to do the multiplication as 8-bit first, then implicit cast to a 16-bit value when storing. - Dean _________________Atmel Studio 6.1 is now released, grab it here. Report AS6/ASF bugs here.

 froody
 Posted: Oct 22, 2006 - 11:49 PM
 Joined: Oct 20, 2006 Posts: 4
 abcminiuser wrote: Good article. However, in your code examples don't you need an explicit cast to the "number" parameter to force GCC to use 16-bit operations? I would expect GCC to do the multiplication as 8-bit first, then implicit cast to a 16-bit value when storing. Thanks. Good point about the typecast. I think it's OK, because the hex constant is implicitly of type int, which is 16 bits, on AVR (right?). So it all works out. But it certainly is safer to add the typecast, so I'll edit my post and stick it in there. Tim

 Koshchi
 Posted: Oct 23, 2006 - 01:54 AM
 Joined: Nov 17, 2004 Posts: 13956 Location: Vancouver, BC
 Quote: You can do the same thing for dividing by any constant, and I'm actually kind of surprised gcc doesn't just do this for me. ... gcc really sucks, huh? I don't know how you can come to this conclusion from this thread. Up until your post this entire thread has been about assembly language and therefore has absolutely nothing to do with gcc. Also, the C language has absolutely no knowledge about fixed point numbers, so how could you possibly expect it to optimize your code in this manner? _________________Regards, Steve A. The Board helps those that help themselves.

 froody
 Posted: Oct 23, 2006 - 03:51 AM
 Joined: Oct 20, 2006 Posts: 4
 Koshchi wrote: Quote: gcc really sucks, huh? I don't know how you can come to this conclusion from this thread. Up until your post this entire thread has been about assembly language and therefore has absolutely nothing to do with gcc. Also, the C language has absolutely no knowledge about fixed point numbers, so how could you possibly expect it to optimize your code in this manner? In the first post you quoted, I showed some C code among the assembly code. You can see that gcc's generated code is really inefficient. For instance it does a mul by a register known to be 0. (Note that I've since installed gcc 4.1.1 and it does a lot better, although there are still some obvious optimizations possible.) Look at this C code: Code: uint8_t function(void) {     volatile uint8_t val = 123;     return val / 10; } When I compile it, I get a call to a general divide routine. But the compiler knows I'm dividing by a constant, so it could do the exact optimization I outlined. Effectively it would rewrite the code to look like: Code: uint8_t function(void) {     volatile uint8_t val = 123;     return ((uint16_t) val * 0xcd) >> 11; } That is actually slightly larger than the original code, but a lot faster. If you really care about code size you could just use the div10() function. Then function() would be 1 instruction shorter (because you don't have to pass 10 to the divide function) but you do take the size penalty of div10() (10 instructions in my gcc 4.1.1 build). Tim

 daqq
 Posted: Jun 14, 2007 - 09:44 AM
 Joined: Dec 15, 2003 Posts: 4412 Location: Slovakia, Bratislava
 Try removing the "voltatile" keyword. You'll get a far better output. _________________There are pointy haired bald people. Time flies when you have a bad prescaler selected.

 safari
 Posted: Jun 14, 2007 - 11:27 AM
 Joined: Mar 10, 2006 Posts: 31 Location: Denmark
 daqq it's an old thread, nearly 3/4 of an year old /Safari

 futrtrubl
 Posted: Jun 14, 2007 - 11:41 AM
 Joined: Aug 29, 2006 Posts: 229 Location: Kingston, Jamaica
 Still, I'm glad he brought it back since this will help me. Edward

 Display posts from previous:  All Posts1 Day7 Days2 Weeks1 Month3 Months6 Months1 Year Oldest FirstNewest First
 Jump to: Select a forum Forum index|--[AVR (8-bit) Technical Forums]|   |-- AVR forum|   |-- XMEGA forum|   |-- AVR Wireless forum|   |-- AVR GCC forum|   |-- AVR Studio 5 and Atmel Studio 6 forum|   |-- AVR studio 4 forum|   |-- AVRfreaks Academy forum|   |-- AVR Tutorials|--[AVR Software Framework]|   |-- AVR Software Framework|--[AVR32 (32-bit) Technical Forums]|   |-- AVR32 Linux Forum|   |-- AVR32 General (standalone)|   |-- AVR32 Software Tools|   |-- AVR32 Hardware|--[General Electronics Technical Forums]|   |-- General Electronics|   |-- Atmel Security Products|--[Non-technical forums]|   |-- AVRfreaks.net Housekeeping|--[Non-topical forums]|   |-- Off-topic forum|   |-- AVRfreaks Trading Post
All times are GMT + 1 Hour