Integer n-dimensional vector length without overflow

Go To Last Post
13 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'm writing routine that depends on calculating a (potentially large) vector length where the components are stored in an array of 32-bit signed integer format.

 

I'm essentially trying to calculate

sqrt(x*x + y*y + z*z + ...)

 

for any vector component that can fit in 32 bits. The problem is that the intermediate value can quickly grow beyond even a 64-bit int.

Is there even a way to mathematically "break up" the calculation into an iterative (and possibly slower) process, while compounding some smaller terms like sqrt(x*x) + sqrt(y*y)? This obviously produces a different result, but the storage required never grows beyond 64 bits for 32 bit operands, and I'd like to transform the above expression into something similar while still producing the correct result.

 

My goal is robustness without resorting to floats.

Last Edited: Tue. Dec 12, 2017 - 05:08 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

<> doesn't like me at the moment.

xlo= x & 0xFFFF;

xhi= (x-xlo)/0x10000;

// x**2 = xlo**2 + 0x20000*xlo*xhi + 0x10000000*xhi**2

The rest is left as an exercise for the reader.
Edit:

uint32_t xu=labs(x);

uint16_t xlo=xu & 0x7F; // 11 bits

uint16_t xmd=(xu>>11) & 0x7F; // 11 more bits

uint16_t xhi=xu>>22;

// xu**2 = xlo**2 + (2**12)*xlo*xmd + (2**23)*xlo*xhi + (2**22)*xmd**2 + (2**34)*xmd*xhi + (2**44)*xhi**2

This exercise should lead to six sums that each fit into 32 bits.

International Theophysical Year seems to have been forgotten..
Anyone remember the song Jukebox Band?

Last Edited: Wed. Dec 13, 2017 - 07:42 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

a simple way would be this:

sqrt(x*x/n + y*y/n + z*z/n + ...)*sqrt(n)

 

Add

And if n=256 it become very simple (move a byte right), perhaps round up if low byte >127 

Last Edited: Tue. Dec 12, 2017 - 08:36 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

he problem is that the intermediate value can quickly grow beyond even a 64-bit int.

Surely not, if you expect the result to fit in 32bits?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

sparrow2 wrote:

sqrt(x*x/n + y*y/n + z*z/n + ...)*sqrt(n)

 

Isn't there a trade-off with that approach in that any rounding/truncation errors will happen on every addend, and also on the final sqrt(), rather than just once on the final sqrt()?

"This forum helps those that help themselves."

"How have you proved that your chip is running at xxMHz?" - Me

"If you think you need floating point to solve the problem then you don't understand the problem. If you really do need floating point then you have a problem you do not understand." - Heater's ex-boss

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

if you don't want to do it correct there will be a tradeoff :)
even #2 will have a problem (with the last sqrt).

 

And OP should look into FP

When the AVR get's into 64 bit FP would be faster (but again there is a tradeoff FP is only 24bit).

 

What is best depends a lot of what OP really need (what is max x, how many numbers etc). 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Are your 32 bit numbers using the full 32 bit range?  For example, perhaps they only use up 24 bits, then squaring would be 48bits & a 64 bit accumulator would be able to sum up about 65000 sums before any chance of overflow.

How many are you summing50? 500? 5000?

If the  values ARE using the full 32 bit range, then after squaring you could probably keep only the 24 MSB (do you really need a 64 bit final result?)

 

Also, kill the signs right away (convert neg to pos) , they go away due to squaring---you just gained a bit

 

here are some tips/tricks that might be worth looking at:

https://webcache.googleusercontent.com/search?q=cache:VqJ0QPKBJo8J:https://www.embedded.com/design/configurable-systems/4006520/Improve-your-root-mean-calculations+&cd=3&hl=en&ct=clnk&gl=us

 

When in the dark remember-the future looks brighter than ever.

Last Edited: Tue. Dec 12, 2017 - 09:35 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

cteffects wrote:

sqrt(x*x + y*y + z*z + ...)

 

for any vector component that can fit in 32 bits. The problem is that the intermediate value can quickly grow beyond even a 64-bit int.

 

Sometimes this function is called hypot, and you want to compute hypot(x, y, z, ...).

 

One way for computation is to find a maximal argument x (by absolute value) and then write

 

hypot(x, y, z, ...) = |x|·hypot (1, y/x, z/x, ...)

 

All the new arguments then satisfy |·| <= 1 and hypot can be approximated by a power series expansion.

 

Moreover, hypot satisfies some funtional equations.  The most obvious is that arguments commute (in theory.  In a concrete implementation this may be quite different and can be exploited to improve conditioning).  Apart from this, we have:

 

hypot(x, y, z, ...) = hypot(hypot(x,y), z...)

 

Maybe this can be expoited, too, for example to trade simplicity against conditioning.  And the functional equation we exploited above reads

 

|a|·hypot(x, y, z, ...) = hypot(a·x, a·y, a·z, ...)

 

avrfreaks does not support Opera. Profile inactive.

Last Edited: Sat. Dec 23, 2017 - 02:03 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Sometimes this function is called hypot,

hypot(x, y, z, ...) = |x|·hypot (1, y/x, z/x, ...)

You equation appears wrong...should be  (1, y*y/(x^2), z*z/(x^2)), if you want to factor the X^2 and bring it out front through the square root. ...or  (x, y*y/x, z*Z/x) and bring out a square root x.

  

 

     double hypot  (double x     , double y);
      float hypotf (float x      , float y);
long double hypotl (long double x, long double y);
     double hypot (double x     , double y);
      float hypot (float x      , float y);
long double hypot (long double x, long double y);
     double hypot (Type1 x      , Type2 y);       // additional overloads

Compute hypotenuse

Returns the hypotenuse of a right-angled triangle whose legs are x and y.

The function returns what would be the square root of the sum of the squares of x and y (as per the Pythagorean theorem), but without incurring in undue overflow or underflow of intermediate values.

 

When in the dark remember-the future looks brighter than ever.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

here is some help, try starting with 64 or  96 (12 byte) squared integers...these should be easy & fast to create if your chip has the mul function (tiny's do not, so extra steps, & slower)

 

here are a bunch of math routines you can build off of to get up to 128 bits

https://sites.google.com/site/avrasmintro/home/2b-basic-math

 

 

Digit-by-digit algorithm[edit]

The traditional pen-and-paper algorithm for computing the square root n {\displaystyle {\sqrt {n}}} {\sqrt {n}} is based on working from higher digit places to lower, and as each new digit pick the largest that will still yield a square ≤

n {\displaystyle \leq n} \leq n. If stopping after the one's place, the result computed will be the integer square root.

Using bitwise operations[edit]

If working in base 2, the choice of digit is simplified to that between 0 (the "small candidate") and 1 (the "large candidate"), and digit manipulations can be expressed in terms of binary shift operations. With * being multiplication, << being left shift, and >> being logical right shift, a recursive algorithm to find the integer square root of any natural number is:

function integerSqrt(n):
    if n < 0:
        error "integerSqrt works for only nonnegative inputs"
    else if n < 2:
        return n
    else:
        # Recursive call:
        smallCandidate = integerSqrt(n >> 2) << 1
        largeCandidate = smallCandidate + 1
        if largeCandidate*largeCandidate > n:
            return smallCandidate
        else:
            return largeCandidate

Or, iteratively instead of recursively:

function integerSqrt(n):
    if n < 0:
        error "integerSqrt works for only nonnegative inputs"

    # Find greatest shift.
    shift = 2
    nShifted = n >> shift
    # We check for nShifted being n, since some implementations of logical right shifting shift modulo the word size.
    while nShifted ≠ 0 and nShifted ≠ n:
        shift = shift + 2
        nShifted = n >> shift
    shift = shift - 2

    # Find digits of result.
    result = 0
    while shift ≥ 0:
        result = result << 1
        candidateResult = result + 1
        if candidateResult*candidateResult ≤ n >> shift:
            result = candidateResult
        shift = shift - 2

    return result

Traditional pen-and-paper presentations of the digit-by-digit algorithm include various optimisations not present in the code above, in particular the trick of presubtracting the square of the previous digits which makes a general multiplication step unnecessary.

When in the dark remember-the future looks brighter than ever.

Last Edited: Sat. Dec 23, 2017 - 04:16 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

avrcandies wrote:

 

Sometimes this function is called hypot,

hypot(x, y, z, ...) = |x|·hypot (1, y/x, z/x, ...)

 

You equation appears wrong...

The 2ary hypot is:

 

hypot(x,y) = sqrt(x²+y²) = |x|·sqrt(1+(x/y)²) = |x|·hypot(1,y/x)

 

avrfreaks does not support Opera. Profile inactive.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

your middle equation appears incorrect, let's take it step by step:

hypot(x,y) = sqrt(x²+y²)

OK, divide by y^2  

=sqrt(Y^2(1+X^2/Y^2))

bring the Y^2 outside the square root

hypot(x,y)=Ysqrt(1+X^2/Y^2)     ....taking sqrt root of "1" & "x/y"  squared 

looking at the first line, this appears to be also recursive 

hypot(x,y)=Yhypot(1,X/Y)

likewise

hypot(x,y)=Xhypot(1,Y/X)  ...since neither X nor Y have any preference.

 

 

 

When in the dark remember-the future looks brighter than ever.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yes that's a typo.  That typo is not in the original where you say that's wrong. And you need the absolute value of the quanity before hypot resp. sqrt or assume the value is positive.

 

..whatever, the idea should be pretty much clear:

 

Power series expansion of hypot(1+x) ≈ 1 + x²/2 + ... i.e. expand sqrt around 1 and with |x| ≤ 1.

 

avrfreaks does not support Opera. Profile inactive.

Last Edited: Sat. Dec 23, 2017 - 08:54 PM