## Looking for efficient (ie v.small) algorithm for LOG2(n)

11 posts / 0 new
Author
Message

Hi Folks,

I suspect I know the answer but if anybody can chip in I'd be grateful.

I need to manipulate some binary numbers in an FPGA co-pro design I'm doing (div, square root, and possibly other roots and powers) and I got to thinking about converting the numbers to their log equivalent (base 2) doing the business and then convert back.

After spending all morning researching this at Queens university librray, and on line, I have come up with zillions of algorithms which use floating point, or require multiply or division or both, and therefore are useless to me.

I need the integer and fractional part of the result, not just integer.

If any of you have ideas about such algorithms, or where I can look for ideas (maybe obscure old textbooks etc), then let me know.

Many thanks,
Murdo.

PS: Application is for a MIMO RF decoder where I am comparing QR and Cholesky decompositions.

There are already a million monkeys in front of a million keyboards, and the internet is nothing like Shakespeare!

Analog Devices has a fixed-point solution for their 16-bit DSPs that you could use. It's in Chapter 4 of Vol. 1 of DIGITAL SIGNAL PROCESSING APPLICATIONS USING THE ADSP-2100 FAMILY. You can download it from the ADI web site. It only has logs to base 10 and e, but you should be able to work out how to do logs to base 2. It uses multiplication, I don't think you will be able to avoid it whatever you do.

Leon Heller G1HSM

Its just the bit number of the highest order bit thats set isnt it? log2 of 0x40 is 6 isnt it?

Imagecraft compiler user

bobgardner wrote:
Its just the bit number of the highest order bit thats set isnt it? log2 of 0x40 is 6 isnt it?
assuming positive numbers.

Suppose you have a 32 bit value A. We want log2(A)/log2(2^31). (If we really want log2(A), then we just need to add 31 afterwards.)

The result starts at 0.
While bit 31 of A is not set, left shift A and subtract 256 from the result.
Grab bit 30..23 of A. (On AVR+GCC the fastest way is to left shift one, and then right shift 24.) Add this to the result.
The result is now a 5.8 fixed point number.

This approximation is usually good enough for most readouts, and can be compensated to be *good* by correcting the error using a polynomium.

Here is an implementation for 10 bit use:

```int log2ff(unsigned w)
{   //Log2_fixedpoint_10bit by Kasper Pedersen 2011
int k;
k=0;
while (!(w&1024)) {
k-=1024;
w<<=1;
}
w&=1023;
k+=w;
w>>=1;
if (w) {
w-=256;
w=65025-w*w;
w>>=8;
w*=90;
w>>=8;
k+=w;
}
return k;
}```

KKP wrote:
(On AVR+GCC the fastest way is to left shift one, and then right shift 24.)

Surely with GCC, the fastest way is to find a library that already has the function? If you can't find one, just use the regular log or ln function with the relationship:

`log2(X) = log(X)/log(2) = ln(X)/ln(2)`

C means never having to understand what you're doing.

bobgardner wrote:
Its just the bit number of the highest order bit thats set isnt it? log2 of 0x40 is 6 isnt it?

Kinda, for simple applications, but not really useable for DSP type number crunching.

I need accuracy here (not just the magnitude) for the manipulations (sqr roots etc) so I need the fractional part eg Lb(0x41) is 0x6.05B9E... (6.0223... in decimal) - 0x06 isn't good enough.

Leon, Thank you - I'll have a look tomorrow when I get back to the uni and see if I think the ADI algorithms can be converted to VHDL easily. We've a very powerful Xilinx Virtex chip design running an array of vector processors so however I do these operations doesn't have to be single cycle but needs to be small, deterministic and low latency.

KKP, Thanks for that - I'm looking though your alorithm right now.

It's probably a silly (and very arrogant) idea of mine, but I cant help think the algorithms that are presented everywhere I look, as the "de facto way to do xyz..." all look horribly complex (for things I've been doing since primary school).

Murdo.

There are already a million monkeys in front of a million keyboards, and the internet is nothing like Shakespeare!

I've got the book on my HD, and have uploaded the relevant chapter here:

http://www.leonheller.com/Docs/C...

Leon Heller G1HSM

Last Edited: Wed. Jun 1, 2011 - 06:22 PM

peret wrote:
KKP wrote:
(On AVR+GCC the fastest way is to left shift one, and then right shift 24.)

Surely with GCC, the fastest way is to find a library that already has the function? If you can't find one, just use the regular log or ln function with the relationship:

`log2(X) = log(X)/log(2) = ln(X)/ln(2)`

C means never having to understand what you're doing.

Alas, VHDL is not quite so simple: the trade-offs between latency and no of LUTs can make or break this whole design. I'm now starting to run out of DSP blocks as well which doesn't help when I'm running 108 processing stages in parallel...

It's projects like this that make me want to go back to the good old Z80 of my youth...

There are already a million monkeys in front of a million keyboards, and the internet is nothing like Shakespeare!

leon_heller wrote:
You could implement a small, fast processor in the FPGA.

If I put any more processors in here it's going to evolve sentience and take over the world - 108 vector processors is enough; hopefully some simple blocks to act as co-pros will reduce the pipeline stalling and allow me to actually shrink the total number of processors.

It's mind bogling when you read the reports from the design software - 100s of thousands of this logic block and tens of thousands of that, plus millions of bits of memory storage plus yada yada yada. I'm a 2 transistor kinda guy - these numbers scare me.

There are already a million monkeys in front of a million keyboards, and the internet is nothing like Shakespeare!