Fast "is-nth-bit-set-in-long" code

3 posts / 0 new
Author
Message

I thought I'd share this since it seemed like quite a novel optimization.

Let's say you have a uint32_t bitfield and need to determine whether an arbitrary bit (*not* a constant) in that field is set. The naive way of doing this is just

static uint32_t bitfield = 0xFEDCBA90;

uint8_t nthbitset = (uint32_t)(bitfield32 & (1UL << n)) != 0UL;

However, since 8-bit AVRs can only shift one byte by one place at a time, this is quite inefficient. GCC compiles this to a loop of right shifts. Even with -Os, this will take over 200 cycles with n = 31!

So I came up with a solution that operates only on 8-bit integers and requires no loops:

static uint8_t bitfield = { 0x90, 0xBA, 0xDC, 0xFE };
static uint8_t bitmasks = { 1, 2, 4, 8, 16, 32, 64, 128 };

uint8_t nthbitset = bitfield[n >> 3] & bitmasks[n & 0x7];

The bitfield is stored explicitly as 4 bytes. The upper 2 bits of n indicate which byte to test. The "AND with shifted 1" is thus performed on one byte, instead of 4, and the value of (1 << n) is stored in a precomputed table a mere 8 bytes in size.

This runs in constant time and takes only 24 cycles!

Hello,
great you got it. That's the way doing it since ... I don't really know because it's to long ago.

Here are some macros that will allow using the original 32 bit definition of the bitfield variable.

typedef struct
{
uint8_t byte;
} u32bfield_t;

#define u32bfieldbyte(value, bit) \
(*((u32bfield_t*) (&value))).byte[bit >> 3]

#define u32bfieldbitval(bit) \

#define u32bfieldtest(value, bit) \
((u32bfieldbyte(value, bit) & u32bfieldbitval(bit)) ? 1: 0)

static uint32_t bitfield;
static uint8_t bitmasks[] = { 1, 2, 4, 8, 16, 32, 64, 128};

To use the macro with the 32 bit variable "bitfield":

uint8_t nthbitset = u32bfieldtest(bitfield, n);

If you want to eliminate the bitmasks[] storage, you can do that by redefining the u32bfieldbitval() macro to:

#define u32bfieldbitval(bit) \
( 1 << (bit & 7))

It will increase the code as a loop of up to 7 iterations will be inserted to shift bitfield down to match the bit postion of the bit value. I was surprised to see this as I was expecting to see it shift a 1 to the proper position instead, but it is fewer cycles by shifting the bitfield to the right rather than the 1 to the left.

--- bill