implementing bit arrays

Go To Last Post
44 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hello, using GCC on an xmega, I need to have an array of about 240 bit variables (to indicate on/off states)...while I certainly can just set up a normal array, it will quickly & wastefully gobble up all sorts of RAM space. I've seen several discussions on setting up 8 completely unrelated bit variables to squeeze down into a single byte, but here I'm looking to do something like:

MyBitArray[233]=1;
MyBitArray[sensornum]=0;

   if (MyBitArray[abcxyz]==1)  motorfwd=1;

this need not be super speedy; I'm looking for a clean and simple method of setting this up, with whatever GCC offers as a reasonable implementation. Any thoughts on what has worked?

When in the dark remember-the future looks brighter than ever.

Last Edited: Thu. Jun 9, 2016 - 04:52 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Did you search on this subject?

it has been discussed multiple times before.

 

you can make a structure of bits to get a byte full of bits.

then you can define bit masks and you can then compare against bit masks

 

It has been a while since I last did some coding, but it looks something like

 

<<struct definition>>

struct

bit1:1

bit2:1

end struct

<<variable difinition>>

struct var-struct

 

<<masks>>

#define maskbit1 0x01

#define maskbit2 0x02

#define maskbit3 0x04

 

if (var-struct.bit1) then bit1 is set else bit1 is cleared

 

do a search on bit arrays and you should find a number of examples that look more or less like above

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yes I did a search, and didn't find anything relating to GCC bit ARRAYS.

 

for example:

  MyArray[251]=1;

  MyArray[165]=0;  where these are single bit values, stored in a compact manner & accessed moderately quickly.  Additionally, all defined without using 200+ separate bitmask defines.

 

 

When in the dark remember-the future looks brighter than ever.

Last Edited: Thu. Jun 9, 2016 - 06:29 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

avrcandies wrote:

Yes I did a search, and didn't find anything relating to GCC bit ARRAYS.

 

for example:

  MyArray[251]=1;

  MyArray[165]=0;  where these are single bit values, stored in a compact manner & accessed moderately quickly.  Additionally, all defined without using 200+ separate bitmask defines.

 

You need to think about what the compiler actually has to do, if you want to emulate BIT arrays on a device with no BIT pointers.

 

You could create some core  functions, for example that allow

 

  SetBit(BitAddress);

  ClrBit(BitAddress);

  RdBit(BitAddress,RdBitV);

  WrBit(BitAddress,WrBitV);

  if (GetBit(BitAddress)) ...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Think abou it. A KS0108 or other monochrome display often has a mirror in AVR SRAM. You set or clear a pixel.
It is using the SRAM bytes as a bit array. This is handled by a macro or inline function.
.
Ok, it is not using C array syntax. But it achieves the same thing.
.
The compiler will generate efficient code.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'm looking for a clean and simple method of setting this up, with whatever GCC offers as a reasonable implementation.

Is there one method already documented?  I need to immediately put this to use, rather than experiment with how to create them (and mostly ensure they will work without strange effects).

How do you efficiently get from "BitAddress"  to working a specific bit in one of many different bytes?

 

 

I suppose I might be able to create a function to figure out which "real array" byte the desired bit is located in, along with hacking out/manipulating the desired specific bit of that byte.  Just looking for any elegant solutions, that are less of a function approach & more of a compiler approach (if any exist).

When in the dark remember-the future looks brighter than ever.

Last Edited: Thu. Jun 9, 2016 - 07:06 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

avrcandies wrote:

How do you efficiently get from "BitAddress"  to working a specific bit in one of many different bytes?

 

The Byte is BitAddress >> 3, and to extract/insert  the bit there are some choices ..

You can shift and count using the 3 LSBs, which is compact, but varies in speed, or you can use a 8 way Table, and index that using the 3 LSB.

Slightly faster, but uses more resource.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

One main questions:

Is the value always a constant? (or do you have MyBitArray[233]=MyBitArray[2] ).

 

I guess that the C++ people need to show a smart way :)

 

add:

In ASM there are compact and fast ways to do this, but none of them can easy be transformed to C 

Last Edited: Thu. Jun 9, 2016 - 07:37 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

A bit variable is not standard C.

 

However Codevision supports this extension:

#include <stdio.h>
extern void initstdio(void);

#define SZ 240
char bitarray[SZ];

void main(void)
{
    int i;
    initstdio();
    printf("Hello avrcandles\r\n");
    for (i = 0; i < SZ; i += 3) bitarray[i] = 1;
    for (i = 0; i < SZ; i++)
        printf("bitarray[%d] = %d\r\n", i, bitarray[i]);
    while (1);
}

I have run this on a Uno.   Works fine.

 

If you are anally retentive,  this is what the compiler produces:

// 0000 000C     for (i = 0; i < SZ; i += 3) bitarray[i] = 1;
                +
000096 e000     +LDI R16 , LOW ( 0 )
000097 e010     +LDI R17 , HIGH ( 0 )
                 	__GETWRN 16,17,0
                 _0x4:
                +
000098 3104     +CPI R16 , LOW ( 20 )
000099 e0e0     +LDI R30 , HIGH ( 20 )
00009a 071e     +CPC R17 , R30
                 	__CPWRN 16,17,20
00009b f44c      	BRGE _0x5
00009c e0a0      	LDI  R26,LOW(_bitarray)
00009d e0b5      	LDI  R27,HIGH(_bitarray)
00009e 0fa0      	ADD  R26,R16
00009f 1fb1      	ADC  R27,R17
0000a0 e0e1      	LDI  R30,LOW(1)
0000a1 93ec      	ST   X,R30
                +
0000a2 5f0d     +SUBI R16 , LOW ( - 3 )
0000a3 4f1f     +SBCI R17 , HIGH ( - 3 )
                 	__ADDWRN 16,17,3
0000a4 cff3      	RJMP _0x4
                 _0x5:

Quite honestly,   I would stick with standard C macros/functions as suggested by Who-me.   Then you are not tied to a specific Compiler.  e.g.

    for (i = 0; i < SZ; i += 3) SetBit(i);

It actually looks simpler than the array syntax.

 

David.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

sparrow2 wrote:
Is the value always a constant? (or do you have MyBitArray[233]=MyBitArray[2] ).

You would just say:

if GetBit(2) SetBit(223);
else ClrBit(223);

Yes,  it looks untidy.   But you could write a suitable macro AssignBit(x, value)

 

David.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

avrcandies wrote:

Yes I did a search, and didn't find anything relating to GCC bit ARRAYS.

 

 

interesting.

first search result:

http://www.avrfreaks.net/forum/h...

 

scroll down to the post made by 'çlawson' he has given some example code on how to do it in gcc

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Have you looked at sbit.h from Peter Dannegger? The core of it is:

struct bits {
  uint8_t b0:1, b1:1, b2:1, b3:1, b4:1, b5:1, b6:1, b7:1;
} __attribute__((__packed__));

It then applies this to storage locations to make the bits individually addressable. However one could just as easily (in any version of C, not just GCC as this uses standard C) create an array:

struct bits bitarray[20];

You then have:

bitarray[7].b3 = 1;

and so on. Of course while '7' in this can be a variable index, the 3 in "b3" cannot easily be done that way. But if these are unrelated bits in the array and are just general bit variables then you might have something like:

#define RED_LED bitarray[5].b4
#define PUMP_ACTIVE bitarray[3].b6

then later:

RED_LED = 1;
PUMP_ACTIVE = 0;

if you really want to be able to index any bit in your 240 then I agree with who_me that you need generic Bit writing and reading routines. So something like:

#include <stdint.h>
#include <stdlib.h>

uint8_t bitarray[30]; // 240 bits

typedef enum {
    LOW = 0,
    HIGH = 1
} bitstate;

void WriteBit(int offset, bitstate state) {
    div_t res;
    uint8_t bytenum;
    uint8_t bitoffset;
    uint8_t mask;

    res = div(offset, 8);
    bytenum = res.quot;
    bitoffset = res.rem;
    mask = (1 << bitoffset);
    if (state == LOW) {
        bitarry[bytenum] &= ~mask;
    }
    else {
        bitarray[bytenum] |= mask;
    }
}

uint8_t ReadBit(int offset) {
    // same stuff to calculate the bytenum+mask - should be
    // called function or static inline for expediency
    return !!(bitarray[bytenum] & mask);
}

I've written that with far too many variables (though the optimiser should spot this anyway and discard them) but did it purely to show the process here. The fact is that hte bit you want to access is in the array at [offset/8] and the bit within that is at offset%8. You can do a combined calculation of dividend and remainder using div(). So if you called:

WriteBit(217, HIGH);

then 217/8 is 27 so it will access [27] in the bitarray and 217 % 8 is 1 so it will be bit 1. So the mask will be (1 << 1) which is 0x02 and because it is "HIGH" it will OR location [27] with 0x02 to set bit 1. On reading:

if (ReadBit(217)) {
    ...
}

then it just works all that backwards and &'s locatioj [27] with 0x02. That actually gives the result 0x02 but I'm assuming that you really just want a 0/1 state (though I suppose if() in this case doesn't really care) so I use the old !!() trick to convert 0/none-0 into 0/1. (apply two !! to any none-0 value and you will see how that works).

 

PS thanks for making me think about this, by a complete coincidence today is the day I also need such a bitarray too - so I just came up with my own solution :-)

 

EDIT:

I guess that the C++ people need to show a smart way :)

Oh and I'm doing mine in C++ so, yes I can overload the = operator (and ==).

 

Last Edited: Thu. Jun 9, 2016 - 08:28 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

It's not like OP can't get it to work.

 

This is about the balance between readable code and the generated code (size and/or speed). 

 

And I guess a bit part does it generate a function call or will it be inline.

 

 

witch bring me to ask if it can be :

MyBitArray[x]=MyBitArray[y]

Last Edited: Thu. Jun 9, 2016 - 09:33 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I have shown you that Codevision supports the non-standard bit variable (and arrays of them too).

 

Whether you choose to use = or a function is up to you.

 

I would always worry about clarity first.   Efficiency second.

 

There is certainly no point in worrying about how the Compiler does it.   Just that it does it correctly.

 

David.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

sparrow2 wrote:
witch bring me to ask if it can be :

As you said earlier this is certainly possible in C++ as the '[]' indexing operation can be implemented by the programmer.

 

[x] and [y] would just be broken down into actual_bitarray[x/8] and actual_bitarray[y/8] with the mask for the bit access created using (1 << (x|y % 8)) just as I showed above. However this will involve a run-time calculation of (1 << n) which could be costly though the obvious { 0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80 } lookup table solution could mitigate some of the over-head of doing run-time shifts.

 

I guess it depends what is important to the implementor: speed, size or ease of maintenance. I would tend to favour the latter myself unless these bit-vars were being used in a particularly speed-sensitive area.

 

Of course one solution to all this is just a "char array[240];" in which you simply store 0's and 1's or perhaps even '0's and '1's (for easier debugging) but that traded RAM usage for ease of implementation and speed of access. I know some people's heads might explode at the prospect of wasting every 7 of 8 bits though!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

{ 0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80 } lookup table solution could mitigate some of the over-head of doing run-time shifts.

but there is a relative big overhead with the code the compiler make for a LUT.

 

it could almost take the same time as the hole ting take in ASM (in the ballpark of 15 clk).  

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

sparrow2 wrote:
but there is a relative big overhead with the code the compiler make for a LUT.

I guess it depends what one means by "big"? ...

$ cat avr.c
#include <avr/io.h>

uint8_t masks[] = { 0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80 };


int main(void) {
    while(1)
        PORTD = masks[PINB];
}
00000082 <main>:
uint8_t masks[] = { 0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80 };


int main(void) {
    while(1)
        PORTD = masks[PINB];
  82:	e6 b3       	in	r30, 0x16	; 22
  84:	f0 e0       	ldi	r31, 0x00	; 0
  86:	e0 5a       	subi	r30, 0xA0	; 160
  88:	ff 4f       	sbci	r31, 0xFF	; 255
  8a:	80 81       	ld	r24, Z
  8c:	82 bb       	out	0x12, r24	; 18
  8e:	f9 cf       	rjmp	.-14     	; 0x82 <main>

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

ok I was thinking of the mask placed in flash.

 

and use a variable and not a const. 

Last Edited: Thu. Jun 9, 2016 - 11:14 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

sparrow2 wrote:

ok I was thinking of the mask placed in flash.

Doesn't seem to make much difference...

$ cat avr.c
#include <avr/io.h>
#include <avr/pgmspace.h>

uint8_t masks[] PROGMEM = { 0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80 };


int main(void) {
    while(1)
        PORTD = pgm_read_byte(&masks[PINB]);
}
00000074 <main>:
  74:	e6 b3       	in	r30, 0x16	; 22
  76:	f0 e0       	ldi	r31, 0x00	; 0
  78:	ec 5a       	subi	r30, 0xAC	; 172
  7a:	ff 4f       	sbci	r31, 0xFF	; 255
  7c:	e4 91       	lpm	r30, Z+
  7e:	e2 bb       	out	0x12, r30	; 18
  80:	f9 cf       	rjmp	.-14     	; 0x74 <main>

Virtually identical apart from LPM versus LD.

 

(wonder why one is Z and the other Z+ ??)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

But that is illegal code.

 

you can not combine Z+ with result in R30 or R31

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yup, odd isn't it - but the compiler really did generate this - however I have to admit it was an old 4.5.3. It's possible that if this is an error it has been fixed.

 

EDIT: yup, this from 4.9.2:

00000074 <main>:
  74:	e6 b3       	in	r30, 0x16	; 22
  76:	f0 e0       	ldi	r31, 0x00	; 0
  78:	ec 5a       	subi	r30, 0xAC	; 172
  7a:	ff 4f       	sbci	r31, 0xFF	; 255
  7c:	e4 91       	lpm	r30, Z
  7e:	e2 bb       	out	0x12, r30	; 18
  80:	f9 cf       	rjmp	.-14     	; 0x74 <main>

No "+"

Last Edited: Thu. Jun 9, 2016 - 12:03 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks for the tips...I will actually use variables as the indicies (MyArray[iii]=1;), but the value will probably be an assigned constant (0 or 1) or compared to a constant...as noted an assign can be converted to a multistatement (not as "fast"  as saying MyArray[iii]=Myarray[jjj], but it realityy may execute as fast.

 

interesting.

first search result:

http://www.avrfreaks.net/forum/h...

 

scroll down to the post made by 'çlawson' he has given some example code on how to do it in gcc  .......I did see the Clawson post originally, but he was tilting towards CV, I am using GCC,

 

I was just looking at what is available compiler-wise rather then code-wise (creating my own functions)...of course using a function call can execute much slower than if the type was supported directly or as a close cousin.   I'm not in a super hurry, so 50x slower  than fast assembler methods is probably "ok"

 

 

 

When in the dark remember-the future looks brighter than ever.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

But if they are constants only, one would think that the compiler would use the SBRC and SBRS instructions (and SBIC and SBIS).

 

Add:

I mean load into R24 from RAM and then use SBRC etc on R24 with the correct number.

Last Edited: Thu. Jun 9, 2016 - 12:58 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Modern Compilers have "inline"

 

All Compilers have macros.

 

You really should have faith in the Tools you use.    They should translate valid C code 100% accurately.   i.e. do as asked.

Of course they may implement differently.

 

I would concentrate on clear code that is easy to maintain.

By all means post your code.   I suspect that you might get advice that keeps the clarity but improves the efficiency.

 

I believe in factoring the code to simple steps.   Then concentrating on a single macro / inline function that is efficient and can be re-used.

Cliff does not like macros.    Inline functions have better "syntax checking".    Most importantly,   write once, use many means you fix one code sequence.

 

David.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You really should have faith in the Tools you use. 

 

I will just ask you to look at #19 and #20 there you can see where my faith is.

 

  Most importantly,   write once, use many means you fix one code sequence.

Yes that is the main reason for using ASM, and not C. (you never know if it make smart code or not, and the real problem is if you just add one line (or even delete one) the worst case timing could go 10 times up).

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

avrcandies wrote:
I was just looking at what is available compiler-wise rather then code-wise

Well for most compilers that is dictated by the C Standard. As you'll read there the only notion of "bit variables" are bitfields in structs. Which is exactly why I mentioned:

struct bits {
  uint8_t b0:1, b1:1, b2:1, b3:1, b4:1, b5:1, b6:1, b7:1;
} __attribute__((__packed__));

Some of that is GCC specific (the __attr__) but the core of it:

struct bits {
  uint8_t b0:1, b1:1, b2:1, b3:1, b4:1, b5:1, b6:1, b7:1;
};

is standard C (and AVR compilers generally don't pad but have inherent packing anyway so the __attr__ is almost certainly unnecessary in this context anyway). Once you have such a structure defined you can use it to make individual bit access to any 8 bit location.

 

Of course, as other have hinted at, all the time you will want to be wary of the actual code generated because it could be bloated/inefficient if you are not careful.

 

As the discussion has already touched in any kind of run-time determination of bit masks from bit numbers involves either a (1<<n) calculation or a { 0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80 } lookup.

 

As I say, if you just need the 240 variables but will access them individually rather than by some indexed method then the foo.b5 kind of access could work nicely and if you want to hide the "ugly syntax" you just do that:

#define EASY_NAME bitarray[n].bm

and from them on write 0/1 to the var and test it in if()s and so on.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

sparrow2 wrote:
Yes that is the main reason for using ASM, and not C. (you never know if it make smart code or not, and the real problem is if you just add one line (or even delete one) the worst case timing could go 10 times up).

While I enjoy your posts, don't get too evangelical.  If you change one line of your ASM, your worst case timing could go 10 times up.  You never know whether your ASM is smart code or not.

 

[Indeed, I've worked with a handful of gurus that think in machine op codes.  You may well be one of them.  Us mere mortals can never hope to match your efforts, no matter what language and toolchain.  But the majority of programmers, ASM or not, are not such gurus.  And so the old argument goes...the ASM program has many more lines thus many more chances for error.  But putting that aside for now...]

 

OP has expressed the desire for ~240 bit variables.  sparrow2, you've taken shots at proposed C implementations, and hinted at "better" ways.  Show a code fragment of your "better" approach for 60 to 64 bytes of bit variables on an AVR8.

 

[LOL -- back in a prior life in CNC/PLC world, my company did it with SRAM "dual port" for lack of a better term.  It could be looked at via the traditional byte view.  And extra decoding hardware gave it a bit view.  IIRC it was patented.  It made for the [at the time] fastest ladder logic scan time in the industry.]

 

 

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

Last Edited: Thu. Jun 9, 2016 - 02:59 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

OP has expressed the desire for ~240 bit variables.  sparrow2, you've taken shots at proposed C implementations, and hinted at "better" ways.  Show a code fragment of your "better" approach for 60 to 64 bytes of bit variables on an AVR8.

That's the nail OP want less than 256 bit, you want more.

 

But if every is const you don't care

Check on a constant bit  would be :

LDS r24,addr

SBRS r24,2

rjmp next

code for true

next:

and you would have a list of all the 240 or 500 bits by name

 

if variable and in a array of 256 bits I have done this in the past:

 

First I have the low 5 bits to be the addr. and the high 3 be the bits:

It's a bit odd but if you use the bit functions for every thing it doesn't matter.

The benefit is that the addr is found by a simple and with 0x1f.(done after the bit mask is found), and add to the array start. 

for the top bits the mask is found by this: (the bit number in R24)

LDI r16, 1

SBRC r24,6

LDI r16,4

SBRC r24,5

ADD r16,r16

SBRC r24,7

SWAP r16

now r16 hold a mask with 1 bit set placed depending on the 3 top bits in r24

now add COM if you want a clear mask.

 

if you want speed you make this "inline" or make a simple (r)call with ret. 

 

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

sparrow2 wrote:
That's the nail OP want less than 256 bit, you want more.

Ooops -- can't divide by 8 in my head.  Must have used C.  In ASM, it would be

Show a code fragment of your "better" approach for 30 to 32 bytes of bit variables on an AVR8.

sparrow2 wrote:
But if every is const you don't care

Agreed.  But OP implied not constant...

avrcandies wrote:
MyBitArray[sensornum]=0;

 

sparrow2 wrote:
if variable and in a array of 256 bits I have done this in the past:

 

Interesting.  I'll have to digest that a bit. [pun intended]  Using the high 3 bits for the bit instead of the low is akin to doing any array row/col or col/row -- doesn't really matter as long as you are consistent.  A quick peek indicates constant 7 cycles (plus register save/restore if needed and the "array index" creation/load into the working register).

 

 

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

Last Edited: Thu. Jun 9, 2016 - 03:18 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

...I have done this in the past...

At least since 2007. ;)

http://www.avrfreaks.net/forum/f...

...where glitch said it is near-optimal.  ;)

 

...and you said you'd been doing it since 2000:

But for normal things I will still use the old rutine. I made a bit array lib back in 2000 that use it.(but that was on a org. 8515 without MOVW).

 Lots of interesting approaches in the linked thread

http://www.avrfreaks.net/comment...

 

I guess your criteria is "least cycles", yet you seemed to put aside the lookup table solution which was four cycles?

 

 

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

the trick is to get a balanced code, and because of the 16 bit nature of X Y Z on an AVR it's a pain to use them unless you really need. (it's not like the 8051 with @R0 )

and if you use the lookup, the mask has to be made from the lower 3 bit, and then the addr take extra time.

But I guess it can be fast with to LUT's of 256byte each in RAM (page aligned) ;)

 

I like the row/col col/row comment.

 

I have used it to store an array of 5 bit numbers, (normal a pain to find n'th element).

But if you place first number as bit 0 in the first 5 bytes and then the next as the 5 bit 1

etc. then you deal with more even numbers (and yes it takes longer). (but mul with 5 is easy)

 

The first job with AVR was my first job in US (2000), and there I came from 8051 and samsungs 4 bit micro, that had some really smart things (now when we talk about bit array, it had a swapping array so if you stored a byte in one addr it showed up in 8 address as bit's and the other way round), and a thing like sret that would have been nice on the AVR (it means skip ret, so you can return true or false just by your ret instruction , and then you place a jump after the call in the main code).

redundant loads was ignored so :

LD EX, 10

:label

LD EX,20

 

would hold 10 if we come from the top and 20 from label, together with skip instructions it can make some very compact code.

And you could define your own opcodes for things you often use,  two 1 byte instructions, would be 1 byte, or 2 byte instructions could be 1 byte (there was some limitations I can't remember) .

 

And that gave me some odd ideas for some funny AVR code (I made SRET that changed the stack, but it got slow because you needed to check if next instruction after then call is 2 words).

And a bit lib with ideas from both 8051 and the samsung.

 

add :

the 4 bit samsung had LCD controller and did everything in this handset

 

http://www.simrad-yachting.com/e...

 

 

 

 

 

Last Edited: Thu. Jun 9, 2016 - 06:20 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Bottom line:
What you want cannot be done in C with array syntax,
not even with common extensions.
Allowing parentheses instead of square brackets, does not change that.
putBit(index, value) and getBit(index) are not that hard to write in C.
In C, the compiler can generate case by case optimizations.
avr-gcc has a sufficiently powerful inline asssembly
as well as some builtin functions that can help
you perform some of the optimizations by hand.
gcc's ({ statements; expression; }) might also be useful.

 

With other compilers, assembly would likely require a function call.

 

International Theophysical Year seems to have been forgotten..
Anyone remember the song Jukebox Band?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks everyone:  This gave some good insights that were shared by all.  While I supposed that I would end up just writing my own functions, I thought there might be some rarely-used feature or "trick" of  GCC to get the job done.

Too many times you get something oddball written up, debugged, & 5 minutes later someone comes along & says, "why didn't you just use the glgglkghlk  feature--it 's been available since 198x, but not yet documented".

 

By the way...regarding shifting, I wonder if the GCC compiler is smart enough to keep the shift-5 result (see below) for reuse in the shift-6 conditional of line 3, or does it wastefully re-shift everything?  Note, in this example aaa itself is not actually (permanently) shifted.  Maybe deleting line 2 makes a difference in the compiler "forgetting/lookahead" strategy.    In assembler you tend not to forget since you are so close to the code details & what is going on.

 

if (aaa>>5)  bbb++;  //line 1

ccc=aaa+bbb;           //line 2 ...some nonsense calculation (but must compile after line 1 and before line 3)

if (aaa>>6)  bbb--;   //line 3  ...hopefully reuses >>5 from line 1, rather than all new shifting

 

 or at least this might even work (requires less compiler smarts)

if (aaa>>5)  bbb++;  //line 1

if (aaa>>6)  bbb--;   //line 2  ...hopefully reuses >>5 from line 1, rather than all new shifting

 

 

I'll fire up the compiler to take a look, but was wondering if there are ways to force this optimization (I've wondered about such things in the past).

 

 

When in the dark remember-the future looks brighter than ever.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

now I don't know the size of aaa, but I don't really think the compiler is so stupid that it actually do the shift when  it 's used with a const.

 

So I hope/expect that it make an & on the top bits and check for !zero

 

add:

I hope that aaa is unsigned ;) (else all negative numbers will be true)

 

add:

If it does the shifts it shows that it's optimized for ARM etc. and not AVR

Last Edited: Sat. Jun 11, 2016 - 07:49 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

That's a good point, that it could do an and operation. My question was really--would it be smart enough to carry the result from line 1 to 3, assuming the problem required shifting

 

 

if ((aaa>>5)>bbb)  bbb++;  //line 1

ccc=aaa+bbb;           //line 2 ...some nonsense calculation (but must compile after line 1 and before line 3)

if ((aaa>>6)>bbb)  bbb--;   //line 3  ...hopefully reuses >>5 from line 1, rather than all new shifting

 

 

When in the dark remember-the future looks brighter than ever.

Last Edited: Sat. Jun 11, 2016 - 08:36 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I just compiled this (on a default setup on a 6.2 that is all I have here)

unsigned int	aaa,bbb,ccc;
int main(void)
{
    while(1)
    {
		aaa=ADCW;
		if (aaa>>5)
		{
			bbb++;
		}
		if (aaa>>6)
		{
			bbb++;
		}
		PORTC=bbb;
        //TODO:: Please write your application code
    }
}

 

And it make very stupid code for both -O2 and -Os (and yes it make >> twice )

only real difference in the two version is the the bbb++ get moved around and bbb only loaded once.

 

but if you give a hint like this:

unsigned int	aaa,bbb,ccc;
int main(void)
{
    while(1)
    {
		aaa=ADCW;
		unsigned int temp;
		if (temp=(aaa>>5))
		{
			bbb++;
		}
		if (temp>>1)
		{
			bbb++;
		}
		PORTC=bbb;
        //TODO:: Please write your application code
    }
}

it will make some OK code, but still do the shifts.

 

 

Last Edited: Sat. Jun 11, 2016 - 10:15 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I just compiled this (on a default setup on a 6.2 that is all I have here).....And it make very stupid code for both -O2 and -Os (and yes it make >> twice )

 

That is what I was afraid of, the compiler should save the value of line 1 and reuse it for line 3, rather than recreate it from scratch.  Many times it seems like a half-witted chess program.   Easier said than done, certainly.

When in the dark remember-the future looks brighter than ever.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Apparently gcc has no AVR-specific optimization code for shifting.

ARMs and 486s have barrel shifters.

 

International Theophysical Year seems to have been forgotten..
Anyone remember the song Jukebox Band?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Apparently gcc has no AVR-specific optimization code for shifting.

Not until someone has enough desire to develop it and then donate it back to the GCC project ;-) 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

it optimize fine like the >>6 is implemented as a <<2 and then some logic.

and in most cases the >>5 include a swap instruction (and sometimes it make a loop )

 

but it has no memory of the two shifts.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

When you spot the wastefulness of the compiler, it's rather aggravating...best not to look under the hood.   Take the extra microseconds to drink a beer :)

 

Was wondering if there is a comparison review between AVR C compilers as to how "smart"  they are (optimization-wise).  GCC is free, so really can't complain (more beer money).

 

 

When in the dark remember-the future looks brighter than ever.

Last Edited: Sat. Jun 11, 2016 - 04:11 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

gcc has to have *some* avr-specific code for >> or it would not be able to do AVRs at all.

In principle, it would not have to special-case constant shifts or particular constants,

but it does avoid some embarrassingly inefficient code.

 

Whether to treat a>>5 as a subexpression of a>>6 is a trickier issue.

 

International Theophysical Year seems to have been forgotten..
Anyone remember the song Jukebox Band?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

GCC does suffer a little from the fact that it is a multi-language, multi-architecture compiler system so it doesn't just compile direct from C to AVR (or any specific target) Asm but breaks whatever language  (C, C++, Ada, Fortran, Objective C, whatever) into an intermediate, generic language breakdown called GIMPLE then it uses a target specific code generator to generate AVR asm code from that. So it's possible this process is sub-optimal for the specific combination of C and AVR. What's more the system tends to be biased towards the important targets (ARM, X86) which may actually be detrimental to a trivial target like AVR. 

 

A compiler specifically for AVR like Codevision could well have advantages because of this. 

 

While it's not a great idea to start a compiler war I think pretty much everyone who uses AVR knows that when size (speed?) really matters it is worth paying the $3000+ for a copy of IAR. But how desperately do you need to save a few bytes/cycles to justify that kind of cost? 

 

I think all the compilers have Eval versions so the real test is to download all of them and try them with the specific style of code YOU write (on the whole forget published "benchmarks", it's rare for them to reflect YOUR code style). 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

unless much have changes over the last 5 years I will say:

 

CV in general make better AVR code than IAR.

Like GCC, IAR make a general (and good) compiler but don't make optimal AVR code

CV don't have that problem it's a AVR compiler. For the overall  general view to optimize I think CGG and IAR are best, but small "stupid" IO code and register optimization (that is what most code are) CV are the best.

 

To compare with the past I will say that CV is like what Keil was for the 8051.

 

note

I don't use either.

 

I use GCC and/or ASM