Best practice number rotation technique for C?

Go To Last Post
30 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I searched to Hell and Gone for this and found as is often the case the best suggestion was from Dean Camera in the thread at:
https://www.avrfreaks.net/index.php?name=PNphpBB2&file=viewtopic&p=186954#186954

Dean's suggest was for a right shift macro. I amplified this by adding (and testing) a left shift macro. The following code works:

#include 
// Rotate macros based on a suggestion by Dean Camera on AVRFreaks at:
// https://www.avrfreaks.net/index.php?name=PNphpBB2&file=viewtopic&p=186954#186954

#define bit_rotate_right(a) a = ((a & 0x01)? ((a >> 1)| (1ULL << ((sizeof(a) * 8) - 1))) : (a >> 1)) 
#define bit_rotate_left(a) a = ((a & (1ULL << ((sizeof(a) * 8) -1))) ? ((a << 1) | 1) : (a << 1)) 

volatile uint8_t myByte1 = 1;
volatile uint8_t myByte2 = 0x80;
volatile uint16_t myInt1 = 1;
volatile uint16_t myInt2 = 0x8000;
volatile uint32_t myLong1 = 1;
volatile uint32_t myLong2 = 0x80000000;

int main()
{
	while(1)
	{
		bit_rotate_right(myByte1);
		bit_rotate_left(myByte2);
		bit_rotate_right(myInt1);
		bit_rotate_left(myInt2);
		bit_rotate_right(myLong1);
		bit_rotate_left(myLong2);
	}
}

I have two questions:
1. Is the the best way to do this?
2. Cliff said that nobody would need to do a rotate in C, but I want to use this in a Chaser Lights project where I rotate a pattern on LEDs and I think this is better than using look up tables since this can do 8, 16, and 32 bit shifts all with the same macros. So can Cliff or someone tell me a better way?

Smiley

Last Edited: Sun. Sep 18, 2011 - 01:17 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

smileymicros wrote:
1. Is the the best way to do this?
Define "best".

If "best" means "as ugly as only C allows", then your macros will do.

If "best" means "grab on the opportunity to show how easily such simple operations can be performed through inline assembler in a simple and elegant way", then invest to the inline assembler version.

I might be a bit biased.

Jan The C Hater

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

1) Hasn't this been beaten to death recently?
https://www.avrfreaks.net/index.p...

2)

Quote:

If "best" means "grab on the opportunity to show how easily such simple operations can be performed through inline assembler in a simple and elegant way", then invest to the inline assembler version.

https://www.avrfreaks.net/index.p...

Quote:
but when using C you almost never need a rotate. It tends to be frustrated Asm programmers who pointlessly look for this.

I'll leave it to the gentle readers tto decide what type of frustration is ailing Jan.

3) This is the perfect opportunity for Jan to take shots, as indeed C dosn't have the concept of rotate. (Or carry. Or 24-bit arithmetic.)

3a) Before we go any further, tell me where the critical use is. That includes you, too, Jan--give some quick examples on where you used rotate in AVR apps, and why it was so useful.

4) And also before we go too far, in the link I gave OP was bemoaning the "no rate in C" situation. Repeatedly, I asked what the optimal sequence of ASM instructions he was trying to generate/duplicate. No answer. So I need to have it here. After all we are after the "best". As the AVR has no inherent rotate instruction(s) then what sequences is optimal for you? Can you tooolchain then be coerced int generating that sequence? If so, then we've found the "best", right?

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

wek wrote:
smileymicros wrote:
1. Is the the best way to do this?
Define "best".
Change to 'best practice'.

wek wrote:
If "best" means "as ugly as only C allows", then your macros will do.
Beauty is in the eye of the beholder... Anyway, deep down you really do want to be a nice guy.

theusch wrote:
1) Hasn't this been beaten to death recently?
https://www.avrfreaks.net/index.p...
Yes, BUT...
I couldn't pull out any reasonable best practice from all that 'discussion'. Wouldn't it be nice to have a bit of working code to point to the next time this question gets asked?

If someone can come up with a working example for this that uses in-line assembler AND is smaller-quicker I'm happy to go that route.

Smiley

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

Wouldn't it be nice to have a bit of working code to point to the next time this question gets asked?

Wouldn't it be nice if someone would finally answer the question on what this "optimal" series of AVR instructions is? There seems to be a >>great<< reluctance. Without that any compiler-generated sequence is just speculation and subject to ridicule.

Joe, I made the "challenge" to Jan 'cause I can't remember any apps where I did a critical rotate. For your chasing-lights scenario, I generally use a sequence of some sort and then move though it. I.e. changing the starting index rather than the pattern. (Isn't that kind of like Jesper's miniDDS? No rotates there...)

Once we know what the gold standard is, THEN we can exercise toolchains to try to match it. Without that it is a tempest in a teapot.

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Tempest in a teapot?

I've done Chaser Lights as you suggest and I've done them with look up tables, but I thought that for rotating a pattern that a rotate procedure would be optimal.

When I say best practice I'm not expecting the GOD OF C to drop down and tell me the TRUTH. I'm just wondering what to tell folks who might want to do this.

Take for instance a challenge where you can have 8, 16, 32, or 64 bits in any pattern on the like number of LEDs and you want to rotate them like on a marquee, what is a good simple way to do this. Say the pattern isn't known at run time but can be anything. Then it seems to me that the rotate makes the most sense. Picture a Chaser Light Marquee around a frame for a movie poster and you want 512 LEDs that you gang in connected units of 32 LEDs each. Then this rotate would make the pattern run around the mirror in 16 blocks but appear to go all the way around the frame. Now you see marquees like this all the time in movie theaters usually displaying some simple pattern, but with this technique you could get a huge variety of patterns going. I'm not going to make a product like this, but I think that it would be a good thinking exercise when you want to convince someone why they might want to learn bitwise operators. And of course if I was going to do this I'd use SPI which ironically enough rotates the bits...

And anyway I'm mainly waiting for Cliff to wake up and tell me why rotate isn't the way to do this and what is.

Smiley

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
If someone can come up with a working example for this that uses in-line assembler AND is smaller-quicker I'm happy to go that route.
Assembly could easily be smaller and quicker for this:

BST num, 0
LSL num
BLD num, 7

or

LSL num
BRCC wasZero
ORI num, 0x80
wasZero:

The problem, of course, is that this only works for fixed sized numbers. If you want 16 bit or 32 bit numbers, you would need separate routines. However I would think that the cases where you would need multiple sizes of rotates in a single program would be even rarer than the need for rotate at all. But for comparison, the C code for 8 bit values compiles down to 6 clocks compared to the 3 clocks of the assembly (7 vs 4 for 16 bit). Even when the compiler gets a little dumb with 32 bit numbers (it wastes 2 cycles by ANDing 0x0001 with the lower 2 bytes instead of just 0x01 with the lowest byte), it still only gets 10 clocks vs 6 for assembly.

Regards,
Steve A.

The Board helps those that help themselves.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I assume that you will like something general rather that fast!
I would make a function that take :

pointer to source (perhaps flash!)
pointer to destination
int number of bit's to copy
int offset of the copy

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

smileymicros wrote:
Best practice number rotation technique for C?
The macros you mentioned in the OP are no goot practice because
• They need the sizeof operator in the input
• Are incorrect if the input has side effects (volatile)
• Need an lvalue as input
• Are hard to read
• cannot be used function-like as b = rot (a+1)

If these macros do what you want, then you are fine with them an need not bother about myriads of different ways it can be written down in C.

One canonical form to write the rotate left N of an unsigned A is

(A << N) | (A >> (sizeof(A)-N))

For rotate offset of One, there's also a way to write it similar to your OP, i.e. shifting by 1 and inserting the shifted-out bit by patching it with OR.

Writing it function-like in C then could look like

#include 

#define F(N)                                        \
static inline uint##N##_t rora##N (uint##N##_t a)   \
{                                                   \
    uint ## N ## _t b = a >> 1;                     \
    if (a & 1) b |= 1l << (N-1);                    \
    return b;                                       \
}                                                   \
static inline uint ##N##_t rola##N (uint##N##_t a)  \
{                                                   \
    uint##N##_t b = a << 1;                         \
    if (a & (1l << (N-1))) b |= 1;                  \
    return b;                                       \
}                                                   \
static inline uint##N##_t rorb##N (uint##N##_t a)   \
{                                                   \
    uint ##N##_t b = (a >> 1) | (a << (N-1));       \
    return b;                                       \
}                                                   \
static inline uint##N##_t rolb##N (uint##N##_t a)   \
{                                                   \
    uint ## N ## _t b = (a << 1) | (a >> (N-1));    \
    return b;                                       \
}                                                   \
static inline uint ##N## _t roth##N (uint##N##_t a) \
{                                                   \
    uint ##N##_t b = (a >> (N/2)) | (a << (N/2));   \
    return b;                                       \
}                                                   \
extern volatile uint##N##_t u##N;                   \
void do_rot##N (void)                               \
{                                                   \
    asm (";rora"#N);                                \
    u##N = rora##N (u##N);                          \
    asm (";rorb"#N);                                \
    u##N = rorb##N (u##N);                          \
    asm (";rola"#N);                                \
    u##N = rola##N (u##N);                          \
    asm (";rolb"#N);                                \
    u##N = rolb##N (u##N);                          \
    asm (";roth"#N);                                \
    u##N = roth##N (u##N);                          \
    asm (";2*roth"#N);                              \
    u##N = roth##N (roth##N (u##N));                \
}

F(32)
F(16)
F(8)

ROLA/B and RORA/B are these two way of writing it down and the ROTH is swapping low and high part.

Attached what avr-gcc-4.6.1 produces out of that; there's some room for improvement...

avr-gcc does not implement rotate patterns except for byte-wise rotate and 4-byte rotate of char. As it detects that a 4-byte rotate is SWAP and two of them are identity, teaching avr-gcc rotates by 1 might be worth a try. In particular form B is well suited to be detected as rotate by target-independent analyzers.

Edit: avr-gcc actually tries to find left-rotate patterns by 1 and size-1 but doesn't find them and thus expands into lengthy code.

Attachment(s): 

avrfreaks does not support Opera. Profile inactive.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
They need the sizeof operator in the input
Why is that not good practice?
Quote:
Need an lvalue as input
Easily correctable by removing the "a =" and calling it as:
c = bit_rotate_right(c);

Quote:
cannot be used function-like as b = rot (a+1)
Mostly correctable by adding extra parentheses around all the "a"s in the macros (in addition to the change above). I say mostly because it won't work for chars. The addition automatically promotes to int. This is correctable by using this:

c = bit_rotate_right((char)(c + 1));

Quote:
Writing it function-like in C then could look like
And this is somehow more readable than the OP? And it is less usable since now you need to call different functions for each datatype.

But I would do it in C++:

template T bitRotateRight(T a)
{
   return  (a & 0x01)? ((a >> 1) | (1ULL << ((sizeof(a) * 8) - 1))) : (a >> 1);
}

Regards,
Steve A.

The Board helps those that help themselves.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I know nothing of C, but bear with me... In assembler the solution is trivial/logical.
But can't C do byte/int/long x 2 (or divide by 2)and know when there is an overflow/underflow to modify the new byte/int/long value correctly ?
@Smiley, best to me means can I understand what the hell is being done...and the outcome :(

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

and know when there is an overflow/underflow

The C programmer cannot (easily) get access to the C bit in SREG.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

How about this modification of Dean's macro (got rid of the comparison)

#define ROR(x) (((x&1) << ((sizeof(x)*8) - 1)) | (x >> 1))
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Are you specifically looking for an operator/function that handles any integer type? Such macros are common, but I'm not sure that they're "best practice." An assembler function for each particular type ought to be much less ugly...

Note that the AVR doesn't really have a "rotate" instruction; it has a shift instruction, and a "shift including carry" instruction that it calls rotate. That makes the code that Dean produced pretty optimal, I think (does it get worse for left rotate?)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
Note that the AVR doesn't really have a "rotate" instruction; it has a shift instruction, and a "shift including carry" instruction that it calls rotate

The MUL instruction can be use as barrel shifter (And yes only for 8 bit)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

2. Cliff said that nobody would need to do a rotate in C, but I want to use this in a Chaser Lights project where I rotate a pattern on LEDs and I think this is better than using look up tables since this can do 8, 16, and 32 bit shifts all with the same macros. So can Cliff or someone tell me a better way?

Well since you mentioned me I'll just reiterate what several people have effectively said above (BTW I like Steve's C++ template which is good for all type widths). Depending on which way you are rotating you start by testing the bit at the top/bottom end - the one that's about to "drop off" and in an Asm rotate would be finding its way into the C flag. You then use a left/right shift to do the shuffle then, if the initial test was true (the original top/bottom bit were set), you OR in a bit at the other end. While you can try can work this into some complex one line macro it seems to me it's as simple as:

#include 

static inline uint8_t ROR8(uint8_t n) {
  uint8_t carry = 0;

  if (n & 1) {
    carry = 1;
  }
  n >>= 1;
  if (carry) {
    n |= 0x80;
  }
  return n;
}

static inline uint8_t ROL8(uint8_t n) {
  uint8_t carry = 0;

  if (n & 0x80) {
    carry = 1;
  }
  n <<= 1;
  if (carry) {
    n |= 0x01;
  }
  return n;
}

int main(void) {
  uint8_t i;

  DDRB = 0xFF;
  PORTB = 0x01;
  for (i=0; i<12; i++) {
    PORTB = ROL8(PORTB);
  }
  for (i=0; i<12; i++) {
    PORTB = ROR8(PORTB);
  }
}

That may look wordy and long winded but at the end of the day the C compiler is probably unlikely to generate (much ;-)) more code than some tightly packed complex function/macro. A single invocation looks like:

  PORTB = ROL8(PORTB);
  6c:	88 b3       	in	r24, 0x18	; 24
  6e:	98 2f       	mov	r25, r24
  70:	99 0f       	add	r25, r25
  72:	87 fd       	sbrc	r24, 7
  74:	91 60       	ori	r25, 0x01	; 1
  76:	98 bb       	out	0x18, r25	; 24

I prefer static inline functions to macros because of the type checking.

The 16 and 32bit versions are obviously:

static inline uint16_t ROR16(uint16_t n) {
  uint8_t carry = 0;

  if (n & 1) {
    carry = 1;
  }
  n >>= 1;
  if (carry) {
    n |= (uint16_t)0x8000;
  }
  return n;
}

static inline uint16_t ROL16(uint16_t n) {
  uint8_t carry = 0;

  if (n & (uint16_t)0x8000) {
    carry = 1;
  }
  n <<= 1;
  if (carry) {
    n |= 0x01;
  }
  return n;
}

and:

static inline uint32_t ROR32(uint32_t n) {
  uint8_t carry = 0;

  if (n & 1) {
    carry = 1;
  }
  n >>= 1;
  if (carry) {
    n |= (uint32_t)0x80000000;
  }
  return n;
}

static inline uint16_t ROL32(uint32_t n) {
  uint8_t carry = 0;

  if (n & (uint32_t)0x80000000) {
    carry = 1;
  }
  n <<= 1;
  if (carry) {
    n |= 0x01;
  }
  return n;
}

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks Cliff.

I just tested your version and it generates much less code, especially for the 16 and 32 bit versions. Also it is clearer so I'm going to use your suggestion in a library with, or course, proper attribution.

Smiley

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Three times I started to write a lengthy answer and then deleted it.

Turns out, others said already all I wanted to say. Except perhaps mentioning the distinction between application and system software when it comes to "best practices".

And I would also not constrain a display designer with the bizarre habit of programmers to count things by powers of two.

JW

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
That may look wordy and long winded but at the end of the day the C compiler is probably unlikely to generate (much Wink) more code than some tightly packed complex function/macro. A single invocation looks like:
Which I believe is the exact same result as the original macro for an 8 bit number.
Quote:
I just tested your version and it generates much less code, especially for the 16 and 32 bit versions.
Really? When I compiled the original macros, the 16 and 32 bit versions were not much worse than the hand-tuned assembly. They certainly did not have room for "much less" code.

Regards,
Steve A.

The Board helps those that help themselves.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Koshchi wrote:
Really? When I compiled the original macros, the 16 and 32 bit versions were not much worse than the hand-tuned assembly. They certainly did not have room for "much less" code.
Maybe I did something wrong? Did you compare to Cliff's code?

Smiley

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
Maybe I did something wrong?
Did you perhaps make the variable sent into the macro volatile? If you do, then every access of "a" must read the whole value from memory. If you trick the compiler with this:

uint32_t a;
volatile uint32_t b; 
...
a = b;
a = ROR32(a);
b = a;

so that "a" itself is not volatile, it actually produces slightly better code than Cliff"s in the 32 bit case (there is one less MOVW). Here's the result for both:

//Your macro
MOVW      R18,R24
ANDI      R18,0x01
ANDI      R19,0x00
LSR       R27
ROR       R26
ROR       R25
ROR       R24
OR        R18,R19
BREQ      PC+0x02
ORI       R27,0x80

//Cliff's code
MOVW      R18,R24  
MOVW      R20,R26  
LSR       R21      
ROR       R20      
ROR       R19      
ROR       R18      
ANDI      R24,0x01 
ANDI      R25,0x00 
OR        R24,R25  
BREQ      PC+0x02  
ORI       R21,0x80 

Note that both do a double AND and an OR to test a single bit when a single AND would do.

Note that adding an explicit cast to uint8_t for the test of the bit in either case corrects this problem, however for the macro it causes unnecessary register thrashing that actually makes the code worse:

//Your Macro
MOVW      R18,R24   
MOVW      R20,R26   
LSR       R21       
ROR       R20       
ROR       R19       
ROR       R18       
SBRS      R24,0     
RJMP      PC+0x0005 
MOVW      R26,R20   
MOVW      R24,R18   
ORI       R27,0x80  
RJMP      PC+0x0003 
MOVW      R26,R20
MOVW      R24,R18

//Cliff's code
MOVW      R18,R24  
MOVW      R20,R26  
LSR       R21      
ROR       R20      
ROR       R19      
ROR       R18      
SBRC      R24,0    
ORI       R21,0x80 

Regards,
Steve A.

The Board helps those that help themselves.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0
#define Ror32(x)                           \
  asm (                                    \
    "mov   __tmp_reg__, %A[_X]    \n\t"    \
    "ror   __tmp_reg__            \n\t"    \
    "ror   %D[_X]                 \n\t"    \
    "ror   %C[_X]                 \n\t"    \
    "ror   %B[_X]                 \n\t"    \
    "ror   %A[_X]                 \n\t"    \
    : [_X]  "+r" (x)                       \
  )

#define Rol32(x)                           \
  asm (                                    \
    "mov   __tmp_reg__, %D[_X]    \n\t"    \
    "rol   __tmp_reg__            \n\t"    \
    "rol   %A[_X]                 \n\t"    \
    "rol   %B[_X]                 \n\t"    \
    "rol   %C[_X]                 \n\t"    \
    "rol   %D[_X]                 \n\t"    \
    : [_X]  "+r" (x)                       \
  )

etc.

#define Ror(x)                             \
  asm (                                    \
    "  mov   __tmp_reg__, %A[_X]  \n\t"    \
    "  ror   __tmp_reg__          \n\t"    \
    ".if (%[_N] >= 4)             \n\t"    \
    "  ror   %D[_X]               \n\t"    \
    ".endif                       \n\t"    \
    ".if (%[_N] >= 3)             \n\t"    \
    "  ror   %C[_X]               \n\t"    \
    ".endif                       \n\t"    \
    ".if (%[_N] >= 2)             \n\t"    \
    "  ror   %B[_X]               \n\t"    \
    ".endif                       \n\t"    \
    "  ror   %A[_X]               \n\t"    \
    : [_X]  "+r" (x)                       \
    : [_N]  "n"   (sizeof(x))              \
  )

#define Rol(x)                             \
  asm (                                    \
    ".if (%[_N] == 4)             \n\t"    \
    "  mov   __tmp_reg__, %D[_X]  \n\t"    \
    ".elseif (%[_N] == 3)         \n\t"    \
    "  mov   __tmp_reg__, %C[_X]  \n\t"    \
    ".elseif (%[_N] == 2)         \n\t"    \
    "  mov   __tmp_reg__, %B[_X]  \n\t"    \
    ".else                        \n\t"    \
    "  mov   __tmp_reg__, %A[_X]  \n\t"    \
    ".endif                       \n\t"    \
    "  rol   __tmp_reg__          \n\t"    \
    "  rol   %A[_X]               \n\t"    \
    ".if (%[_N] >= 2)             \n\t"    \
    "  rol   %B[_X]               \n\t"    \
    ".endif                       \n\t"    \
    ".if (%[_N] >= 3)             \n\t"    \
    "  rol   %C[_X]               \n\t"    \
    ".endif                       \n\t"    \
    ".if (%[_N] >= 4)             \n\t"    \
    "  rol   %D[_X]               \n\t"    \
    ".endif                       \n\t"    \
    : [_X]  "+r" (x)                       \
    : [_N]  "n"   (sizeof(x))              \
  )

Enjoy! ;-)

JW

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I would reconsider the problem that needs to be solved.

In other words, does the data really need to be rotated at all?
Or is it really that the data just needs to be viewed rotated or written somewhere else rotated?

There is very big distinction.

For example you may be able to simply extract the bits
from the original data in a different/shifted/rotated
order rather than actually rotate the data itself.

For something like an led chaser you will probably have
to examine each bit in the sequence data anyway,
so why not just rotate where you start looking into
the sequence rather than rotate the actual sequence data
to allow always pulling the data starting at the beginning.

Also consider that if you want the sequence data to
live in non volatile memory, you can't really rotate
the real data.

Say you were doing a scrolling sign that consisted
of a very wide bitmap that is larger than the display
device and perhaps even so large that it won't fit
in the RAM insdide an AVR. On something like, you
would definitely want to play with indexes rather
than rotate the actual data.

And consider that if you have a very large sequence
and the sequence data is acatually rotated, do
you need a way to get back to the original data
to restart things? If so, how do you do that?

In the simplest case such as the huge 512 leds
you mention, it could be as simple as blindly copying the bits
for each led from one location (output port/bit) to the next to perform the rotate.
Set the initial pattern, then just
copy each bit over as needed.

But I'm guessing in reality for something like that
a set of external shift registers would be used and the entire
problem would go away.

Just a few alternative ideas.

--- bill

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
Quote:

2. Cliff said that nobody would need to do a rotate in C, but I want to use this in a Chaser Lights project where I rotate a pattern on LEDs and I think this is better than using look up tables since this can do 8, 16, and 32 bit shifts all with the same macros. So can Cliff or someone tell me a better way?

Well since you mentioned me I'll just reiterate what several people have effectively said above (BTW I like Steve's C++ template which is good for all type widths). Depending on which way you are rotating you start by testing the bit at the top/bottom end - the one that's about to "drop off" and in an Asm rotate would be finding its way into the C flag. You then use a left/right shift to do the shuffle then, if the initial test was true (the original top/bottom bit were set), you OR in a bit at the other end. While you can try can work this into some complex one line macro it seems to me it's as simple as:

#include 

static inline uint8_t ROR8(uint8_t n) {
  uint8_t carry = 0;

  if (n & 1) {
    carry = 1;
  }
  n >>= 1;
  if (carry) {
    n |= 0x80;
  }
  return n;
}

static inline uint8_t ROL8(uint8_t n) {
  uint8_t carry = 0;

  if (n & 0x80) {
    carry = 1;
  }
  n <<= 1;
  if (carry) {
    n |= 0x01;
  }
  return n;
}

int main(void) {
  uint8_t i;

  DDRB = 0xFF;
  PORTB = 0x01;
  for (i=0; i<12; i++) {
    PORTB = ROL8(PORTB);
  }
  for (i=0; i<12; i++) {
    PORTB = ROR8(PORTB);
  }
}

That may look wordy and long winded but at the end of the day the C compiler is probably unlikely to generate (much ;-)) more code than some tightly packed complex function/macro. A single invocation looks like:

  PORTB = ROL8(PORTB);
  6c:	88 b3       	in	r24, 0x18	; 24
  6e:	98 2f       	mov	r25, r24
  70:	99 0f       	add	r25, r25
  72:	87 fd       	sbrc	r24, 7
  74:	91 60       	ori	r25, 0x01	; 1
  76:	98 bb       	out	0x18, r25	; 24

I prefer static inline functions to macros because of the type checking.

The 16 and 32bit versions are obviously:

static inline uint16_t ROR16(uint16_t n) {
  uint8_t carry = 0;

  if (n & 1) {
    carry = 1;
  }
  n >>= 1;
  if (carry) {
    n |= (uint16_t)0x8000;
  }
  return n;
}

static inline uint16_t ROL16(uint16_t n) {
  uint8_t carry = 0;

  if (n & (uint16_t)0x8000) {
    carry = 1;
  }
  n <<= 1;
  if (carry) {
    n |= 0x01;
  }
  return n;
}

and:

static inline uint32_t ROR32(uint32_t n) {
  uint8_t carry = 0;

  if (n & 1) {
    carry = 1;
  }
  n >>= 1;
  if (carry) {
    n |= (uint32_t)0x80000000;
  }
  return n;
}

static inline uint16_t ROL32(uint32_t n) {
  uint8_t carry = 0;

  if (n & (uint32_t)0x80000000) {
    carry = 1;
  }
  n <<= 1;
  if (carry) {
    n |= 0x01;
  }
  return n;
}

Cliff,

May I have your permission to use these in a library that has a BSD license?

Smiley

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Of course - I'm sure I can't be the first one to have written the exact same code so I think claiming any kind of copyright would be very tricky anyway.

(but I do prefer that anything I make public is BSD rather than GPL as I don't want it to be a barrier to commercial re-use).

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I don't know what else you have in the lib , but for a ASM programmer a ROR on 8 bit will repeat it self after 9 call's not 8 , etc for 16 32 64 repeat n+1, so if I was you I would find an other name, to avoid confusion.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

One could simulate the "9th bit" (that is the carry) if you like? But I think for the Cylon Eyes/Knightrider thing you actually want the rotate "within" the byte/word. So like Sparrow says - maybe ROR is a misnomer?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yeah, I realize the algorithm is well known, but I've been dealing with some very anal retentive types about licenses so I'm trying a bit harder to make sure I'm covered.

I named them rotate_right8 etc. so I doubt there will be confusion. (over the name anyway)

Smiley

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I made a 'wheel of fortune' with a circle of lamps with a mega8 with a bunch of 595s and my approach was to have a bunch of patterns of 1 of 8 on, 2 of 8 on, 4 on 4 off etc and you can use a for loop to fill the bytes of ram that are output to the 595s. Not a shift instruction in sight it seems.

Imagecraft compiler user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I said earlier:

Quote:
... And of course if I was going to do this I'd use SPI which ironically enough rotates the bits...

I also wrote a couple of articles on the topic that use the 595 and 597:
http://smileymicros.com/blog/201...
http://smileymicros.com/blog/201...

The SPI does the shift for you.

Smiley