endian issue

Go To Last Post
27 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

1. Can someone tell me if the AVR Micro controller's are Little or Big Endian ?
2. What is the endian of WINAVR compiler?
3. Is the endian of a microcontroller fixed purely due to its hardware architecture or does the programming language/compiler (Assembly, C or Basic) has any effect on it?

Thanks,
prabu

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Well the GCC tools tend to err towards little endian. If you use:

uint16_t var16=0x1234;
uint8_t * p_var8 = (uint8_t *) &var16;
printf("%02X", *p_var8);

you will see 0x34 output.

Cliff

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

i can't get you clearly. could you please explain little bit more.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Cliff put 0x1234 in a 16 bit variable. He then output the 'first' byte of that variable using a pointer to it and printf. What came out was 0x34, i.e. the lowest byte value.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Where AVRs use addresses or othe values which need two bytes, they would seem to use the little endian method, i.e. the low byte of the address or value occupies the lower memory address. The GCC compiler, which is part of WinAVR, uses the same convention (at the moment).

Quebracho seems to be the hardest wood.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

ok.what about the other issues that mentioned below,

if the AVR Micro controller's are Little or Big Endian?

is the endian of a microcontroller fixed purely due to its hardware architecture or does the programming language/compiler (Assembly, C or Basic) has any effect on it?

thanks,
prabu

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

prabu wrote:
ok.what about the other issues that mentioned below,

if the AVR Micro controller's are Little or Big Endian?


Neither.
Typically, endianness refers to the the order in which bytes are operated on in memory when atomic arithmetic and logic is performed on chunks of data larger than 1 byte. With only a very few exceptions, AVRs are incapable of operating on data in chunks larger than 1 byte, so the concept of endianness doesn't even exist for them.

However, there are some instances of multi-byte memory structures. And even here, things are not consistent. For example, the X, Y, and Z pointer registers are all interpreted as little-endian 16 bit values. Additionally, most (but not all) AVRs group their 16-bit timer registers as little-endian values. Some counter-examples also exist. For example, the return addresses on the AVR's call-return stack are stored as big-endian values. So, actually, the AVR core might be argued to be mixed-endian. But since the basic arithmetic and logic core of the AVR is limited to single-byte operations, I'd argue that neither definition is appropriate.

Quote:
is the endian of a microcontroller fixed purely due to its hardware architecture or does the programming language/compiler (Assembly, C or Basic) has any effect on it?

In a microprocessor like the 80x86, the instruction set is biased in favour of little-endian operations. It would be impossible to make it operate entirely on
big-endian numbers (because of the need to use little endian numbers to address memory), but even if you tried, you'd end up severely slowing down system performance because you could no longer perform arithmetic in chunks larger than 8 bits.

So, in some circumstances, the underlying hardware does necessitate the endianness of the system no matter what programming language you choose to use.

In the AVR, since most logic and arithmetic is done in 8-bit chunks, either endianness could be used by the compiler with no significant degradation in performance. It's arguably more convenient to use little endian numbers to be more consistent with the layout of the pointer registers. But even then it's not necessary because the two halves of those registers can be loaded independently.

Quote:
...or does the programming language/compiler (Assembly, C or Basic) has any effect on it?

Probably not so much a question of C vs Basic vs Assembly. Rather, it would be more likely a question of which compiler vendor you're using (eg GCC vs IAR vs CodeVision vs ImageCraft etc.)

- Luke

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
if the AVR Micro controller's are Little or Big Endian?

Well the AVR8s are pretty much 8 bit micros so endianess doesn't really come into it.

However in some of the peripheral blocks there are 16 bit registers (take ADCW which is ADCH and ADCL or UBRR which is UBRRL and UBRRH or TCNT1 which is TCNT1L and TCNT1H) in those the convention is that the L register is at the lower IO/memory address. So they too are little endian.

Cliff

PS Just out of interest, why is any of this important? It's only usually when you are transferring data between processors of different endianess that the whole question of byte ordering comes into play.

EDIT: Yup, what he said ;)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

If you know which end comes first, you can create and use a union to help avr-gcc efficiently access bytes within words.

- John

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I think this should be a candidate for an FAQ/Tutorial...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Does it come into play when you use SDI or MMC flash cards? Or, is it set by the file access mechanism? Example: storing an int_32?

Jim

Jim Wagner Oregon Research Electronics, Consulting Div. Tangent, OR, USA http://www.orelectronics.net

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

It would be nice if this was in a FAQ and/or the help documentation, but I am glad I at least found it here (thanks!). I needed this info for sending data to/from a PC via a serial port. Copying a C struct directly to/from a stream of bytes is efficient, but you need to know and/or control the endianness and alignment/packing to make both sides agree on how that stream of bytes maps to the struct members.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

chunderdog wrote:
It would be nice if this was in a FAQ and/or the help documentation, but I am glad I at least found it here (thanks!). I needed this info for sending data to/from a PC via a serial port. Copying a C struct directly to/from a stream of bytes is efficient, but you need to know and/or control the endianness and alignment/packing to make both sides agree on how that stream of bytes maps to the struct members.

Makre sure you don't get burned by padding! If I have a structure:

struct SExample
{
    uint8_t byte;
    uint32_t longWord;
}[

Then on an x86 machine, you may find that there are 3 "wasted" bytes between .byte and .longWord - x86 will generally align operations to 4-byte boundaries. Other architectures (e.g. powerpc, 68K) differ.

Therefore, if I have a serialised array of that struct:

uint8_t array[] = { 0x11 0x22 0x33 0x44 0x55 };

The following is not guaranteed to work:

SExample example;
memcpy(&example, array, sizeof(SExample));

// This _may_ result in, depending on machine and compiler:
// example.byte = 0x11;
// example.longWord = 0x??????55; 
// Where the ?? values are whatever was past the array in memory.
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

damien_d wrote:
Make sure you don't get burned by padding!
...

Yes, that is very important. Nice example! That is a good explanation of the packing/alignment issue.

To make the byte streams map to the same struct members on both sides, the simplest solution is to tell the compilers on both sides to pack the structs.

For GCC, use __attribute__ ((__packed__)):

struct SExample
{
    uint8_t byte;
    uint32_t longWord;
} __attribute__ ((__packed__));

For Visual C++, the #pragma pack directive tells the compiler the packing alignment:

#pragma pack(push, 1)
struct SExample
{
    uint8_t byte;
    uint32_t longWord;
};
#pragma pack(pop)

I don't know about other compilers, but I would imagine that most compilers would have some way to specify that a struct should be packed.

Since both ends of the conversation are little-endian in my case, I don't have to worry about anything else. To communicate with a big-endian host, you could use htons(), htonl(), ntohs(), ntohl(), or similar functions or macros to swap the order of the bytes in multi-byte (larger than uint8_t) elements.

Last Edited: Tue. Nov 3, 2009 - 11:23 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

I don't know about other compilers

-fpack-struct for GCC

(users of avr-gcc have probably seen this in their build - though it's a bit pointless as everything is byte aligned in AVR8 land anyway)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
Quote:

I don't know about other compilers

-fpack-struct for GCC

(users of avr-gcc have probably seen this in their build - though it's a bit pointless as everything is byte aligned in AVR8 land anyway)

Individual structs can be packed using __attribute__((__packed__)).

We made use of it when sharing data structures through common header files between a mutually communicating AVR and PC application (while allowing the PC application to allocate code-efficiently the non-AVR-related structs).

JW

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I would say the AVR is pretty clearly little endian.

The index registers all use the order XL:XH YL:YH ZL:ZH

The results of a multiplication are put into r0:r1 in the order LSB:MSB

Cheers,

Joey

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Joey,

I know it's a bit contrived but set a Data-Memory window in the simulator to watch your stack and then make a CALL. ;-)

Cliff

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
I know it's a bit contrived but set a Data-Memory window in the simulator to watch your stack and then make a CALL. ;-)
But even here you can consider the AVR as being little endian. First the low byte is put on the stack, then the high byte. It only appears to be big endian, because the stack grows downwards. ;-)

Stefan Ernst

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

It only appears to be big endian, because the stack grows downwards

Hence "contrived" ;-)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
Hence "contrived" ;-)
Perhaps I should look up an unknown word before answering next time. :oops:

Stefan Ernst

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Here is a brief summary of info from this topic that is relevant to memcpy serialization of C structs for transferring data between AVR and non-AVR hosts (for example, a PC):

  1. avr-gcc is little-endian
  2. avr-gcc packs structs (aligns members to 1-byte boundaries) by default
  3. compilers for non-AVR platforms typically pad structs as needed to align members to the appropriate boundaries for efficient access (http://en.wikipedia.org/wiki/Data_structure_alignment)
  4. to make C structs on non-AVR platforms be memcpy-compatible (aside from endianness) with their AVR counterparts, the compilers can be told to pack the structs
    1. For GCC, use __attribute__ ((__packed__))
    2. For Visual C++, use #pragma pack(push, 1) ... #pragma pack(pop)
  5. to match endianness, multi-byte fields can be converted to/from network byte order (big endian) via htonl, ntohl, htons, and ntohs (http://linux.die.net/man/3/htonl) or similar functions or macros
  6. alternatively, a little-endian format could be used if data will not be shared with big-endian hosts or if such hosts convert the fields to/from little-endian format (note that htonl et al. will not work for this, so custom functions or macros may be needed)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

chunderdog wrote:
I needed this info for sending data to/from a PC via a serial port.

Beware that for example JAVA virtual machine always uses big-endian (Motorola format), unlike x86 PC (Intel format) native compilers which use little-endian. So it comes to that if endianity matters and it is not the same on the both sides, one side has to adapt.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

know it's a bit contrived but set a Data-Memory window in the simulator to watch your stack and then make a CALL.

You could call that either-endian ;)

It puts them on the stack little end first, but fetches them big-end first.

Cheers,

Joey

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

One case where this is important is for implementing a bootloader. I've been working on a CAN Bootloader for the mega644p. Pages are filled a word at a time by the system rather than a byte at a time. In the avr lib-c documentation, example code is given in avr/boot.h that specifies that the word to write should be little-endian.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

"the return addresses on the AVR's call-return stack are stored as big-endian values"

 

Thank you for pointing that out. It has been screwing me up bad.

Who would have thunk they could do such a moronic thing!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

From the standpoint of the stack, which grows downwards, it is little-endian, i.e. the LSB of a return address is pushed onto the stack first.  It is only when you try to directly manipulate a return address on the stack that you must remember that this means the LSB will be >>after<< the MSB in SRAM.

 

Facilitation of the direct manipulation of return addresses on the stack was likely not a design requirement when the architecture was designed.  The call/rcall/ret/reti instructions are generally the only ones used to place and retrieve return addresses to/from the stack, and they do it correctly.

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

Last Edited: Thu. Oct 15, 2015 - 08:19 PM