split uint16_t into two uint8_t

Go To Last Post
34 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I have an integer that contains 0cDEAD.  I need to split the high and low byte into separate bytes.

 

I tried:

char c16[2];
uint16_t ui16 = 0xdead;
memcpy( c16, ui16, 2 );

But Studio spits up about a cast and memcpy looking for CONST VOID.

 

Any ideas on how to do this simply?

 

Jim

This topic has a solution.

I would rather attempt something great and fail, than attempt nothing and succeed - Fortune Cookie

 

"The critical shortage here is not stuff, but time." - Johan Ekdahl

 

"Step N is required before you can do step N+1!" - ka7ehk

 

"If you want a career with a known path - become an undertaker. Dead people don't sue!" - Kartman

"Why is there a "Highway to Hell" and only a "Stairway to Heaven"? A prediction of the expected traffic load?"  - Lee "theusch"

 

Speak sweetly. It makes your words easier to digest when at a later date you have to eat them ;-)  - Source Unknown

Please Read: Code-of-Conduct

Atmel Studio6.2/AS7, DipTrace, Quartus, MPLAB, RSLogix user

Last Edited: Mon. Dec 30, 2019 - 07:25 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

jgmdesign wrote:
Any ideas on how to do this simply?

 

You are already doing it "simply". But `memcpy` expects pointers as its first and second argument

 

memcpy( c16, &ui16, 2 );

 

It is not clear to me why you were trying to pass `ui16` to it directly.

 

jgmdesign wrote:
memcpy looking for CONST VOID

 

What are you talking about? The second parameter of `memcpy` is `const void *`, not some "CONST VOID".

 

Of course, the question you left unanswered is whether you want an endian-dependent or an endian-independent result. The answer to this question will determine the best/proper way to do it.

Last Edited: Fri. Dec 27, 2019 - 06:22 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

unions are best IMO

 

typedef union { 
    struct {
        char cL;
        char cH;
    };
    uint16_t u;
} cu_t;

void foo(uint16_t uv) {
    cv_t v;
    v.u = uv;
    putchar(v.cL);
    putchar(v.cH);
}

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 2

c16[0] = (char) (ui16 >> 8);

c16[1] = (char) (ui16 & 0x00ff); //may not need the mask as the cast to char should do it

 

???

 

Jim

 

Click Link: Get Free Stock: Retire early! PM for strategy

share.robinhood.com/jamesc3274
get $5 free gold/silver https://www.onegold.com/join/713...

 

 

 

 

This reply has been marked as the solution. 
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 2

EDIT: beat me to it

 

Simply to me would be

    uint8_t msb = u16 >> 8;
    uint8_t lsb = u16;

then the endianess doesn't matter, msb will always be the most signifcant byte.

If you don't need to worry about using the same code on different platforms with possibly different endianess, then OK the memcpy (once you make it the address of u16 as already pointed out) or union method can be used.

Last Edited: Fri. Dec 27, 2019 - 06:15 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

MrKendo wrote:
then the endianess doesn't matter, msb will always be the most signifcant byte.

 

What if the OP actually wants the endianness to matter?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

MattRW wrote:
unions are best IMO

 

... with the remark that using unions for type-punning is legal in C, but formally illegal in C++.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

AndreyT wrote:

MrKendo wrote:

then the endianess doesn't matter, msb will always be the most signifcant byte.

 

 

What if the OP actually wants the endianness to matter?

Then use memcpy or union.

 

 

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

AndreyT wrote:
What if the OP actually wants the endianness to matter?

 

Ideally the MSB would be in the high byte, but I am not picky.

 

Gotta play some more.

 

Thanks all

 

JIm

I would rather attempt something great and fail, than attempt nothing and succeed - Fortune Cookie

 

"The critical shortage here is not stuff, but time." - Johan Ekdahl

 

"Step N is required before you can do step N+1!" - ka7ehk

 

"If you want a career with a known path - become an undertaker. Dead people don't sue!" - Kartman

"Why is there a "Highway to Hell" and only a "Stairway to Heaven"? A prediction of the expected traffic load?"  - Lee "theusch"

 

Speak sweetly. It makes your words easier to digest when at a later date you have to eat them ;-)  - Source Unknown

Please Read: Code-of-Conduct

Atmel Studio6.2/AS7, DipTrace, Quartus, MPLAB, RSLogix user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

For C and avr-gcc, the union method is probably best.

When doing it with arithmetic, I've seen some pretty awful code from avr-gcc.

 

With inline assembly, a single MOVW would suffice.

Iluvatar is the better part of Valar.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

skeeve wrote:

With inline assembly, a single MOVW would suffice.

 

That's basically exactly what `memcpy` will translate into in any modern self-respecting compiler.  `memcpy` is a quintessential example of an intrinsic.

 

https://godbolt.org/z/tut4xm

 

memcpy( c16, (const void *) &ui16, 2 ):

 

  ldd r24,Y+1

  ldd r25,Y+2

  std Y+4,r25

  std Y+3,r24

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

One could simply cast the pointer and avoid the memcpy altogether.  Personally I'd just use the shift and mask technique.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Kartman wrote:
Personally I'd just use the shift and mask technique
+1

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Kartman wrote:
One could simply cast the pointer and avoid the memcpy altogether.

 

In this particular case it is indeed a possibility, since we are type-punning the data as an array of `char`, specifically. But in general case (when the target type is not necessarily `char`-based) that would be an aliasing violation. Not permitted. In general case it is either `memcpy` or a union (in C) as the most versatile and safe methods of type-punning.

 

Kartman wrote:
Personally I'd just use the shift and mask technique.

 

Again, it is a matter of what the objective is. Shift-and-mask is a good approach in arithmetic contexts, assuming if it gives one what one needs. Again, in the OP's case the fact that that target type is `char` (as opposed to `unsigned char`) introduces some nuances (probably, unintentional).

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

skeeve wrote:

For C and avr-gcc, the union method is probably best.

When doing it with arithmetic, I've seen some pretty awful code from avr-gcc.

Indeed. Back in the days of WinAVR-20100110 and before, folks had to keep an eye on the generated assembly. I remember writing some dreadful constructs just to get sensible assembly from the compiler. (Looking back though I probably shouldn't have wasted my time; the inefficient code wouldn't have made any significant difference to the end product)

 

Fortunately now we're as good as in 2020; those days are long gone:

Just write the code that clearly demonstrates your intention, let the compiler/optimiser work it's magic and have done with it.

 

So here: The 2-line function on the left compiles to the assembly on the right with -Os

#include <stdint.h>

extern uint16_t num;
extern uint8_t gh, gl;          // test:
                                //		lds r24,num
void test (void) {              //		lds r25,num+1
    gh = num >> 8;              //		sts gh,r25
    gl = num & 0xff;            //		sts gl,r24
}                               //		ret

Almost perfect !

 

Last Edited: Sat. Dec 28, 2019 - 04:54 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

skeeve wrote:
For C and avr-gcc, the union method is probably best (sic).

Strictly speaking, this relies upon undefined behaviour - so is best avoided.

 

You are likely to get away with it on an 8-bitter - but it may fall apart on targets where alignment and/or padding come into play.

 

So I'm with the others on preferring the shift & mask.

 

EDIT

 

typo

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
Last Edited: Sat. Dec 28, 2019 - 07:19 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

awneil wrote:

skeeve wrote:

For C and avr-gcc, the union method is probably best (sic).

 

Strictly speaking, this relies upon undefined behaviour - so is best avoided.

 

Actually, it doesn't. Union method for type-punning has been deliberately legalized in C since C99. There's no undefined behavior (as long as you are not running into trap representation of the target type).

 

This is undefined behavior in C++, but not in C.

 

awneil wrote:
You are likely to get away with it on an 8-bitter - but it may fall apart on targets where alignment and/or padding come into play.

 

Um... That is actually the reason unions are so popular in this role (and the reason such usage was formally made legal): they allow one to naturally avoid any alignment or padding issues. Unions are guaranteed to be properly aligned for all of their members.

 

awneil wrote:
So I'm with the others on preferring the shift & mask.

 

As long as you and others recognize the fact that first and foremost, this is a matter of intent. And only after that a matter of preference.

Last Edited: Sat. Dec 28, 2019 - 08:20 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

AndreyT wrote:
Again, in the OP's case the fact that that target type is `char` (as opposed to `unsigned char`) introduces some nuances (probably, unintentional).

 

I never insisted on CHAR or any other requirement parameters other than I want to split up an unsigned integer into two bytes.  uint8_t is fine with me too, and I was looking at that originally.

 

The code I posted was something I pulled off of the interwebs while searching around.  it was an experiment that went funky and I was asking for some guidance.

 

JIm

I would rather attempt something great and fail, than attempt nothing and succeed - Fortune Cookie

 

"The critical shortage here is not stuff, but time." - Johan Ekdahl

 

"Step N is required before you can do step N+1!" - ka7ehk

 

"If you want a career with a known path - become an undertaker. Dead people don't sue!" - Kartman

"Why is there a "Highway to Hell" and only a "Stairway to Heaven"? A prediction of the expected traffic load?"  - Lee "theusch"

 

Speak sweetly. It makes your words easier to digest when at a later date you have to eat them ;-)  - Source Unknown

Please Read: Code-of-Conduct

Atmel Studio6.2/AS7, DipTrace, Quartus, MPLAB, RSLogix user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

AndreyT wrote:
Actually, it doesn't. Union method for type-punning has been deliberately legalized in C since C99. There's no undefined behavior (as long as you are not running into trap representation of the target type).

 

This is undefined behavior in C++, but not in C.

IIRC, even in C++, arrays of unsigned chars (jdmdesign used plain chars) are allowed to alias anything.

Unsigned chars do not have trap representations.

Also IIRC plain chars default to signed on avr-gcc.

That might complicate plans for subsequent use.

Also, on a platform (not avr-gcc) whose plain chars are signed, but not twos-complement,

using plain char as byte is a bad idea.

 

Iluvatar is the better part of Valar.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Just noticed: example uses char, but title uint8_t.

Iluvatar is the better part of Valar.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

skeeve wrote:

Just noticed: example uses char, but title uint8_t.

Title is correct, the OP was an an example I had tried.

Jim

I would rather attempt something great and fail, than attempt nothing and succeed - Fortune Cookie

 

"The critical shortage here is not stuff, but time." - Johan Ekdahl

 

"Step N is required before you can do step N+1!" - ka7ehk

 

"If you want a career with a known path - become an undertaker. Dead people don't sue!" - Kartman

"Why is there a "Highway to Hell" and only a "Stairway to Heaven"? A prediction of the expected traffic load?"  - Lee "theusch"

 

Speak sweetly. It makes your words easier to digest when at a later date you have to eat them ;-)  - Source Unknown

Please Read: Code-of-Conduct

Atmel Studio6.2/AS7, DipTrace, Quartus, MPLAB, RSLogix user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

AndreyT wrote:

Union method for type-punning has been deliberately legalized in C since C99. There's no undefined behavior (as long as you are not running into trap representation of the target type).

 

This is undefined behavior in C++, but not in C.

 

Well, the compiler can handle undefined behaviour as it pleases. Fortunately, gcc handles this particular undefined behaviour always in the same way: type-punning with unions works, even in C++.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

El Tangas wrote:
the compiler can handle undefined behaviour as it pleases

That's true

 

. Fortunately, gcc handles this particular undefined behaviour always in the same way

In other words, the code works by luck - not by design.

 

In general, relying upon undefined behaviour is not a sensible design approach.

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

awneil wrote:
In other words, the code works by luck - not by design.

 

It will work for as long as this is present in gcc documentation:

 

Pay special attention to code like this:

union a_union {
  int i;
  double d;
};

int f() {
  union a_union t;
  t.d = 3.0;
  return t.i;
}

The practice of reading from a different union member than the one most recently written to (called “type-punning”) is common. Even with -fstrict-aliasing, type-punning is allowed, provided the memory is accessed through the union type. So, the code above works as expected.

I other words, gcc handles this undefined behaviour in a well documented way. Of course they can change it whenever they wan to.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

To be fair, I think pretty much all compilers do the same in practice - even before C99 "legalised" it.

 

El Tangas wrote:
Of course they can change it whenever they wan to

That's always the catch with undefined behaviour!

 

(although I guess there's a subtle difference between stuff which is undefined by the Standards, and stuff which remains undefined by the implementation ...? )

 

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

El Tangas wrote:
Well, the compiler can handle undefined behaviour as it pleases.

 

Yes, absolutely. The compiler is free to "define the undefined", as a compiler-specific extension. The compiler is free to assign well-defined behavior to a specific kind of formally undefined behavior, in which case it effectively turns "undefined" into "implementation-defined" (read: documented) behavior. Or, the compiler is free to leave the behavior undefined. In the latter case the code will easily behave unpredictably.

 

P.S. There's a surprising and inexpiable misguided belief out there, that "undefined  behavior" does not exist in real-life compilers. I.e. for some reason some people believe that compilers always somehow "define" undefined behavior. This is, of course, not true.

Last Edited: Mon. Dec 30, 2019 - 08:29 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

AndreyT wrote:
. I.e. for some reason some people believe that compilers always somehow "define" undefined behavior. This is, of course, not true.
are you suggesting they generate non deterministic code? For any scenario they must do something and it's pretty likely it'll be the same something every time you run that build of the compiler.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Just as a warning, and not specific to AVR in any more than a generic sense: on a work job I recently had to decode a message group emitted by a ARM, from code built with the EWARM desktop (IAR compiler). That message consisted of a standard header, and then a type-specific message. The messages - a couple of dozen - were all defined as packed structures of locally defined types - from bitfields up to 64 bit variables, 8/16/32 bit variables, and text. The overall message was a structure containing the header and a union of all the structures; just pack the appropriate structure and transmit it.

 

To decode, read the entire message into a similar struct/union and read out the appropriate structure from the union. No problem, on the ARM that was intended to receive it.

 

On a W64 build on a desktop though... meh. Bitfields the other way around, things not packed to the same size, alignment issues, not recognising typedeffed enum variables as the same size, even logic variable the wrong size. A complete nightmare, having to go through the whole of the original definitions and refit them to MS C's idea of what they should be.

 

Do what you like if (a) you're using the same compiler at both ends; (b) you're talking to the same processor, or at least one with compatible field sizes; and (c) if you don't care about talking to anything else. Otherwise, stick to the standards. After all, there are so many to choose from.

 

The point about undefined behaviour is this: even if it's defined somewhere - and it must be, for the compiler to work - it is by no means guaranteed to work the same across different compilers, different versions of the same compiler, or different target processors.

 

Neil

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

barnacle wrote:

The point about undefined behaviour is this: even if it's defined somewhere - and it must be, for the compiler to work - it is by no means guaranteed to work the same across different compilers, different versions of the same compiler, or different target processors.

 

... or different optimization levels of the same compiler.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

IIRC gcc does document its behavior regarding type-punning.

If not, I retract my recommendation.

memcpy might be good.

 

BTW behavior could be undefined by the compiler.

Undefined behavior could result in the compiler using an uninitialized

variableor even a hardware random number generator.

I'm not recommending it,

but compiler writers could be real mean if they wanted to.

Iluvatar is the better part of Valar.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:

AndreyT wrote:

. I.e. for some reason some people believe that compilers always somehow "define" undefined behavior. This is, of course, not true.

 

are you suggesting they generate non deterministic code? For any scenario they must do something and it's pretty likely it'll be the same something every time you run that build of the compiler.

 

Firstly, no the compiler is not really required to generate code. The compiler is free to flat out refuse to compile undefined code, aborting the compilation process with a diagnostic message.

 

Secondly, generating the same "something" every time does not amount to defined behavior. For one, defined behavior has to be consistent from one site where a language feature is used to another. With undefined code there's no such consistency.

 

Yes, the compiler will most of the time generate "something", but in 99 cases out of 100 that "something" is a bizarre combination of unrelated bits and pieces that somehow combined into some meaningless "Franken-something". And each instance of that "Franken-something" is generally different, even if the code that produced it appears the same. Changing something else in a seemingly unrelated portion of the code might easily affect what the compiler generates for undefined code in other places.

 

 

 

Last Edited: Thu. Jan 2, 2020 - 09:14 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
are you suggesting they generate non deterministic code?

We're talking undefined behaviour - not (necessarily) undefined code.

 

eg, the value of an uninitialised auto variable is "undefined" - usually, it's just whatever value happens to be left in the memory location from whatever used it before

 

For any scenario they must do something and it's pretty likely it'll be the same something every time you run that build of the compiler.

So long as it's the exact same build of the compiler with the exact same source code and exact same options.

 

But that doesn't mean that one "snippet" of source code will always be translated identically wherever it occurs; context can matter - especially at higher optimisation levels.

 

Add to that modern linkers also bringing in their own optimisations.

 

I guess, strictly, one could analyse the complete system and work out what would happen - so it is "deterministic" in that sense - but it's not necessarily going to be clear from examining the source code.

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Some compilers, e.g. gcc, sometimes just assume that undefined code is unreachable.

IIRC it warns about the undefined code,

but does not warn that the code has been optimized away.

for(int j=0; j>=0; ++j) { ... }  would get a warning and be optimized away

because int arithmetic overflow is undefined.

Iluvatar is the better part of Valar.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

if you don't actually need to copy you can do something like:  in all the mapped examples above (memcpy, and the union, and the one below) the endianess is what is when natural one ot the platform. [or the source of the data if it came in from an external source]

[code]

char *c16;
uint16_t ui16 = 0xdead;
c16 = (char *)&ui16;

Writing code is like having sex.... make one little mistake, and you're supporting it for life.

Last Edited: Fri. Jan 3, 2020 - 07:46 PM