C-language Cast & Sign Extension?

Go To Last Post
13 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Greetings -

 

I have carefully  scoured K&R and cannot find anything that explains what happens when one "down-casts" from an int16 or unint16 to a corresponding int8 or uint8. I was quite surprised to not find anything! Similarly int32 to int16.

 

Here is the situation:

 

I have an int32_t on which I have done several operations (specifically ">>16") that guarantees that the result is a 16-bit integer. Now, I want to use it as an int16_t. 

 

The question:

 

Which half of the larger type is used for the smaller type? I have always assumed that it is the low half, but just realized that I have no basis for that assumption.

 

I've also looked at a list file  but, at the moment, it seems pretty incomprehensible. 

 

In a similar vein, where can I find the rules for sign-extension? K&R uses the term, but I cannot find a definition for it or what the conditions are under which it happens.

 

Thanks for your help,

Jim

Jim Wagner Oregon Research Electronics, Consulting Div. Tangent, OR, USA http://www.orelectronics.net

Last Edited: Sun. Aug 11, 2019 - 06:24 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

It is your responsibility. C will silently assign a wide expression to a narrow variable e.g. assign an int8_t variable from an uint32_t expression is silent truncation.
The int8_t wiil have the sign that happened to be in bit7 of the unsigned expression.
A compiler will tell you if you assign a char from a pointer.
.
The word to search for is truncate or truncation
.
It is unwise (tm) to use a signed variable for individual bit fields.
You will need to manage bit31 i.e. the sign bit if you want to transfer it to bit15 of an int16_t
.
David.

Last Edited: Sun. Aug 11, 2019 - 06:57 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

That does not quite answer the question. 

 

Certainly, there must be some rules about which bytes from the wider type are used for the narrower type. For a cast from a 4-byte type to a 2-byte type, I assume that the narrow one is not taken from the middle two bytes. It also seems logical that it is not taken from the highest and lowest bytes. That leaves the two high byte as one possibility and the low two bytes as another. K&R does not  tell me which.

 

On the  question of sign extension, the question really applies to doing a right-shift of a signed value. If, lets say, I do 8 right shifts on a 16-bit value, does the low byte of the new value have the same sign as the original 16-bit value. What  happens if  I specify a divide operator rather than a shift? I've been told that the compiler will substitute right shifts for divisions, when it can; is the sign maintained in this case?

 

Thanks,

Jim

 

Jim Wagner Oregon Research Electronics, Consulting Div. Tangent, OR, USA http://www.orelectronics.net

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The rules are clearly defined.
Truncation
Promotion
Sign extension
.
It is UNWISE to truncate or shift a signed expression.
The rules for promotion are intuitive. C will promote silently and with no ill effects.
.
In answer to your question. Truncation.
.
David.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

If you want a practical solution:

int32_t signed_variable;     //your unwise type 
uint32_t x = (uint32_t)signed_variable;

uint16_t top = x >> 16;
uint16_t bot = x;
int16_t signed_top = (int16_t)top;
int16_t signed_bot = (int16_t)bot;

The behaviour of signed_top is what you expected.   i.e. bit15 has the same sign bit as bit31 of signed_variable.

 

Note that bit15 of bot is the same as bit15 of x.

However bit15 becomes the sign bit when cast into signed_bot.

 

I imagine that your head is beginning to hurt by now.

 

It is WISE to use unsigned variables for any homegrown bitfield storage.

 

It is WISER to use regular C bit fields as described in K & R.

Note that individual fields must be unsigned.

 

Don't worry about my excess use of fresh variables in my example.    The Compiler will optimise intelligently.

The important point is:   all shifts and truncations are known, defined and intuitive

 

David.

Last Edited: Sun. Aug 11, 2019 - 08:30 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The C standard doesn't use the words truncation, it describes it like this (from C99 draft), but the end result is truncation.

 

6.3.1.2 Boolean type
1
When any scalar value is converted to _Bool, the result is 0 if the value compares equal
to 0; otherwise, the result is 1.
6.3.1.3 Signed and unsigned integers
1 When a value with integer type is converted to another integer type other than _Bool, if
the value can be represented by the new type, it is unchanged.
2 Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or
subtracting one more than the maximum value that can be represented in the new type
until the value is in the range of the new type. 49)

3 Otherwise, the new type is signed and the value cannot be represented in it; either the
result is implementation-defined or an implementation-defined signal is raised.

 

The highlighted rule is a slightly long winded way of specifying truncation.

A uint16_t value of 1000 (0x03e8)  converted to uint8_t gives 0xe8 (232).

Asuming 2's complement,

an int16_t of -1000 (0xfc18) converted to uint8_t gives 0x18 (24).

So, truncation.

You will notice the result is always well defined if the destination type is unsigned.

If the destination type " is signed and the value cannot be represented in it;"

the result according to the standard is implementation defned.

In practice, however, for a 2's complement system, what you will (almost certainly) get in this case is again truncation eg.

a uint16_t of 1000 (0x03e8) converted to int8_t will give 0xe8 ( -24).

 

ka7ehk wrote:
What  happens if  I specify a divide operator rather than a shift? I've been told that the compiler will substitute right shifts for divisions, when it can; is the sign maintained in this case?

The sign will be maintained for a divide operation, assuming the operation is carried out as a signed operation (this follows the usual rules) eg.

int n;

int d;

then n/d will be peformed as type int ie. signed

unsigned int n;

int d;

then n/d will be performed as unsigned int

 

If the compiler optimises the divide, as it usually can if it is dividing by a power of 2, you are guaranteed that the end result will be the correct result for a divide, which includes rounding the correct way for a divide

eg. -1 / 2 written as a divide, however the compiler optiimises it, will give result 0 not -1.

 

EDIT Just to be clear, these rules are conceptual, when it talks of 'repreatedly adding or subtracting..." the compiler of course doesn't literally do this, it just has to give a result that is the same AS IF it had literally follwed those steps.

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

To look at it another way - in Jim's original question, he is dealing with a guaranteed 16-bit signed value in a 32-bit signed variable. Thus, all the top 16 bits (of the 32) have the same value as bit 15 - 0 if the result is positive and 1 if it's negative. The result is in the range -32k to +32k. A cast from 32 to 16 bit *signed integers* will therefore always have the correct sign no matter how the cast is internally performed. A truncation will have the same correct result.

 

If however, Jim were to have a signed value greater than 127 or less than -128 in the 16 bit signed integer, bit 7 *does not necessarily* have the same value as the sign bit; a truncating cast may change sign. (If it's within the eight bit signed range, of course, he's fine).

 

So the short answer is - caveat emptor. If you can guarantee that the result fits in the target variable, and it and the uncast version are both signed or both unsigned, then you have no issues. If the initial variable is too big to fit into 8 bits, then it's too big and the answer is automatically wrong.

 

If you want to change variable type as well as width, it may be safest to do a double convert: e.g. unsigned 8 from signed sixteen requires  = (uint_8)(uint_16)signed_16_variable

 

I *hate* c's insistence on silent promotion and demotion; there are all sorts of potential nasties hidden waiting to bite you. In particular, I dislike the construct

if (true == boolean_expression)

e.g. where a called function returns a boolean value, or as a direct observation of a boolean variable. Instead of

if (boolean variable)

which matches the types correctly, in the first expression both true and the boolean expression are promoted to unsigned ints, then compared, then the result demoted back to an int. OK, most compilers I suspect will optimise all that out, but I'm suspicious that boolean_expression might have values other than 1 and 0. </rant>

 

Neil

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I don't understand the reference to "bitfields". How do they apply to this question?

 

I am averaging some 16-bit signed ints by summing a series of values into an int32_t. The average block size is always a "convenient" number such as 4 or 8 or 16. I was getting warnings from attempts to do

#define BASE 8
int32_t avg_sum;
int16_t avg;
....

avg = (int16_t) avg_sum/BASE;

The warning was something like "invalid mix of signed and unsigned" in the last line.  See: https://www.avrfreaks.net/commen...

 

The suggestion in that thread was to replace the division by shift to avoid the warning. The question at this point is whether the sign is being properly propagated through the steps in that last line when the shift is used instead of a division operation.

 

Thanks

Jim

Jim Wagner Oregon Research Electronics, Consulting Div. Tangent, OR, USA http://www.orelectronics.net

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You were getting those warnings because you had defined BASE with  U on the end, making it unsigned.

So you had mixture of avg_sum (signed) and BASE (unsigned).

Without the U this is fine, assuming after dividing by BASE the result does indeed fit in int16_t.

The division will be signed.

#define BASE 8
int32_t avg_sum;
int16_t avg;
....

avg = avg_sum / BASE;

If you have an overly zealous warning about assigning the result of the divide (which will be int32_t but you know it fits in the range of int16_t) to an int16_t, you can shut it up with cast (note where the brackets are)

avg = (int16_t)(avg_sum / BASE);

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks

 

Jim

Jim Wagner Oregon Research Electronics, Consulting Div. Tangent, OR, USA http://www.orelectronics.net

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Sorry, Jim.   I assumed that you were using an int32_t to store several items of data.   e.g. an int16_t in bits 0-15 and another int16_t in bits 16-31.

You can just say:

#define BASE 8
int32_t avg_sum;
int16_t avg;
....

avg = avg_sum / BASE;
avg = (int16_t) (avg_sum / BASE);

My two examples will work ok with avg_sum having up to 23 18 bits of data (+ sign)

Your statement casts avg_sum to int16_t before division.    Which means you are limited to 15 bits (+ sign) i.e. there was no point in ever using an int32_t in the first place.

 

God gave you parentheses.   Use them.    It makes expressions understood by mere humans.

Yes,  you can save wear on your typing finger by avoiding the space key, (, ) ...

But this means you have to understand every rule for operator precedence.

 

Now that you have shown what you want to do,  it is perfectly good to use signed variables.  The sign is preserved in division.

 

David.

 

Edit.  You are dividing by 8.  I had mis-read as dividing by 256.   I have corrected the number of data bits.

Last Edited: Sun. Aug 11, 2019 - 04:47 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks for that catch on the parenthesis. And, the other comments. 

 

I hope that all of you understand how much I appreciate your tutoring and critical evaluation of my question-posts. I've tried hard to learn and it does not flow easily  from books. And, even being next door to a medium-power engineering university, live instruction is not easy to come by.

 

Best wishes

Jim

Jim Wagner Oregon Research Electronics, Consulting Div. Tangent, OR, USA http://www.orelectronics.net

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I always look at K & R.   i.e. the original paper book.  

I ought to buy the 2nd edition one day.

 

I am never sure about every rule of precedence.    But I can just use regular parentheses to make my intentions clear to the Compiler (and me)

 

As a general rule.    Simple statements are easy to follow.   The Compiler will optimise them into efficient code.   (without you needing to worry)

 

David.