## Detect Overflow In C

29 posts / 0 new
Author
Message

Hi Everyone,

In assembly when you add a number to another number, you have the carry flag to detect if the add resulted in an overflow.

In C, what is the best/most optimal way to do the same?

Let's say we have two:

uint8_t c1,c2;

And then we add them:

c1+=c2;

One way is to use a larger variable, or count on C taking these uint8_t up to a uint16_t:

if (c1+c2>=256)

overflow=1;

else overflow=0;

c1+=c2;

That seems less than optimal and what if c1/c2 are scaled to 16 or 32 bits...

What thoughts do you guys have?

This topic has a solution.
Last Edited: Wed. Aug 29, 2018 - 12:17 PM

if (c1 + c2 >= 256) won't work because the answer will be constrained to 8 bits and that can never be larger than 255.

just cast it to 16 bits. As explained to me recently, it is all done fast in r-registers and the overhead is quite low.

Jim

Until Black Lives Matter, we do not have "All Lives Matter"!

Last Edited: Tue. Aug 28, 2018 - 11:51 PM

I could be wrong, but I think it will scale the c1+c2 to a "unsigned int" in this case for gcc (uint16_t).

Try it and see - examine the generated code. Personally, I'd be casting the variables to make the intention explicit.

Regarding Carry - C has no knowledge of this, so to emulate the Carry, you'd have to choose a larger variable size and test the required bit.

I think that if the result is smaller than the larger of c1 or c2, you know you have had an overflow (for unsigned values).

EDIT: I think the actual condition is smaller than either c1 and c2, but I'd have to think about that a bit.  That makes the test easier since if the result is smaller than the first c, you don't have to test against the second c.

Last Edited: Wed. Aug 29, 2018 - 02:51 AM

I think that when you have C evaluate an expression, it is going to be dealt with as an "int" by default unless one or the other items in the expression push it larger to "long" (constant with an L on it, long variable, etc.).

alank2 wrote:
I think that when you have C evaluate an expression, it is going to be dealt with as an "int" by default unless one or the other items in the expression push it larger to "long" (constant with an L on it, long variable, etc.).
Correct for recent standards.

At one time, unsigned char would promote to unsigned int.

That was annoying enough that it got changed.

That said, using an int implies that the compiler cannot optimise away a register use.

Comparing the result with an operand is faster and requires fewer registers.

Moderation in all things. -- ancient proverb

Maybe this is helpful:

Too bad you can't simply look at the carry flag (but it is busy being clobbered anyhow by everything else that is happening before you could look at it)

http://www.fefe.de/intof.html

You could possibly reserve the msb  for use as your own carry bit...then your variable only have half the normal working range (but that might be ok for many things).

So for an 8 bit value, a result>=128 would be considered overflowed.   Why not just design the program so overflow can never occur, given the range of the inputs?

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

This reply has been marked as the solution.
```    c1 += c2;
if (c1 < c2) {
// overflow
}```

Stefan Ernst

sternst wrote:

```    c1 += c2;
if (c1 < c2) {
// overflow
}```

Neat. When compiled, as long as c1 and c2 are in registers, you get two instructions plus a conditional branch.

#1 Hardware Problem? https://www.avrfreaks.net/forum/...

#2 Hardware Problem? Read AVR042.

#3 All grounds are not created equal

#4 Have you proved your chip is running at xxMHz?

#5 "If you think you need floating point to solve the problem then you don't understand the problem. If you really do need floating point then you have a problem you do not understand."

@alank2: It would be helpful if you would change the subject to something that actually describes the topic of discussion; eg, "Detect Overflow In C" - this would make it easier to find again in future, as it's a topic that does come up from time-to-time.

EDIT

Done - see #17

Top Tips:

1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
Last Edited: Thu. Aug 30, 2018 - 07:46 AM

```   c1+=c2;
if (SREG.SREG_C) {
//OVERFLOW
}```

...compiles OK with Codevision.

#1 Hardware Problem? https://www.avrfreaks.net/forum/...

#2 Hardware Problem? Read AVR042.

#3 All grounds are not created equal

#4 Have you proved your chip is running at xxMHz?

#5 "If you think you need floating point to solve the problem then you don't understand the problem. If you really do need floating point then you have a problem you do not understand."

In GCC that would be:

`if (SREG & (1 << SREG_C)) {`

HOWEVER the C language gives no guarantee about code sequencing. In the presented code it might suit the optimising code generator to actually do the SREG test before the addition !

Having said that a quick test with:

```#include <avr/io.h>

uint8_t c1, c2;

int main(void) {
while(1) {
c1 += c2;
if (SREG & (1 << SREG_C)) {
PORTB = 0xFF;
}
}
}```

generates the "right code" on this occasion:

```uint8_t c1, c2;

int main(void) {
while(1) {
c1 += c2;
7e:   20 91 60 00     lds     r18, 0x0060     ; 0x800060 <_edata>
82:   80 91 61 00     lds     r24, 0x0061     ; 0x800061 <c1>
86:   82 0f           add     r24, r18
88:   80 93 61 00     sts     0x0061, r24     ; 0x800061 <c1>
if (SREG & (1 << SREG_C)) {
8c:   0f b6           in      r0, 0x3f        ; 63
8e:   00 fe           sbrs    r0, 0
90:   f8 cf           rjmp    .-16            ; 0x82 <main+0x6>
PORTB = 0xFF;
92:   98 bb           out     0x18, r25       ; 24
94:   f4 cf           rjmp    .-24            ; 0x7e <main+0x2>
```

But, it would be VERY unwise to rely on this kind of thing.

Last Edited: Wed. Aug 29, 2018 - 09:46 AM

clawson wrote:

HOWEVER the C language gives no guarantee about code sequencing. In the presented code it might suit the optimising code generator to actually do the SREG test before the addition !

There are a couple of post on Stackoveflow that say that you can selectively turn off optimisation (on some GCC versions) like this...

```#pragma GCC push_options
#pragma GCC optimize ("O0")

#pragma GCC pop_options```

Not having GCC installed I can't try it.

#1 Hardware Problem? https://www.avrfreaks.net/forum/...

#2 Hardware Problem? Read AVR042.

#3 All grounds are not created equal

#4 Have you proved your chip is running at xxMHz?

#5 "If you think you need floating point to solve the problem then you don't understand the problem. If you really do need floating point then you have a problem you do not understand."

But that is a terrible solution!

The true solution here is probably to start by picking variable widths that will not overflow in the first place.

Last Edited: Wed. Aug 29, 2018 - 10:08 AM

clawson wrote:

But that is a terrible solution!

I was thinking about if it would work if just around the addition and test of carry. As the unoptimised code for those two lines should be pretty optimal anyways as long as it prevent the order changing all should be good(?).

#1 Hardware Problem? https://www.avrfreaks.net/forum/...

#2 Hardware Problem? Read AVR042.

#3 All grounds are not created equal

#4 Have you proved your chip is running at xxMHz?

#5 "If you think you need floating point to solve the problem then you don't understand the problem. If you really do need floating point then you have a problem you do not understand."

Last Edited: Wed. Aug 29, 2018 - 10:09 AM

awneil wrote:

@alank2: It would be helpful if you would change the subject to something that actually describes the topic of discussion; eg, "Detect Overflow In C" - this would make it easier to find again in future, as it's a topic that does come up from time-to-time.

Done!  I like sternst's solution, that was what I was looking for.  I've done it the way of using a larger variable to hold the result, but sometimes a larger variable isn't available and it is less efficient when you know in assembly that it is so simple!

I wonder why the C-creator, didn't include or emulate a status register (such as containing overflow).  Your code would specifically clear it, then you could later check for any overflow since your last check.  I suppose the issue of automatic code reordering would greatly blurrr its usefulness (or lead to a lot of false conclusions).

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

Probably for hardware abstraction I would think.

But if you use a comma operator like this:

```uint8_t c1, c2;
int main () {
while(1) {
if ( (c1 += c2) , (SREG & (1 << SREG_C)) ) {
PORTB = 0xFF;
}
}
}```

C is forced to respect the order in this case, right?

There is still no guarantee that the carry flag (or any other flag in SREG) will reflect the previous expression.  The comma creates a sequence point, wherein all side effects of the first expression are complete before the second expression is evaluated, and none of the side effects of the second expression will have occurred until after the first expression has been evaluated.  Since the hardware flags in SREG are not part of the C abstract machine, changes to them will not be counted as side effects, and the sequence point will exert no power over them.

 "Experience is what enables you to recognise a mistake the second time you make it." "Good judgement comes from experience.  Experience comes from bad judgement." "Wisdom is always wont to arrive late, and to be a little approximate on first possession." "When you hear hoofbeats, think horses, not unicorns." "Fast.  Cheap.  Good.  Pick two." "We see a lot of arses on handlebars around here." - [J Ekdahl]

alank2 wrote:
you know in assembly that it is so simple!

So write yourself an assembler module to do it, then!

eg,

`bool add_8_with_carry( uint8_t addend, uint8_t augend, uint8_t * sum );`

implement the body in assembler; the bool return is true iff a carry occurred.

Top Tips:

1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
Last Edited: Thu. Aug 30, 2018 - 07:45 AM

awneil wrote:
implement the body in assembler;
Something like this (untested) perhaps?

```; asm.S

#define __SFR_OFFSET 0
#include <avr/io.h>

; bool add_8_with_carry( uint8_t addend, uint8_t augend, uint8_t * sum );
; ABI says addend in R24, augend in R22, *sum in R21:R20
add R22, R24 ; doesn't really matter which way round this is done and R24 is then free to prepare bool return
ldi R24, 1
brcs 1f
clr R24
1:
mov R27, R21
mov R26, R20
st X, R22
ret```

and invoked as:

```// avr.c

#include <stdbool.h>
#include <avr/io.h>

bool add_8_with_carry( uint8_t addend, uint8_t augend, uint8_t * sum );

uint8_t c1, c2;

int main(void) {
c1 = 0xAA;
c2 = 0x99;
if (add_8_with_carry(c1, c2, &c1)) {
PORTB = 0x55;
}
else {
PORTB = 0xAA;
}
while(1) {
}
}```

This appears to build clean with:

```D:\atmel_avr\avr8-gnu-toolchain-win32_x86\bin>avr-gcc -mmcu=atmega16 avr.c asm.S -Os -g -o avr.elf

D:\atmel_avr\avr8-gnu-toolchain-win32_x86\bin>```

The accepted C-only solution is simpler and shorter than the assembly solutions.

Edit: shorted->shorter

Moderation in all things. -- ancient proverb

Last Edited: Thu. Aug 30, 2018 - 10:17 PM

skeeve wrote:
The accepted C-only solution is simpler and shorted than the assembly solutions.

I  wouldn't quite go that far -- a comparison test on multi-bit-often-multi-byte values is going to be less expensive than a status register bit test?  I'd think not.  But

awneil wrote:
alank2 wrote: you know in assembly that it is so simple! So write yourself an assembler module to do it, then!

... I'd lean very far towards awneil.  How many lines of A VR8 code have I written in the past 20 years?  Well over a million source lines I'd wager.  How many times did I feel a bit irritated that my chosen C toolchain didn't have carry or rotate or arithmetic operand widths other than standard?  Probably "several".  Now many times was it so important to the app that I needed to do a cycle-counted solution?  A "few".  So make the critical loop tight and get on with life. Whinging about what is included in the C >>language<< ain't gonna help get the app to work any quicker.  You are free to invent your own language.  Or add extensions to the infinite-value-open-source-toolchain.

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

theusch wrote:
skeeve wrote:
The accepted C-only solution is simpler and shorted than the assembly solutions.

I  wouldn't quite go that far -- a comparison test on multi-bit-often-multi-byte values is going to be less expensive than a status register bit test?  I'd think not.  But

The case at hand is single-byte arithmetic.  The C is shorter, simpler and faster.

For two-byte arithmetic, speed is a wash.

For three bytes, the assembly is probably a winner for speed:

an IN, a skip and an RJMP is going to be faster than three comparisons and a branch.

It does use one more register.

Moderation in all things. -- ancient proverb

Maybe I'm missing something in your analysis, skeeve.  Am I looking at the wrong C implementation example?  How do you know that both operands are in registers after the arithmetic operation?  If it is just carry being tested than wouldn't one use BRCS?  But no matter in the end...as I mentioned I think it would be rare in the real world that this is of vital importance, and if so it will be worth the time to answer the questions.

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

Same here when there is a clean C solution that only add one instruction compared to "real" assembler the in and out of C and ASM don't solve that.

I tried different codes like this and they all made the same code:

```b+=a;
if (b<a)
c++;

((b+=a)>a) ?:c++ ;

if ( (b+=a)<a)
c++;
```

```  86:	82 0f       	add	r24, r18
88:	82 17       	cp	r24, r18
8a:	08 f4       	brcc	.+2      	; 0x8e <main+0xe>
c++;
8c:	9f 5f       	subi	r25, 0xFF	; 255
```

theusch wrote:
Maybe I'm missing something in your analysis, skeeve.  Am I looking at the wrong C implementation example?  How do you know that both operands are in registers after the arithmetic operation?
So far as I know, avr-gcc does not do partial loads or stores of integers.

c1 and c2 would be be in registers before and after the +=.  An add and a compare per byte and a branch.

I was off a bit in my previous analysis.

The asm function itself requires 12 cycles.  Add in the rcall a skip and an RJMP and were up to 17.

I could shave a couple of cycles off the asm without changing the API, so call it 15 cycles.

This assume that c1 and c2 are in registers, that c2 is wanted in register(s) and that c1 is wanted in SRAM.

Add 3 cycles per additional byte in the operands.

The C-only solution will require an add and a compare per byte and a branch at the end:

One byte requires at most 4 cycles, 4 bytes 10 cycles.

An API change would improve the function greatly.

Do not access SRAM.

Make the return value a struct.

The last three instructions should be LDI, ADC and RET.

Quote:
If it is just carry being tested than wouldn't one use BRCS?  But no matter in the end...as I mentioned I think it would be rare in the real world that this is of vital importance, and if so it will be worth the time to answer the questions.
True.

This discussion will not change the world.

Moderation in all things. -- ancient proverb