Penalty for using 32bit variables?

Go To Last Post
29 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'm thinking on using some uint32_t variables, but wanted to know what speed penalty am I getting.

Is it 4 times slower doing maths than 8 bit?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You pretty well got it!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

That's going to depend entirely on your compiler's code generation model. Which compiler is it? What do you see if you study the generated assembler for some test cases (assuming they aren't discarded by the optimiser).

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

hi, i'm using avr-gcc on linux, latest stable I think

I will be using uint32_t for counting and some divisions. Nothing complicated.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

A sweeping "yes" to that.

The integer arithmetic operations of an AVR is eight-bit by definition, so for e.g. adding two 32-bit integers you need to do four of those. Multiplication might be een worse as the result can potentially be 64 bits wide.

[edit]Add to that that there probably is a larger risk for variables ending up allocated in RAM intead of registers, leading to extra loads and stores being necessary.[/edit]

I'd set up a few test cases that start out from volatile variables, and end up in volatiles, and run them in the simulator and study the number of clock cycles consumed. Just to get a ballpark guesstimate..

As of January 15, 2018, Site fix-up work has begun! Now do your part and report any bugs or deficiencies here

No guarantees, but if we don't report problems they won't get much of  a chance to be fixed! Details/discussions at link given just above.

 

"Some questions have no answers."[C Baird] "There comes a point where the spoon-feeding has to stop and the independent thinking has to start." [C Lawson] "There are always ways to disagree, without being disagreeable."[E Weddington] "Words represent concepts. Use the wrong words, communicate the wrong concept." [J Morin] "Persistence only goes so far if you set yourself up for failure." [Kartman]

Last Edited: Wed. May 11, 2011 - 11:00 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

ok, i'll do some study in the case

thanks guys for fast replies :)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

OK, here's a GCC example:

#include 

extern uint8_t b1,b2,b3; 
extern uint32_t w1,w2,w3; 

int main(void) {
	b1 = b2 + b3;
	w1 = w1 + w2;
	while(1) {
	}
}

this generates:

.LM2:
	lds r24,b3
	lds r25,b2 
	add r24,r25
	sts b1,r24
.LM3:
	lds r24,w1
	lds r25,(w1)+1
	lds r26,(w1)+2
	lds r27,(w1)+3
	lds r18,w2
	lds r19,(w2)+1
	lds r20,(w2)+2
	lds r21,(w2)+3
	add r24,r18
	adc r25,r19
	adc r26,r20
	adc r27,r21
	sts w1,r24
	sts (w1)+1,r25
	sts (w1)+2,r26
	sts (w1)+3,r27

The 8bit one is 7 cycles. The 32 bit one is 28 cycles. So in this case it is *4. Replacing '+' with '*' the figures are 10 cycles and 72 cycles, the 32 bit code is more than 7 times as long.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

and some divisions. Nothing complicated

Divisions are complex on AVRs. Since there is no machine instruction for making divisions it has to be done by a rather elaborate sequence of other arithmetic operations. Going from 8 to 32 bits variables will make it even worse.

As of January 15, 2018, Site fix-up work has begun! Now do your part and report any bugs or deficiencies here

No guarantees, but if we don't report problems they won't get much of  a chance to be fixed! Details/discussions at link given just above.

 

"Some questions have no answers."[C Baird] "There comes a point where the spoon-feeding has to stop and the independent thinking has to start." [C Lawson] "There are always ways to disagree, without being disagreeable."[E Weddington] "Words represent concepts. Use the wrong words, communicate the wrong concept." [J Morin] "Persistence only goes so far if you set yourself up for failure." [Kartman]

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Just for completeness I tried the divide example:

#include 

extern uint8_t b1,b2,b3; 
extern uint32_t w1,w2,w3; 

int main(void) {
	b1 = b2 / b3;
	w1 = w1 / w2;
	while(1) {
	}
}

both divisions called library functions. The 8bit code, including the LD/ST's was 86 cycles. The 32 bit code was 693 cycles. So 8 times as long for a divide.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yeah... those doofuses that use the other compilers probably aren't smart enough to comprehend the tradeoff involved in using 32 bit variables in their programs, so us GCC guys will have this nice hi level philosophical discussion over here in private. Egalitarianism. Sheesh.

Imagecraft compiler user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

us GCC guys

You switched trenches, Bob? Welcome! :wink:

Quote:
this nice hi level philosophical discussion

If comparing assembler-sequences generated by the avr-gcc toolchain is high level, then what is low-level in your world Bob?

As of January 15, 2018, Site fix-up work has begun! Now do your part and report any bugs or deficiencies here

No guarantees, but if we don't report problems they won't get much of  a chance to be fixed! Details/discussions at link given just above.

 

"Some questions have no answers."[C Baird] "There comes a point where the spoon-feeding has to stop and the independent thinking has to start." [C Lawson] "There are always ways to disagree, without being disagreeable."[E Weddington] "Words represent concepts. Use the wrong words, communicate the wrong concept." [J Morin] "Persistence only goes so far if you set yourself up for failure." [Kartman]

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Bob I specifically moved this to the GCC forum after OP identified GCC because it's YOU (no one else, just you) that is forever moaning about GCC threads in AVR Forum. Are you now saying this is wrong? I wish you'd make your bloody mind up! :evil:

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Bob is a swell guy, but all us swell guys have something that we get stuck on. For Bob it's the precense of the avr-gcc forum and the absense of a Imagecraft forum.

As of January 15, 2018, Site fix-up work has begun! Now do your part and report any bugs or deficiencies here

No guarantees, but if we don't report problems they won't get much of  a chance to be fixed! Details/discussions at link given just above.

 

"Some questions have no answers."[C Baird] "There comes a point where the spoon-feeding has to stop and the independent thinking has to start." [C Lawson] "There are always ways to disagree, without being disagreeable."[E Weddington] "Words represent concepts. Use the wrong words, communicate the wrong concept." [J Morin] "Persistence only goes so far if you set yourself up for failure." [Kartman]

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I think there should be a tag system where someone could ask a 'software engineering' question that was compiler agnostic. That sort of universal advice crosses compiler boundaries. If the AVR forum was the object and the compilers were class members, the poor gcc compiler is sort of left out in the outer orbit. I LIKE the general questions, because I think that sort of Big Picture info is useful to the new guys. If they get a Good Example, they'll remember it. "A modem always transmits on pin 3". I've never forgotten.

Imagecraft compiler user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
Bob I specifically moved this to the GCC forum after OP identified GCC
Just because the OP is using GCC doesn't make this a GCC question.

Regards,
Steve A.

The Board helps those that help themselves.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

OK, then two votes against - back it goes - this is a democracy after all.

But surely the question of how much more 32bit math costs over 8bit IS compiler dependent and OP could give a flying f**k through a rolling doughnut what the code generation models of IAR or CV or Rowley or ICC is? He has said he's using GCC - why would he care what the other compilers deliver?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I bet a benchmark with each compiler doing long adds and mults will come out within a couple of percent on each compiler. Its hard to not generate 4 adds and add with carry in the right order.

Imagecraft compiler user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
But surely the question of how much more 32bit math costs over 8bit IS compiler dependent
But surely the OP is looking for an approximate answer, not something down to the cycle. That answer is going to be pretty much the same in any compiler.

Regards,
Steve A.

The Board helps those that help themselves.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

tigrezno wrote:
I'm thinking on using some uint32_t variables, but wanted to know what speed penalty am I getting.

Is it 4 times slower doing maths than 8 bit?

You have little choice. Use the size of variable that is appropriate. So if you only need uint8_t use uint8_t.

If your counter needs uint32_t then that is what you you have to use. Yes, the math will be slower but so what?

If your execution time is really critical, look at your inner loops. Code that gets called many times. One-off initialisations or outer loops make very little difference.

David.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

david.prentice wrote:
tigrezno wrote:
I'm thinking on using some uint32_t variables, but wanted to know what speed penalty am I getting.

Is it 4 times slower doing maths than 8 bit?

You have little choice. Use the size of variable that is appropriate. So if you only need uint8_t use uint8_t.

If your counter needs uint32_t then that is what you you have to use. Yes, the math will be slower but so what?

If your execution time is really critical, look at your inner loops. Code that gets called many times. One-off initialisations or outer loops make very little difference.


This pretty well covers it. For any N-bit processor, operating on data larger than N-bits will involve a performance hit, and the bigger the difference the bitter the hit. That's why when using an 8-bit micro in particular you should be more disciplined in not declaring larger variables than you need (e.g. using an int to count to 10). Using 32-bits on an 8-bit micro is especially a decision that needs to be justified and not just done without thinking.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You can see in code written by microprocessor-untrained people, e.g., those who have previously used x86 PCs, everything's declare as an int (defaulting to signed 16 bit on most 8 bit microprocessors).

A hardware-ish person will use 8 bit unsigned for most work, including for loops where the index is never negative and never greater than 255. Same for other aspects. This is for speed and RAM conservation.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Let me generalize it to a recommendation to use processor word sized operands. As a counterexample, use of 8-bit variables on an ARM Cortex-M0/M3 carries small penalty, since the variables have to extended each time they are used. Usually int follows this recommendation, except on AVR-s and other 8-bit MCU-s.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

That answer is going to be pretty much the same in any compiler.

My point entirely. If GCC's + is x4, * is x7 and / is x8 then it is THAT which will concern the OP, not whether IAR is x4, x5, x12 or whatever.
Quote:
But surely the OP is looking for an approximate answer, not something down to the cycle.

Without counting cycles how would you know the "cost"? Or do we all just wave a wet finger in the air and hope that a guess is correct? In part I was showing the OP that it is easy to do simple tests to determine a more quantitative than qualitative answer.

Just for interest I happen to have CV(eval) installed so did the tests there:

+   7 cycles/36 cycles    x5
*   9 cycles/73 cycles    x8
/   89 cycles/628 cycles  x7

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
I will be using uint32_t for counting and some divisions. Nothing complicated.

In GCC. Then it is a wrong forum.
Same algorithms can be executed using different variable types. And in every case you get the correct answer. The only thing really varies is the computational error involved. So the idea is to use the smallest footprint variable which gives acceptable error. The error can be 1e-14 or 200% - it all depends on a specific problem you solve.

Hard to believe you use 32 bit integer divided by 32 bit integer (to get 64-bit fractional).. What i think is you use a counting variable which is divided by some compile time known value. Is that right? Like:

Quote:
frequency=counter/19;

?

No RSTDISBL, no fun!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

It's already solved guys, thanks!

Hosted my code on github:

https://github.com/jvalencia80/atmega-timers

take a look :)

(I forgot to add UL on some constants of the readme, but i'll fix it tonight)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
Hard to believe you use 32 bit integer divided by 32 bit integer (to get 64-bit fractional)..
First, why is that so hard to believe? Second, why do you think that such a division would result in a 64 bit fractional?

Regards,
Steve A.

The Board helps those that help themselves.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Koshchi wrote:
Quote:
Hard to believe you use 32 bit integer divided by 32 bit integer (to get 64-bit fractional)..
First, why is that so hard to believe? Second, why do you think that such a division would result in a 64 bit fractional?
(P+1)/P and P/(P-1) are near one and differ by 1/((P+1)*P).
P> 2**30 makes the fraction smaller than 2**-60 .

International Theophysical Year seems to have been forgotten..
Anyone remember the song Jukebox Band?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

P> 2**30 makes the fraction smaller than 2**-60 .

Sorry but how does a uint32_t hold 2**-60 ?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

My advice is pretty simple: "go for it". Code it up and see how it does. For a recent project (the DDS project found on this site) I had to do a small amount of 64-bit integer math. It worked out very well.