Get rid of float -> replace it with (word / word)

Go To Last Post
24 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Following a previous thread (Lee's idea in fact), I made a tiny Win32 program that takes a float as an input and an error-level and then will output two integer numbers: first number divided by the second will give you aprox. the inputted float.
The goal is to avoid (of course, if it's possible) the float-math in your embedded applications.
Now, here's the proggy:

Attachment(s): 

Real men don't use backups, they post their stuff on a public ftp server and let the rest of the world make copies.

Last Edited: Wed. Nov 30, 2005 - 07:12 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

For simple conversion factors, arbitrary numbers are fine.

There are cases where the numbers will be used many, many times and speed is important. An example might be wire-frame rotation in graphics, where all the endpoints need to be re-processed every iteration. In that case, it is almost magic if you can find a close-enough ratio where the divisor (especially on AVRs with no hardware divide support) or the numerator are powers of two, so simple shifting can be used.

Lee

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
it is almost magic if you can find a close-enough ratio where the divisor (especially on AVRs with no hardware divide support) or the numerator are powers of two

BTW: someone here on AVRfeaks mentioned something about a C-compiler capable of doing such magic things... [can't remember exactly the details right now...]

Real men don't use backups, they post their stuff on a public ftp server and let the rest of the world make copies.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

@Stanley: The download doesn't work for me - am I doing something wrong?
When I say "doesn't work" I mean that I end up with an empty zip file.

Four legs good, two legs bad, three legs stable.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hmmm... I don't know what is happening... :?
Each time I upload the file and download it back, it appears to be 20 bytes shorter!!! :shock:
I think there's a problem on AVRfreaks side.

Real men don't use backups, they post their stuff on a public ftp server and let the rest of the world make copies.

Last Edited: Wed. Nov 30, 2005 - 07:29 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hmmm. Now it's not empty... but apparently corrupted or damaged!

Four legs good, two legs bad, three legs stable.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

C-compiler capable of doing such magic things
========================================
replacing mult and div by powers of 2 is a 'peephole optimization' done by many compilers. Imagecraft lists this one and others in its description of optimizations

Imagecraft compiler user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

bobgardner wrote:
C-compiler capable of doing such magic things
========================================
replacing mult and div by powers of 2 is a 'peephole optimization' done by many compilers. Imagecraft lists this one and others in its description of optimizations

Bob, I was referring to something completely different… replacing float operations with integer operations!

Real men don't use backups, they post their stuff on a public ftp server and let the rest of the world make copies.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

lemee see...

avr-gcc appears to be pretty smart about division by a power-of-two. It does creative byte/nibble swapping to do partial divisions that are powers of 256/16. Then, it moves on to LSR's for the remaining amount of power-of-2 dividing that is left to be done.

But it's kinda wierd with multiplication. Setting optimization for "size" (-Os):

For any pair of 8-bit numbers, on an AVR with an on-chip multiplier, you'll always be better off using the MUL instruction, right?

Well, if I do "8-bit times 63", it invokes MUL, using 5 words/cycles of code.

If I do "8-bit times 64", it creates a loop where it adds the result to itself 6 times, with 16-bit intermediate results, using 8 words of code, and requiring 33 cycles. It's actually slower and larger than the "non-optimized" multiplication by a non-power-of-two-above.

It's somewhat clearer what's going on with 16-bit ints:
If I do "16-bit times 45" it will invoke MUL twice, using 10 words/cycles of code.

If I do "16-bit times 64" it will double the number 6 times by repeatedly adding the result to itself, using 5 words of code, and 30 cycles.

If I do "16-bit times 63", it performs multiplication-by-64 and then subtracts one copy of the original number, using 8 words of code and 33 cycles.

So 16-bit optimization is smaller but slower... exactly what you expect when optimizing for size. But the 8-bit optimization is actually bigger and slower...

Now, if I set it to optimize for speed (-O3):
8-bit * 64: It does something very tricksy with rotating right through the carry bit across three registers. No loops or branching. Total: 11 words/cycles.

8-bit *63: Uses MUL. Total: 5 words/cycles. Same as with -Os.

16-bit *45: Uses MUL. Total: 10 words/cycles. Same as with -Os.

16-bit *64: Does the same tricksy rotating-right through the carry bit across 3 registers business. Total: 9 words/cycles. It is slightly faster and slightly smaller than the arbitrary case above. Almost twice as big as the optimize-for-space method, but also much faster.

(Now I understand what was going on in the 8-bit version above... the compiler is internally promoting the 8-bit operation into a 16-bit operation before beginning, and then jumping through hoops beforehand and afterwords to make sure that an 8-bit input is used and an 8-bit result is returned... That doesn't sufficiently explain "why?"... but at least it starts to tackle "what?". The rotate-right-by-two-thru-carry technique ends up being functionally equivalent to the "shift-left-by-6" or the "add-self-to-self-6-times" techniques that could otherwise be used, but faster.)

16-bit *63:
It does the same tricksy technique for multiplying by 64, then subtracting one copy of the original number. Total: 12 words/cycles. Actually larger/slower than the MUL option by 2 words/cycles.

So, I can conclude that avr-gcc tries very hard to optimize its power-of-two multiplication. But it tries too hard sometimes, and sometimes ends up using more code space and/or execution time than its own non-optimal versions of the same things!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

It is a little hard to do direct work with a test program from your description. Are you doing

fred = ethel * 63; or
fred *= 63; ?

Are fred/ethel in registers or SRAM to start/end?

[Note to EW: Round 51 of Compiler Wars]

Anyway, CodeVision does similar with *63 & *64. Even with the "promote char to int" switch on, it recognizes that only 8 bits need to be fussed with. It skips the MUL for *64, but the resulting sequence is the same number of cycles though one word longer than the 5 word/6 cycle MUL for *63.

Lee

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Get rid of the float data type?

No way, man!!!

You can avoid reality, for a while.  But you can't avoid the consequences of reality! - C.W. Livingston

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Sorry, Lee... Here's my test case: modify the data type and the multiplicand as necessary:

All variables are located in registers (or pairs).

#include 

int main(void)
{

	uint8_t x = PINC;	// to force the compiler to use an unknown initial value.
	x *= 64;
	PORTD = x;	// to force the compiler to actually compute this result.
	x = PINC;	// ditto.
	x /= 64;
	PORTA = x;	// to force the compiler to actually compute this result.

	while(1);
	return 1;
}

[edit]Oops... In my analysis, I'd missed that MUL is 2 cycles... For every instance above where MUL was invoked, add 1 cycle to the count. That's 2 cycles for 16-bit MUL invocations...[/edit]

Last Edited: Wed. Nov 30, 2005 - 08:42 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Real programmers don't need no stankin' floats nor 8087 numeric coprocessors, Carl. They just use the 1's & 0's.

On days when the 1's are broken, all the coding is done using just the 0's.

Lee

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I was referring to something completely different… replacing float operations with integer operations!
=========================================
Lee was talking about a 'tricky' integer ratio where numerator and denominator that were powers of two.... but that reminded me of a tricky FP mult by 2 by incrementing the exponent. Thats 1 8 bit op instead of calling the fpmult subroutine. Cool huh? I thought of that.

Imagecraft compiler user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Well Lee, isn't there something about "Real Men", like:

"Real Men Don't Eat Quesh!" I eat Quesh! I like Quesh!!!

Quote:
...all the coding is done using just the 0's.

In other words, while very little programming gets done using binary and it's actually working. When the 1's are broken, nothing gets done?

You can avoid reality, for a while.  But you can't avoid the consequences of reality! - C.W. Livingston

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

microcarl wrote:
Get rid of the float data type?
No way, man!!!

Sigh... misunderstanding maybe...
I was NOT saying to get rid of float data type!
I was only suggesting that sometimes it is possible to REPLACE float operations with INTEGER operations. For example if you have x * 4.6, you may replace it with (23 / 5). This will bring the benefit of a faster & smaller code! That's all...

Real men don't use backups, they post their stuff on a public ftp server and let the rest of the world make copies.

Last Edited: Wed. Nov 30, 2005 - 09:04 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

When the 1's are broken, nothing gets done?

Actually, Carl, one can put in a full day of "programming AVRs" and actually be quite productive on the days the 1's are broken--if the chips are brand-new or freshly erased. Since the erased state is 0xff for EEPROM & flash -- all bits 1 -- then the task is merely to decide which ones should be 0. Come to think of it, that what us AVRFreaks do all day every day--decide which flash bits to turn to 0.

Groan. :)

Lee

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
Come to think of it, that what us AVRFreaks do all day every day--decide which flash bits to turn to 0.

I prefer to think of myself as "choreographing the dance of the electron".

Four legs good, two legs bad, three legs stable.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
BTW: ... is my English that bad?

No! Your english is quite good. Better then many U.S. citizens, in fact.

As I stated in other posts in the past... I do very low volume product design. Usually less then 10 units. I always use a microcontroller with RAM/ROM larger then I could possibly need. While I realize that others really need to minamize the controller size based on volume cost, I don't. As such, I don't need to concern myself about wasted space as a result of using float data types in my projects. I do however, need to be mindful of speed for certian functions. I haven't yet, run into a situation where I haven't been able to resolve speed related issues on the AVR. Though, I'm sure that day/project is nearing.

But comparing say, the MC68HC11timer functions to that of the AVR, there really isn't any! I implimented a analog controlled PWM/DAC on the AVR this past week that used almost no code space compared to what the same function took on the HC11. But for most projects, I couldn't practically impliment code in C in an HC11E2 product, the EEPROM (it's program space) was usally too small. So, I had to use assembly. But the assembly version of the ADC/PWM/DAC implimentation on the AVR was supprisingly small, as was the C implimentation.

I realize my needs are different. I also do understand the needs of the more moderately volumed developers.

I guess it's all for the sake of argument, really...

You can avoid reality, for a while.  But you can't avoid the consequences of reality! - C.W. Livingston

Last Edited: Wed. Nov 30, 2005 - 10:04 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

theusch wrote:
Real programmers don't need no stankin' floats nor 8087 numeric coprocessors, Carl. They just use the 1's & 0's.

On days when the 1's are broken, all the coding is done using just the 0's.

Lee

OK, OT, but this reminds me of a Dilbert strip where he and Wally are talking, going something like:

- Back in the good ol days we didn't need those sissy icons and windows stuff. All we had was ones and zeroes.
- You had zeroes? We had to use the letter O...

As of January 15, 2018, Site fix-up work has begun! Now do your part and report any bugs or deficiencies here

No guarantees, but if we don't report problems they won't get much of  a chance to be fixed! Details/discussions at link given just above.

 

"Some questions have no answers."[C Baird] "There comes a point where the spoon-feeding has to stop and the independent thinking has to start." [C Lawson] "There are always ways to disagree, without being disagreeable."[E Weddington] "Words represent concepts. Use the wrong words, communicate the wrong concept." [J Morin] "Persistence only goes so far if you set yourself up for failure." [Kartman]

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
...actually be quite productive on the days the 1's are broken--if the chips are brand-new or freshly erased.

Lee, that sounds like an Oxymoron to me...

If the 1's are broken, I'd think that would mean that you would not be able to set them to zero...

You can avoid reality, for a while.  But you can't avoid the consequences of reality! - C.W. Livingston

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

Lee, that sounds like an Oxymoron to me...

If the 1's are broken, I'd think that would mean that you would not be able to set them to zero...

Well, the "moron" part seems to fit where I've led this thread. Good thing the moderators are busy with the tits, fits, quits, pits, zits thread.

Lee

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Carl wrote:
Lee, that sounds like an Oxymoron to me...

Bwahaha :lol: oxymoron!!!
OT: check this out: TOP 10 OXYMORONS

10. Exact estimate

9. Genuine imitation

8. Found missing

7. Butt Head

6. Military Intelligence

5. Women in programming

4. Computer security

3. Political science

2. Working vacation

And the number one top Oxymoron...

1. Micro$oft works

Real men don't use backups, they post their stuff on a public ftp server and let the rest of the world make copies.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

JohanEkdahl wrote:
theusch wrote:
Real programmers don't need no stankin' floats nor 8087 numeric coprocessors, Carl. They just use the 1's & 0's.

On days when the 1's are broken, all the coding is done using just the 0's.

Lee

OK, OT, but this reminds me of a Dilbert strip where he and Wally are talking, going something like:

- Back in the good ol days we didn't need those sissy icons and windows stuff. All we had was ones and zeroes.
- You had zeroes? We had to use the letter O...


I cut that out and kept it - I have no idea where it is now, but it really tickled me!

Four legs good, two legs bad, three legs stable.