The shortcuts taken by optimized code may occasionally (sic?) be surprising

Go To Last Post
16 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

From the GCC Documentation, but applies to any optimising compiler:

The GCC Manual wrote:
The shortcuts taken by optimized code may occasionally (sic) be surprising: 

 

some variables you declared may not exist at all; 

 

flow of control may briefly move where you did not expect it; 

 

some statements may not be executed because they compute constant results or their values are already at hand; 

 

some statements may execute in different places because they have been moved out of loops. 

 

https://gcc.gnu.org/onlinedocs/g...

Judging by the many & frequent posts here, this is not just "occasionally" surprising; it is regularly - even routinely - surprising. Especially to beginners.

 

So, rather than keep repeating the explanation, I thought I'd capture it here - for future reference.

 

The GCC Manual goes on to say:

Nevertheless it is possible to debug optimized output. This makes it reasonable to use the optimizer for programs that might have bugs.

 and

If you are not using some other optimization option, consider using -Og (see Optimize Options) with -g. With no -O option at all, some compiler passes that collect information useful for debugging do not run at all, so that -Og may result in a better debugging experience.

This last bit is specific to GCC - for other compilers, see their Documentation. 

 

#DebugOptimisedCode

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Congratulations.

 

 

Well done archeology. You found one of the parts of the documentation, added 21 years, 2 months ago when that document was originally added by Michael Meissner:

 

 

http://gcc.gnu.org/viewcvs/gcc/t...

 

 

Maybe that text is even older and had previously been hosted out of the repo, dunno.

 

 

> Nevertheless it is possible to debug optimized output. This makes it reasonable to use the optimizer

> for programs that might have bugs.

 

 

There are compiler that do or did *not* support debugging optimized code.  This means that for some other compiler(s), you could either have debug info or optimized code (or none of them) but not both at the same time.

 

 

This means that when you encountered a bug in *optimized* code it was *impossible* to debug using debug info.  You had to re-generate the code *without* optimization to be able to reasonably debug it with a debugger (or to switch on debug info and thereby turning off optimizations), and as we all know, the result may be quite different. So different so, that the bug you hunted has gone and is no more there...

 

 

  

This is different in GCC as it allowed to debug the code *without* regenerating, or at least to regenerate with different -g setting.  This is because GCC has the strict rule that whatever debug info level is chosen, this debug info generation must not trigger any change in the generated executable code (provided non-debug options are unaltered).  If a different debug info setting or internals dump setting changes the executable in any way, this is a GCC bug and not just an inconvenience.

 

This is also the rationale behind -Og because -Og is not a debugging option but an optimization option.

 

Different compilers might just not need something like -Og.  Different compilers may just come up with a completely different binary when debug info is switched on or off or changed.

 

 

That part of the docs might need to be reworded by a native speaker experienced in GCC and other compiler brands, so that it better reflects state-of-the-arts optimization capabilities.

 

avrfreaks does not support Opera. Profile inactive.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

SprinterSB wrote:
added 21 years, 2 months ago

And yet there is still a steady flow of people being "surprised" by the effects of optimisation!

 

You would have thought that teachers, textbooks, tutorials, etc would be explaining this to their students/readers by now ...

 

This is also the rationale behind -Og because -Og is not a debugging option but an optimization option.

Thanks for the background on that - I did wonder why the "g" seemed to appear as both a debug (on its own) and optimisation (with 'O') option.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

awneil wrote:

And yet there is still a steady flow of people being "surprised" by the effects of optimisation!

 

They are surprised by their understanding of C / C++.

 

 

Or better: They are surprised by their misunderstanding of C / C++.

 

 

They are surprised by the fact that C / C++ just describes an abstract machine, and that each one of the infinitely many possible implementations of that abstract machine has infinitely many ways to translate a specific function.

 

 

When you are going from the concrete to the abstract, this means to throw away information.  When you model a tennis ball as a perfect sphere of a specific radius with a specific mass density and moment of interta, then you are forgetting many things about that concrete tennis ball.  When you want to go back from the abstract to the concrete, you have unimaginably many degrees of freedom to design your tennis ball, and which you MUST add to get a real tennis ball.  Likewise, when a compiler transforms a source describing the actions of an abstract machine back to a concrete binary, the compiler will (and must) pop that "lost information" out of thin air.  A compiler is guided by switches and metrics like costs, but still it has to map the abstract to the concrete.

 

 

And it is the result of abusing tools like debuggers and simulators.

 

 

Such tools can help to better understand how the silicon works, and what assembler is about, maybe even get a better understanding how a compiler as concrete piece of software works, yes. But after all, a debugger is a de-bugger, a tool to find bugs.  If a code has a bug, then you know that the source has a bug (or sometimes the tool has a bug).  But if a code works as expected, then this is no proof whatsoever that the source is correct.

 

 

Sometimes I have the impression people regard debuggers and simulators as design tools to develop software, not as a last resort to locate a bug (and often incorrectly infer from as-expected binary that the input is correct, after tweaking the source until everything looks as expected).

 

 

 As Lance Taylor put it:
 

Michelangelo reportedly said that he could see a statue in a block of marble, and that his goal as a sculptor was to carve away everything which wasn't part of the statue. Some programmers appear to similarly believe that programming means starting with a block of code, and carving away everything which isn't part of the program. This is also known as programming by debugging. It's a bad idea. Don't do it.

And people are permanently tought to use debuggers or simulators as design tools.  If someone asks "what does this piece of C code do" then a frequent answer is NOT to explain the code but just a recommendation to "pipe that stuff through a simulator and stare at the result".

 

People are tough to stare ar binary code, they are tought to stare at testing. (A test runs binary code after all).  Testing in inevitable in a development process, but focusing on testing is often a symptom that other means like reviewing, well documentation comments, having well trained staff or avoid bad environment (like stress, task switching, dead-lines etc.) are neglected.  Tests have the problem that they can easily be measured. "we have 100000 tests" and everybody is impressed. Tests can be generated and their number be increaded like mad. Something you cannot do with knowledgable reviewers, where all to often their sheer numbers serves to compensate their lack of deep understanding.

 

Quote:

 

You would have thought that teachers, textbooks, tutorials, etc would be explaining this to their students/readers by now ...

 

avrfreaks does not support Opera. Profile inactive.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

SprinterSB wrote:
They are surprised by their misunderstanding of C / C++.

Indeed.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

If I was allowed to pick just one favourite sentence out of Georg-Johanns excellent post it would be:

SprinterSB wrote:
They are surprised by the fact that C / C++ just describes an abstract machine, and that each one of the infinitely many possible implementations of that abstract machine has infinitely many ways to translate a specific function.

That summarizes the situation brilliantly.

 

Alas, telling that to a frustrated beginner of "step-n-trace debugging" will probably just have them answer "abstract machine??". I just want the code that executes do exactly what I wrote in the source C code. But of-course I want optimization too..".

As of January 15, 2018, Site fix-up work has begun! Now do your part and report any bugs or deficiencies here

No guarantees, but if we don't report problems they won't get much of  a chance to be fixed! Details/discussions at link given just above.

 

"Some questions have no answers."[C Baird] "There comes a point where the spoon-feeding has to stop and the independent thinking has to start." [C Lawson] "There are always ways to disagree, without being disagreeable."[E Weddington] "Words represent concepts. Use the wrong words, communicate the wrong concept." [J Morin] "Persistence only goes so far if you set yourself up for failure." [Kartman]

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

It's not impossible to have enough debug information to "reverse" the optimizations during debugging - the compiler has to prove that all its optimizations still maintain the abstract machine's working, and it's exactly this proof what's needed to do that "reverse". 

 

It's just very impractical to do so - e.g. a full runtime trace would be needed to "reverse" certain optimizations, the debug info would be excessive thus hard to document/maintain, etc.

 

It's much more practical to rely on the processing powers of the real debugger, i.e. the one between the keyboard and chair.

 

JW

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

JohanEkdahl wrote:

"... I just want the code that executes do exactly what I wrote in the source ..."

 

So write assembler.  laugh 

 

For optimizing, you just have to think a bit harder.

 

S.

 

PS - One thing I never figured out how to make an optimizer do was make multiple (fairly extensive) code branches execute in exactly the same number of cycles.  In assembler I just stuffed in NOPs wherever I needed them 'til they matched.  Now I just run the whole thing as a fast timer ISR and if it ain't all done by the next tick, I get to think a bit harder.  S.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Scroungre wrote:
One thing I never figured out how to make an optimizer do was make multiple (fairly extensive) code branches execute in exactly the same number of cycles.

I cannot remember running into that requirement in many decades and [countless?] apps with many different target platforms.  I can envision some of them, I guess...maybe it is just that I haven't had those apps?

 

That would be an interesting type of optimization.  Assuming that I write my compiler to optimize for speed, then [during Compiler Wars in the 80's,, say] my unit sales would be rewarded for the elapsed time for Towers of Hanoi.  So I wouldn't care to "balance" between the "yes, move" and "no, don't move" legs.

 

I've been told that the beauty of GCC is that since it is open source, one can tailor it to whatever one needs.  This sounds like one of those cases.  How would you propose the syntax?  As an extension to switch/case?  Surely you would not want a global setting; if one leg of an app decision takes 123ms then you would not want >>all<< legs to be affected?  Also, how do you handle the non-deterministic leg times?

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Scroungre wrote:
One thing I never figured out how to make an optimizer do was make multiple (fairly extensive) code branches execute in exactly the same number of cycles. 

That's because you are making the key mistake which Johan highlighted in #6.

 

A key feature of the translation to the virtual machine is that it never guarantees anything about timing.

 

If you want precise timing of the code execution, then you need to control that by other means; eg, HW timers, or use assembler.

 

See: http://www.8052.com/forum/read/1...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Industrial process control. 

 

Imagine a cannery where the cans come down the line all the same size and at the same speed, and the gizmo has to decide whether to put tomato soup or chicken soup into the can, but it must put exactly the same amount of soup in exactly the same time.  The real application (under NDA, sorry about that) ran an algorithm tens of thousands of times per "soup dispensing unit" so any difference between the 'tomato soup' and 'chicken soup' decision lines would have been much multiplied.  It was a customer requirement, too.

 

As far as rewriting the compiler?  I haven't the faintest idea.  I'm not the sort of software guy you want mucking around in your compiler, and even going through case/switch statements takes time - the result at the bottom of a 40-unit decision tree isn't going to get executed as quickly at one at the top...

 

S.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Scroungre wrote:

PS - One thing I never figured out how to make an optimizer do was make multiple (fairly extensive) code branches execute in exactly the same number of cycles.  In assembler I just stuffed in NOPs wherever I needed them 'til they matched.  Now I just run the whole thing as a fast timer ISR and if it ain't all done by the next tick, I get to think a bit harder.  S.

 

awneil wrote:

If you want precise timing of the code execution, then you need to control that by other means; eg, HW timers, or use assembler.

 

Which is, of course, exactly what I did.  So there.  ;-P    S.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

theusch wrote:

Scroungre wrote:
One thing I never figured out how to make an optimizer do was make multiple (fairly extensive) code branches execute in exactly the same number of cycles.

I cannot remember running into that requirement in many decades....

I can, writing software UARTs, for an 1802 and an ATtiny24A, or about once every 3-4 decades. :)

- John

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

jfiresto wrote:
I can, writing software UARTs,

I can understand a certain instance, such as that and other bitstream encoding.  But is a single case with simple decisions, right?  Not "multiple fairly extensive code branches".

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The ATtiny24A code was like that. The 1802 code was less trivial. It serviced two switches and a full duplex serial line, and consisted of three finite-state machines, that were all called over 12 instruction cycles. The co-routine calls took three instructions, leaving nine to be split among the three machines. Every machine's state change + action took exactly the same number of instructions, let us say, three, so that everything, together, always ran in exactly 3 + 3 x 3 = 12 cycles. I did not try to use a compiler, but I can see why someone might want a modern compiler to equalize branch times.

 

These days you could easily throw a billion or more transistors at the problem, spend less money, execute the solution faster – in Python – and quite possibly use less power!

 

- John

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The examples I can think of where I needed to match execution times along different branch paths, were all workarounds of issues with the MCU's peripherals (e.g., the not-quite-Universal Serial Interface), that would otherwise make the MCU insufficient for the application and/or incompatible.

- John