Compiled binary needs more flash with newer avr-gcc

Go To Last Post
67 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I am not very skilled in using the compiler options, so i run now into the problem that the compiled output does not fit into the flash of an AtMega8 any more, using a newer avr-gcc.

 

AVR-GCC Version 5.4.0 (in Debian 10)

$ make
colorgcc.pl avr-gcc -c -mmcu=atmega8 -I. -gstabs   -Os -Wall -Wstrict-prototypes -std=gnu99  -ffunction-sections main.c -o main.o
colorgcc: 24 preferences loaded
colorgcc: calling avr-gcc

colorgcc.pl avr-gcc -mmcu=atmega8 -I. -gstabs   -Os -Wall -Wstrict-prototypes -std=gnu99  -Wl,--gc-sections,-Map,main.map,--cref main.o   --output main.elf     -lm
colorgcc: 24 preferences loaded
colorgcc: calling avr-gcc

/usr/lib/gcc/avr/5.4.0/../../../avr/bin/ld: main.elf section `.data' will not fit in region `text'
/usr/lib/gcc/avr/5.4.0/../../../avr/bin/ld: region `text' overflowed by 8 bytes
collect2: error: ld returned 1 exit status
make: *** [Makefile:212: main.elf] Fehler 1

 

 

AVR-GCC Version 4.8.1 (in Debian 8)

 

$ make
colorgcc.pl avr-gcc -c -mmcu=atmega8 -I. -gstabs -Os -Wall -Wstrict-prototypes -std=gnu99 -ffunction-sections main.c -o main.o
colorgcc: 24 preferences loaded 
colorgcc: calling avr-gcc

colorgcc.pl avr-gcc -mmcu=atmega8 -I. -gstabs   -Os -Wall -Wstrict-prototypes -std=gnu99  -Wl,--gc-sections,-Map,main.map,--cref main.o   --output main.elf     -lm
colorgcc: 24 preferences loaded
colorgcc: calling avr-gcc

avr-objcopy -j .text -j .data -O binary main.elf main.bin
avr-objcopy -j .text -j .data -O ihex main.elf main.hex
avr-objdump -h -S main.elf > main.lst
avr-objcopy -j .eeprom --set-section-flags=.eeprom="alloc,load" \
--change-section-lma .eeprom=0 -O binary main.elf main.eep
avr-size -C --mcu=atmega8 main.elf
AVR Memory Usage
----------------
Device: atmega8

Program:    8152 bytes (99.5% Full)
(.text + .data + .bootloader)

Data:        523 bytes (51.1% Full)
(.data + .bss + .noinit)

EEPROM:      505 bytes (98.6% Full)
(.eeprom)

 

How it is possible that the result needs 48 Bytes more now?

 

What must be changed in calling the avr-gcc 5.4.0 ?

 

Any tip would be helpful.

Last Edited: Sun. Feb 14, 2021 - 09:45 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Compiling another code:

 

AVR-GCC Version 5.4.0 (in Debian 10) 8162 bytes

 

 

AVR-GCC Version 4.8.1 (in Debian 8) 8106 bytes

 

 

That's much difference for the same source, makefile and at least compiler.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

So compare the LSS

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
So compare the LSS

I guess I'd shart start with the map(s), to help narrow down the area(s) for further digging.  [I had to laugh at my own typo]

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

Good point. avr-nm --size-sort of the ELF is one of the best ways to find the 'big stuff"

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

If you want small GCC code then perhaps try the last winavr, it normally make the "best" code, as I remember it's made with GCC 4.3.?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks for your ideas so far.

 

Hmmm - of course it is possible to dig through the compiler output.

There must be differences - but this will not answer the question why two versions of the same compiler will compile with such different results?

 

I can only analyze what is different, but i think i can't do anything to get the same result as before with a newer compiler.

Normally some bytes don't interest anyone, but when you have 8 K-Byte it makes an difference.

 

The question is why the optimization is getting bad with newer versions of the compiler?

Last Edited: Sun. Feb 14, 2021 - 05:28 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

When you look with KDiff3 over the two lss files you will get blind!

Nearly everything is different and the same.

 

Attachment(s): 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
Good point. avr-nm --size-sort of the ELF is one of the best ways to find the 'big stuff"

 

Is there a possibility to extract how many bytes a function needs?

So that i can get a list of the functions with the amount of bytes for each function.

Last Edited: Sun. Feb 14, 2021 - 05:46 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

sparrow2 wrote:

If you want small GCC code then perhaps try the last winavr, it normally make the "best" code, as I remember it's made with GCC 4.3.?

 

Please - i have no windows.

 

But when GCC 4.3 compiles with the "best code", then every version after it get's worse and worse.

Maybe i should ask the developers of avr-gcc why?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Perhaps stupid but do they both default the the same about wrap around

 

Try to specifically enable wrap around.  

 

Add

I'm not surprised that the code get's bigger, GCC is getting more and more a ARM optimized compiler, and AVR is just an other backend.

But there could be some speed benefits in ISR with the new compiler, sometimes it's a give and take. 

 

 

 

 

 

Last Edited: Sun. Feb 14, 2021 - 06:01 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

sparrow2 wrote:
Try to specifically enable wrap around.

 

What is the meaning of this?

 

 

When you are not surprised, then i will not be less surprised.

Somehow everything seems to go this way ...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

lsmodavr wrote:

Hmmm - of course it is possible to dig through the compiler output.

There must be differences - but this will not answer the question why two versions of the same compiler will compile with such different results?

I have no idea how you think you can resolve differences in the generated Asm without comparing the Asm?

 

But anyway later compilers generally add features and fix bugs, in doing that it could well be the case that corrected code is larger than buggy code. 

 

Another thing to know is that GCC is not really developed for AVR . It is principally an Intel and ARM compiler so the ongoing development makes those better and better. Sometimes this can have an adverse effect on the AVR version.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

lsmodavr wrote:

clawson wrote:

Good point. avr-nm --size-sort of the ELF is one of the best ways to find the 'big stuff"

 

Is there a possibility to extract how many bytes a function needs?

So that i can get a list of the functions with the amount of bytes for each function.

 

Er, that's what you'll get from avr-nm --size-sort main.elf > main.sym

Alternatively this will give the sizes as well, slightly differently

avr-nm -n -S main.elf > main.sym

 

Do this for the .elf produced by each compiler and diff the .sym files (maybe you need to comment out a small amount of code to get it to produce the .elf for new compiler, but should give some clues).

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

lsmodavr wrote:

Is there a possibility to extract how many bytes a function needs?

So that i can get a list of the functions with the amount of bytes for each function.

 

This is doing the job.

avr-nm -C -S --size-sort main.o

 

At least there are many symbols / routines that differ with some bytes up to 24 Bytes for the biggest function.

That's hard!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

MrKendo wrote:

Er, that's what you'll get from avr-nm --size-sort main.elf > main.sym

Alternatively this will give the sizes as well, slightly differently

avr-nm -n -S main.elf > main.sym

 

Thanks - seeing it now - a little bit after finding the solution for myself. smiley

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:

But anyway later compilers generally add features and fix bugs, in doing that it could well be the case that corrected code is larger than buggy code. 

 

Hmmm - up to now the code runs fine without bugs for years.

This small AtMega8 is doing only simple things.

 

clawson wrote:

Another thing to know is that GCC is not really developed for AVR . It is principally an Intel and ARM compiler so the ongoing development makes those better and better. Sometimes this can have an adverse effect on the AVR version.

 

O.K. Good argument.

But why gets compiling simple C-code more inefficient?

 

 

Here are the results of the comparison if you want to compare:

 

Attachment(s): 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

lsmodavr wrote:
What must be changed in calling the avr-gcc 5.4.0 ?
Link-time optimization? ... though IIRC took a few compiler versions to correct the defects in LTO.

https://gcc.gnu.org/onlinedocs/gcc-5.5.0/gcc/Optimize-Options.html#index-flto

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

exec_command has increased from 0x806 to 0x81e

 

but also some functions have got smaller eg.

uart_gets reduced from 0x5a to 0x46

WaitZs reduced from 0x34 to 0x26  (EDIT typo)

 

but at least you can go and look at the assembler for these functions in the .lss and see what it is doing differently

Last Edited: Sun. Feb 14, 2021 - 07:32 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

A lot of the optimizations done by gcc are dones WRT some "intermediate" representation of the code, which only has a vague idea of what the binary code will look like (for example, I think it assumes that all simple math operations at "int" size or below generate the sane amount of code.

 

IIRC (experience with Optiboot. which MUST fit in 256 words), one of the things that happened post WinAVR is that the compiler got moreaggressive WRT automatically in-lining small functions, resulting in multiple copies of "getch()" and similar.  There were fixed by using noinline attributes on some functions, and changing the compile flags slightly...

https://github.com/Optiboot/opti...

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

-fno-inline-small-functions perhaps?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

gchapman wrote:
Link-time optimization? ... though IIRC took a few compiler versions to correct the defects in LTO.

 

https://gcc.gnu.org/onlinedocs/gcc-5.5.0/gcc/Optimize-Options.html#index-flto

 

That's it - that seems to be the missing option.

Thank You!

 

Test before:

 

AVR-GCC Version 5.4.0 (in Debian 10) 8162 bytes

AVR-GCC Version 4.8.1 (in Debian 8) 8106 bytes

 

Test now (with additional option -flto):

 

AVR-GCC Version 5.4.0 (in Debian 10) 8096 bytes

 

 

 

Last Edited: Mon. Feb 15, 2021 - 08:52 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Just checking but you were using -ffunction-sections , -fdata-sections and -gc-sections anyway?

 

(as it happens -flto will have a similar effect anyway I believe).

 

EDIT: must learn to re-read #1. So "yes".

Last Edited: Mon. Feb 15, 2021 - 08:54 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

lsmodavr wrote:
up to now the code runs fine without bugs for years

laugh laugh laugh laugh 

 

Ha ha - the "Proven-Product Syndrome"

 

https://www.avrfreaks.net/commen...

 

https://www.avrfreaks.net/commen...

 

Just because bugs have not manifested (or maybe just not been obvious) doesn't mean that there aren't bugs hiding in there!

 

A change of compiler (version) is often a great way to find such bugs. Changing optimisation settings is another.

 

 

 

 

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
EDIT: must learn to re-read #1. So "yes".

 

Yes. smiley

 

I tested now without  -ffunction-sections and -gc-sections and the result keeps the same with 8096 Bytes.

 

 

What must be done to use avr-nm now?

$ avr-nm -C -S --size-sort main.o
avr-nm: main.o: plugin needed to handle lto object
00000001 00000001 C __gnu_lto_slim
00000001 00000001 C __gnu_lto_v1

How the "plugin" can be added?

 

 

And is there somewhere an description to get an overview what all this optimizations are doing?

Last Edited: Mon. Feb 15, 2021 - 09:03 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

awneil wrote:
Ha ha - the "Proven-Product Syndrome"

 

Yes - that's possible of course.

But i can say it is really much better than using chinese programmed products. smiley

 

 

I think i must search for a photo of my cats for the avatar now ...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Well when you link with -flto the linker will be using a plugin. So presumably it's that same plugin you need to provide to nm but how you find or specify that I have no idea (don't use -flto myself).

 

A quick google suggests "gcc -dumpspecs" may show info about LTO but when I try that on a copy of avr-gcc here (the one in AS7) simply see:

C:\Program Files (x86)\Atmel\Studio\7.0\toolchain\avr8\avr8-gnu-toolchain\bin>avr-gcc -dumpspecs | grep -A 2 plugin_file:
*linker_plugin_file:

IOW there is no linker plugin file defied in the specs.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

OK a bit more poking about shows that avr-nm has a "--plugin" parameter and while I can't find evidence of an LTO plugin in my copy of avr-gcc with AS7 (which is a 5.4.0) in a fairly recent Arduino installation there is:

 Directory of D:\arduino-1.8.13\hardware\tools\avr\libexec\gcc\avr\7.3.0

26/05/2020  15:09           170,651 liblto_plugin-0.dll
26/05/2020  15:09            56,094 liblto_plugin.dll.a
26/05/2020  15:09             1,017 liblto_plugin.la
               3 File(s)        227,762 bytes

So I'm guessing that when invoking avr-nm you need to somehow make it aware of one of these that it needs to decode the LTO sections of the .o/.elf files.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Other random options that may or may not reduce code size:

 

1) -fno-gcse

2) -fno-jump-tables

3) using avr-gcc version 7.x.x (versions newer than 7 are almost certain to increase code size but they come with ISR optimization which is a plus)

Last Edited: Mon. Feb 15, 2021 - 09:26 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

lsmodavr wrote:

What must be done to use avr-nm now?

$ avr-nm -C -S --size-sort main.o
avr-nm: main.o: plugin needed to handle lto object
00000001 00000001 C __gnu_lto_slim
00000001 00000001 C __gnu_lto_v1

Don't know, but what if you run nm on the .elf (ie. after whole thing is linked) rather than on individual object file main.o ?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
Well when you link with -flto the linker will be using a plugin. So presumably it's that same plugin you need to provide to nm but how you find or specify that I have no idea (don't use -flto myself).

 

It's possible to use avr-nm -C -S --size-sort --plugin /usr/lib/gcc/avr/5.4.0/liblto_plugin.so main.o

but it returned with exit code 0 and no output.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

MrKendo wrote:
Don't know, but what if you run nm on the .elf (ie. after whole thing is linked) rather than on individual object file main.o ?

 

Yes - that's the solution - Thanks!

 

avr-nm -C -S --size-sort main.elf

 

The result is:

 

Attachment(s): 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

See MrKendo's reply. I completely missed the fact you were using nm on .o files rather than the .elf

 

As the ELF is "after LTO" it should no longer contain LTO artifacts.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
As the ELF is "after LTO" it should no longer contain LTO artifacts.

 

Yes - but then the "plugin needed to handle lto object" must be existant.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I fear you are missing the point. You are using:

$ avr-nm -C -S --size-sort main.o
avr-nm: main.o: plugin needed to handle lto object

So you are trying to look at function sizes in the .o file (unlinked so an intermediary format with LTO info (because you built with -flto)). That is why it's saying it can#t show you anything as it can read what is "inside" those LTO sections.

 

But later in your build you must do a link. Something like:

avr-gcc -mmcu=atmega8 -I. -gstabs   -Os -Wall -Wstrict-prototypes -std=gnu99  -Wl,--gc-sections,-Map,main.map,--cref main.o   --output main.elf     -lm

that you showed in #1. That is an invocation of avr-gcc but without "-c" (so it will link not compile) and in this you are taking the intermediate object in main.o and making it into a whole program in main.elf. If LTO is used the intermediate .o files will have an internal LTO format that nm cannot initially understand but when you reach the EFL (Link Time is over!) then link time object will no longer be present and you should simply have a code image that nm can read OK. IOW simply use:

avr-nm -C -S --size-sort main.elf

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks for the explanation.
It is not easy to understand what the gcc is doing in detail with all the files.

 

The generation has been done with the .elf file: https://www.avrfreaks.net/commen... ?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

lsmodavr wrote:
It is not easy to understand what the gcc is doing in detail with all the files.
The way to do this is to avr-nm on the ELF to find out which routines have grown. Then to look at the generated Asm for those functions in the two cases and compare to see why the later compiler has chosen to generate different "larger" code.

 

Let's say that the earlier compiler had some bug where it maybe wasn't push/popping all the registers that needed to be preserved ( I seem to recall one of the many fixed issues was something like this. So maybe the older code looks like:

FOO:
    LDI R18, 0xAA
    OUT 0x05, R18
    RET

which may well look small and efficient but the later compiler perhaps generates:

FOO:
    PUSH R18
    LDI R18, 0xAA
    OUT 0x05, R18
    POP R18
    RET

Sure it's slower and bloatier but it is actually correct. But the only way you are going to see the reason for a difference like this (a 4 byte increase in this case) IS by studying the Asm to see what has happened.

 

You can't really hope to find the differences (and some idea how you might work around them) without studying the detail. Even if avr-NM told you that old "FOO" was 6 bytes in one case and 10 bytes in the other you will have no idea what might be behind that without comparing Asm.

 

(of course comparing Asm might not be immediately clear anyway).

 

One thing I have noticed more of in later code is Z indexing so that instead of something like:

DDRB = 0x12;
PINB = 0x34;
PORTB = 0x56;

converting to something like:

LDI R24, 0x12
OUT 0x03, R24
LDI R24, 0x34
OUT 0x05, R24
LDI R24, 0x56
OUT 0x04, R24

you might find something like:

LDI R18, 0x12
LDI R19, 0x34
LDI R20, 0x56
LDI R30, 0x03
LDI R31, 0x00
ST Z++, R18
ST Z++, R19
ST Z, R20

The compiler has to do a cost/benefit analysis on whether the indexed access is "cheaper" - so large blocks of registers it will be, fo a handful it might not. But it's this kind of thing that can make code very difficult to compare because on the surface they look quite different.

Last Edited: Mon. Feb 15, 2021 - 01:22 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

And then perhaps some routines are getting faster and/or smaller for the cost of using more registers, but in some combinations it run out of registers and need to push/pop more.

 

Remember as long the output follow the rules the compiler is happy, it don't know anything about worst case.

 

If you have the source code for everything, then make everything as one big file (perhaps there is a GCC command for this, this will take longer but who care for a 8k program), this way the compiler know where everything is at compile time and therefor can generate better code (it can't optimize between .o files), and then the linker's just is to give everything a absolute addr.   

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
You can't really hope to find the differences (and some idea how you might work around them) without studying the detail. Even if avr-NM told you that old "FOO" was 6 bytes in one case and 10 bytes in the other you will have no idea what might be behind that without comparing Asm.

 

This was clear from the beginning.

 

But now i have a couple of explanations for the reasons, a new method to analyze the differences and to compensate with other optimizations.

 

Thank you all for your help.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

sparrow2 wrote:
If you have the source code for everything, then make everything as one big file (perhaps there is a GCC command for this, this will take longer but who care for a 8k program), this way the compiler know where everything is at compile time and therefor can generate better code (it can't optimize between .o files), and then the linker's just is to give everything a absolute addr.

 

Yes - for such small projects it does not make sense to compile in different parts, because the compilation needs only about 1 second.

All libraries are included direct without the header files in the main file.

Last Edited: Mon. Feb 15, 2021 - 03:40 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0


lsmodavr wrote:

Please - i have no windows.

 

 

 

lsmodavr wrote:

 

But when GCC 4.3 compiles with the "best code", then every version after it get's worse and worse.

Maybe i should ask the developers of avr-gcc why?

You've dug enough  now to actually examine the output of a piece.  Take one of the functions that you mention, 24 bytes bigger.  Now, instead of examining 4000 words we might be examining 40.  What does the annotated-with-the-corresponding-source .LSS say about the comparison of the generated code?

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yes,   I feel your pain.   8216 bytes versus 8152 bytes.

 

I would look at your source code first.    See if you can make some savings that way.

Otherwise you will suffer each time that a new Compiler is released.

 

When you have tweaked the source code as well as it is possible you can progress to "special" build options.

 

David.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hey - where did you find the picture of my house? smiley

 

theusch wrote:
What does the annotated-with-the-corresponding-source .LSS say about the comparison of the generated code?

 

You mean the difference of -flto with before?

The green parts are the routines before and the blue parts with -flto.

 

 

The function suart_init is a good example.

When you look at the file https://www.avrfreaks.net/sites/... you will see, that this function is not included in the list now after optimization.

Last Edited: Mon. Feb 15, 2021 - 04:20 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

david.prentice wrote:
Yes,   I feel your pain.   8216 bytes versus 8152 bytes.

 

The pain has gone. The solution was found here: https://www.avrfreaks.net/commen...

It is possible to save additional 10 Bytes - so there is an improvement now with the new compiler.

 

O.K. the result is untested ...

Last Edited: Mon. Feb 15, 2021 - 04:26 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

lsmodavr wrote:
The function suart_init is a good example.
Not a really good example I'm afraid.

 

On the left is an RCALL to something we can't see (but I suspect is pretty similar to the stuff on the right). While the stuff on the right is presumably an example of the function code simply having been inlined. As mentioned above later compilers got better at inlining when they can. In theory this is actually a "save" as it's removed an RCALL and a RET so saves 4 bytes - so the right might look "bad" but is actually "good" !

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:

Not a really good example I'm afraid.

 

I am so sorry ...

 

I can only say that the theory is really hard with so much details and complexity and optimization strategies.

For me the results of the avr-gcc are excellent and i am happy when i can code my simple ideas with it.

 

With an AVR it is possible to have a look and to understand the results of the generated ASM output.

When i try the same with an Blue-Pill STM32F103 i will get crazy.

Last Edited: Mon. Feb 15, 2021 - 04:59 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

lsmodavr wrote:
With an AVR it is possible to have a look and to understand the results of the generated ASM output.
Sure, AVR Asm is actually pretty simple:

 

http://ww1.microchip.com/downloa...

 

But when I follow your link in #44 then sure that is NM output but which of the two builds and where is the same for the other build? To make a comparison what you need to do is use NM on each build. Identify a function where there's been a dramatic change in size (which is likely to be a big function like main() or check_rs485()) then get the LSS files and compare the opcode body of each instance of the same thing and look for obvious discrepancies. 

 

Now just using Kdiff/meld/BeyondCompare/diff/whatever to compare the two LSS is not necessarily easy as the code will have addresses, opcodes (hex) and also labelled call/jump destinations all of which will likely be different. But it is the "core" of the code - the LDIs and OUTs and CMPs and ADDs and whatever else. You need to look for chunks where one is obviously larger than the other then try to determine why that may have come about.

 

But if the real "issue" here is you have an 8K chip but 8K+a_bit to fit into it then while examining navel fluff to understand the minutiae of what has changed might help really I'd just be looking to see if there's anything been done in the code "inefficiently" simply so you can save "a_bit" bytes and bring it back under 8K.

 

The "cowards way out", of course, is simply to trade up to a 16K version of the chip.

 

Now when one considers "old 8K designs" I have a sneaking suspicion it's mega8 (am I right?). If so the mega88 is a very close cousin that it would be fairly easy to switch to and if you did that you future proof yourself for examplsion as you can later trade up to mega168 and then mega328 if things get bigger.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

david.prentice wrote:
When you have tweaked the source code as well as it is possible ...
A linter is one way to find low hanging fruit.

https://rules.sonarsource.com/c/?search=redundant

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

for #47 I guess that this from #1 tells it all :

-mmcu=atmega8 

If you haven't been jumping on the C program so it fit 8k my guess is that it would be possible to save 500-1000 byte by rearranging things, place 1-5 common used variables in registers, find some int variables that actually only needs to be byte size, some common code that could be moved into a function etc etc.........

 

But if you are done I guess it's all over now.

And for an old chip like a mega8 I guess that a relative old compiler make fine code. It's all the new stuff, bank switching, multilevel ISR's etc. I would fear using an old compiler. 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

sparrow2 wrote:
And for an old chip like a mega8 I guess that a relative old compiler make fine code. It's all the new stuff, bank switching, multilevel ISR's etc. I would fear using an old compiler. 

I was a CodeVision user, and kept stable versions of 1.x and 2.x loaded just to do [relatively] "safe" rebuilds of legacy projects for small tweaks.  As there were some syntax and other adjustments when jumping over that boundary, I thought it would be easier and not force changes in approach.

 

lsmodavr wrote:
Hmmm - up to now the code runs fine without bugs for years.

So why this need to rebuild?  I suppose it is the dreaded "one more feature"...

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

sparrow2 wrote:

If you want small GCC code then perhaps try the last winavr, it normally make the "best" code, as I remember it's made with GCC 4.3.?

Not according to my research it didn't. Some years ago I had modded some old WinAVR code and it no longer fitted into a ATmega169 (16K) and wondered whether a newer compiler would help. So I spent the day performing the test.

 

In each of the following results the code essentially remained the same, I did fix some warnings because as the compiler became newer; more warnings were issued but those changes didn't change the code size.

 

As you can see - As newer compiler versions were released the generated code size fell.

avr-gcc (WinAVR 20100110) 4.3.3
_______________________________
section            size      addr
.text             16762         0
.bss                404   8388864

 

 

avr-gcc (AVR_8_bit_GNU_Toolchain_3.3.2_485) 4.5.1
_________________________________________________
section             size      addr
.text              16058         0
.bss                 404   8388864

 

avr-gcc (AVR_8_bit_GNU_Toolchain_3.4.0_663) 4.6.2
_________________________________________________
section           size      addr
.text            15768         0
.bss               404   8388864

 

avr-gcc (AVR_8_bit_GNU_Toolchain_3.4.1_798) 4.6.2
_________________________________________________
section           size      addr
.text            15768         0
.bss               404   8388864

 

avr-gcc (AVR_8_bit_GNU_Toolchain_3.4.2_939) 4.7.2
_________________________________________________
section            size      addr
.text             15502         0
.bss                404   8388864

 

avr-gcc (AVR_8_bit_GNU_Toolchain_3.4.5_1522) 4.8.1
__________________________________________________
section            size      addr
.text             15424         0
.bss                404   8388864

 

 

avr-gcc (AVR_8_bit_GNU_Toolchain_3.5.2_1680) 4.9.2
__________________________________________________
section                      size      addr
.text                       15422         0
.bss                          404   8388864

 

 

Unfortunately I no longer have that code so cannot test even newer compiler versions

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
But if the real "issue" here is you have an 8K chip but 8K+a_bit to fit into it then while examining navel fluff to understand the minutiae of what has changed might help really I'd just be looking to see if there's anything been done in the code "inefficiently" simply so you can save "a_bit" bytes and bring it back under 8K.

 

In this case this was not the "real issue".
I simply discovered that the same source compiled under Debian 8 did not fit in the flash when compiled under Debian 10.

This point is cleared now.

 

When the source gets to big of course i search for inefficient programming and/or features that are not so important.

 

 

clawson wrote:
Now just using Kdiff/meld/BeyondCompare/diff/whatever to compare the two LSS is not necessarily easy as the code will have addresses, opcodes (hex) and also labelled call/jump destinations all of which will likely be different. But it is the "core" of the code - the LDIs and OUTs and CMPs and ADDs and whatever else. You need to look for chunks where one is obviously larger than the other then try to determine why that may have come about.

 

Yes - this will be a good method.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

N.Winterbottom wrote:
As you can see - As newer compiler versions were released the generated code size fell.

 

That's an extented and interesting overwiew - thank you.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Diffing .lss files is likely a nogo,

but I suspect diffing the assembly files would work.

Moderation in all things. -- ancient proverb

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

skeeve wrote:
I suspect diffing the assembly files would work.

 

How a plain assembly file can be generated with the gcc-avr toolset?

 

Or is another tool like avrdisas needed ?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Just use -save-temps. For each main.c, uart.c, adc.c or whatever that is built you will find the build directory (often "Debug" or "Release") then contains for each main.i, main.s, uart.i, uart.s, adc.i, adc.s. The .i files are the source after pre-processing and the .s files are Asm sources after code generation.

 

As I note in one of my github projects here:

 

https://github.com/wrightflyer/a...

 

The asm that is generated can be quite difficult to read. That's because (without -save-temps) it's usually a "private" thing between the compiler and the assembler. No one expects humans to read it so it's not necessarily formatted that well for human use. As my example on github shows:

int main(void) {
    DDRB= 0xFF;
    while(1) {
        PORTB ^= 0xFF;
        _delay_ms(100);
    }
}

becomes:

.global    main
    .type    main, @function
main:
.LFB6:
    .file 1 ".././ledblink.c"
    .loc 1 4 0
    .cfi_startproc
/* prologue: function */
/* frame size = 0 */
/* stack size = 0 */
.L__stack_usage = 0
    .loc 1 5 0
    ldi r24,lo8(-1)     ;  tmp46,
    out 0x17,r24     ;  MEM(volatile uint8_t *)55B?, tmp46
.L2:
    .loc 1 7 0 discriminator 1
    in r24,0x18     ;  D.1487, MEM(volatile uint8_t *)56B?
    com r24     ;  D.1487
    out 0x18,r24     ;  MEM(volatile uint8_t *)56B?, D.1487
.LVL0:
.LBB4:
.LBB5:
    .file 2 "c:\\program files\\atmel\\atmel toolchain\\avr8 gcc\\native\\3.4.2.876\\avr8-gnu-toolchain\\bin\\../lib/gcc/avr/4.7.2/../../../../avr/include/util/delay.h"
    .loc 2 164 0 discriminator 1
    ldi r24,lo8(24999)     ; ,
    ldi r25,hi8(24999)     ; ,
    1: sbiw r24,1     ;
    brne 1b
    rjmp .
    nop
    rjmp .L2     ;
.LBE5:
.LBE4:
    .cfi_endproc
.LFE6:
    .size    main, .-main

where even something simple like:

    DDRB= 0xFF;

has become:

    ldi r24,lo8(-1)     ;  tmp46,
    out 0x17,r24     ;  MEM(volatile uint8_t *)55B?, tmp46

Now lo8(-1) might well be the same thing as 0xFF but some of this stuff is not obvious. My avr-source program tries to help a bit by parsing the debug info in the file and using it to put source code back in so at least this becomes:

//==>     DDRB= 0xFF;
    ldi r24,lo8(-1)     ;  tmp46,
    out 0x17,r24     ;  MEM(volatile uint8_t *)55B?, tmp46

but I don't try to "repair" things like a lo8(-1) to 0xFF translation. But anyway at least the two compilers are going to be generating code in a "similar" fashion which might make diffing the .s a bit easier.

 

BTW is your project "secret" or could you simply share the code so we can all take a look and see if we can help y7ou to understand why one code generation is different from the other?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

My advice would be:   ZIP up the AS7.0 project and attach the ZIP.

If it is a Makefile project on Linux.   You can still ZIP it up.

 

I suspect that there are opportunities to reduce the code size.    Just with regular C.  No tricks.

Let's face it.   An 8kB program is not very big.

 

I would only go down the ASM rabbit hole if performance or inventory costs are overwhelming.

 

David.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
Just use -save-temps.

 

Thanks for the tip and your explanations.

It gives an overview and understanding what's happening in the background.

 

It seems that avrdisas gives a better opportunity to understand the result of the assembler, then the "machine readable" output of the compiler.

At this time i don't want to do a diff of the output and to understand what's happening in the assembler output - this is really like acting as a machine.

Last Edited: Tue. Feb 16, 2021 - 05:46 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

lsmodavr wrote:
this is really like acting as a machine.
Or a professional software engineer ;-)

 

(actually I guess those of us touching 60 or above who've been in this game 40/50/60 years probably did the first 20/30/40 years in Asm alone so we do tend to look at things in terms of the generated Asm - it's  a tricky habit to break!)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:

Or a professional software engineer ;-)

 

Well - it's better when you say this. wink

 

I did assemble my first code by hand for an Z80 ...

(and the machine was an ZX81 designed in Essex when i remember correct)

Last Edited: Tue. Feb 16, 2021 - 06:10 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Sometimes you have to do that, or operate a verified compiler, or do applied formal methods (theory learned by studying abstract machines or Turing machines)

 

Tracing requirements through to object-code verification - Embedded.com

CompCert - Main page

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You went further than me (TRS-80 Model 1 operator though via the assembler and BASIC)

The TRS-80 Model I

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

gchapman wrote:
Sometimes you have to do that

 

Yes - but only as informatic guy.

I am only an electronic guy that is programming a little bit.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

gchapman wrote:
The TRS-80 Model I

 

Oh yes - such a model we did have at school.

And we used such an HP-calculator to learn what is programming - that's before BASIC https://en.wikipedia.org/wiki/HP-65

 

The ZX81 was the first computer that was cheap enough to buy it as grown up.

Now an AtMega8 is an equivalent with RAM and flash and more performance for about $1.

Of course i first used BASCOM for it. https://www.mcselec.com/index.ph...

It's an really bad compiler in comparison to the avr-gcc.

 

 

Maybe you will have fun with this project: https://www.mikrocontroller.net/... ?

Here in english: https://github.com/abelykh0/stm3...

Last Edited: Tue. Feb 16, 2021 - 07:01 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
(actually I guess those of us touching 60 or above who've been in this game 40/50/60 years probably did the first 20/30/40 years in Asm alone so we do tend to look at things in terms of the generated Asm - it's  a tricky habit to break!)

 

Us are now officially old :(

 

A chap I used to mentor at work had his 28th birthday today. It'll be another five years before he's half my age!!!

 

Neil

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

So the conclusion is that assembler is for oldies. wink

 

Or seen in a positive way - we are the last understanding what's really going on in a machine. smiley

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0


 

lsmodavr wrote:
(and the machine was an ZX81 designed in Essex
One county over - Sinclair Research where just opposite Kings in Cambridge high street I remember. I thin kthey later moved out to the Cambridge Science Park. In fact 6 Kings Parade where they started is the cream coloured shop in the middle here:

 

 

If you swing the view around to the other side of the street:

 

 

then that is a pretty inspiring view when you are sat designing your first Z80 computers !!

 

Building to the right is King's Chapel where the famous carol concert that is broadcast at 3pm each Christmas Eve is recorded. This is the other side of that building viewed from the River Cam:

 

 

Last Edited: Wed. Feb 17, 2021 - 10:22 AM