Hi there,
Is there a way for avr-gcc to tell me (perhaps give me a warning) after compilation, that this-and-that function/variable is not used?
Thanks
J
Hi there,
Is there a way for avr-gcc to tell me (perhaps give me a warning) after compilation, that this-and-that function/variable is not used?
Thanks
J
-ffunction-sections followed by -gc-sections
The first puts each function into it's own named section at compile time then the latter tells the linker to "garbage collect" (aka "throw away") any sections that are not refereneced.
There's also the -whole-program thing but that requires you to list all the .c files in a single invocation of the compiler which makes the Makefile more complex than usual.
Thanks, I have tried it and it works (unused functions not in executable file).
But still is there a way that the linker can tell me , that this-and-that function is not used?
You can ask for an Xref to be added to the .map ;-)
(don't ask me how - I know I managed to find the magic runes once though!)
EDIT: Ah, apologies, looking at an Mfile generated Makefile it already has:
#---------------- Linker Options ---------------- # -Wl,...: tell GCC to pass this to linker. # -Map: create map file # --cref: add cross reference to map file LDFLAGS = -Wl,-Map=$(TARGET).map,--cref
So there's a cref in the .map already. Maybe it's Studio that doesn't add --cref by default?
EDIT2: In this I put a function (urrgle) into empty.c and called it from main.c. The .map xref gave:
urrgle empty.o test.o
The fact that two .o's mentioned it sort of implies that it was defined in one and called by the other. But a function that was local to one file would presumably only appear once and you wouldn't know if it were referenced.
Cliff!
Any chance splint could help here? (I know you like it, and I have not exercised it enough...)
(Inspiration came from looking at JavaCC just the other day, and specifically the C parser as I would like to collect a few metrics on code like number of statements, number of expressions etc.)
'Fraid I'm not enough of a splint user to be able to help - while I've "toyed" with it I couldn't really face the rigmarole of setting it up properly for AVR projects so that it's not worried about the system .h files so I never got much beyond the 10 line source test.
I have used lint more than splint in the past but even that was just a for pre-compile error checking (our makefiles wouldn't invoke the compiler until lint set ERRORLEVEL to 0) we weren't using it to generate coverage metrics or anything like that.
But you are on the right lines - even if splint doesn't offer this functionality someone on Linux is bound to have written a tool that will do such analysis. (and if it's on Linux it could presumably be cross compiled as a Win32 tool)
Cliff
EDIT: the splint manual:
http://www.splint.org/downloads/...
includes
13.2 Complete Programs
Splint can be used on both complete and partial programs. When checking complete programs,
additional checks can be done to ensure that every identifier declared by the program is defined and
used, and that functions that do not need to be exported are declared static.Splint checks that all declared variables and functions are defined (controlled by compdef).
Declarations of functions and variables that are defined in an external library, may be preceded by
/*@external@*/ to suppress undefined declaration errors.Splint reports external declarations that are unused (controlled by topuse). Which declarations are
reported also depends on the declaration use flags (Section 13.1). The +partial flag sets flags for
checking a partial system. Top-level unused declarations, undefined declarations, and unnecessary
external names are not reported if +partial is set.
Thanks for your help guys, at least avr-gcc takes out the unused functions which is a great help!
/doc/binutils/ld.html/Options.html"]-q
--emit-relocs
Leave relocation sections and contents in fully linked executables. Post link analysis and optimization tools may need this information in order to perform correct modifications of executables. This results in larger executables.
Not trivial but homebrewable.
JW
Not trivial
Relocation section '.rela.text' at offset 0x8e8 contains 36 entries: Offset Info Type Sym.Value Sym. Name + Addend 00000078 00003312 R_AVR_CALL 00000000 __vector_default + 0 00000000 00002812 R_AVR_CALL 00000054 __init + 0 00000004 00001b12 R_AVR_CALL 00000078 __vector_1 + 0 00000008 00003b12 R_AVR_CALL 00000078 __vector_2 + 0 0000000c 00002312 R_AVR_CALL 00000078 __vector_3 + 0 00000010 00003812 R_AVR_CALL 00000078 __vector_4 + 0 00000014 00003412 R_AVR_CALL 00000078 __vector_5 + 0 00000018 00002112 R_AVR_CALL 00000078 __vector_6 + 0 0000001c 00002d12 R_AVR_CALL 00000078 __vector_7 + 0 00000020 00004212 R_AVR_CALL 00000078 __vector_8 + 0 00000024 00003a12 R_AVR_CALL 00000078 __vector_9 + 0 00000028 00004612 R_AVR_CALL 00000078 __vector_10 + 0 0000002c 00002712 R_AVR_CALL 00000078 __vector_11 + 0 00000030 00001e12 R_AVR_CALL 00000078 __vector_12 + 0 00000034 00002a12 R_AVR_CALL 00000078 __vector_13 + 0 00000038 00004512 R_AVR_CALL 00000078 __vector_14 + 0 0000003c 00003c12 R_AVR_CALL 00000078 __vector_15 + 0 00000040 00004712 R_AVR_CALL 00000078 __vector_16 + 0 00000044 00002b12 R_AVR_CALL 00000078 __vector_17 + 0 00000048 00004812 R_AVR_CALL 00000078 __vector_18 + 0 0000004c 00002c12 R_AVR_CALL 00000078 __vector_19 + 0 00000050 00004912 R_AVR_CALL 00000078 __vector_20 + 0 00000058 00003f06 R_AVR_LO8_LDI 0000045f __stack + 0 0000005a 00003f07 R_AVR_HI8_LDI 0000045f __stack + 0 00000070 00003712 R_AVR_CALL 0000007c main + 0 00000074 00004312 R_AVR_CALL 00000092 exit + 0 00000060 00002607 R_AVR_HI8_LDI 00800061 __bss_end + 0 00000062 00003606 R_AVR_LO8_LDI 00800060 __bss_start + 0 00000064 00003607 R_AVR_HI8_LDI 00800060 __bss_start + 0 00000066 00000103 R_AVR_13_PCREL 00000000 .text + 6a 0000006a 00002606 R_AVR_LO8_LDI 00800061 __bss_end + 0 0000006e 00000102 R_AVR_7_PCREL 00000000 .text + 68 0000007c 00003112 R_AVR_CALL 00000088 urrgle + 0 00000080 00000103 R_AVR_13_PCREL 00000000 .text + 7c 0000008c 00000112 R_AVR_CALL 00000000 .text + 82 00000094 00000103 R_AVR_13_PCREL 00000000 .text + 94
within that notice the R_AVR_CALL entries. Now I modify the code so urrgle() doesn't call gurrgle()...
Relocation section '.rela.text' at offset 0x8dc contains 35 entries: Offset Info Type Sym.Value Sym. Name + Addend 00000078 00003312 R_AVR_CALL 00000000 __vector_default + 0 00000000 00002812 R_AVR_CALL 00000054 __init + 0 00000004 00001b12 R_AVR_CALL 00000078 __vector_1 + 0 00000008 00003b12 R_AVR_CALL 00000078 __vector_2 + 0 0000000c 00002312 R_AVR_CALL 00000078 __vector_3 + 0 00000010 00003812 R_AVR_CALL 00000078 __vector_4 + 0 00000014 00003412 R_AVR_CALL 00000078 __vector_5 + 0 00000018 00002112 R_AVR_CALL 00000078 __vector_6 + 0 0000001c 00002d12 R_AVR_CALL 00000078 __vector_7 + 0 00000020 00004212 R_AVR_CALL 00000078 __vector_8 + 0 00000024 00003a12 R_AVR_CALL 00000078 __vector_9 + 0 00000028 00004612 R_AVR_CALL 00000078 __vector_10 + 0 0000002c 00002712 R_AVR_CALL 00000078 __vector_11 + 0 00000030 00001e12 R_AVR_CALL 00000078 __vector_12 + 0 00000034 00002a12 R_AVR_CALL 00000078 __vector_13 + 0 00000038 00004512 R_AVR_CALL 00000078 __vector_14 + 0 0000003c 00003c12 R_AVR_CALL 00000078 __vector_15 + 0 00000040 00004712 R_AVR_CALL 00000078 __vector_16 + 0 00000044 00002b12 R_AVR_CALL 00000078 __vector_17 + 0 00000048 00004812 R_AVR_CALL 00000078 __vector_18 + 0 0000004c 00002c12 R_AVR_CALL 00000078 __vector_19 + 0 00000050 00004912 R_AVR_CALL 00000078 __vector_20 + 0 00000058 00003f06 R_AVR_LO8_LDI 0000045f __stack + 0 0000005a 00003f07 R_AVR_HI8_LDI 0000045f __stack + 0 00000070 00003712 R_AVR_CALL 0000007c main + 0 00000074 00004312 R_AVR_CALL 0000008e exit + 0 00000060 00002607 R_AVR_HI8_LDI 00800061 __bss_end + 0 00000062 00003606 R_AVR_LO8_LDI 00800060 __bss_start + 0 00000064 00003607 R_AVR_HI8_LDI 00800060 __bss_start + 0 00000066 00000103 R_AVR_13_PCREL 00000000 .text + 6a 0000006a 00002606 R_AVR_LO8_LDI 00800061 __bss_end + 0 0000006e 00000102 R_AVR_7_PCREL 00000000 .text + 68 0000007c 00003112 R_AVR_CALL 00000082 urrgle + 0 00000080 00000103 R_AVR_13_PCREL 00000000 .text + 7c 00000090 00000103 R_AVR_13_PCREL 00000000 .text + 90
So the giveaway clue in that was the difference:
0000008c 00000112 R_AVR_CALL 00000000 .text + 82
Only by now analysing the .map do I actually find what is at 0x0082:
.text 0x00000082 0xc empty.o 0x00000088 gurrgle 0x00000082 urrgle
But it's "gurrgle" at 0x0088 that is the thing that is not called (the "dead function"). So I think this is going to be decidedly "not trivial" !
objdump appears to output a more civil version of the same information. Could you please give it a try?
Jan
Sadly no help - I used the readelf rather than objdump output above simply because one seems to be a superset of the other. The objdump output for the case with the called function is:
=============================================================================== test.elf: file format elf32-avr RELOCATION RECORDS FOR [.text]: OFFSET TYPE VALUE 00000078 R_AVR_CALL __vector_default 00000000 R_AVR_CALL __init 00000004 R_AVR_CALL __vector_1 00000008 R_AVR_CALL __vector_2 0000000c R_AVR_CALL __vector_3 00000010 R_AVR_CALL __vector_4 00000014 R_AVR_CALL __vector_5 00000018 R_AVR_CALL __vector_6 0000001c R_AVR_CALL __vector_7 00000020 R_AVR_CALL __vector_8 00000024 R_AVR_CALL __vector_9 00000028 R_AVR_CALL __vector_10 0000002c R_AVR_CALL __vector_11 00000030 R_AVR_CALL __vector_12 00000034 R_AVR_CALL __vector_13 00000038 R_AVR_CALL __vector_14 0000003c R_AVR_CALL __vector_15 00000040 R_AVR_CALL __vector_16 00000044 R_AVR_CALL __vector_17 00000048 R_AVR_CALL __vector_18 0000004c R_AVR_CALL __vector_19 00000050 R_AVR_CALL __vector_20 00000058 R_AVR_LO8_LDI __stack 0000005a R_AVR_HI8_LDI __stack 00000070 R_AVR_CALL main 00000074 R_AVR_CALL exit 00000060 R_AVR_HI8_LDI __bss_end 00000062 R_AVR_LO8_LDI __bss_start 00000064 R_AVR_HI8_LDI __bss_start 00000066 R_AVR_13_PCREL .text+0x0000006a 0000006a R_AVR_LO8_LDI __bss_end 0000006e R_AVR_7_PCREL .text+0x00000068 0000007c R_AVR_CALL urrgle 00000080 R_AVR_13_PCREL .text+0x0000007c 0000008c R_AVR_CALL .text+0x00000082 00000094 R_AVR_13_PCREL .text+0x00000094
(I padded the width with an === spacer so it's easier to compare this with the readelf output above)
I take that back. This is definitively nontrivial :-(
JW
avr-nm -f posix -l fred.elf
might be useful.
avr-nm -f posix -l fred.elf
might be useful.
gurrgle T 00000072 00000006 D:\test/empty.c:11 main T 0000006c 00000006 D:\test/test.c:11 urrgle T 00000078 0000000a D:\test/empty.c:6
and when the call is not made:
gurrgle T 00000078 00000006 D:\test/empty.c:11 main T 0000006c 00000006 D:\test/test.c:11 urrgle T 00000072 00000006 D:\test/empty.c:6
So unfortunately that does not look like it helps to prove that gurrgle() isn't called.
Regardless of the technicalities, there is one more fundamental issue with post-compiling approach: it appears to be undistinguishable, whether a function which is never called in the binaries is inlined (thus cannot be removed from sources but can be removed from binary by making it static), or is never called in the sources (thus can be removed from the source altogether).
JW
CodeVision reports any uncalled functions and whinges.
They are automatically removed from the link.
From distant memory of linkers wot I wrote in the 1980's, you can easily determine unreferenced globals. However it depends on how the Compiler has generated the object file. For example 'gurrgle' may appear as a .globl but only a pc-relative reference is made in its own source module.
I occasionally use an IAR compiler for an obsolete Mitsubishi 740 series cpu. The IAR linker helpfully does not report undefined references correctly. So I post process the objects and the map file with some sed-scripts.
In an ideal world, the linker just removes all un-referenced blocks of code. Failing that, you need to post-process your object modules or map file. You #if the offending functions in your source code.
Jan's suggestion of doing it all with objdump seems a good one. Use sed to place the relocs in one file, the entries in another file. Sort and Uniq both files, and finally Diff them.
But knowing GCC, there is probably a command-line switch for doing all of this in the first place.
David.
CodeVision reports any uncalled functions and whinges.
They are automatically removed from the link.
The CV behaviour is very like GCC's fairly recently added "-whole-program" option which then gives the compiler the freedom to throw away uncalled functions like CV does but it does this in the same way as CV and that's by compiling ALL the .c files at the same time.
If the OP used -whole-program then he would get what he was looking for and no code generation for unaccessed items.
As I wrote above, now I think post-compiler/post-linker is not the right place to do this thing.
I went through the gcc switches and -fdump-ipa-cgraph produces interesting files (except when -O0 is used, unfortunately). Similarly -fdump-tree-xxx-yyy are very interesting switches; but all those dumps are not that easy to interpret.
I also tried to play with http://savannah.gnu.org/projects... , but that chokes on typedefs (anyway using a different tool than the native to re-parse code is IMHO a bad idea to start from - that's also one of my objections to the *lint family although that's a slithgly different case).
Jan
Jan,
Is -fwhole-program not the solution?
Cliff
EDIT: One of the prior threads on this subject:
Cliff,
CV has used linkable objects since v2.0 i.e. for the last year or two.
It really should be quite simple for the linker to reject unreferenced blocks. The downside is that things like version strings get removed too.
Even if it links blindly, reporting 'possibly' unreferenced entry points should be easy.
I may investigate either my will to live, or mow my lawn. (after I have had some lunch)
David.
I may investigate either my will to live, or mow my lawn
So, just start a new hobby project. :wink:
Get eg javaCC. It's a compiler-compiler, akin to eg Flex&Bison, Lex&YACC etc. The big difference is that is seems to be more easy to get going for people who don't spend 25 hrs a day writing compilers.
On the JavaCC site there is a link to a repository of language definitions, and there is one file with a definition for C. I know that in that file there is a production (language rule) for a function definition because I've seen it. There should be a production for a function call, although I have not stumbled over it.
Now, in a JavaCC language definition you can inject ordinary Java code to be executed when a production matches. So now you set up a simple symbol table (simple because this is C, not C++, and you don't need to handle function overloading, and name mangling/decoration).
In the function definition production you add functions to the symtab. Any function definition is uniquely identified by the source file name and the function name. You also need to detect if it is static. Slam this information into a symbol table together with a flag indicating if it is called (initially set to false, or zero).
When the function call production matches you go into the lookup table and try to locate the function by it's name. There are details on handling of static, and there are details on detecting multiple matches, that we can talk about if anyone actually becomes interested enough in this my rant.
I won't do the work - if I find a cohesive block of time to hobby-code I have more pressing obligations (read: "the C/C++ demonstrator") - but I am willing to toss in any advice I can give.
Is -fwhole-program not the solution?
Is there a way for avr-gcc to tell me (perhaps give me a warning) after compilation, that this-and-that function/variable is not used?
And I got curious (as David said, bad combination - a lawn to mow too...)
Jan
Solution to what?
Quote:
Solution to what?
Indirectly the OP's question. You build the project -fwhole-program, the unused functions are dumped, you now use avr-nm, the .sym file (same thing) or the .map file to find out which function ARE included. Any not listed are uncalled.
Still, without trying, I suspect, this does not tell whether functions can be removed from the source or just made static.
Jan
I suppose one could always parse the .lss file.
There may be something in that. The functions that exist are (looking for lines with ">:" :
0000006c: .. 00000072 : .. 00000078 :
But the word "call" just occurs:
60: 0e 94 36 00 call 0x6c ; 0x6c.. 6c: 0e 94 39 00 call 0x72 ; 0x72
There's no "call ....
HOWEVER if I do invoke the function but remove the "__attribute__((noinline))" then I get:
0000006c: .. 00000072 : .. 0000007a :
and (looking for "call" and
60: 0e 94 36 00 call 0x6c ; 0x6c.. 6c: 0e 94 39 00 call 0x72 ; 0x72
So I'm afraid that this fails too because gurrgle() was called but, because it was inlined, it's not easy to spot that it was used.
So I'm afraid that this fails too because gurrgle() was called but, because it was inlined, it's not easy to spot that it was used.
JW
maybe there is something in the dwarf info
00000072: #include "test.h" void gurrgle(void); void urrgle(void) { PORTB = 0xFF; 72: 8f ef ldi r24, 0xFF ; 255 74: 88 bb out 0x18, r24 ; 24 gurrgle(); } void gurrgle(void) { PORTD = 0xFF; 76: 82 bb out 0x12, r24 ; 18 void gurrgle(void); void urrgle(void) { PORTB = 0xFF; gurrgle(); } 78: 08 95 ret
It's annotated the OUT to 0x12 (which comes from the body of gurrgle() ) in the middle of urrgle() but I always find such .lss annotation to be "all over the place" so whether you can determine the "rules" as to what's going on and hence reverse engineer it is another question. Simplistically, as a human, when I first read that it starts by looking like separate urrgle() and gurrgle() and it's only the fact that there's no "RET" in the middle that gives the game away.
Okay, so now up to that 250+ pages dwarf specs.... :-(
JW
Jan,
Remember that the DWARF2 stuff is only there when the code is built with -g (and with the dwarf-2 option). OTOH I guess almost all avr-gcc programs are built with -g (except for the ones that aren't of course!)
Remember that the DWARF2 stuff is only there when the code is built with -g (and with the dwarf-2 option). OTOH I guess almost all avr-gcc programs are built with -g (except for the ones that aren't of course!)
The question is how to extract the "function is called/inlined" information.
JW
It's annotated the OUT to 0x12 (which comes from the body of gurrgle() ) in the middle of urrgle() but I always find such .lss annotation to be "all over the place" so whether you can determine the "rules" as to what's going on and hence reverse engineer it is another question. Simplistically, as a human, when I first read that it starts by looking like separate urrgle() and gurrgle() and it's only the fact that there's no "RET" in the middle that gives the game away.