Finding value in .hex file

Go To Last Post
17 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi, at a certain point in code I have the following lines:

 

#define FIRMWARE_VERSION_BYTE_0 6517
#define FIRMWARE_VERSION_BYTE_1 9108
#define FIRMWARE_VERSION_BYTE_2 4167

 

The numbers are meaningless and intentionally large so that I could locate them more easily using hex editor (on Windows - I've tried HxD, HexEdit and FlexHex), however, I had no luck yet. So the question is - how do I locate these bytes in hex file? I also don't know if I should search them in .hex or .bin - confusing.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Just making #define's does not embed them in the code?!? Now if you had something like:

#define FIRMWARE_VERSION_BYTE_0 6517
#define FIRMWARE_VERSION_BYTE_1 9108
#define FIRMWARE_VERSION_BYTE_2 4167

const struct {
    uint16_t ver_0;
    uint16_t ver_1;
    uint16_t ver_2;
} verinfo = {
    FIRMWARE_VERSION_BYTE_0,
    FIRMWARE_VERSION_BYTE_1,
    FIRMWARE_VERSION_BYTE_2
};

int main(void) {

}

then, yes you would get this:

$ avr-gcc -mmcu=atmega16 -Os avr.c -o avr.elf
$ avr-objcopy -j .text -j .data -O ihex avr.elf avr.hex
$ cat avr.hex
:100000000C942A000C943F000C943F000C943F0089
:100010000C943F000C943F000C943F000C943F0064
:100020000C943F000C943F000C943F000C943F0054
:100030000C943F000C943F000C943F000C943F0044
:100040000C943F000C943F000C943F000C943F0034
:100050000C943F0011241FBECFE5D4E0DEBFCDBF1E
:1000600010E0A0E6B0E0E8E8F0E002C005900D92F4
:10007000A636B107D9F70E9441000C9442000C94B7
:0800800000000895F894FFCF81
:06008800751994234710D6
:00000001FF

Where 0x1975 = 6,517 = "byte 0", 0x2394 = 9,108 = "byte 1" and 0x0D47 = 3,399  = "byte 2"

 

EDIT: forgot to make the point that what you actually see in the hex for something like 6517 (0x1975) is the 75 followed by the 19 and not in 19 then 75 order because the encoding is "little endian".

 

Oh and you called your defines "BYTE_0", "BYTE_1" and so on. Byte means "8 bits" and you cannot get numbers above 255 into a byte. That is why I stored your values into uint16_t's in my example code as a 16 bit variable can hold 0..65535. But in this case "_BYTE_" is a misleading misnomer.

Last Edited: Thu. Jun 30, 2016 - 10:33 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
Just making #define's does not embed them in the code?!?

 

Ah yes, that thought has occurred to me but I ignored it, foolishly. Thanks a lot!

 

EDIT:

 

clawson wrote:
Oh and you called your defines "BYTE_0", "BYTE_1" and so on. Byte means "8 bits" and you cannot get numbers above 255 into a byte. That is why I stored your values into uint16_t's in my example code as a 16 bit variable can hold 0..65535. But in this case "_BYTE_" is a misleading misnomer.

 

I am fully aware of that - those values are bytes, normally, but I intentionally put larger values here to make them easier to find.

Last Edited: Thu. Jun 30, 2016 - 10:40 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Shantea wrote:
but I intentionally put larger values here to make them easier to find.

For that kind of thing I tend to do this:

#define VER_MAJOR 1
#define VER_MINOR 3
#define VER_REV 67 //aka 0x43

const struct {
    uint32_t cookie;
    uint8_t ver_major;
    uint8_t ver_minor;
    uint8_t revision;
} verinfo = {
    0xBABEFACE,
    VER_MAJOR,
    VER_MINOR,
    VER_REV
};

Where 0xBABEFACE is easily recognisable. However, because of the endianism you would probably do:

} verinfo = {
    0xCEFABEBA,
    VER_MAJOR,
    VER_MINOR,
    VER_REV
};

which makes little sense in the source but will come out looking right in the binary...

:08008800BABEFACE01034300E9

Of course another way to do this is:

#include <avr/io.h>

const char ver_string[] = __DATE__ " " __TIME__;

int main(void) {
}

which doesn't look too promising at first in the .hex:

:0800800000000895F894FFCF81
:100088004A756E20333020323031362031323A30E2
:06009800313A3035000092
:00000001FF

but if you convert to bin and dump it all is revealed...

$ avr-objcopy -I ihex -O binary avr.hex avr.bin
$ hexdump -C avr.bin
00000000  0c 94 2a 00 0c 94 3f 00  0c 94 3f 00 0c 94 3f 00  |..*...?...?...?.|
00000010  0c 94 3f 00 0c 94 3f 00  0c 94 3f 00 0c 94 3f 00  |..?...?...?...?.|
*
00000050  0c 94 3f 00 11 24 1f be  cf e5 d4 e0 de bf cd bf  |..?..$..........|
00000060  10 e0 a0 e6 b0 e0 e8 e8  f0 e0 02 c0 05 90 0d 92  |................|
00000070  a6 37 b1 07 d9 f7 0e 94  41 00 0c 94 42 00 0c 94  |.7......A...B...|
00000080  00 00 08 95 f8 94 ff cf  4a 75 6e 20 33 30 20 32  |........Jun 30 2|
00000090  30 31 36 20 31 32 3a 30  31 3a 30 35 00 00        |016 12:01:05..|
0000009e

To do this you would want the file that contains the __DATE__/__TIME__ string to be rebuilt every time. So perhaps put it in a separate file and use something like a "touch" rule in the Makefile to ensure that make sees it as out of date every time it builds.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Probably a safer way to find your variables would be to look up their addresses in the map file, and then find those addresses in the hex file ... 

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
Just making #define's does not embed them in the code?!?

Shantea wrote:
Ah yes, that thought has occurred to me 

You do understand why it is so - don't you ... ?

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

In the past I've used "what strings". That is a string that you embed in to a binary that has an "@(#)" prefix then there is a Linux tool called "what" so you can say "what foo.bin" and it will report any @(#) text that is embedded such as "Wideget driver 3.7.213" or whatever. Rather sadly Google ignore @(#) completely, even if you quote it, so trying to google for "what strings "@(#)"" simply tells you about the best guitar strings money can buy :-(

 

(the ide of "what strings" originated with one of the earliest revision control systems but try as I might I can't google/remember which one it was! With hindsight calling something "what" probably wasn't a great idea as it now turns out almost impossible to google for it!)

 

EDIT: turns out it was worth mentioning this as it made me look for search engines that WILL search for things like @ and # (apparently Google treats these only as "Twitter characters"). It led to:

 

www.symbolhound.com

 

That then took me straight to:

 

http://stackoverflow.com/questio...

http://stackoverflow.com/questio...

 

which reminds me that the name of the ancient old revision control system I was looking for was "SCCS". That is what provided the "what" utility.

Last Edited: Thu. Jun 30, 2016 - 11:49 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

awneil wrote:

 

clawson wrote:

Just making #define's does not embed them in the code?!?

 

 

 

Shantea wrote:

Ah yes, that thought has occurred to me 

 

 

You do understand why it is so - don't you ... ?

 

Yes, because defines are just symbolic representation of a value. No use of value = no showing up in compiled binary.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
To do this you would want the file that contains the __DATE__/__TIME__ string to be rebuilt every time. So perhaps put it in a separate file and use something like a "touch" rule in the Makefile to ensure that make sees it as out of date every time it builds.

 

How do I do this in Atmel Studio (6.2)?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Shantea wrote:
How do I do this in Atmel Studio (6.2)?

Well the embedding of he string is easy. It's exactly as I showed above:

#include <avr/io.h>

const char ver_string[] = __DATE__ " " __TIME__;

int main(void) {

In fact you probably don't want to waste RAM on that so in fact you should use:

#include <avr/io.h>
#include <avr/pgmspace.h>

const char ver_string[] PROGMEM = __DATE__ " " __TIME__;

int main(void) {

or if your Stuido/C compiler is late enough the new:

#include <avr/io.h>

const __flash char ver_string[] = __DATE__ " " __TIME__;

int main(void) {

That gets the date/time *into* the binary. As I showed above the "trick" is then getting it back out in visible form. In Studio you can setup a "post build step" that issues the command:

avr-objcopy -I ihex -O binary $(name of).hex $(name of).bin

which will create a .bin copy of the hex. "$(name of)" is not valid syntax in this but Atmel (really Microsoft) do provide some project specific metavariables that hold things like the name of the output directory and the name of the final, generated .hex file. So use those.

 

Of course that just gets it as far as .bin. To actually "see" it you need to load that .bin file into some kind of hex dumper/editor (I have the luxury of using Linux so for me things like hexdump and hexedit are simply endemic). This is where that "what" technology I mentioned above could come into play. One could actually make the string:

const char ver_string[] = "@(#)" __DATE__ " " __TIME__;

which prepends a recognisable sequence like "@(#)" onto the front. Then one could either take what/what.exe from SCCS and use that to display these though it's probably easier to just write it as a 10 line C program from scratch. Then have AS6/7 post build commands invoke that utility on the generated .bin file.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:

Just making #define's does not embed them in the code?!? Now if you had something like:

#define FIRMWARE_VERSION_BYTE_0 6517
#define FIRMWARE_VERSION_BYTE_1 9108
#define FIRMWARE_VERSION_BYTE_2 4167

const struct {
    uint16_t ver_0;
    uint16_t ver_1;
    uint16_t ver_2;
} verinfo = {
    FIRMWARE_VERSION_BYTE_0,
    FIRMWARE_VERSION_BYTE_1,
    FIRMWARE_VERSION_BYTE_2
};

int main(void) {

}

then, yes you would get this:

$ avr-gcc -mmcu=atmega16 -Os avr.c -o avr.elf
$ avr-objcopy -j .text -j .data -O ihex avr.elf avr.hex
$ cat avr.hex
:100000000C942A000C943F000C943F000C943F0089
:100010000C943F000C943F000C943F000C943F0064
:100020000C943F000C943F000C943F000C943F0054
:100030000C943F000C943F000C943F000C943F0044
:100040000C943F000C943F000C943F000C943F0034
:100050000C943F0011241FBECFE5D4E0DEBFCDBF1E
:1000600010E0A0E6B0E0E8E8F0E002C005900D92F4
:10007000A636B107D9F70E9441000C9442000C94B7
:0800800000000895F894FFCF81
:06008800751994234710D6
:00000001FF

 

I wonder why is the hex output from Atmel Studio so different from yours. I've created new project (C) for mega16, just like you, built the solution and this is what I get (source is exactly the same as yours above):

 

:100000000C942A000C943F000C943F000C943F0089
:100010000C943F000C943F000C943F000C943F0064
:100020000C943F000C943F000C943F000C943F0054
:100030000C943F000C943F000C943F000C943F0044
:100040000C943F000C943F000C943F000C943F0034
:100050000C943F0011241FBECFE5D4E0DEBFCDBF1E
:1000600010E0A0E6B0E0ECE8F0E002C005900D92F0
:10007000A036B107D9F70E9441000C9444000C94BB
:0C008000000080E090E00895F894FFCFAD
:00000001FF

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'm (a) using Linux (though for exactly the same version of compiler and same inputs to the build that shouldn't actually matter) and (b) using gcc 4.5.3 - in Studio you likely have a 4.8 or a 4.9 and (c) I invoked the compiler/linker and objcopy on the command line using the absolute barest essentials to make the code build. Studio passes all kinds of other build options. In particular it use -fdata-sections and -gc-sections which will mean that any "unaccesed" data will be expunged from the build output. I imagine that's the main reason your output is so different.

 

If I build like this (to mimic Studio):

~$ avr-gcc -mmcu=atmega16 -fdata-sections -Wl,-gc-sections -Wl,-print-gc-sections -Os avr.c -o avr.elf
/usr/lib/gcc/avr/4.5.3/../../../avr/bin/ld: Removing unused section '.rodata.ver_string' in file '/tmp/ccFvh4ef.o'
$ avr-objcopy -O binary -j .text -j .data avr.elf avr.bin
$ hexdump -C avr.bin
00000000  0c 94 2a 00 0c 94 3f 00  0c 94 3f 00 0c 94 3f 00  |..*...?...?...?.|
00000010  0c 94 3f 00 0c 94 3f 00  0c 94 3f 00 0c 94 3f 00  |..?...?...?...?.|
*
00000050  0c 94 3f 00 11 24 1f be  cf e5 d4 e0 de bf cd bf  |..?..$..........|
00000060  10 e0 a0 e6 b0 e0 e8 e8  f0 e0 02 c0 05 90 0d 92  |................|
00000070  a0 36 b1 07 d9 f7 0e 94  41 00 0c 94 42 00 0c 94  |.6......A...B...|
00000080  00 00 08 95 f8 94 ff cf                           |........|
00000088

then I don't get date/time in the output either. As you can see the linker discarded ver_string.

 

The simple solution may be to build the version string in its own .c file and don't apply -fdata-sections to that particular file.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I appreciate the answer a lot, you were correct - unchecking "Garbage collect unused sections (-Wl,--gc-sections)" did the trick.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Well yes and no. But that is a bit of a sledgehammer to crack a nut solution. It's well worth building code with -fdata-sections and -ffunction-sections (especially as it gets more complex) and they only work when -gc-sections is passed to the linker. As I say it's really just that one string that you don't want garbage collected. I did think you could achieve that using -Wl,-u,ver_string (or whatever it is called) but when I try an experiment to do that here it still gets discarded so I'm not sure why because in theory it would make for the "easy solution" to keep unreferenced data in the binary.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Cliff -

 

This has been a most enlightening discussion. Where does one find out about "sections" and how they are used in gcc? Guess that would be the gcc manual but I was hoping more for a tutorial with a practical discussion on use and abuse, thereof. Nothing found in K&R!

 

Jim

Jim Wagner Oregon Research Electronics, Consulting Div. Tangent, OR, USA http://www.orelectronics.net

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Well the operation of the GCC compiler is documented in its manual:

 

https://gcc.gnu.org/onlinedocs/g...

 

That is the 4.8.2 manual which is probably the one you are using. If not replace 4.8.2 in that URL with your own version. The stuff about fdata-sections is here:

 

https://gcc.gnu.org/onlinedocs/g...

 

the stuff about ffunction-sections is here:

 

https://gcc.gnu.org/onlinedocs/g...

 

Effectively what they do is change:

int variable = 12345;

void func(void) {
}

int main(void) {
    func();
}

so that instead of "variable" just being placed into ".data" with all the other variables and func() into ".text" with all the other functions it places variable into .data.varaible and the function into .text.func (in essence).

 

When you come to link the code the linker, who's manual is here:

 

https://sourceware.org/binutils/...

 

documents an option gc-sections here:

 

https://sourceware.org/binutils/...

 

When it's used it keeps count on the number of access to each of the subsections created above (.data.variable and .text.func). It will see that .text.func has one reference (it's called from main) but .data.variable has nothing that references. When it gets to the end of the link it sees if there is anything left with ref=0. If there are any such sections it "garbage collects" them (that's what gc in -gc-sections means). That means it discards them. In the above I used another linker option "-print-gc-sections" and that gets it to print a list of any sections it discards. If you ask for a .map file there is, equally, a section in that where the discarded sections are listed.

 

For a while Atmel have been creating GCC projects in Studio that pass -fdata-sections and -ffunction-sections to the compiler and -gc-sections to the linker. So any un-accessed data or functions will be discarded.

 

As for learning about this - I know it's a bit sad but from time to time I read the compiler and linker manuals and specifically the list of command line options and see if anything new has been added.

 

In recent years (well not that recent) there was -whole-program which was kind of interesting and more recently there is -flto (link time optimization) which can perform optimizations once all the .o files are being linked together (previously optimization was only on a per file basis as each file was compiled). This would allow (for example) common code sequences to be spotted being used in multiple .c/.o and arrange to keep just one copy that they all called but there are many other ways it might optimize things. It's perhaps early days for that one though.

 

If you are as sad as me (and I think several other readers here I'm thinking of) then have a read through these manuals from time to time.

 

There are whole swathes of things I've never explored even to this day - such as exactly what can be achieve with linker scripts. Well, I guess you have to keep something to play with and lean about! :-)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks for that, Cliff. As you say, "you have to keep something to play with and learn about! :-)" I have spent far too little time with that document!

 

Jim

Jim Wagner Oregon Research Electronics, Consulting Div. Tangent, OR, USA http://www.orelectronics.net