avr-objdump SFR register names?

Go To Last Post
52 posts / 0 new

Pages

Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi all,

 

I was wondering if it's possible to get SFR register names instead of their address in a avr-objdump .lss file?

For example (ATtiny85):

0:	11 24       	eor	r1, r1
2:	1f be       	out	SREG, r1	; 63

instead of:

0:	11 24       	eor	r1, r1
2:	1f be       	out	0x3f, r1	; 63

And while we're at it, the GCC register names as well (like zero_reg instead of r1)?

 

I've googled but couldn't find anything that worked for me (objdump v2.24).

It seems the -M reg-names=avr25 option (resp. -M reg-names-gcc) should work but it does not.

 

Thanks for your help!

This topic has a solution.

ɴᴇᴛɪᴢᴇᴎ

Last Edited: Sat. Mar 12, 2016 - 02:04 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

To achieve what you ask you'd have to redefine all the SFRs as linker assigned global symbols. At present they are prepro macros. Those don't even get as far as the C compiler let alone the linker. 

 

Without blowing my own trumpet can I suggest you explore "avr-source" and study the .s files? 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks for your answer clawson.

I'm not sure I fully understand it though. Admittedly, I feel a bit out of my league here. :-p

 

I understand you're mentioning the preprocessor macros, as defined for example in common.h:

/* Status Register */
#ifndef SREG
#  if __AVR_ARCH__ >= 100
#    define SREG _SFR_MEM8(0x3F)
#  else
#    define SREG _SFR_IO8(0x3F)
#  endif
#endif

However I'm not sure what .s file you are suggesting me to look into…

 

Let me add some details to make sure I explained myself correctly.

It seems everything I need is in the ELF file already:

$ avr-objdump -x -d test.elf
...
SYMBOL TABLE:
00000000 l    d  .text    00000000 .text
00800060 l    d  .bss    00000000 .bss
00000000 l    d  .comment    00000000 .comment
00000000 l    df *ABS*    00000000 ccv3ijzT.ltrans0.o
0000003e l       *ABS*    00000000 __SP_H__
0000003d l       *ABS*    00000000 __SP_L__
0000003f l       *ABS*    00000000 __SREG__
00000000 l       *ABS*    00000000 __tmp_reg__
00000001 l       *ABS*    00000000 __zero_reg__
...
00000000 <main>:
   0:    11 24           eor    r1, r1
   2:    1f be           out    0x3f, r1    ; 63

Why can't I see __SREG__ instead of 0x3f?

ɴᴇᴛɪᴢᴇᴎ

Last Edited: Sun. Mar 6, 2016 - 03:50 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Observe:

$ cat avr.c
#include <avr/io.h>


int main(void) {
	DDRB = 0xFF;
	while(1) {
		PORTB ^= 0xFF;
	}
}
$ ~/windows/avr8-gnu-toolchain-linux_x86_64/bin/avr-gcc -c -mmcu=atmega16 -gdwarf-2 -Os -save-temps avr.c
$ head avr.s -n 35
	.file	"avr.c"
__SP_H__ = 0x3e
__SP_L__ = 0x3d
__SREG__ = 0x3f
__tmp_reg__ = 0
__zero_reg__ = 1
	.text
.Ltext0:
	.cfi_sections	.debug_frame
	.section	.text.startup,"ax",@progbits
.global	main
	.type	main, @function
main:
.LFB0:
	.file 1 "avr.c"
	.loc 1 4 0
	.cfi_startproc
/* prologue: function */
/* frame size = 0 */
/* stack size = 0 */
.L__stack_usage = 0
	.loc 1 5 0
	ldi r24,lo8(-1)
	out 0x17,r24
.L2:
	.loc 1 7 0 discriminator 1
	in r24,0x18
	com r24
	out 0x18,r24
	.loc 1 8 0 discriminator 1
	rjmp .L2
	.cfi_endproc
.LFE0:
	.size	main, .-main
	.text

Now at first sight that does not look too promising as a way of reading the ASM code of the AVR. You are still just seeing:

	ldi r24,lo8(-1)
	out 0x17,r24

for things like:

	DDRB = 0xFF;

and that 0x17 doesn't tell you much. But now take a look at my utility at:

 

https://spaces.atmel.com/gf/proj...

 

and run that on the code:

$ ./avr-source avr.s
file 1 = avr.c
file 2 = /home/uid23021/windows/avr8-gnu-toolchain-linux_x86_64/avr/include/stdint.h
$ cat avr.source.s 
	.file	"avr.c"
__SP_H__ = 0x3e
__SP_L__ = 0x3d
__SREG__ = 0x3f
__tmp_reg__ = 0
__zero_reg__ = 1
	.text
.Ltext0:

	.section	.text.startup,"ax",@progbits
.global	main
	.type	main, @function
main:
//==> int main(void) {
/* prologue: function */
/* frame size = 0 */
/* stack size = 0 */
.L__stack_usage = 0
//==> 	DDRB = 0xFF;
	ldi r24,lo8(-1)
	out 0x17,r24
.L2:
//==> 		PORTB ^= 0xFF;
	in r24,0x18
	com r24
	out 0x18,r24
//==> 	}
	rjmp .L2
	.size	main, .-main
	.text

And now you have GCC Assembler source code form the C compiler with source annotation added to it so it's clear what a line like:

//==> 	DDRB = 0xFF;
	ldi r24,lo8(-1)
	out 0x17,r24

is doing. Obviously the 0x17 in this is "DDRB". Similarly in:

//==> 		PORTB ^= 0xFF;
	in r24,0x18
	com r24
	out 0x18,r24

it's clear the 0x18 must be PORTB etc.

 

Of course you should be seeing something similar in the .lss too:

 

$ avr-objdump -S avr.o

avr.o:     file format elf32-avr

Disassembly of section .text.startup:

00000000 <main>:
#include <avr/io.h>


int main(void) {
	DDRB = 0xFF;
   0:	8f ef       	ldi	r24, 0xFF	; 255
   2:	87 bb       	out	0x17, r24	; 23
	while(1) {
		PORTB ^= 0xFF;
   4:	88 b3       	in	r24, 0x18	; 24
   6:	80 95       	com	r24
   8:	88 bb       	out	0x18, r24	; 24
	}
   a:	00 c0       	rjmp	.+0      	; 0xc <__zero_reg__+0xb>

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks again. I did not know your avr-source tool: I thought you were referring to sources of AVR GNU tools… :P

I just grabbed a copy. If I understand you correctly, you're suggesting looking into how it works and adapt it to my ends?

 

Let me add some more details: I'm working from an ELF file (because I use LTO), so it's difficult/impossible to get back to the C source. It does not matter to me though: what I want to look into is the final assembly.

But I'd like to get all the sugar I can, and HW register address translation specially (I don't know them by heart, so that would save time). I just don't understand why I can't, as all the information seems to be there (in the symbol table) and was wondering if I had missed some objdump option that would have triggered it…

ɴᴇᴛɪᴢᴇᴎ

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I just don't understand why I can't, as all the information seems to be there (in the symbol table)

>>Not<< all of it.  Yes, there is:

0000003e l       *ABS*    00000000 __SP_H__
0000003d l       *ABS*    00000000 __SP_L__
0000003f l       *ABS*    00000000 __SREG__
00000000 l       *ABS*    00000000 __tmp_reg__
00000001 l       *ABS*    00000000 __zero_reg__

... but that's it.  Other SFR have no symbols, because they are never defined.

 

Note also that the size of those symbols is zero, and they are not associated with any section (*ABS*).

 

You can probably write a custom tool or script to parse the device header file to extract all of the SFR addresses, generate an .elf (or other format) file containing appropriately named symbols with the extracted address values, then merge that .elf with your project .elf, then use avr-objdump on the merged .elf to achieve the results you seek.  However, I think you'll have some difficulty reconciling the difference between I/O-space SFR addresses and SRAM-mapped addresses, not to mention the different syntax used by the instructions which employ each addressing scheme.

 

I think it likely you'll have to re-implement what AVR Libc has done (as macros) in a different way.  Something like:

$ cat netizen.c
#include <stdint.h>

volatile uint8_t PORTB __attribute__ ((__section__ (".portb")));

int main(void) {

  uint8_t i = 0;

  while (1) {
    PORTB = i++;
  }

}

$ avr-gcc -g -Wall -O1 -Wl,--section-start=.portb=0x800025 netizen.c -o netizen.elf
$ avr-objdump -x -d netizen.elf

netizen.elf:     file format elf32-avr
netizen.elf
architecture: avr, flags 0x00000112:
EXEC_P, HAS_SYMS, D_PAGED
start address 0x00000000

Program Header:
    LOAD off    0x00000074 vaddr 0x00000000 paddr 0x00000000 align 2**1
         filesz 0x0000000a memsz 0x0000000a flags r-x
    LOAD off    0x0000007e vaddr 0x00800025 paddr 0x00800025 align 2**0
         filesz 0x00000001 memsz 0x00000001 flags rw-

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .portb        00000001  00800025  00800025  0000007e  2**0
                  CONTENTS, ALLOC, LOAD, DATA
  1 .text         0000000a  00000000  00000000  00000074  2**1
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  2 .comment      00000030  00000000  00000000  0000007f  2**0
                  CONTENTS, READONLY
  3 .debug_aranges 00000020  00000000  00000000  000000af  2**0
                  CONTENTS, READONLY, DEBUGGING
  4 .debug_info   000000ac  00000000  00000000  000000cf  2**0
                  CONTENTS, READONLY, DEBUGGING
  5 .debug_abbrev 0000007a  00000000  00000000  0000017b  2**0
                  CONTENTS, READONLY, DEBUGGING
  6 .debug_line   0000009f  00000000  00000000  000001f5  2**0
                  CONTENTS, READONLY, DEBUGGING
  7 .debug_frame  00000024  00000000  00000000  00000294  2**2
                  CONTENTS, READONLY, DEBUGGING
  8 .debug_str    0000008b  00000000  00000000  000002b8  2**0
                  CONTENTS, READONLY, DEBUGGING
  9 .debug_loc    0000002c  00000000  00000000  00000343  2**0
                  CONTENTS, READONLY, DEBUGGING
SYMBOL TABLE:
00800025 l    d  .portb	00000000 .portb
00000000 l    d  .text	00000000 .text
00000000 l    d  .comment	00000000 .comment
00000000 l    d  .debug_aranges	00000000 .debug_aranges
00000000 l    d  .debug_info	00000000 .debug_info
00000000 l    d  .debug_abbrev	00000000 .debug_abbrev
00000000 l    d  .debug_line	00000000 .debug_line
00000000 l    d  .debug_frame	00000000 .debug_frame
00000000 l    d  .debug_str	00000000 .debug_str
00000000 l    d  .debug_loc	00000000 .debug_loc
00000000 l    df *ABS*	00000000 netizen.c
0000003e l       *ABS*	00000000 __SP_H__
0000003d l       *ABS*	00000000 __SP_L__
0000003f l       *ABS*	00000000 __SREG__
00000000 l       *ABS*	00000000 __tmp_reg__
00000001 l       *ABS*	00000000 __zero_reg__
00000000 l    df *ABS*	00000000 
00000000 g       .text	00000000 __trampolines_start
0000000a g       .text	00000000 _etext
0000000a g       *ABS*	00000000 __data_load_end
00000000 g       .text	00000000 __trampolines_end
0000000a g       *ABS*	00000000 __data_load_start
00000000 g       .text	00000000 __dtors_end
00810000 g       .text	00000000 __eeprom_end
00800025 g     O .portb	00000001 PORTB
00000000 g       .text	00000000 __ctors_start
00000000 g     F .text	0000000a main
00000000 g       .text	00000000 __dtors_start
00000000 g       .text	00000000 __ctors_end
00800060 g       .text	00000000 _edata
00800060 g       .text	00000000 _end



This would be extremely tedious

 

And the payoff is small, and not without its drawbacks:

Disassembly of section .text:

00000000 <main>:
   0:	80 e0       	ldi	r24, 0x00	; 0
   2:	80 93 25 00 	sts	0x0025, r24
   6:	8f 5f       	subi	r24, 0xFF	; 255
   8:	fc cf       	rjmp	.-8      	; 0x2 <__zero_reg__+0x1>

Despite the existence of a proper symbol for PORTB, the symbol name is not emitted by avr-objdump.  What's more, the compiler did not transform the 2-cycle sts instruction into a 1-cycle out instruction as it normally does.

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

joeymorin wrote:
>>Not<< all of it. Yes, there is:

0000003e l *ABS* 00000000 __SP_H__

0000003d l *ABS* 00000000 __SP_L__

0000003f l *ABS* 00000000 __SREG__

00000000 l *ABS* 00000000 __tmp_reg__

00000001 l *ABS* 00000000 __zero_reg__

... but that's it. Other SFR have no symbols, because they are never defined.

You're right: I was using a minimalistic blinkish app to work on this, so I felt like every register I used was there. But it wasn't the case indeed: there should have been at least PORTB. Thanks!

 

joeymorin wrote:
Note also that the size of those symbols is zero, and they are not associated with any section (*ABS*).

I did notice they had zero size, but did not know that was a problem. Are you sure it indeed is? Because __zero_reg__ also has, yet it is later referenced:

00000000 <main>:
   ...
   8:	fc cf       	rjmp	.-8      	; 0x2 <__zero_reg__+0x1>

… as a jump address!  :]
I thought *ABS* meant absolute, as in "not relative to one section" or "valid everywhere"?

 

joeymorin wrote:
I think it likely you'll have to re-implement what AVR Libc has done (as macros) in a different way.

Alright, I get it now. So that's what clawson was mentioning at the beginning, right?  :p

That'd be tedious indeed. I was just looking for an easy switch to turn on, as seems to be the case with other architectures according to the man page (-M reg-names=...). I never meant to implement it myself (nor implement a workaround), I can do without! It's a pity we AVR users do not benefit from it too, but if someone was to implement it, the above posts would make clear it'd rather not be me!

 

Yet, now I'm getting curious… ^^

I see how your tactic brought PORTB all the way to the symbol table, but it wasn't used as planned on the objdump side. And, as you mentioned, it seemed to have messed with the compiler's ability to use it properly.

Do you have any idea how this feature is implemented for other architectures?

ɴᴇᴛɪᴢᴇᴎ

Last Edited: Sun. Mar 6, 2016 - 08:49 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

as seems to be the case with other architectures according to the man page (-M reg-names=...)

That option applies to machine (cpu) registers (r0-r31), not peripheral registers (PORTB, UDR0, etc).

 

Do you have any idea how this feature is implemented for other architectures?

Nope.  But others around here surely do.  Stick around.

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

netizen wrote:
So that's what clawson was mentioning at the beginning, right?
Yup Joey just did pretty much the exact thing I was thinking.

 

but I'm still a bit confused as to why:

DDRB = 0xFF;
   0:	8f ef       	ldi	r24, 0xFF	; 255
   2:	87 bb       	out	0x17, r24	; 23

in the LSS or:

//==> 	DDRB = 0xFF;
	ldi r24,lo8(-1)
	out 0x17,r24

in my own utility is not sufficient for you. What's the real difference between that and:

   0:	8f ef       	ldi	r24, 0xFF	; 255
   2:	87 bb       	out	DDRB, r24	; 23

? In all cases you can see that 0xFF is being written to DDRB=0x17 can't you?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Both would satisfy me in your use case, actually.

Although (obviously) not as much as explicitly mentioning the actual  register worked on: I'm not looking for C code, I'm just looking for register names instead of their addresses. If you noticed, the assembly I presented as an example belongs to __ctors_end, for which I believe there is no C source code.

This aside, the goal seems rather obvious to me: if I'm setting SREG to 0, I'd like to see that in the disassembled output — whether or not I have some C source code. What particular address is SREG on that particular MCU is just a distraction in this regard (unneeded layer of indirection). If you feel this reasoning is wrong, please elaborate.

 

This being said, avr-objdump -S does output some C source code, but avr-source fails with: Input has no .file debug statements. Not sure why. There are a few intermediary .o files in the middle that perhaps get in the way; Or my linker options are wrong?

 

ɴᴇᴛɪᴢᴇᴎ

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

If you noticed, the assembly I presented as an example belongs to __ctors_end, for which I believe there is no C source code.

Sort of:

https://www.avrfreaks.net/comment/1791296#comment-1791296

 

In short that symbol is part of the C runtime.  It is in fact for C++ code, but since you aren't working with C++ it seems out of place.  In reality, there are number of symbols included in the .elf file which come from the C runtime but which are 'ununsed' by your code.  Since they are 'unused', they are all symbols of zero length.  Since they are all zero length, and since they also describe sections of code which are normally contiguous, they in this case all have the same value.  When avr-objdump reads a file with multiple symbols all sharing the same value, it only emits the last one found in the .elf.  In this case, it is __ctors_end.

 

This aside, the goal seems rather obvious to me: if I'm setting SREG to 0, I'd like to see that in the disassembled output — whether or not I have some C source code. What particular address is SREG on that particular MCU is just a distraction in this regard (unneeded layer of indirection). If you feel this reasoning is wrong, please elaborate.

It doesn't do that for normal data object symbols, so it is unlikely you'll coax it into doing so for SFR object symbols, even if you manage to define them all.

 

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

Last Edited: Mon. Mar 7, 2016 - 04:25 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks for the link. The subject of __init() and the .initX sections really is interesting. Too bad the discussion stopped as soon as the OP's problem was solved…

ɴᴇᴛɪᴢᴇᴎ

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

None of this stuff should be a mystery as avr-gcc is an open source compiler.

 

Some of the stuff about init and the CRT are covered in the manual here:

 

http://www.nongnu.org/avr-libc/u...

 

But more to the point, the source of the CRT is openly available here:

 

http://svn.savannah.nongnu.org/v...

netizen wrote:
but avr-source fails with: Input has no .file debug statements. Not sure why.

That message means exactly what it says. My utility depends on you building with -g which will embed debug info into the Asm which links it to specific lines in the source. That's why my example above has "-gdwarf-2".

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
But more to the point, the source of the CRT is openly available here:

http://svn.savannah.nongnu.org/v...

Nice. I'd seen a couple crt1.S files (from micronucleus and FemtoOs) before, but never this one.

clawson wrote:
netizen wrote:
but avr-source fails with: Input has no .file debug statements. Not sure why.

That message means exactly what it says. My utility depends on you building with -g which will embed debug info into the Asm which links it to specific lines in the source. That's why my example above has "-gdwarf-2".

I saw that: I'm compiling with -gdwarf-2 -save-temps. Doesn't objdump -S also use the same information?

 

EDIT: Don't worry, I'll find out eventually what I'm doing wrong. ^^

ɴᴇᴛɪᴢᴇᴎ

Last Edited: Mon. Mar 7, 2016 - 03:18 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Ah it could be the version of the compiler you are using. I don't think it'll work with much less than 4.8

 

As for objdump -S versus -save-temps and the .s file. You are right that ultimately they are the same AVR opcodes. However the .s is different to the .lss in two principle ways:

 

1) the .s is the assembler source generated by the C compiler. It retains some knowledge of what the C compiler was "thinking" when it generated the code (more so if you include -fverbose-asm). The .lss on the other hand is the built (and usually linked) code put back through a disassembler - it can lose some of the intention of the C compiler by simply being a disassembly of the binary

 

2) the .s file, even when annotated by avr-source is immediately usable as a source file to avr-as. You cannot not that with "avr-objdump -S" output directly. You would need to post process it to remove address info and the hex of the opcodes.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I might have a bit more interest if I understood the purpose of this exercise.  Examining pertinent aspects of build results with .LSS or similar is a "usual" method, deciphering a fragment of interest.

 

It appears to me that OP is on some type of disassembly mission.  Reverse engineering?  But then there is talk of "compiling with ...".  So why not use the human-readable annotated build results provided by the toolchain, rather than taking the end-result Hobbit creatures and attempting to go backwards?

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Lee,

 

I suspect he was intrigued by the mysteries of the CRT which comes pre-built (and where objdump -S or similar would be pointless as links to source would point to the machine the toolchain was built on)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

netizen wrote:
The subject of __init() and the .initX sections really is interesting. Too bad the discussion stopped as soon as the OP's problem was solved…

Let me see if I can find a couple recent threads that might be of interest about the init sections.  One had to do with whether #define or const was "better" -- and we found that even though the compiler/linker optimized references in the code to the const, a null init value copy was included.

https://www.avrfreaks.net/forum/c...

 

I forget the aim of the other thread, but there were examples posted of making you own init section "add ons" and when they were invoked.

 

[I can't find that one; stupid search won't look for "init1"...]

Found the one I was thinking of:  https://www.avrfreaks.net/comment...

 

 

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

Last Edited: Mon. Mar 7, 2016 - 04:03 PM
This reply has been marked as the solution. 
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I suspect that the simplest way would be to post-process the avr-objdump output.

Whatever method is chosen will require a mechanism for going from an address in a namespace to a name.

This method will also require going from an instruction name to an operand position and a namespace.

Given a usable database for the symbols, I'm pretty sure I could do it in Python.
 

Moderation in all things. -- ancient proverb

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks for the links, theusch (FWIW joeymorin linked to the second one in #11).

 

Originally, I was checking that my C code was properly optimized, because I noticed some of my libs functions were not correctly inlined — and that annoyed me.

For example, if a lib function was publicly available (declared in .h) it would never get inlined, even when the only caller ended up being another local function (i.e. within the same lib). So, if I wanted to make low-granularity functions publicly available, my binary would grow in size (intermediary rjmp, IIRC), even when those functions were only "privately" used.

In other words, what I considered as good design actually resulted in junk — frustrating. I wanted them to be both publicly available, and inlined as if they'd been static functions when never (directly) called from outside. Hope that makes sense…

For the record, my solution was to use LTO to inline such functions at link time. I started disassembling the final ELF file to diagnose my problem and later make sure the suckers were indeed inlined as I wanted.

 

Looking into this, I noticed there were other stuff in there that I had never really looked into: like the bss code for example.

I noticed there was unnecessarily wasted space in my vector table: I know which ones I'm using, I don't need the other ones to link to __bad_interrupt, specially when all the latter does is jumping back to __vectors (reset)! I mean, I understand how that could be used to implement a default ISR e.g. (your second link shows a use-case for this), so a hook of some sort definitely is a good idea. But when this hook's not used, -Os should at least make the unused vectors point straight to reset. Or am I missing something?

I also noticed there were _exit and __stop_program instructions that, for the life of me, I couldn't get rid off using __attributes__ only. And that main is actually called (as opposed to simply jumped to); Thus, on top of it, my stack is unnecessarily polluted with a return address that will never be used.

At some point I stopped trying to clean that mess (relatively to my use-case) with attributes, and started looking into how it got there in the first place and how to prevent it to. That's pretty much where I am now.

 

EDIT: Indeed, skeeve, I was thinking of a solution like this (although I thought it existed already and that I just ignored the right avr-ojbdump switch I needed to turn on). That's why I was a it puzzled at first when clawson mentioned carrying these symbols all along the compilation process.

ɴᴇᴛɪᴢᴇᴎ

Last Edited: Mon. Mar 7, 2016 - 06:11 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

netizen wrote:
For example, if a lib function was publicly available (declared in .h) it would never get inlined,...

I cannot see anything in that paragraph related to disassembly of any sort.  If you want stuff "always inline", then IIRC that is another topic.

 

netizen wrote:
Looking into this, I noticed there were other stuff in there that I had never really looked into: like the bss code for example.
\

The links at least touch on this.  If >>you<< choose to assign initial values...

 

And is that related to:

netizen wrote:
I noticed there was unnecessarily wasted space in my vector table:...

 

I don't see the relationship.  And a full vector table isn't a burden except in the tightest of apps.  Indeed, if you are poking about for the absolute last word of flash, then you need to understand all the "hooks".  Personally, I'd rather have some kind of hook/trap in a production app, for robust operation.  But sure, if you care to, you can place a small constant table or whatever in vector area.

 

netizen wrote:
I also noticed there were _exit and __stop_program instructions that, for the life of me, I couldn't get rid off using __attributes__ only. And that main is actually called (as opposed to simply jumped to); Thus, on top of it, my stack is unnecessarily polluted with a return address that will never be used.

Aaah, >>now<< we get into a good one.  >>You<< have chosen to use the infinite-value toolchain. [where are those devil's horns?]devil

 

 

 

 

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

theusch wrote:
I cannot see anything in that paragraph related to disassembly of any sort.  If you want stuff "always inline", then IIRC that is another topic.

Yet it's there: how can you tell what's got inlined and what has not unless you disassemble the final binary?

If you read my last post again, you'll see that I do >>not<< want stuff "always inline". If something isn't clear enough, please tell met what it is and I'll provide more details.

 

theusch wrote:
netizen wrote:
Looking into this, I noticed there were other stuff in there that I had never really looked into: like the bss code for example.

The links at least touch on this.  If >>you<< choose to assign initial values...

The first link actually shows the contrary: you'll get bss code even when your only const (rightly) gets optimized away.

 

theusch wrote:
netizen wrote:
I noticed there was unnecessarily wasted space in my vector table:...

I don't see the relationship.  And a full vector table isn't a burden except in the tightest of apps.

The relationship to what? Disassembly? If so, I suppose you know of a way to look at your actual vector table without disassembly that I do not.

Indeed, anyone complaining about wasted space in a vector table probably is looking to claim as many bytes as possible… Your reasoning skills are quite impressive. ;-]

 

theusch wrote:
netizen wrote:
I also noticed there were _exit and __stop_program instructions that, for the life of me, I couldn't get rid off using __attributes__ only. And that main is actually called (as opposed to simply jumped to); Thus, on top of it, my stack is unnecessarily polluted with a return address that will never be used.

Aaah, >>now<< we get into a good one.  >>You<< have chosen to use the infinite-value toolchain. [where are those devil's horns?]devil

I don't know what's the "infinite-value toolchain" is. But I'm looking forward to hearing why systematically wasting stack space (however small) is a good thing. :-)

ɴᴇᴛɪᴢᴇᴎ

Last Edited: Mon. Mar 7, 2016 - 07:38 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The first link actually shows the contrary: you'll get bss code even when your only const (rightly) get optimized away.

Known issue.  Much ado, since I can hardly think of a practical app which would have >>no<< static variables in .bss data .data.

 

Indeed, anyone complaining about wasted space in a vector table probably is looking to claim as many bytes as possible…

If you're after the pittance of bytes 'wasted' by a vector table, it's time to consider moving to the next larger member of the family, or move to pure ASM.

 

I don't know what's the "infinite-value toolchain" is.

Lee's cheeky way of referring to AVR GCC.  He's typically a CV (CodeVision) man.  Although his expertise on AVR GCC is increasing! ;-)

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

Last Edited: Mon. Mar 7, 2016 - 07:37 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

joeymorin wrote:
Lee's cheeky way of referring to AVR GCC.

 

Value = Utility/Cost, right?  If the chosen toolchain has 0 Cost, but some Utility, then it is infinite Value, right?  Now if Utility is also 0 ...

 

Every toolchain has its own "pattern" of code generation.  Given the Harvard nature of the AVR8, along with EEPROM address space, toolchains might approach things differently.

 

There is a lot of bantering here about a few words one way or the other.  GCC makes a smaller "null program" than CodeVision.  But I claim that the standard CV prologue does stuff like turn off the watchdog that GCC >>should<< do.

 

Surely you can find a per-cent or two to quibble about with any of the mature AVR8 toolchains.  Things that might be truly important to a particular app can usually be "pencil whipped" using hooks and alternative techniques.  Yes, size is important in packed apps--but rarely the last 10 words.  And if deemed that important, then the dev time to examine each painful item will be deemed worth it.  Usually the last 10 words don't matter.

 

IAR probably has the best AVR compiler all around.  So spend a few grand and get your 10 words.

 

 

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

joeymorin wrote:

The first link actually shows the contrary: you'll get bss code even when your only const (rightly) get optimized away.

Known issue.  Much ado, since I can hardly think of a practical app which would have >>no<< static variables in .bss data .data.

Indeed, but that's not the point: however unlikely, the toolchain should ultimately behave properly in those cases. Don't get me wrong: I am not complaining that it does not, I'm just pointing that it should, and I'm ready to bet someday it will. :-)

 

joeymorin wrote:
If you're after the pittance of bytes 'wasted' by a vector table, it's time to consider moving to the next larger member of the family, or move to pure ASM.

That's a diversion: I'm not willing to move to another chip, nor to ASM, I'm willing to get what's possible to get out of C.

 

joeymorin wrote:

I don't know what's the "infinite-value toolchain" is.

Lee's cheeky way of referring to AVR GCC.  He's typically a CV (CodeVision) man.  Although his expertise on AVR GCC is increasing! ;-)

Alright. :-)

ɴᴇᴛɪᴢᴇᴎ

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Just for fun, CV null program generation:

 

                 	.CSEG
                 	.ORG 0x00
                 
                 ;START OF CODE MARKER
                 __START_OF_CODE:
                 
                 ;INTERRUPT VECTORS
000000 940c 0034 	JMP  __RESET
000002 940c 0000 	JMP  0x00
000004 940c 0000 	JMP  0x00
000006 940c 0000 	JMP  0x00
000008 940c 0000 	JMP  0x00
00000a 940c 0000 	JMP  0x00
00000c 940c 0000 	JMP  0x00
00000e 940c 0000 	JMP  0x00
000010 940c 0000 	JMP  0x00
000012 940c 0000 	JMP  0x00
000014 940c 0000 	JMP  0x00
000016 940c 0000 	JMP  0x00
000018 940c 0000 	JMP  0x00
00001a 940c 0000 	JMP  0x00
00001c 940c 0000 	JMP  0x00
00001e 940c 0000 	JMP  0x00
000020 940c 0000 	JMP  0x00
000022 940c 0000 	JMP  0x00
000024 940c 0000 	JMP  0x00
000026 940c 0000 	JMP  0x00
000028 940c 0000 	JMP  0x00
00002a 940c 0000 	JMP  0x00
00002c 940c 0000 	JMP  0x00
00002e 940c 0000 	JMP  0x00
000030 940c 0000 	JMP  0x00
000032 940c 0000 	JMP  0x00
                 
                 __RESET:
000034 94f8      	CLI
000035 27ee      	CLR  R30
000036 bbef      	OUT  EECR,R30
                 
                 ;INTERRUPT VECTORS ARE PLACED
                 ;AT THE START OF FLASH
000037 e0f1      	LDI  R31,1
000038 bff5      	OUT  MCUCR,R31
000039 bfe5      	OUT  MCUCR,R30
                 
                 ;CLEAR R2-R14
00003a e08d      	LDI  R24,(14-2)+1
00003b e0a2      	LDI  R26,2
00003c 27bb      	CLR  R27
                 __CLEAR_REG:
00003d 93ed      	ST   X+,R30
00003e 958a      	DEC  R24
00003f f7e9      	BRNE __CLEAR_REG
                 
                 ;CLEAR SRAM
000040 e080      	LDI  R24,LOW(__CLEAR_SRAM_SIZE)
000041 e098      	LDI  R25,HIGH(__CLEAR_SRAM_SIZE)
000042 e0a0      	LDI  R26,LOW(__SRAM_START)
000043 e0b1      	LDI  R27,HIGH(__SRAM_START)
                 __CLEAR_SRAM:
000044 93ed      	ST   X+,R30
000045 9701      	SBIW R24,1
000046 f7e9      	BRNE __CLEAR_SRAM
                 
                 ;GPIOR0 INITIALIZATION
000047 e0e0      	LDI  R30,__GPIOR0_INIT
000048 bbee      	OUT  GPIOR0,R30
                 
                 ;HARDWARE STACK POINTER INITIALIZATION
000049 efef      	LDI  R30,LOW(__SRAM_END-__HEAP_SIZE)
00004a bfed      	OUT  SPL,R30
00004b e0e8      	LDI  R30,HIGH(__SRAM_END-__HEAP_SIZE)
00004c bfee      	OUT  SPH,R30
                 
                 ;DATA STACK POINTER INITIALIZATION
00004d e0c0      	LDI  R28,LOW(__SRAM_START+__DSTACK_SIZE)
00004e e0d3      	LDI  R29,HIGH(__SRAM_START+__DSTACK_SIZE)
                 
00004f 940c 0051 	JMP  _main

JMP to main(), not CALL.  (built for '328) Should it be an RJMP?  Probably to save a word for you, but this is the standard prolugue for a '328 so JMP reaches everywhere.

 

Default ISR action is JMP to 0. 

 

Now, you probably won't like the data-clearing loops on the other hand...

 

As with GCC (unless you trick it), no initial data copy if no initial data.

 

 

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Indeed, but that's not the point: however unlikely, the toolchain should ultimately behave properly in those cases. Don't get me wrong: I am not complaining that it does not, I'm just pointing that it should, and I'm ready to bet someday it will. :-)

It is the point.  Because the toolchain is behaving properly.  They only metric that matters for evaluating 'properly' is 'Does the generated code correctly express the source code?'  Arguments about whether or not 20 cycles and 10 words of code should wasted in the one-time initialisation routine are quite simply not worth anyone's time.  If you feel differently, GCC and AVR GCC are open source projects open to contribution by anyone.  Someday could be tomorrow if you do the work.

 

That's a diversion: I'm not willing to move to another chip, nor to ASM, I'm willing to get what's possible to get out of C.

Again, it's up to you to provide a replacement mechanism for the C runtime which is responsible for the vector table, init code, call to main, exit, etc.  Remember the pedigree of GCC.  It is built primarly for 'big iron' (i.e. x86) and more recently ARM.  AVR in many respects a sideshow.  And having a call/return/exit structure is not bad.  It is actually good.  It means that if for any reason you decide to allow main to return, the AVR will not run amok.  It will instead go into an infinite empty loop with interrupts disabled.  That is correct behaviour.  Note also that the toolchain has provisions for post-main code in the form of the .finiN sections.

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

theusch wrote:
Value = Utility/Cost, right? If the chosen toolchain has 0 Cost, but some Utility, then it is infinite Value, right?

I suppose GCC wouldn't even exist if its developers were individually abiding by your economics logic. :-)

Let's stick to it: what's the value of your answering to me and freely sharing your expertise, when the cost of you doing so is non-null and the overall utility is speculative at best? I guess we're not that different after all. ^^

We're talking IT here: please leave economics considerations to economics forum.

 

theusch wrote:
IAR probably has the best AVR compiler all around.  So spend a few grand and get your 10 words.

I won't: I don't ultimately care which compiler is the best. I'd keep using FOSS ones even if they're not. There would be no decent FOSS without such behavior. Pointing out how a FOSS project could do a better job is not ranting, it's increasing the likelihood that it'll get better even sooner. :-)

ɴᴇᴛɪᴢᴇᴎ

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

joeymorin wrote:
It is the point.  Because the toolchain is behaving properly.  They only metric that matters for evaluating 'properly' is 'Does the generated code correctly express the source code?'

So, by your logic, if my home-made compiler needed 32KB to blink a LED, it would "behave properly". Unless of course I need to blink two LEDs, in which case I just need to upgrade my chip for a more potent in the same family. Come on… :-)

 

joeymorin wrote:
Again, it's up to you to provide a replacement mechanism for the C runtime which is responsible for the vector table, init code, call to main, exit, etc.

That's exactly what I'm trying to do. :-)

 

Overall, and this is a general announcement: please don't bother arguing that I shouldn't complain because it's free, or that I should be using another chip, or ASM, or whatever. These >>are<< irrelevant distractions.

If you're not ready to consider what's the best output our beloved toolchain should throw out, and debate it, this thread is definitely not for you.

FWIW, I *have* contributed to FOSS projects like the Linux kernel in the past, I might do it again in the future, or someone else might do it. That definitely is not the point here.

ɴᴇᴛɪᴢᴇᴎ

Last Edited: Mon. Mar 7, 2016 - 08:31 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

joeymorin wrote:
And having a call/return/exit structure is not bad. It is actually good. It means that if for any reason you decide to allow main to return, the AVR will not run amok. It will instead go into an infinite empty loop with interrupts disabled.

Correct me if I'm wrong, but having a call structure is (perhaps) only useful if you're using destructors. If you're not, then returning from main definitely is a bug. And bitching about the MCU running amok when your code is buggy, >>that<< is ranting.

Don't get me wrong: I do understand the value of protecting yourself (as far as possible) from your own bugs. But when we use -Os we're explicitly asking for the smallest code the toolchain can produce; asking for all the cushions we could get is not part of the deal.

ɴᴇᴛɪᴢᴇᴎ

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

theusch wrote:
Just for fun, CV null program generation:

JMP to main(), not CALL.

I guess joeymorin should explain to them this is incorrect behavior. ;-)

And yes: it should be a RJMP unless a JMP actually is needed. There's nothing wrong in mentioning that.

 

PS: sorry for double/triple/quadruple posting.

ɴᴇᴛɪᴢᴇᴎ

Last Edited: Mon. Mar 7, 2016 - 09:07 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I feel that I can get "close enough" with the mainstream C compilers.  And, as mentioned, there are generally enough hooks and options to address particular situations.

 

A number of points have been raised.  Indeed, the GCC gurus will enjoy working at the "puzzle" if presented in an interesting manner.  I'd suggest you organize your issues in e.g. numbered or lettered list.  Then each can be responded to in a separate and organized fashion.  For example, there are three ways to address "inline" according to the docs.  https://gcc.gnu.org/onlinedocs/g... I'm no guru, but "whole program" and "relax" might help.  Same with your other points.

 

[wait till you try to do a skinny ISR with the infinite-value tooolchain... ;) ]

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

Last Edited: Mon. Mar 7, 2016 - 09:07 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

There's no misunderstanding of inline: the inlining I needed only is possible at link time, that is all. And GCC provides it if you use -flto (in which case -whole-program becomes useless); -whole-program by itself isn't able to do the necessary optimizations in my toolchain version (4.8.1), even with -mrelax -Wl,--gc-sections.

But then it seems you run into other problems, specially when you're trying to use __init() or .initX sections.

 

wait till you try to do a skinny ISR with the infinite-value tooolchain... ;)

I *am* using a naked ISR and it compiles just fine. :-)

Or did I get you wrong?

ɴᴇᴛɪᴢᴇᴎ

Last Edited: Mon. Mar 7, 2016 - 09:21 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

netizen wrote:
I *am* using a naked ISR and it compiles just fine. :-) Or did I get you wrong?

Going naked introduces its own problems; you might be left in the cold [if you mess up save/restore etc.].

 

CV's "smart" ISR does a great job of keeping save/restore to the minimum needed.  For example, GCC will never have a non-naked one-instruction ISR:

 

                 ;//
                 ;// **************************************************************************
                 ;// *
                 ;// *		T I M E R 2 _ C O M P _ I S R
                 ;// *
                 ;// **************************************************************************
                 ;//
                 ;// Timer 2 output compare interrupt service routine
                 ;//
                 ;// Main timer "tick", occurs every 10.0ms
                 ;//
                 ;interrupt [TIM2_COMPA] void timer2_comp_isr(void)
                 ;0000 0411 {
                 _timer2_comp_isr:
                 ;0000 0412 
                 ;0000 0413 // Flag to main loop that a tick has occured
                 ;0000 0414 	tick = 1;
00013c 9af0      	SBI  0x1E,0
                 ;0000 0415 
                 ;0000 0416 
                 ;0000 0417 }
00013d 9518      	RETI
                 ;

In short, I cannot remember fussing with a critical ISR in CV for 10 years -- don't need to.

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I don't know about CV's smart ISR, but that's pretty close to my code:

volatile register u08 ticks asm("r3");
ISR(TIM0_COMPA_vect, ISR_NAKED) {
	tick++;
	reti();
}

00000098 <__vector_10>:
  98:	33 94       	inc	r3
  9a:	18 95       	reti

Of course I'm not just setting a bit, I'm incrementing: so a IO register <0x1F isn't enough to get atomicity.

ɴᴇᴛɪᴢᴇᴎ

Last Edited: Mon. Mar 7, 2016 - 10:05 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

netizen wrote:

I don't know about CV's smart ISR, but that's pretty close to my code:

volatile register u08 ticks asm("r3");
ISR(TIM0_COMPA_vect, ISR_NAKED) {
	tick++;
	reti();
}

00000098 <__vector_10>:
  98:	33 94       	inc	r3
  9a:	18 95       	reti

 

...and that's why going naked can leave you in the cold...

 

INC affects flags.  And if ticks is indeed hard-assigned to R3, then doesn't that limit you in using GCC library functions? [In CV, I have a number of low registers that are "mine".]

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

It doesn't limit you as long as your libs are all aware of it. Of course you can't compile a lib as if r3 was free, and then link to it while claiming r3 as your own… This has nothing to do with GCC: I doubt CV would do any better.

 

You're right about the carry. But if you go at the root of the problem, your code considers setting a bit is fine: if ever two ticks happen before this bit is checked/cleared, one tick will be lost. Mine won't miss it.

In other words: in normal circumstances, ticks should never be >1. If it does, your code misses it, mine gets lost only if it gets >255 (in which case the app definitely has gone rogue anyway…). At least my code has a chance to detect something was wrong…

 

Notice we're getting away from your point: the naked ISR did compile fine, using r3 (or not) is another issue.

ɴᴇᴛɪᴢᴇᴎ

Last Edited: Mon. Mar 7, 2016 - 10:31 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

As a complement: using registers as global variables indeed is tricky with GCC. For example, read operations must be handled in a special way, or they could be optimized away (as such a register content, even when it's tagged as volatile, is not expected to change within a code block). For example:

#define reg(_r)  ({ __asm__ __volatile__ ("" : "+r"(_r)); _r; })

Also, compiling with -flto will fail unless you also link with -flto-partition=max on my toolchain version. I guess that's the price to pay to get infinite value. ;-)

ɴᴇᴛɪᴢᴇᴎ

Last Edited: Mon. Mar 7, 2016 - 10:59 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

So, by your logic, if my home-made compiler needed 32KB to blink a LED, it would "behave properly".

Hyperbole doesn't help your case.

 

There is a clear distinction between 'correct' and 'optimal'.  'Correct' is unambiguous.  The generated code expresses the source code as defined by the language standard.  'Optimal' is a matter of perspective.  It is also a game of compromise and trade-offs.

 

But when we use -Os we're explicitly asking for the smallest code the toolchain can produce

You have misunderstood what -Os does:

       -Os Optimize for size.  -Os enables all -O2 optimizations that do not
           typically increase code size.  It also performs further
           optimizations designed to reduce code size.

           -Os disables the following optimization flags: -falign-functions
           -falign-jumps  -falign-loops -falign-labels  -freorder-blocks
           -freorder-blocks-and-partition -fprefetch-loop-arrays

 

While the casual reader might stop after 'Optimise for size', the careful developer will go on to examine the specific optimisation strategies which are enabled by -O2.  There are dozens.  Almost all optimisation strategies have space/speed tradeoffs, and all of the -O* options are merely 'presets', predefined collections of dozens of specific optimisations which have been selected to achieve a general objective.  By no means is -Os 'the smallest code the toolchain can produce', nor is it anywhere claimed that it is.

 

And nowhere is it mentioned that the CRT will be omitted by using -Os.  Indeed, since the CRT is a precompiled module, there is nothing the compiler can do to further reduce its size.  Indeed, there is nothing the compiler could do about it anyway, since it is a link-time issue.  The mechanism is clearly documented, and if you are part of the 0.1% percent of users who are offended at the notion of a robust CRT, you are free to implement your own.  Anyone who has written a bootloader is already familiar with these issues, so have a look at any of the zillion bootloader projects out there.  Start with -nostartfiles and -nostdlib.

 

I stand by my statement that this is hardly worth more effort than has already been expended on the subject.  A solution (in the form of the documented mechanism and a means to override them) exists.  Effort and resources towards the development effort for AVR GCC could be far more effectively deployed against a number of other deficiencies in w.r.t. efficient code generation, loop optimisation, ISR prologues/epilogues, etc.  Fighting to carve off a few words of flash in code which runs once after reset is not an efficient use of a developer's time.  But by all means, knock yourself out.

 

It doesn't limit you as long as your libs are all aware of it. Of course you can't compile a lib as if r3 was free, and then link to it while claiming r3 as your own… This has nothing to do with GCC: I doubt CV would do any better.

Reserving a register for a global has no effect on prebuilt library code.  You must ensure that any prebuilt libraries (like stdio, floating point) are used atomically in an app which uses global register variables in an ISR.  For example, vfprintf uses nearly all of the general purpose registers.  Some floating point routines do use all of them.  They preserve and restore the contents of all of the call saved register, of course.  However, if an interrupt were to break in while the main thread were executing any of that library code, the ISR would operate on data it doesn't own.  Upon ISR return, the interrupted library code would have corrupted data, and the operation performed by the ISR on the register variable would be lost once the library restored that register.  The only solution in that context is to make all prebuilt library code atomic, which rather defeats the purpose of using a register variable in the first place, since that was likely done to increase ISR performance.

 

As for CV, I can't say.  But other toolchains like ICC do allow real global reserved register variables.  No other code will ever touch them.

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

Last Edited: Tue. Mar 8, 2016 - 04:37 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

joeymorin wrote:
netizen wrote:
So, by your logic, if my home-made compiler needed 32KB to blink a LED, it would "behave properly".
Hyperbole doesn't help your case. There is a clear distinction between 'correct' and 'optimal'. 'Correct' is unambiguous. The generated code expresses the source code as defined by the language standard. 'Optimal' is a matter of perspective. It is also a game of compromise and trade-offs.

Notice I did not explicitly talk about "correctly" or "optimally" (you did). I talked about "properly", which implies both correctness as well as a reasonable level of optimality (what my hyperbole was hinting at).

For example, leaving unused bss code behind is correct, not optimal, and not proper behavior either (specially while optimizing for size); Removing it would be correct, more optimal and proper behavior. Notice there is no compromise or trade-off to blur such a case: it's bigger and it's not faster.

 

I believe the same applies to expecting main to return on an embedded system (specially while explicitly optimizing for size). It seems I'm not alone in this belief since CV implements it, as theusch has shown. According to me, avr-gcc also should assume main won't return as a default; If you need it to return, you should have to explicitly ask for it.

Practically speaking, what I find annoying is that I cannot seem to get rid of it: e.g. if I add the noreturn attribute to main()for example, the clutter is still there. Does anyone know how to achieve this?

 

joeymorin wrote:
You have misunderstood what -Os does:

Quote:
-Os Optimize for size. -Os enables all -O2 optimizations that do not typically increase code size. It also performs further optimizations designed to reduce code size.

I don't think I do: -Os does its best to reduce code size. Exactly my point.

I fail to see how optimization being a collection of strategies has anything to do with our discussion.

 

joeymorin wrote:
And nowhere is it mentioned that the CRT will be omitted by using -Os.

Nowhere has anyone claimed otherwise…?

 

joeymorin wrote:
I stand by my statement that this is hardly worth more effort than has already been expended on the subject. (…) Fighting to carve off a few words of flash in code which runs once after reset is not an efficient use of a developer's time.

Back to the economics argument. Nothing like it to disguise better as worse. ;-)

Yes, it's "just a few words of flash". Notice they would make the code both smaller and faster: no speed/size trade-off here either. Thus it's not even worse an optimization flag: it should always be done.

You make it sound like I'm the bad guy bitching about the project and its developers (and conveniently you're the good guy defending them). Nothing like this here: it's often the case in such a situation that what appears at first glance as an enhancement ends up not being one, once the bigger picture is known, that's why we're debating. The result of the discussion so far is: it still is an enhancement. That's the very first step in any implementation process.

You can brush it off on the basis it's just a few words of flash (I'd be happy with a single one!), you can justify it apparently not being implemented yet because of developers time management, you can put forward other features/enhancements that are more important to you. To me, these are all beside the point; What I hear however is that you're implicitly admitting it would indeed be an enhancement. Thanks. ^^

 

joeymorin wrote:
As for CV, I can't say. But other toolchains like ICC do allow real global reserved register variables. No other code will ever touch them.

Interesting. I'd be curious to know how they do it. I can only think of two strategies: either recompile everything so that every piece of code is aware it can't touch these registers (what I'm doing), or unconditionally reserve some registers for users beforehand (which means all prebuilt libraries are under-optimized when the user does not use these registers).

ɴᴇᴛɪᴢᴇᴎ

Last Edited: Tue. Mar 8, 2016 - 09:06 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

netizen wrote:
Hope that makes sense…

Not to me. How could you implement a fuction that is both callable and inline-able? if I wanted to achieve this I would do:

// head.h
static inline void _foo(void) __attribute__((always_inline)) {
    // stuff
}
// callable.c
#include "head.h"

void foo(void) {
    _foo();
}

Now foo() is a callable version and _foo() is an inlined version that will be inlined where it is invoked in any header that include head.h

netizen wrote:
I noticed there was unnecessarily wasted space in my vector table: I know which ones I'm using, I don't need the other ones to link to __bad_interrupt, specially when all the latter does is jumping back to __vectors (reset)! I mean, I understand how that could be used to implement a default ISR e.g. (your second link shows a use-case for this), so a hook of some sort definitely is a good idea. But when this hook's not used, -Os should at least make the unused vectors point straight to reset. Or am I missing something?

This is why -nostartfiles exists. If you don't like the compiler/library provided solution for the vector table (which for most people is perfectly acceptable) then you use -nostartfiles so that crt<model>.o is not linked and then you choose whether to make your own modified grct1.s or implement it in some other way.

netizen wrote:
I also noticed there were _exit and __stop_program instructions that, for the life of me, I couldn't get rid off

They are part of the CRT too - they will also disappear if you use -nostartfiles too. In that case it becomes your responsibility to provide equivalents if you need them.

 

In short the CRT basically does something like:

void vectors() __attribute__((naked, section(".vectors))) /* @ 0x000 */ {
    asm( "JMP reset"
     "JMP __bad_interrupt"
     "JMP __bad_interrupt"
     "JMP __bad_interrupt"
     etc.
     );
}

register uint8_t reg1 asm("r1");

void reset() __attribute__((naked, section(".init2"))) {
    reg1 = 0x00 // compiler relies a lot on 0x00 in R1 to save setting up 0x00
    SREG = r1;
    SP = RAMEND;
    main();
_exit:
    cli();
    while(1);
}

void __bad_interrupt(void) {
    asm("jmp 0");
}

If you don't want all that use -nostartfiles but then you need to provide everything from the jmp at location 0 onwards. You have to be particularly careful about ensuring that R1 will contain 0x00 as soon as you enter "real" C code.

netizen wrote:
however unlikely, the toolchain should ultimately behave properly in those cases.

GCC is an open source compiler - patches always welcome from those who think they can improve it!

theusch wrote:
JMP to main(), not CALL.

What happens when main() returns? (as many beginners often make the mistake of doing)

netizen wrote:

I don't know about CV's smart ISR, but that's pretty close to my code:

volatile register u08 ticks asm("r3");
ISR(TIM0_COMPA_vect, ISR_NAKED) {
	tick++;
	reti();
}

00000098 <__vector_10>:
  98:	33 94       	inc	r3
  9a:	18 95       	reti

Of course I'm not just setting a bit, I'm incrementing: so a IO register <0x1F isn't enough to get atomicity.

You can only use this code *IF* the INC opcode does not change any bits in SREG. But it can change all of S, V, N and Z so this code will cause really pernicious bugs at some point!!

EDIT: sorry missed the fact that Lee pointed out how dangerous this use of INC in the ISR without SREG preservation is in the very next post - I must learn to read! (but I don't think it hurts to mention it again - your ISR() is DANGEROUS!).

Last Edited: Tue. Mar 8, 2016 - 11:27 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
How could you implement a fuction that is both callable and inline-able?

Of course not at the same time: depends on the context. Check this library code for example:

// lib.h
void sub_feature();
void feature();
// lib.c
#include "lib.h"
inline void sub_feature() { ... }
void feature() { sub_feature(); ... }

sub_feature() sure cannot be inlined when compiling lib.o, because it could be called by external code.

Suppose our app only calls feature(): sub_feature() can now be inlined within feature() at link time. Of course, if our app called both, there would be no such inlining.

Makes sense?

 

clawson wrote:
This is why -nostartfiles exists.

Sure, using -nostartfiles and a custom CRT should do the trick for me (I'll give it a go and report). But I'm arguing the behaviors I've criticized should be "fixed" for everyone. We're debating about whether or not that is the case.

For example, I believe not returning from main() is by far the most common use-case of avr-gcc, thus main should be jumped to, _exit and __stop_program disappearing; At least when main() does not return (i.e. has no ret instruction). Or __bad_interrupt should not be there when the hook isn't used (unused vectors should directly point to reset). Etc. What do you think?

To be honest I suspect some of these would be the default, as they are in other toolchains, if they could easily be implemented (by modifying the CRT e.g.).

 

clawson wrote:
GCC is an open source compiler - patches always welcome from those who think they can improve it!

Sure. But this starts with agreeing on what would be an improvement. :-)

 

clawson wrote:
You can only use this code *IF* the INC opcode does not change any bits in SREG. But it can change all of S, V, N and Z so this code will cause really pernicious bugs at some point!!

Originally, this was an answer to theusch's remark (and code) that GCC did not handle naked ISR well enough to his taste (or so). I haven't had any issue with them. Have you?

But if you want to talk about inc r3: this ISR is part of a time-triggered architecture and it ticks while the MCU is in idle mode. In other words, it always triggers after the same SLEEP instruction (and before the same instruction — which we do not care about). Besides this, you're right: there is no safety cushion here.

ɴᴇᴛɪᴢᴇᴎ

Last Edited: Tue. Mar 8, 2016 - 01:28 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

netizen wrote:
But I'm arguing the behaviors I've criticized should be "fixed" for everyone.

What behaviours? I've been a user of avr-gcc since 2005 and I cannot think of anything in the standard CRT that I would object to or want removed. I like the fact that there's a fully populated IVT with unused entries pointed to __bad_interrupt() because if you then ever make the mistake of enabling an IE bit in some control register without having provided the ISR you cannot miss the fact that it causes a "JMP 0". I like the fact that it clears I in SREG for me so when control arrives at 0 then interrupts are immediately turned off. I like the fact that main() is called so that if any user exits main() it can be caught and what's more I like the _exit() code that catches that because it cli()s before it deadloops. So who do you claim to be speaking on behalf of if you think there are behaviours to be criticized?

 

I also like the fact that GCC has -nostartfiles so if (as is the case here: https://spaces.atmel.com/gf/proj... ) I ever have reason not to want the IVT+CRT I can simply switch it off and provide my own.

 

BTW there was once an issue of avr-gcc where the CRT was changed so that there was no "catch all" for code that returned from main(). For about the following two years (until that version was totally purged from usage) these message boards lit up with people being caught by it!

netizen wrote:
I haven't had any issue with them. Have you?

I haven't used "naked" heavily but if I did I know enough to preserve SREG if I ever write code that changes a flag.

netizen wrote:
In other words, it always triggers after the same SLEEP instruction
Wow that's a dangerous assumption to make! I just hope you aren't involved in medical or automotive electronics as I wouldn't want my life left in your hands! cheeky

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
What behaviours? I've been a user of avr-gcc since 2005 and I cannot think of anything in the standard CRT that I would object to or want removed.

See why I'm discussing it before I bother looking into a patch? :-)

Thanks for your contribution!

 

clawson wrote:
I like the fact that there's a fully populated IVT with unused entries pointed to __bad_interrupt() because if you then ever make the mistake of enabling an IE bit in some control register without having provided the ISR you cannot miss the fact that it causes a "JMP 0".

I suppose you mean when using a debugger? Otherwise, a reset is in itself quite noticeable…

 

clawson wrote:
I like the fact that it clears I in SREG for me so when control arrives at 0 then interrupts are immediately turned off.

Are you talking about .init2 code (or non-naked ISR handling)?

 

clawson wrote:
I like the fact that main() is called so that if any user exits main() it can be caught and what's more I like the _exit() code that catches that because it cli()s before it deadloops.

Correct me if I'm wrong, but it is known at compile time whether main contains a ret instruction. When it does not, it is still called and it is still followed by a jump to _exit. Call me a maniac, but I don't like that. ^^

I'm afraid relying on this kind of cushion can be dangerous (and if the cushion is there it will be relied on): you're assuming a dead-loop is a safe state for every system. Planes have gone to the ground with such implicit assumptions.

 

clawson wrote:
BTW there was once an issue of avr-gcc where the CRT was changed so that there was no "catch all" for code that returned from main(). For about the following two years (until that version was totally purged from usage) these message boards lit up with people being caught by it!

Interesting. So that's why this catch-all code is always included: to protect the message boards! :-)

 

clawson wrote:
Wow that's a dangerous assumption to make! I just hope you aren't involved in medical or automotive electronics as I wouldn't want my life left in your hands! cheeky

In the context of a TTA this is an assumption that can be proven right, contrary to the "real time" designs you seem to be exclusively working with. That's why it's mainly popular in critical applications like aeronautics and nuclear plants.

This being said, you assume too much: if your life depended on my code, I would preserve SREG. Well, on the other hand, now that I know you're opposing my proposals… ;-)

ɴᴇᴛɪᴢᴇᴎ

Last Edited: Tue. Mar 8, 2016 - 03:09 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

netizen wrote:
I suppose you mean when using a debugger? Otherwise, a reset is in itself quite noticeable…

You clearly have not read Freaks for long enough! ;-)

 

We get endless threads about "my AVR keeps resetting". When it is established that avr-gcc is being used our first suggestion to such users is "you are probably accidentally enabling an interrupt without providing a handler". About 90..95% of the time we turn out to be right. So it is a VERY useful diagnostic for determining when interrupts have been enabled in error. Often it'll be something like the user has enabled TOIE1 but provided ISR(TIMER0_OVF_vect) or something like that.

netizen wrote:
Are you talking about .init2 code (or non-naked ISR handling)?

I'm talking about:

	.section .init2,"ax",@progbits
	clr	__zero_reg__
	out	AVR_STATUS_ADDR, __zero_reg__
	ldi	r28,lo8(__stack)
#ifdef __AVR_XMEGA__
	out	AVR_STACK_POINTER_LO_ADDR, r28
#ifdef _HAVE_AVR_STACK_POINTER_HI
	ldi	r29,hi8(__stack)
	out	AVR_STACK_POINTER_HI_ADDR, r29
#endif	/* _HAVE_AVR_STACK_POINTER_HI */
#else
#ifdef _HAVE_AVR_STACK_POINTER_HI
	ldi	r29,hi8(__stack)
	out	AVR_STACK_POINTER_HI_ADDR, r29
#endif	/* _HAVE_AVR_STACK_POINTER_HI */
	out	AVR_STACK_POINTER_LO_ADDR, r28
#endif  /* __AVR_XMEGA__ */

That is the code in gcrt1.s that goes into .init2. It clears R1. It writes the cleared R1 to SREG (here referred to as AVR_STATUS_ADDR) and it sets the stack pointer.

netizen wrote:
Correct me if I'm wrong, but it is known at compile time whether main contains a ret instruction. When it does not, it is still called and it is still followed by a jump to _exit. Call me a maniac, but I don't like that. ^^

You are undoubtedly a maniac! One thing you have to understand about avr-gcc is that it's simply one of many variants of GCC. Some of what the C compilers do is "per target" but some of it is generic. I think you'll find that a change like this may require changes in the generic C compiler and the ARM and i386/AMD64 boys simply aren't going to wear such a change. In fact part of the reason that avr-gcc can never be as good as some of the commercial compilers like IAR and CodeVision is that it is forever battling against its ARM/i386/AMD64 heritage. It was enough of a stretch to get "__flash" add recently as ARM(well usually)/i386/AMD64 are von Neumann not Harvard and adding Harvard supporting stuff is outside of their remit.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Alright. I guess I'll see what I can do hacking crt1 first, and then if I get something I deem satisfying I'll open a thread to describe what I've done and see if anyone else is interested. I feel like you're not gonna be my easiest client. :-)

It is understood we're not talking about anything revolutionary, just a few words saved here and there, no unused init code left-overs (ideally), tiny bit cleaner code in corner cases…

ɴᴇᴛɪᴢᴇᴎ

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Why this press to save a few words anyway? The only place you normally "program tight" (like not wanting a full IVT) are in bootloaders where Atmel give you a fixed 0.5K/1K/2K/4K/8K up the top end of a chip and you have to squeeze everything you can into such a small space. In such instances you are bound to use -nostartfiles and handle any standard CRT stuff yourself if only to remove the IVT. Otherwise (as Lee says above) some might claim the avr-gcc standard CRT is actually "too small" as it does not handle important things like early disabling of the watchdog for example. (my counter argument to him is that is why .init1/3/5/7 are provided for the end user's use ;-).

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Something similar: sort of a tiny OS (a glorified scheduler really), where any byte saved is a byte available for actual application code.

But I'd be lying if I said it was the only reason. Aesthetics is another reason: I find minimalistic and efficient code beautiful, while dead code and left-overs are confusing and tickle horribly. ^^

I know it's a just few bytes, but that'd be a few bytes on every avr-gcc compiled binary or so… That means something to me.

Also, even if it's just 4 words (rjmp to _exit, __bad_interrupt, _exit and __stop_program), in relative terms that's 35% of the 11 gcrt1 spits out. :-D

ɴᴇᴛɪᴢᴇᴎ

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

As you now know this is easily achieved. You pull a copy of gcrt1.S that matches your installed AVR-LibC. You use -nostartfiles. You add gcrt1.S to your own sources and modify away to your hearts content :-)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks, I did just that and it works beautifully. :-)

I kept the vector table so far, but got rid of the stuff I mentioned earlier. Much nicer and 8 precious bytes saved: a win/win scenario as far as I'm concerned!

 

While researching this, I fell on an old post from ralphd saying he wanted to introduce a switch to remove the __bad_interrupt code on demand, but never found the time to do it. Perhaps that'd be the way to go?

Say a -minimal-startup option that would get rid of anything unnecessary, so that people like me are happy. Yet that stuff would still be here by default, so message boards and people like you would be happy too. :-)

	.macro	vector name
	.if (. - __vectors < _VECTORS_SIZE)
	.weak	\name
+ #ifndef MINIMAL_STARTUP
	.set	\name, __bad_interrupt
+ #else
+       .set    \name, __vectors
+ #endif
	XJMP	\name
	.endif
	.endm
(...)
+ #ifndef MINIMAL_STARTUP
        XCALL   main
        XJMP    exit
+ #else
+       XJMP    main
+ #endif

This does the job (optimization gets rid of left-overs).

Although hacking gcrt1.S works, it's dependent on the libc version used, thus it's not very practical. And once again, I would definitely use that in every app I work on (not limited to bootloader/OS stuff). I have a hard time imagining I'm the only one.

 

On another front, I noticed interrupt vectors are numbered 1 to N in AVR datasheets, but 0 to N-1 in gcc. This seems unnecessarily confusing. Any idea why?

 

EDIT: Removed references to SRAM (need sleep…).

ɴᴇᴛɪᴢᴇᴎ

Last Edited: Tue. Mar 8, 2016 - 11:46 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

It seems to me the simplest way to do at least some of the desired optimization is to split gcrt1.S into pieces.

Use -nostartfiles and link in the desired pieces.

Moderation in all things. -- ancient proverb

Last Edited: Wed. Mar 9, 2016 - 04:04 AM

Pages