How useful is shrinking the vector table?

Go To Last Post
42 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Would it be useful if the linker shrinks the vector table down until the last "used" vector table entry?

Say, for the following piece of code for an XMEGA

#include 
#include 

ISR (__vector_1)
{
    return;
}
int main()
{
    return 0;
}

the toolchain generates

00000000 <__vectors>:
   0:   03 c0            rjmp    .+6              ; 0x8 <__ctors_end>
   2:   00 00            nop
   4:   10 c0            rjmp    .+32             ; 0x26 <__vector_1>
        ...

00000008 <__ctors_end>:
   8:   11 24            eor     r1, r1
   a:   1f be            out     0x3f, r1        ; 63

instead of (ignore the jmp/rjmp difference - it is 4 bytes per entry either way)

00000000 <__vectors>:
   0:   0c 94 fa 00     jmp 0x1f4   ; 0x1f4 <__ctors_end>
   4:   0c 94 0c 01     jmp 0x218   ; 0x218 <__vector_1>
   ...
   1f0:    0c 94 0a 01     jmp 0x214   ; 0x214 <__bad_interrupt>

Of course, shrinking down to __vector_1 is the best case, saving 490 odd bytes. If the only entry used is the last one in the table, nothing get shrinked.

Is this useful at all? Is it critical that interrupts that are enabled but not handled should have some defined behavior (reset/run user defined code)?

Regards

Senthil

 

blog | website

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

VERY useful for bootloader authors and those using small micros.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I've been coveting those bytes.

So here I am, a relatively clueless studio 6 user. My program uses interrupt 0 and 1, and of course Reset, so most of the table isn't used for anything. How do I tell studio 6 to use vectors 4 through 19 for program code?

The largest known prime number: 282589933-1

In my humble opinion, I'm always right. 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You cannot at present, the best you can do is hand code a vector table in Asm based on gcrt1.s in AVR-Libc. OP is, I believe, one of the Atmel developers and is presumably polling us to ask whether it would be worthwhile for Atmel to expend the effort to add such a feature in a future version. I guess this makes it two yes votes so far.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Or maybe they'll decide not to so I'll start buying more expensive chips :D

The largest known prime number: 282589933-1

In my humble opinion, I'm always right. 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

If the toolsmiths could see their way clear to regain the feature of early (1980s-style) linkers whereby a "best fit" packing of scads of separate, small segments into a given memory space after any absolutely-located segments were placed as constraints could be achieved, the interrupt vector tables would always be minimal. The ISR() macros would simply emit a one-instruction-long segement, absolutely addressed at the appropriate vector table slot, containing a jump to the supplied routine.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Levenkay wrote:
The ISR() macros would simply emit a one-instruction-long segement, absolutely addressed at the appropriate vector table slot, containing a jump to the supplied routine.
That would be the easy part... but, can you please demonstrate how would that work?

I can't quite envisage how would the linker know which of the interrupt vectors are in use. The absolute location of those segments is just a secondary problem then.

Please prove me I am wrong.

JW

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

wek wrote:
I can't quite envisage how would the linker know which of the interrupt vectors are in use.
JW

The standard avr-libc way of setting up the interrupt vector table is to setup weak symbols for __vector_n, with the default value being bad_interrupt. The table is then filled with jumps to the corresponding __vector_n symbols. All this goes into the .vectors section, and this section is placed first up in the .text section of the elf file in the linker scripts.

When user code defines an interrupt, that definition overrides the (weak) one in avr-libc. This fact (weak vs non-weak definition) can be used by the linker to figure out which interrupts have handlers in user code.

I actually have a working version of the linker that does this
- Walk through all symbols in the .vectors section
- Look for those that start with __vector_ and extract the n part of __vector_n
- Find the max (non-weak) defined n, and the max available n - this corresponds to the last interrupt that has a handler, and the last interrupt for that device.
- Find the addresses for the vector table entries corresponding to the last handled interrupt and the last interrupt, and delete everything between.

Of course, this only works if all the above conventions are followed - interrupt handlers are named __vector_n, the interrupt table is from avr-libc, linker scripts don't do anything funny etc..

Regards

Senthil

 

blog | website

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
OP is, I believe, one of the Atmel developers and is presumably polling us to ask whether it would be worthwhile for Atmel to expend the effort to add such a feature in a future version. I guess this makes it two yes votes so far.

Yes, I'm an Atmel developer, but this was more of an on-the-side thing than official work :) Anyway, I actually have a rough patch for the binutils linker that does this as part of linker relaxation - I just wanted to know if it was worth doing it properly/submitting it to the community.

Regards

Senthil

 

blog | website

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You can save us up to one k of flash by removing stuff no sensible developer would ever need. That's a great idea, please do it and keep up the good work.

To satisfy the other 99%, make it a compiler/linker/whatever option.

Sid

Life... is a state of mind

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

saaadhu wrote:
I actually have a working version of the linker that does this
That's CHEATING!!!

;-)

saaadhu wrote:

- Walk through all symbols in the .vectors section
- Look for those that start with __vector_ and extract the n part of __vector_n
- Find the max (non-weak) defined n, and the max available n - this corresponds to the last interrupt that has a handler, and the last interrupt for that device.
- Find the addresses for the vector table entries corresponding to the last handled interrupt and the last interrupt, and delete everything between.
I personally wouldn't make this dependent on the *name* of the symbol - as in the last step you find the addresses of the entries anyway, you can "simply" look up the addresses in the first step and search for the highest address targeting non-weak symbol.

And I second Sid's opinion - it should be an option.

Nice work, Senthil . You might want to discuss it in the avr-gcc list, too.

JW

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

ChaunceyGardiner wrote:
You can save us up to one k of flash by removing stuff no sensible developer would ever need.
One could argue that a sensible (or prudent) developer would want a default interrupt handler for the otherwise-unused interrupt vectors as a means of detecting errant code. You can add your own default interrupt handler to replace the one provided (jump to zero) which does whatever is reasonable for your application.

Don Kinzer
ZBasic Microcontrollers
http://www.zbasic.net

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

dkinzer wrote:
ChaunceyGardiner wrote:
You can save us up to one k of flash by removing stuff no sensible developer would ever need.
One could argue that a sensible (or prudent) developer would want a default interrupt handler for the otherwise-unused interrupt vectors as a means of detecting errant code. You can add your own default interrupt handler to replace the one provided (jump to zero) which does whatever is reasonable for your application.

Having an option to do that would be a good idea - for debugging.

Being forced to waste of up to 1 KB of flash on every chip that ends up in a finished product is not a good idea.

Sid

Life... is a state of mind

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

What is the linker supposed to do with option
--section-start=.isr10=0x14
if there is no section .isr10?
Would it prevent a four-byte section starting at 0x12?
To me, the documentation is not clear on that.

"SCSI is NOT magic. There are *fundamental technical
reasons* why it is necessary to sacrifice a young
goat to your SCSI chain now and then." -- John Woods

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

wek wrote:
Levenkay wrote:
The ISR() macros would simply emit a one-instruction-long segement, absolutely addressed at the appropriate vector table slot, containing a jump to the supplied routine.
That would be the easy part... but, can you please demonstrate how would that work?

I can't quite envisage how would the linker know which of the interrupt vectors are in use. The absolute location of those segments is just a secondary problem then.

Please prove me I am wrong.

JW

Any module wishing to define an interrupt service routine would use an ISR() macro, pretty much the same as they do now. The macro would:
  1. Declare a void handlerName(void); function prototype, whose name is derived from the given macro argument. The prototype should also be adorned with whatever other arguments mark it as an ISR (needing preservation of SREG, R1, etc).
  2. Emit a specially named (also derived from the ISR() macro argument) segment, consisting of a single jmp-to-handlerName instruction, and adorned with an attribute that gets passed to the linker as an instruction to set the load address of the segment to a specific value (the interrupt vector address, of course).
  3. Ouptut another copy of the handler's signature, so that the brace-enclosed code block that presumably follows the ISR() macro will finish the definition of the handler function
In my fantasy world, it would be possible to apply various sorts of address constraints to a given segment, the most basic of which would be to demand a specific load address. This feature would be exploited to locate the "jmp handlerName" instruction(s) at the appropriate place(s) in the vector table. So any module that used one or more ISR() macros would set up one or more single-instruction-long segments at fixed places in the ultimate load image. If a given project's modules only define two ISRs, there would be at least three absolutely-addressed one-instruction-long segments: one for the Reset vector, and one for each of the two ISR's jump vectors. For example, if the target machine was an ATmega8 and the only interrupts defined were INT1(vector 3) and USART_RXC (vector 12), there would be two bytes' worth of available space between the Reset and the INT1 vectors, and 16 bytes between the INT1 and the USART_RXC vectors. Finding something that could usefully fit in the first pair of bytes wouldn't be too likely, but it's quite possible that the linker might find a segment that would fit into the 16-byte gap. Nothing in the hypothetical situation has any further claim on specific load addresses after the USART_RXC vector, so the linker will fit the remaining segments into the remaining FLASH space, beginning at address 0x00C.

If other load-address constraints could be associated with specific segments, many other benefits could result, like assigning a "must fit within a 256-byte page, but I don't care WHICH page" attribute to a memory section so that simple 8-bit memory pointer operations would be sufficient to access the segment contents.[/]

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Levenkay,

I meant, with the existing linker. What you describe implies a substantial modification of the linker.

skeeve wrote:
What is the linker supposed to do with option
--section-start=.isr10=0x14
if there is no section .isr10?

The linker has the notion of "orphan sections", both for sections located from the command line and sections not located at all (the linker "guesses" where to place the latter). Neither is handled ideally (principially they can't be); moreover the sections located from command line may (and in this particular case, they surely will) simply clash with sections located through the linker script. AFAIK, they are located as last, so the lowermost addresses in FLASH would be already taken.

JW

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

wek wrote:
skeeve wrote:
What is the linker supposed to do with option
--section-start=.isr10=0x14
if there is no section .isr10?

The linker has the notion of "orphan sections", both for sections located from the command line and sections not located at all (the linker "guesses" where to place the latter). Neither is handled ideally (principially they can't be); moreover the sections located from command line may (and in this particular case, they surely will) simply clash with sections located through the linker script. AFAIK, they are located as last, so the lowermost addresses in FLASH would be already taken.
To be an orphan section, a section has to exist.
I'm referring to a "section" mentioned only on the command line.
In my question, there would be no existing section to place,
but that would not prevent the linker from creating a zero-length section that served only to get in the way of other sections.

"SCSI is NOT magic. There are *fundamental technical
reasons* why it is necessary to sacrifice a young
goat to your SCSI chain now and then." -- John Woods

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I posted a patch that does this, and there's a discussion going on in the AVR GCC mailing list about the usefulness of this feature. Eric, the maintainer, feels that this is too dangerous an option - unhandled (but enabled) interrupts would now execute random code instead of looping forever, as is now the case.

Please voice your opinion in the discussion, whether you disagree/agree :)

Regards

Senthil

 

blog | website

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I think it's a good idea - as long as the user has thoroughly debugged their application. I can certainly see monkeys writing up "awesome AVR tutorials" telling new users to always do this, but I'd argue that a) there are other dangerous things you can do if you don't read the documentation and b) unhandled interrupts executing is a critical bug anyway.

Perhaps we should *not* define __bad_isr_vect when the option is enabled, so that catch-alls in the user application would not compile when it is unavailable.

- Dean :twisted:

Make Atmel Studio better with my free extensions. Open source and feedback welcome!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
Perhaps we should *not* define __bad_isr_vect when the option is enabled, so that catch-alls in the user application would not compile when it is unavailable.

The program could enable undefined interrupts and it still would compile, because the compiler would not know, which bit at which address would enable which interrupt.

/Martin.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

saaadhu wrote:
Would it be useful if the linker shrinks the vector table down until the last "used" vector table entry?

Say, for the following piece of code for an XMEGA

How many Flash has your Xmega?
How many Flash was used by the vector table?

Divide both values.
If the result was less than 10%, forget it.

It make no sense to fight for every single byte of Flash.

Peter

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Senthil in the avr-gcc-list thread wrote:
Like the others said, the patch does not change the default behavior for unhandled interrupts - they still branch to __bad_interrupt. The user would have to explicitly use the --shrink-ivt option to change that - at which point, I guess he should be knowing what he's doing.
IMO it makes perfect sense.

One vote for here.

JW

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

abcminiuser wrote:

Perhaps we should *not* define __bad_isr_vect when the option is enabled, so that catch-alls in the user application would not compile when it is unavailable.

- Dean :twisted:

I didn't get that. Right now, all the __vector_ symbols are weakly defined to __bad_interrupt. If there is no definition of __vector_ (i.e. function by that name), the linker resolves it to __bad_interrupt, and therefore unhandled interrupts end up in __bad_interrupt.

If the shrink option is turned on, the entries that have jumps to __bad_interrupt wouldn't exist (ideally, but with my patch there could be holes in the table, if say only the first and last interrupts have handlers).

__bad_interrupt jumps to __vector_default, and __vector_default is weakly defined to entry 0 (reset vector). I guess user code overrides __vector_default to customize this behavior, but that merely is an "overriding" definition.

Regards

Senthil

 

blog | website

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

just a comment from the sideline:
We must remember that it's NOT a vector table but the code run from that addr, so if anything should gets optimized the last ISR (often if speed is needed only one ISR is used), the code should be placed at it's addr to avoid the (R)JUMP.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

unhandled (but enabled) interrupts would now execute random code instead of looping forever, as is now the case.

That's been true for every Asm program ever written for an AVR - they seem to cope!

I see a huge potential benefit for bootloader writers. For those who might be caught out by enabling interrupts without ISRs (ie idiots!) then they probably won't have the smarts to read a manual either, so won't know how to turn it on ;-)

+1 vote.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

sparrow2 wrote:
just a comment from the sideline:
We must remember that it's NOT a vector table but the code run from that addr, so if anything should gets optimized the last ISR (often if speed is needed only one ISR is used), the code should be placed at it's addr to avoid the (R)JUMP.

That's actually harder to do right now without tweaking the linker script. The table resides in the .vectors section, and the interrupt handlers are in .text, along with startup code and a bunch of user code.

For this to work, I'd think that the linker script should group/sort interrupt handlers and place them at the end of .vectors - then the linker can delete the jump for the last handled interrupt as well, and rely on the script to place the correct handler at that position.

Regards

Senthil

 

blog | website

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I want this feature, even without a bootloader. For regular programs, that is.

Make it an option you have to enable to use it.

Anybody that enables it without providing handlers for every interrupt they enable deserves what is coming to them.

It could be argued that even that is way too lenient. If you enable interrupts and don't provide handlers for them, you really deserve a crash. If the crash proved expensive enough, maybe you would start doing your job instead of relying on safeguards that imposes a cost on every properly designed program.

Sid

Life... is a state of mind

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

saaadhu wrote:
Please voice your opinion in the discussion, whether you disagree/agree :)
+1
It's very useful feature!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

After a bit of a discussion, the patch was turned down by the maintainer because he felt the cost of unhandled interrupts isn't worth the benefit, so sorry folks (see http://lists.nongnu.org/archive/...).

There were some alternative implementations suggested as well - placing the IVT entry in conditional assembly in the startup code, and including it in your project in source form (with a preprocessor define specifying the max interrupt number), for example.

Regards

Senthil

 

blog | website

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

he felt the cost of unhandled interrupts isn't worth the benefit,

Why does one person get to decide these things? Shouldn't it be put to a vote (as witnessed in this thread). The suggestion was that it was not enabled by default so would not affect ignorant users and grown ups have been handling limited vector tables (asm programs) for decades.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I wouldn't worry about it, the gains are tiny and the micros with large memories are cheap. Spend time fixing other stuff.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I agree. Just have a look at the known avr-gcc problems!

There is any kind of problem you might want to solve: better optimizations, internal compiler errors, wrong- code bugs, extensions, better diagnostics, fixing test cases, tweaking libgcc implementations, better documentation, ...

And if that's not enough, switch to the bug trackers of AVR-LibC and Binutils :-)

avrfreaks does not support Opera. Profile inactive.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

I wouldn't worry about it, the gains are tiny and the micros with large memories are cheap.

The XMEGA vector table is half a kilibyte, and the bootloader section is 8KB. Shrinking it there would be a good savings.

- Dean :twisted:

Make Atmel Studio better with my free extensions. Open source and feedback welcome!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I think, a better approach would be, if you need a bigger size for the bootloader (e.g. Ethernet, decryption),
then place subroutines below the bootloader reset vector.
Only the flash write code (some bytes) must be inside the upper section.
The bootloader must also know it's lowest address to prevent overwriting itself.

Then you are almost unlimited.
E.g. on the ATmega2560 you are able to use 128kB application and 128kB bootloader.

Peter

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

abcminiuser wrote:
The XMEGA vector table is half a kilibyte, and the bootloader section is 8KB. Shrinking it there would be a good savings.
Adjusting startup-code is common practice, at least in the non-AVR world.

Why not make use of this?

avrfreaks does not support Opera. Profile inactive.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

This is awkward!!!
Imagine that I have with the 10 types of h/w in one project, and I'll have 10 options interrupt table.
Then I add one more int and I'll have to read the documentation again looking for his number and make 10 edits...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

1. Helping the user to shrink the IVT via a compiler switch is a solution to what problem? How many users/applications are out there there are *so* tight with code space that this is necessary?

2. A solution already exists that allows users to customize the IVT to their heart's content. Is it through a compiler switch? No. But this is an advanced technique for advanced users. Most AVR applications should not need to have the IVT shrunken down to squeeze the last byte of code space anyway.

3. Why is the compiler team working on a feature that hasn't been requested, when it has been pointed out that there is a list of AVR GCC bugs to be fixed:
known avr-gcc problems.
The compiler team should also be focusing on new device support and perhaps better C++ support, since that is being used more and more via Arduino.

4. Just saying "that would be good savings", is ignoring the fact that "premature optimization is the root of all evil". If you want more code space then help on the back-end of AVR GCC to generate more optimal code. Or help out at the avr-llvm project to make a better compiler for the AVR. Or help with fixing avr-libc bugs.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Nitpicking really, but it was a linker switch (binutils), not a compiler one.

Anyway, the "feature" was written by me mostly in my spare time, so there is no "compiler team" involved. And I'm sure you'd agree that no one gets to tell me what I should work on in my spare time.

I did ask in the forum here if it would be useful before I took the trouble of refining my prototype and submitting the patch, and discussions in the mailing list such as these (http://lists.nongnu.org/archive/...) led me to believe that code size is a very important factor.

Regards

Senthil

 

blog | website

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

1. Helping the user to shrink the IVT via a compiler switch is a solution to what problem? How many users/applications are out there there are *so* tight with code space that this is necessary?

Me, now, with a bootloader that is tight enough to fail when compiled on some versions of AVR-GCC.

- Dean :twisted:

Make Atmel Studio better with my free extensions. Open source and feedback welcome!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

@EW

1) bootloaders

You almost always end up using -nostartfiles then putting back code to set SP, clear R1, do the data loops and so on all to remove the IVT. Senthil's work seems great but someone's vetoed it.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

IMnot-so-HO, I have a better idea.

Design the cpu with the ability to define the vector number of each source. Then crt0 and the linker can build a minimal vector table. The most I've ever implemented is about 10 ISRs, leaving 400+ bytes of useless flash on the table.

One of the reasons I like working with FPGA based cpus...

DOG is the Anagrammaton

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

sfnative wrote:
IMnot-so-HO, I have a better idea.

Design the cpu with the ability to define the vector number of each source. Then crt0 and the linker can build a minimal vector table. The most I've ever implemented is about 10 ISRs, leaving 400+ bytes of useless flash on the table.

You realise this is a one-year-old thread, don't you? ;)

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]