Making the linker convert lds/sts instructions to in/out

Go To Last Post
11 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

How useful is having the linker rewrite LDS/STS instructions to IN/OUT (given the address fall in the I/O addressable range)? The IN/OUT instructions are a byte shorter and take one fewer cycle to execute, according to the datasheet.

The compiler already does this for addresses known to be in the IN/OUT range at compile time - it falls back to lds/sts otherwise (I'm excluding things LTO can do). The final addresses will obviously be known at link time, so if this is done, the linker can catch the missed cases too.

I'm just curious if it will be useful though - how common is having I/O code that doesn't really know the actual addresses until link time? In most cases, the device header is included when compiling, so the compiler can see the actual addresses. Library code will benefit though.

What do you guys think?

Regards

Senthil

 

blog | website

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

(I'm excluding things LTO can do)

Why? Isn't this the kind of thing that Link Time Optimisation should be doing anyway?
Quote:

In most cases, the device header is included when compiling, so the compiler can see the actual addresses.

Indeed - the only thing in IN/OUT range are going to be SFRs and assuming they are used symbolically then the compiler is bound to emit IN/OUT (with optimisation) anyway. I suppose there are those people who pass around PORTB addresses for "flexibility" but they are basically using the "wrong solution" in the first place so I'm not sure there's a huge demand for it?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

LTO cannot see inside library archives, unless the libraries themselves were built with LTO turned on (and a linker plugin, compatible linker are available). And LTO cannot access symbols defined on the command line (with defsym).

I was thinking people would have lots of libraries (like avr-libc, maybe one per device arch) which they would then link against a specific device. It doesn't seem a lot of work to get the linker to do this though, so I'll give it a shot and try benchmarking some ASF code to see if it helps.

Regards

Senthil

 

blog | website

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

I was thinking people would have lots of libraries (like avr-libc, maybe one per device arch)

Nope almost no one in the AVR world ever uses .a files (the prebuilt .a's in AVR-LibC are an exception rather than the rule). The reason being that if you have a UART or LCD library or whatever you would either have to have lots of different prebuilt .a for each SFR layout or even a runtime check of the architecture followed by adaption of the SFR addresses or else you do what almost everyone working with AVRs does and provide a "library" in source form and .c and .h files so it can do stuff like:

uart_putchar(uint8_t data) {
#ifdef __AVR_ATmega16__
  UDR = data;
#elif defined(__AVR_ATmega168__)
  UDR0 = data;
#elif ...
#endif
}

To do this as .a you would have to build that code for mega16 and mega168 separately and then ensure the right one gets linked depending on the target or you would:

uart_putchar(uint8_t data) {
  if (read_signature() == atmega16) {
    *(uint8_t *)mega16_UDR_address = data;
  }
  else if (read_signature() == atmega168) {
    *(uint8_t *)mega168_UDR_address = data;
  }
  ...
}

or something equally horrendous!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Ah ok. I was thinking there'd be code like

extern uint8_t const uart_data_port;
uart_putchar(uint8_t data) {
   ...
   uart_data_port = data;
}

in a library (say uartlib.a), and then in the app, do something like (or its equivalent)

avr-gcc  uartlib.a -Wl,--defsym=uart_data_port=

But I guess things are not done this way. Thanks for the feedback.

Regards

Senthil

 

blog | website

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

IMO, such an optimization makes not much sense

1) Applications / libraries are not written in that way

2) Even when you try to factor out hardware in that way, the approach is quite limited. Suppose for example, that you also want to factor out bit positions and want to specify them at link time. Then you get odd code that must use a shift loop:

#include 

uint8_t set_bit (uint8_t val)
{
    extern char sym;
    uint8_t bit = (uint8_t)(uintptr_t) &sym;
    
    return val | (1 << bit);
}

This is a common idiom when dealing with setting / clearing SFR bits.

In order to factor out hardware, there are better and more general approaches.

avrfreaks does not support Opera. Profile inactive.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hmm ok. There's one other benefit - assembly programmers needn't remember to write in/out; the linker would do it for them. But I guess if someone is programming in assembly, it is safe to assume they won't miss that?

Regards

Senthil

 

blog | website

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Sounds like you are desperately looking for something to optimize ;-)

avrfreaks does not support Opera. Profile inactive.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

As Georg-Johan says elsewhere :

https://www.avrfreaks.net/index.php?name=PNphpBB2&file=viewtopic&p=115604...

there are surely more important things to be done by those who are lucky enough to understand how the toolchain works internally?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
As Georg-Johan says elsewhere :

https://www.avrfreaks.net/index.p...

there are surely more important things to be done by those who are lucky enough to understand how the toolchain works internally?

:) Register allocation related code is darn difficult to understand - I gotta admit I don't know enough to confidently debug and fix reload related problems. Another factor is that a decent chunk of that code is going away, thanks to LRA(http://gcc.gnu.org/wiki/cauldron...).

I've been messing around with LRA recently to make it work for the AVR target. Hopefully that'll fix these problems, or atleast make it easier to debug :)

Regards

Senthil

 

blog | website

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

SprinterSB wrote:
Sounds like you are desperately looking for something to optimize ;-)

:D I got this idea from an internal discussion, and the implementation itself was very straightforward. Not desperate, just trying to see if it is useful for someone :)

Regards

Senthil

 

blog | website