static inline does not inline, but just inline does in AS7

Go To Last Post
12 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I've been writing my first emulator code which has been a learning adventure.  I have an execute function that uses a switch for each opcode.  Then each opcode calls an inline static function which may call one or more inline static functions.  In the end in the other compilers I've tried, it boils them all down to code that runs in the switch and there is no wasted time calling and returning from functions at the expense of program size.  When I moved this over to AS7 in an atxmega384c3 project, it does not inline the code.  If I put "#define static" at the top to remove all the static keywords leaving only inline, then it does inline them all.  Any ideas why?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Firstly, what language are you using: C or C++?

 

Secondly, an example of your code might be useful.

Dessine-moi un mouton

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

__attribute__((always_inline)) perhaps?

 

Remember that "inline" on it's own is simply a "hint" to the compiler to say "it could be more efficient if you inline this to drop the CALL/RET overhead and ABI setup"

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

With C99 & C11, inline/extern inline controls where the function symbol gets emitted.  Whether it actually gets inlined or not is dependent on the compiler flags.

https://gustedt.wordpress.com/20...

 

For C++11, the function gets emitted in every translation unit as weak.

 

Here's a detailed comparison of inline in both languages.

http://gudok.xyz/inline/

 

In order to get deterministic results in both C and C++ I recently started doing the following:

// make these functions work like type-safe macros
#define MUST_INLINE __attribute((always_inline)) inline

For anything that could compile to more than a few lines of code, I haven't found a satisfactory technique that works for both C and C++.  Instead, when I want control over performance and code size, I often write the code in assembler along with a small inline wrapper so it can be safely called from C and C++.

 

In case it is not clear, for most modern compilers including gcc, the "inline" keyword means nothing about actually including the code inline where it is called.  That happens based upon optimization flags, whether or not the inline keyword is specified.  Inlining can even happen between different .o files when using link-time optimization.

 

I have no special talents.  I am only passionately curious. - Albert Einstein

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

It is C.  The functions are only in the single module, not in any header files.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

alank2 wrote:
Then each opcode calls an inline static function which may call one or more inline static functions

Perhaps that's the reason. There's little point inlining a top-level function without also inlining the lower-level functions, and the call/return overhead may be insignificant compared to the otherwise inlined code.

 

Also I would have expected static declarations to have improved things because then the compiler need not follow the ABI.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

alank2 wrote:
When I moved this over to AS7 in an atxmega384c3 project, it does not inline the code.  If I put "#define static" at the top to remove all the static keywords leaving only inline, then it does inline them all.  Any ideas why?

In a word, no.

Assuming the same optimisation settings, why it would choose to inline in one case and not in the other, don't know.

 

In the non static case, if it didn't inline, you would then get a link error unless you provide an external definition somewhere,

which you do (assuming C99 onwards) by adding, in one c file only,

extern inline void my_function(void);

(Or whatever the signature of the function is).

An external (ie. not inlined) definition for the function will then be emitted in the object file generated from that c file.

But since it has inlined in the non static case, you haven't needed that external definition.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

N.Winterbottom wrote:
Perhaps that's the reason. There's little point inlining a top-level function without also inlining the lower-level functions, and the call/return overhead may be insignificant compared to the otherwise inlined code.

 

From the first inlined function that is called, it only calls other inlined functions.  They are created in such a way as to prevent duplicating logic in more than one function.  Here is an example:

 

Here is some opcodes calling the initial inline function:

 

            //single register instructions
            case 0x04: inr(&CPUREGS_B); break;
            case 0x0C: inr(&CPUREGS_C); break;
            case 0x14: inr(&CPUREGS_D); break;
            case 0x1C: inr(&CPUREGS_E); break;
            case 0x24: inr(&CPUREGS_H); break;
            case 0x2C: inr(&CPUREGS_L); break;

Here are the inline functions:


static inline uint8_t inrx(uint8_t AValue)
  {
    AValue++;
    CPUFLAG_AUXCARRY=(uint8_t)((AValue & 0xf)==0);
    zero_sign_parity_flags(AValue);
    return AValue;
  }

static inline void inr(uint8_t *AValue)
  {
    add_cycles(5);
    *AValue=inrx(*AValue);
  }
static inline void add_cycles(uint8_t ACycles)
{
  #if (CPU_CYCLE_COUNTER>0)
    cpu.cycles+=ACycles;
  #endif
}

static inline void zero_sign_parity_flags(uint8_t AValue)
  {
    CPUFLAG_ZERO=(uint8_t)(AValue==0);
    CPUFLAG_SIGN=(uint8_t)(AValue & _BV(7));
    CPUFLAG_PARITY=p_table[AValue];
  }

If the compiler inlines these, it does an amazing job of optimizing them down into a minimal set of unwound instructions.  Even an old DOS compiler from the 90's does!

 

N.Winterbottom wrote:
Also I would have expected static declarations to have improved things because then the compiler need not follow the ABI.

 

I'm not sure what you mean by ABI.  I would have expected static to improve things because the compiler knows it doesn't have to make them externally available.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

ABI=Application Binary Interface
.
That is the rules that say which registers must be used to pass values into functions and which will be used to return results.
.
Anyway, as above use the always_inline attribute with GCC (and it's easy to do a prepro check to see if it is GCC)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks clawson - is __GNUC__ what to check for?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yup. BTW this page looks to have lots off useful info in case you need to do something similar for other compilers...
.
https://blog.kowalczyk.info/article/j/guide-to-predefined-macros-in-c-compilers-gcc-clang-msvc-etc..html

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'm no pro... but here's some ideas:

 

* try constexpr functions

* try tweaking the optimization level

* try declaring your static inlined class methods in the header file

* try always_inline

* try passing method params as template constants
* if all else fails #defines foo().

#pragma GCC optimize("-Os")

#pragma GCC optimize("-O2")

static inline void __attribute__((always_inline)) foo(void) {...}

constexpr uint8_t small(constexprParams) { return constexprExpression; }

 

 

 

 

My GitHub (FAB_LED bitbang for ws2812, DigitalIO, SerialMenu...).
I make portable, code optimized Arduino libraries as a hobby and use AtTiny85 the most for my builds.
I am a burner and an OS and systems programmer by trade.

Last Edited: Wed. May 13, 2020 - 09:40 AM