Inline vs Macro with -0s

Go To Last Post
10 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

My personal preference is to using inline functions instead of macros for small, commonly used functions. There is a significant difference between the two when using the GCC optimization level -Os as GCC chooses to use RCALL once you have more than one call to a given function of sufficient length. For example, the following code calls a static inline function twice:

static inline void SetPB0High(void)
{
    if (PINB & (1 << PINB0))
        DDRB &= ~ (1 << PINB0);
    PORTB |= (1 << PINB0);
}

int main(void)
{
    while (1)
    {
        SetPB0High();
    }
    return 0;
}

ISR(TIMER0_OVF_vect)
{
    SetPB0High();
}

Which generates the following .lss output:

00000030 <SetPB0High>:
  30:	b0 99       	sbic	0x16, 0	; 22
  32:	b8 98       	cbi	0x17, 0	; 23
  34:	c0 9a       	sbi	0x18, 0	; 24
  36:	08 95       	ret

00000038 <main>:
  38:	fb df       	rcall	.-10     	; 0x30 <SetPB0High>
  3a:	fe cf       	rjmp	.-4      	; 0x38 <main>

0000003c <__vector_5>:
  3c:	1f 92       	push	r1
  3e:	0f 92       	push	r0
  40:	0f b6       	in	r0, 0x3f	; 63
  42:	0f 92       	push	r0
  44:	11 24       	eor	r1, r1
  46:	2f 93       	push	r18
  48:	3f 93       	push	r19
  4a:	4f 93       	push	r20
  4c:	5f 93       	push	r21
  4e:	6f 93       	push	r22
  50:	7f 93       	push	r23
  52:	8f 93       	push	r24
  54:	9f 93       	push	r25
  56:	af 93       	push	r26
  58:	bf 93       	push	r27
  5a:	ef 93       	push	r30
  5c:	ff 93       	push	r31
  5e:	e8 df       	rcall	.-48     	; 0x30 <SetPB0High>
  60:	ff 91       	pop	r31
  62:	ef 91       	pop	r30
  64:	bf 91       	pop	r27
  66:	af 91       	pop	r26
  68:	9f 91       	pop	r25
  6a:	8f 91       	pop	r24
  6c:	7f 91       	pop	r23
  6e:	6f 91       	pop	r22
  70:	5f 91       	pop	r21
  72:	4f 91       	pop	r20
  74:	3f 91       	pop	r19
  76:	2f 91       	pop	r18
  78:	0f 90       	pop	r0
  7a:	0f be       	out	0x3f, r0	; 63
  7c:	0f 90       	pop	r0
  7e:	1f 90       	pop	r1
  80:	18 95       	reti

I can understand this choice, but I would like to be able to control this behavior with respect to its use in ISRs. It seems I have only two options

  1. Use macros for any such functions called from ISR's
  2. Manually modify the ISR code 

 

Is there another way to avoid the RCALL and associate PUSH/POLL issues in the ISR for inline functions and -Os?

 

This topic has a solution.
Last Edited: Tue. Oct 27, 2015 - 09:32 PM
This reply has been marked as the solution. 
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I guess my first thought/question is how many of these fragments do you really have, and how many in ISRs?  If the above is painful to you, then just do the coding in the ISR to make it skinnier.

 

I realize this may be only an example, but whenever I bit-bang push-pull it is with external pullup (as in I2C) and then I never fuss with PORT.  So it is just a single SBI/CBI .

 

[I'm not a GCC guru...] Do you get the same results when you exercise always_inline ?

https://gcc.gnu.org/onlinedocs/g...

https://gcc.gnu.org/ml/gcc-help/...

https://www.avrfreaks.net/forum/h...

 

oecben wrote:
It seems I have only two options

<devil's grin> You didn't mention selecting a different toolchain as an option.

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0
#include <avr/io.h>
#include <avr/interrupt.h>

static void SetPB0High(void) __attribute__ ((always_inline));
static void SetPB0High(void)
{
	if (PINB & (1 << PINB0))
	DDRB &= ~ (1 << PINB0);
	PORTB |= (1 << PINB0);
}

int main(void)
{
	while (1)
	{
		SetPB0High();
	}
	return 0;
}

ISR(TIMER0_OVF_vect)
{
	SetPB0High();
}
00000038 <main>:
#include <avr/interrupt.h>

static void SetPB0High(void) __attribute__ ((always_inline));
static void SetPB0High(void)
{
	if (PINB & (1 << PINB0))
  38:	b0 99       	sbic	0x16, 0	; 22
	DDRB &= ~ (1 << PINB0);
  3a:	b8 98       	cbi	0x17, 0	; 23
	PORTB |= (1 << PINB0);
  3c:	c0 9a       	sbi	0x18, 0	; 24
  3e:	fc cf       	rjmp	.-8      	; 0x38 <main>

00000040 <__vector_9>:
	}
	return 0;
}

ISR(TIMER0_OVF_vect)
{
  40:	1f 92       	push	r1
  42:	0f 92       	push	r0
  44:	0f b6       	in	r0, 0x3f	; 63
  46:	0f 92       	push	r0
  48:	11 24       	eor	r1, r1
#include <avr/interrupt.h>

static void SetPB0High(void) __attribute__ ((always_inline));
static void SetPB0High(void)
{
	if (PINB & (1 << PINB0))
  4a:	b0 99       	sbic	0x16, 0	; 22
	DDRB &= ~ (1 << PINB0);
  4c:	b8 98       	cbi	0x17, 0	; 23
	PORTB |= (1 << PINB0);
  4e:	c0 9a       	sbi	0x18, 0	; 24
}

ISR(TIMER0_OVF_vect)
{
	SetPB0High();
}
  50:	0f 90       	pop	r0
  52:	0f be       	out	0x3f, r0	; 63
  54:	0f 90       	pop	r0
  56:	1f 90       	pop	r1
  58:	18 95       	reti

0000005a <_exit>:
  5a:	f8 94       	cli

0000005c <__stop_program>:
  5c:	ff cf       	rjmp	.-2      	; 0x5c <__stop_program>

Beyond that, if you really know that only those three instructions are used, you could try going naked.  Or find some other method to make a skinnier ISR.

 

[hmmm--if you indeed have this construct in mainline and ISR, don't you have a race condition if the ISR fires in the middle of the mainline sequence?  those are a bugger to find.]

 

 

 

 

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Had to try it in CodeVision after I smarted off above.  Luckily for me...

 

#include <io.h>

static inline void SetPB0High(void)
{
    if (PINB & (1 << PINB0))
    DDRB &= ~ (1 << PINB0);
    PORTB |= (1 << PINB0);
}

void main(void)
{
    while (1)
    {
        SetPB0High();
    }
}

interrupt [TIM0_OVF] void timer0_ovf_isr(void)
{
	SetPB0High();
}

 

                ;static inline void SetPB0High(void)
                 ;0000 0004 {
                 ;0000 0005     if (PINB & (1 << PINB0))
                 ;0000 0006     DDRB &= ~ (1 << PINB0);
                 ;0000 0007     PORTB |= (1 << PINB0);
                 ;0000 0008 }
                 ;
                 ;void main(void)
                 ;0000 000B {
                 
                 	.CSEG
                 _main:
                 ;.FSTART _main
                 ;0000 000C     while (1)
                 _0x5:
                 ;0000 000D     {
                 ;0000 000E         SetPB0High();
00005a 9918      	SBIC 0x3,0
00005b 9820      	CBI  0x4,0
00005c 9a28      	SBI  0x5,0
                 ;0000 000F     }
00005d cffc      	RJMP _0x5
                 ;0000 0010 }
                 _0x8:
00005e cfff      	RJMP _0x8
                 ;.FEND
                 ;
                 ;interrupt [TIM0_OVF] void timer0_ovf_isr(void)
                 ;0000 0013 {
                 _timer0_ovf_isr:
                 ;.FSTART _timer0_ovf_isr
                 ;0000 0014 	SetPB0High();
00005f 9918      	SBIC 0x3,0
000060 9820      	CBI  0x4,0
000061 9a28      	SBI  0x5,0
                 ;0000 0015 }
000062 9518      	RETI
                 ;.FEND
                 ;

 

 

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

theusch wrote:

I guess my first thought/question is how many of these fragments do you really have, and how many in ISRs?  If the above is painful to you, then just do the coding in the ISR to make it skinnier.

Probably not enough to say this isn't an option, but the OCD side of me would have a break down.

theusch wrote:

[I'm not a GCC guru...] Do you get the same results when you exercise always_inline ?

https://gcc.gnu.org/onlinedocs/g...

https://gcc.gnu.org/ml/gcc-help/...

https://www.avrfreaks.net/forum/h...

This does the trick. It also creates inner turmoil. One of my pet peeves with GCC is all of its magic attributes. I remember my head almost exploding the first time I wandered through Dean Camera's LUFA with an eye for porting it to another compiler.

 

theusch wrote:

oecben wrote:
It seems I have only two options

<devil's grin> You didn't mention selecting a different toolchain as an option.

I do use other compilers for some things. The unfortunate truth is that for work that will be shared with "the community" at large, GCC seems to be the lowest common denominator. 

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

theusch wrote:

[hmmm--if you indeed have this construct in mainline and ISR, don't you have a race condition if the ISR fires in the middle of the mainline sequence?  those are a bugger to find.]

 

This was just a bogus example. laugh

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Duplicate post...deleted

Last Edited: Tue. Oct 27, 2015 - 09:40 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You general problem is :

Avoid making a call inside a ISR. If you need it skinny set a flag and do the rest outside.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You might also want to explore the effect of -finline-small-functions

 

Having said that, the user manual:

 

https://gcc.gnu.org/onlinedocs/g...

 

tells us that -finline-small-functions is included in -O2 and then -Os is basically -O2 with some options removed - but not this one.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

theusch wrote:
[hmmm--if you indeed have this construct in mainline and ISR, don't you have a race condition if the ISR fires in the middle of the mainline sequence?  those are a bugger to find.]
In the case at hand, I'd say not.

It does not matter whether the ISR or the mainline code clears the DDRB.0 bit or sets the PORTB.0 bit.

The order will also be acceptable.

DDR and PORT registers are volatile only in the sense that their use causes side-effects.

They will not change while the code is not looking.

If the test fails in the ISR or the mainline,

inserting the lonely PORT assignment between the other assignments has the same effect as placing it after.

This is true even if the assignments are done with read-modify-write.

 

That said, it might be nice to avoid the need for the analysis.

 

Iluvatar is the better part of Valar.