return value vs pointer

Go To Last Post
20 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Good day!

I've a function that is called frequently and it return single value. What is more efficient with AVR GCC -- return value with 'return' or pass pointer to function and update value using this pointer?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Moderator note:

If you have questions specifically about using GCC can you post them in the GCC forum please?

(as to your question - it takes about 2 minutes to build a test program and check the .lss)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks for the quick replies, it seems I place all my topics wrong, thank you for your assistance and for kindly notificatins about this!

About my questions. I'm not so good in assembly, so I want to check my assumptions. So:

#include 

int main() {

	return(0);
}

int func1(void)
{
	return 0;
}

void func2(int * a)
{
	*a = 0;
}

.lss output:

int func1(void)
{
	return 0;
}
  3e:	80 e0       	ldi	r24, 0x00	; 0
  40:	90 e0       	ldi	r25, 0x00	; 0
  42:	08 95       	ret

00000044 :

int func2(int * a)
{
  44:	fc 01       	movw	r30, r24
	*a = 0;
  46:	11 82       	std	Z+1, r1	; 0x01
  48:	10 82       	st	Z, r1
  4a:	08 95       	ret

As far as I understand assembly instructions (without macroses) are directly translated to machine code. AVR processes most instructions are processed in one clock cycle, so more assembly instructions = slower. So returning value seems to be faster, but I don't know if this is significant.

inline functions should be even faster, because no need to calling\storing stack\return.

Right?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

artvolk wrote:
I'm not so good in assembly [...]
It's time to hit the books, then. It is a very, very useful skill to have.
artvolk wrote:
[...]most instructions are processed in one clock cycle, so more assembly instructions = slower.
Generally speaking, the fewer assembly instructions the shorter the execution time. Note, however, that there are quite a few 2-cycle instructions and even some 3, 4, and 5-cycle instructions. Five one-cycle instructions are clearly faster than two four-cycle instructions. You'll have to familiarize yourself with the instruction set (download the document at the Atmel website) to know for sure.

Don Kinzer
ZBasic Microcontrollers
http://www.zbasic.net

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I doubt that you will find the answer using such trivial functions.

Why is it exactly that you are concerned with a few clock cycles?

Regards,
Steve A.

The Board helps those that help themselves.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

artvolk wrote:
Good day!

I've a function that is called frequently and it return single value. What is more efficient with AVR GCC -- return value with 'return' or pass pointer to function and update value using this pointer?

Forget about the minutia of code generation.
Think about what makes your code easy to write, and easy to read.

result = function(parameters);

function(parameters, &result);

Get the calling procedure and use nice and straightforward. Then write the function to suit.

IMHO, method (1) wins every time. You are [passing parameters by value. You are returning a value. You are never corrupting your main code with the operation of the function. You choose what to do with the 'result'.

There are occasions when you want to return more than one result. In which case you can use pointers to structures or pointers to individual variables as a last resort.

Which is more efficient? I do not really care.
Which is easier to read or maintain? No contest.

David.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Generally, any indirect operation (i.e. using pointers) is more costly than direct, as it means memory access for two things: loading the pointer's value and then loading/storing the pointed value.

Moreover, compilers tend to place return values into registers, whereas pointer access goes into memory.

Some compilers/optimizers may be smart enough to guess your intention, but don't count on that. Also, there is a case when the result of function is stored through pointer by the caller, which could be perhaps optimized a bit by passing the pointer to the function, but that's a special case and does not apply generally.

JW

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thank you all for the input, MCU are new environment for me, so sometimes I doubt trivial things I'm doing for years in other languages (like this).

I've checked instructions table in the datasheet, a valuable reading...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

A more valid test of the mechanism would be something like:

#include  

uint8_t func1(void) __attribute__((noinline));
uint8_t func1(void) 
{ 
   return PINA; 
} 

void func2(uint8_t * a) __attribute__((noinline));
void func2(uint8_t * a) 
{ 
   *a = PIND; 
}

int main(void) { 
   uint8_t data;

   while(1) {
     PORTB = func1();
     func2(&data);
     PORTC = data;
   }
   return(0); 
}

The implementation of func1() is:

00000066 :
#include  

uint8_t func1(void) __attribute__((noinline));
uint8_t func1(void) 
{ 
   return PINA; 
  66:	89 b3       	in	r24, 0x19	; 25
} 
  68:	08 95       	ret

While the implementation of func2() is:

0000006a :

void func2(uint8_t * a) __attribute__((noinline));
void func2(uint8_t * a) 
{ 
  6a:	fc 01       	movw	r30, r24
   *a = PIND; 
  6c:	80 b3       	in	r24, 0x10	; 16
  6e:	80 83       	st	Z, r24
}
  70:	08 95       	ret

and main() invokes these with:

00000072 
: int main(void) { 72: 0f 93 push r16 74: 1f 93 push r17 76: df 93 push r29 78: cf 93 push r28 7a: 0f 92 push r0 7c: cd b7 in r28, 0x3d ; 61 7e: de b7 in r29, 0x3e ; 62 uint8_t data; 80: 8e 01 movw r16, r28 82: 0f 5f subi r16, 0xFF ; 255 84: 1f 4f sbci r17, 0xFF ; 255 PORTB = func1(); 86: ef df rcall .-34 ; 0x66 88: 88 bb out 0x18, r24 ; 24 func2(&data); 8a: c8 01 movw r24, r16 8c: ee df rcall .-36 ; 0x6a PORTC = data; 8e: 89 81 ldd r24, Y+1 ; 0x01 90: 85 bb out 0x15, r24 ; 21 92: f9 cf rjmp .-14 ; 0x86

So the func1() is astronomically more efficient both in implementation and in the invocation.

Cliff

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Cliff, thank you for taking time to create this example for me! One quick question:

"__attribute__((noinline))" -- it seems that even in my example compiler doesn't optimize functions to be inline. Is this just for safety of this example?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

In your example you never called the functions anyway - I was just showing how they would be invoked and how mucm more cumbersome (both in C source and generated object) the pointer variant was.

To keep the code clean I didn't want the compiler to simply inline the functions but to make (r)calls so that you could clearly see the setup overhead in each case.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thank you, now I got it!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I will just add this:
If the format of the return value is 32 (or 64) bit it could be better with a pointer (because it's only 16bit).

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

david.prentice wrote:
There are occasions when you want to return more than one result. In which case you can use pointers to structures or pointers to individual variables as a last resort.
One can also return a structure.

Moderation in all things. -- ancient proverb

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

If one wanted to risk burning at the stake after being convicted of being a heretic, one could operate on global data directly. And/or pass a parameter being the pointer to the structure being affected, instead of passing it back (and forth).

Even in subroutines/functions I tend to do a lot of operations on global data. Especially on smaller AVR models; e.g. Mega48/88 class.

Quote:

I've a function that is called frequently and it return single value.

My decision would be based on the nature of that function, and what it operates on. If a single-line (or few-line) function I might be tempted to "always inline" if there is code space, and my primary aim is speed. Might even do a macro.

Also, the lifetime of the returned value may be pertinent. If e.g. the X-value of a coordinate pair that is used and discarded, I might have it operate on a global register variable.

As nearly always, "it depends".

Lee

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Maybe in the application note "AVR035: Efficient C Coding for AVR" you can find some answers:
http://www.atmel.com/dyn/resourc...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

Maybe in the application note "AVR035: Efficient C Coding for AVR" you can find some answers

Except that it is written to describe the code generation model in IAR V2, not GCC so what's written there does not necessarily apply to avr-gcc, subject of this forum/thread.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

If you want to return a pointer, it must not point auto variable. On the other hand returning big amounts of data through the stack..

It depends on what "efficient" means to you. If you have plenty of flash and short of SRAM, use push/pop to reuse the stack. If you have plenty of SRAM, use statics and pointers.

And of course AVR GCC uses 16-bit pointers, so every time you return something shorter - return it by value.

No RSTDISBL, no fun!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Or simply follow Lee's advice and share globals. These are 1K/2K/4K controller apps written by one author we discuss here (in the main) so the usual worries about data hiding don't really need to apply.

I guess the downside of globals is that if the functions were destined to be inlined the compiler loses the possibility of simply caching variables into machine registers.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

According to a FAQ, avr-gcc will return up to eight bytes in registers.
C allows returning arbitrarily large structs, but I'm not sure about avr-gcc.
I'd expect register return to usually be more efficient.
An obvious exception can occur when the goal is to put data into SRAM.
If the target's address is not known by link-time,
the pointer will have to be generated at run time.
It might as well be passed to the function,
which might be able to get by with one "return" register instead of eight.

Moderation in all things. -- ancient proverb