C/C++ -> asm function call registers

Go To Last Post
17 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

 

Hi

 

I really apologize for asking such a dumb question, but I cantt find a precise answer about which registers are used when calling a function in C/C++

 

I found this

 

 

yet in this arduino code, it seems to use X register , which is R26:27... not R25

 

I dont get it

 

 

here is the definition of the structure in c++

 

typedef struct {
	volatile IO_REG_TYPE * pin1_register;
	volatile IO_REG_TYPE * pin2_register;
	IO_REG_TYPE            pin1_bitmask;
	IO_REG_TYPE            pin2_bitmask;
	uint8_t                state;
	int32_t                position;
} Encoder_internal_state_t;

 

my goal is to understand how the structure pointer is accessed from assembly language

 

thanks for your time

 

Phil

 

Last Edited: Thu. May 6, 2021 - 07:28 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Your function argument needs 2 bytes as it is a pointer, so uses r25:r24 as the list suggests. These registers are not pointer registers so this value will be moved to a register that is a pointer register, which will be X or Z, since Y is usually for stack use.

 

You can watch what the compiler does-

https://godbolt.org/z/984n98516

 

Not sure if you are trying to decode existing asm, or create it. If creating, I would say- why.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The code you put in your post does not include the full ASM statement.

I would expect that it has an "input operand" after the asm string that specifies "pass the value of arg in the X register."

Probably something like:

"clr r1  \n\t" 		         \		
	: "=&d" (prod)               \
	: "x" (arg), "z" (dest)    );

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Are you planning to do this in inline asm or in .S files. For the latter you simply need to know the ABI:

 

https://gcc.gnu.org/wiki/avr-gcc

 

The piece you quote about "fixed argument list" is basically just a precis of that ABI.

 

But then you show inline Asm where the input/output allocation is not fixed in the same way but is controlled by the input and output lists as described in the manual:

 

https://www.nongnu.org/avr-libc/...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

curtvm wrote:
If creating, I would say- why.

Creating an ASM function callable from C or C++ ?

 

If that's the case, the easiest way is to create a "skeleton" in C or C++ and look at the ASM that the compiler generates for the call and return - because the compiler certainly does know its own ABI.

 

You then take that ASM, and fill-in your own ASM code.

 

Same principle as: http://www.8052mcu.com/forum/rea...

 

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
Last Edited: Fri. May 7, 2021 - 08:21 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

>Creating an ASM function callable from C or C++ ?

>You then take that ASM, and fill-in your own ASM code.

 

Like I said. Why.

 

What can possibly be going in that update function that the compiler cannot do, and most likely better. I'll assume that this function is from some arduino library, and phil1-7 is just trying to understand what is going on. Someone thought they needed asm, and they didn't. Happens all the time.

 

Nothing wrong with knowing how to read asm, and one should keep an eye on what the compiler produces simply because it will point out where you are wrong in your understanding of the c/c++ language, and lets you resolve problems quicker because you know what 'looks right'. 

 

edit-

Looks like I found something that appears to be what we are looking at-

http://docs.ros.org/en/hydro/api...

 

Why on earth is there all that asm.

 

How about something 10x easier to read, and looks like it probably produces better code-

https://godbolt.org/z/j934bcra5

 

 

 

Here is what I came up with for an encoder on a recent thread-

https://www.avrfreaks.net/commen...

 

and this is what the code looks like (cannot compile online, so just showing the code)-

https://godbolt.org/z/KTMxx7bo6

Was looking for simple, and no asm needed. I don't think I have needed to create asm for an mcu in a long time. Compilers do a good job.

 

Time is better spent getting good at c/c++ instead of asm.

 

 

 

 

Last Edited: Fri. May 7, 2021 - 09:54 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

curtvm wrote:
Like I said. Why.

https://www.avrfreaks.net/commen...

 

A classic example would be a precise delay routine: http://www.8052mcu.com/forum/rea...

 

Time is better spent getting good at c/c++ instead of asm

You know that I am firmly  in the "why on earth would you do this in assembler?" camp - but there will always be a few things that need to be done in ASM.

 

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

curtvm wrote:
Why on earth is there all that asm.
There doesn't have to be! That encoder support was written by Paul Stoffregen (behind "Teensy" and a lot of the core Arduino stuff). The file even has a #else:

	static void update(Encoder_internal_state_t *arg) {
#if defined(__AVR__)
		// The compiler believes this is just 1 line of code, so
		// it will inline this function into each interrupt
		// handler.  That's a tiny bit faster, but grows the code.
		// Especially when used with ENCODER_OPTIMIZE_INTERRUPTS,
		// the inline nature allows the ISR prologue and epilogue
		// to only save/restore necessary registers, for very nice
		// speed increase.
		asm volatile (
			"ld	r30, X+"		"\n\t"
			"ld	r31, X+"		"\n\t"
			"ld	r24, Z"			"\n\t"	// r24 = pin1 input
			"ld	r30, X+"		"\n\t"
			"ld	r31, X+"		"\n\t"
			"ld	r25, Z"			"\n\t"  // r25 = pin2 input
			"ld	r30, X+"		"\n\t"  // r30 = pin1 mask
			"ld	r31, X+"		"\n\t"	// r31 = pin2 mask
			"ld	r22, X"			"\n\t"	// r22 = state
			"andi	r22, 3"			"\n\t"
			"and	r24, r30"		"\n\t"
			"breq	L%=1"			"\n\t"	// if (pin1)
			"ori	r22, 4"			"\n\t"	//	state |= 4
		"L%=1:"	"and	r25, r31"		"\n\t"
			"breq	L%=2"			"\n\t"	// if (pin2)
			"ori	r22, 8"			"\n\t"	//	state |= 8
		"L%=2:" "ldi	r30, lo8(pm(L%=table))"	"\n\t"
			"ldi	r31, hi8(pm(L%=table))"	"\n\t"
			"add	r30, r22"		"\n\t"
			"adc	r31, __zero_reg__"	"\n\t"
			"asr	r22"			"\n\t"
			"asr	r22"			"\n\t"
			"st	X+, r22"		"\n\t"  // store new state
			"ld	r22, X+"		"\n\t"
			"ld	r23, X+"		"\n\t"
			"ld	r24, X+"		"\n\t"
			"ld	r25, X+"		"\n\t"
			"ijmp"				"\n\t"	// jumps to update_finishup()
			// TODO move this table to another static function,
			// so it doesn't get needlessly duplicated.  Easier
			// said than done, due to linker issues and inlining
		"L%=table:"				"\n\t"
			"rjmp	L%=end"			"\n\t"	// 0
			"rjmp	L%=plus1"		"\n\t"	// 1
			"rjmp	L%=minus1"		"\n\t"	// 2
			"rjmp	L%=plus2"		"\n\t"	// 3
			"rjmp	L%=minus1"		"\n\t"	// 4
			"rjmp	L%=end"			"\n\t"	// 5
			"rjmp	L%=minus2"		"\n\t"	// 6
			"rjmp	L%=plus1"		"\n\t"	// 7
			"rjmp	L%=plus1"		"\n\t"	// 8
			"rjmp	L%=minus2"		"\n\t"	// 9
			"rjmp	L%=end"			"\n\t"	// 10
			"rjmp	L%=minus1"		"\n\t"	// 11
			"rjmp	L%=plus2"		"\n\t"	// 12
			"rjmp	L%=minus1"		"\n\t"	// 13
			"rjmp	L%=plus1"		"\n\t"	// 14
			"rjmp	L%=end"			"\n\t"	// 15
		"L%=minus2:"				"\n\t"
			"subi	r22, 2"			"\n\t"
			"sbci	r23, 0"			"\n\t"
			"sbci	r24, 0"			"\n\t"
			"sbci	r25, 0"			"\n\t"
			"rjmp	L%=store"		"\n\t"
		"L%=minus1:"				"\n\t"
			"subi	r22, 1"			"\n\t"
			"sbci	r23, 0"			"\n\t"
			"sbci	r24, 0"			"\n\t"
			"sbci	r25, 0"			"\n\t"
			"rjmp	L%=store"		"\n\t"
		"L%=plus2:"				"\n\t"
			"subi	r22, 254"		"\n\t"
			"rjmp	L%=z"			"\n\t"
		"L%=plus1:"				"\n\t"
			"subi	r22, 255"		"\n\t"
		"L%=z:"	"sbci	r23, 255"		"\n\t"
			"sbci	r24, 255"		"\n\t"
			"sbci	r25, 255"		"\n\t"
		"L%=store:"				"\n\t"
			"st	-X, r25"		"\n\t"
			"st	-X, r24"		"\n\t"
			"st	-X, r23"		"\n\t"
			"st	-X, r22"		"\n\t"
		"L%=end:"				"\n"
		: : "x" (arg) : "r22", "r23", "r24", "r25", "r30", "r31");
#else
		uint8_t p1val = DIRECT_PIN_READ(arg->pin1_register, arg->pin1_bitmask);
		uint8_t p2val = DIRECT_PIN_READ(arg->pin2_register, arg->pin2_bitmask);
		uint8_t state = arg->state & 3;
		if (p1val) state |= 4;
		if (p2val) state |= 8;
		arg->state = (state >> 2);
		switch (state) {
			case 1: case 7: case 8: case 14:
				arg->position++;
				return;
			case 2: case 4: case 11: case 13:
				arg->position--;
				return;
			case 3: case 12:
				arg->position += 2;
				return;
			case 6: case 9:
				arg->position -= 2;
				return;
		}
#endif
	}

so if you simply pretend it is not an AVR then you are left with:

	static void update(Encoder_internal_state_t *arg) {
		uint8_t p1val = DIRECT_PIN_READ(arg->pin1_register, arg->pin1_bitmask);
		uint8_t p2val = DIRECT_PIN_READ(arg->pin2_register, arg->pin2_bitmask);
		uint8_t state = arg->state & 3;
		if (p1val) state |= 4;
		if (p2val) state |= 8;
		arg->state = (state >> 2);
		switch (state) {
			case 1: case 7: case 8: case 14:
				arg->position++;
				return;
			case 2: case 4: case 11: case 13:
				arg->position--;
				return;
			case 3: case 12:
				arg->position += 2;
				return;
			case 6: case 9:
				arg->position -= 2;
				return;
		}
	}

which is simply putting into C the states in:

00137 //                           _______         _______
00138 //               Pin1 ______|       |_______|       |______ Pin1
00139 // negative <---         _______         _______         __      --> positive
00140 //               Pin2 __|       |_______|       |_______|   Pin2
00141
00142                 //      new     new     old     old
00143                 //      pin2    pin1    pin2    pin1    Result
00144                 //      ----    ----    ----    ----    ------
00145                 //      0       0       0       0       no movement
00146                 //      0       0       0       1       +1
00147                 //      0       0       1       0       -1
00148                 //      0       0       1       1       +2  (assume pin1 edges only)
00149                 //      0       1       0       0       -1
00150                 //      0       1       0       1       no movement
00151                 //      0       1       1       0       -2  (assume pin1 edges only)
00152                 //      0       1       1       1       +1
00153                 //      1       0       0       0       +1
00154                 //      1       0       0       1       -2  (assume pin1 edges only)
00155                 //      1       0       1       0       no movement
00156                 //      1       0       1       1       -1
00157                 //      1       1       0       0       +2  (assume pin1 edges only)
00158                 //      1       1       0       1       -1
00159                 //      1       1       1       0       +1
00160                 //      1       1       1       1       no movement

I know from experience that the C version works very well - both on an ARM based teensy and also if you simply use the C version for AVR as I did in:

 

https://github.com/wrightflyer/s...

 

I previously made a small video of that in action (mega32):

 

https://www.youtube.com/watch?v=...

 

While it's pretty tricky to operate a joystick or encoder in one hand, while holding a phone in the other to take the pictures this kind of proves that the "simple C" solution works very nicely.

 

For the update() I simply do:

Timer0  tim(Timer0::TIM0_CTC, 100);
    tim.start(64);
    tim.attachInterrupt(Timer0::TIM0_COMP_ISR, timerUpdate);

and then:

void timerUpdate(void) {
    enc1.intUpdate();
    enc2.intUpdate();
    enc3.intUpdate();
    enc4.intUpdate();
    enc5.intUpdate();
    enc6.intUpdate();
    enc7.intUpdate();
    enc8.intUpdate();
    but1.update();
    but2.update();
    but3.update();
    but4.update();
    but5.update();
    but6.update();
    but7.update();
    but8.update();
}

which is handling 8 encoders (and 8 buttons with debounce).

 

The 100 passed to the c'tor sets the OCR and the 64 to tim.start() means /64 prescale. As this was a 4MHz crystal (what I happened to have lying around at the time) then the timer rate will be 4M/64 = 62.5K and 100 ticks of that will be 1.6ms but it's pretty arbitrary and almost picked at random.

Last Edited: Fri. May 7, 2021 - 10:33 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

westfw wrote:

The code you put in your post does not include the full ASM statement.

I would expect that it has an "input operand" after the asm string that specifies "pass the value of arg in the X register."

Probably something like:

"clr r1  \n\t" 		         \
	: "=&d" (prod)               \
	: "x" (arg), "z" (dest)    );

 

 

"L%=end:"				"\n"
		: : "x" (arg) : "r22", "r23", "r24", "r25", "r30", "r31");
	}

if X is used, being 26:27 then I guess r25 might hold the number of arguments I guess

 

as for the arguments passed, they start with register r22 which makes even less sens

 

there is only one argument in the function , yet registers r22 to r31 are used

Last Edited: Fri. May 7, 2021 - 11:20 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

it's the code for a rotary encoder , maybe the writer needed precision and performances

 

I just wrote a ws2812 lib, there is no way to have it run fast with high level functions, especially if one is using arduino functions like digitalWrite

Last Edited: Fri. May 7, 2021 - 11:26 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

>A classic example would be a precise delay routine

 

The avr already has that built in so not a great example, but even for a cortex-m you do not have to go into asm to get some non-hardware delay-

https://godbolt.org/z/TqTMcqbEG

which like all other non-hardware delays is subject to interrupts, but this code which in this version uses runtime cpu speeds, is still within 400ns from 1us on up (at sane speeds) and is no better/worse than any asm version except you can stay out of asm. So getting good at  c/c++ is probably an incomplete statement which should also include being able to also use the toolchain to get what you want.

 

 

> my goal is to understand how the structure pointer is accessed from assembly language

 

Back to original post- your pointer X came from arg and without you seeing it, the value for arg (r25:r24) was copied to X by the compiler

: : "x" (arg) : "r22", "r23", "r24", "r25", "r30", "r31");

so looking at asm source inside a c function is not necessarily showing you everything.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

as for people being against ASM, well I am not, hence the question for which I still don't have an answer yet :-)

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

 

"so looking at asm source inside a c function is not necessarily showing you everything. "

 

ok but why these r22 r23 r24.... ?

 

and x is 26:27, hence my question

 

 

Last Edited: Fri. May 7, 2021 - 11:24 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Read the ABI link. It not only dictates which registers should be used for passing function parameters in the given order when you are making a call across the boundary from C/C++ to Asm in a .S file but it will also tell you about "Call used" and "Call Saved" register which influences the compiler's own internal choices about which registers it preferes to use for day to day jobs until the "register pressure" gets so great that it has to use a wider range of registers.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

> ok but why these r22 r23 r24.... ?

 

Because that is the clobber list. The compiler is still in use and the asm code is using registers that the compiler does not know about so needs to inform the compiler of registers that may have changed.

 

Read-

https://www.nongnu.org/avr-libc/...

Looks fun, doesn't it.

 

You can also try yourself to see what is going on with the asm-inside-a-c-function-

https://godbolt.org/z/a7Go1rdx1

Last Edited: Fri. May 7, 2021 - 11:35 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

OMG!

 

I have seen many ways to implement an encoder, but never a more clumsy way. 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The ABI for using inline assembly is not the same

as the ABI for passing arguments in function calls.

Conflating the two seems to be causing the confusion.

Moderation in all things. -- ancient proverb