zero register

Go To Last Post
38 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi,

I noticed that r1 is the zero register which means it is always zero.

1. When I was writing my fixed-point division routines, I noticed r1 is used as a loop counter for 32bit division (the comment says it's ok since it's zero at the end of the loop) Is this really safe.. what if an interrupt occurs? Can you not depend on it being zero in interrupts?

2. Isn't r1 a poor choice for the zero register since it is used by multiplication? If the zero register were say r2 instead of r1, it would be nice because then r1 wouldn't have to be cleared after every multiplication operation which would increase performance and reduce code size. I would prefer having r0 and r1 both be scratch registers, and r2 be the zero register.

Any comments?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

1) I think you'll find that the standard code generated for ISRs involves R1 being pushed then a "CLR R1" to guarantee it's back to 0. So if there are times in the library (and there are) when it doesn't contain 0 then that's OK.:

.global	__vector_1
	.type	__vector_1, @function
__vector_1:
.LFB6:
.LM9:
	push __zero_reg__	 ; 
	push r0	 ; 
	in r0,__SREG__	 ; 
	push r0	 ; 
	clr __zero_reg__	 ;

	pop r0	 ; 
	out __SREG__,r0	 ; 
	pop r0	 ; 
	pop __zero_reg__	 ; 
	reti
.LFE6:
	.size	__vector_1, .-__vector_1

2) It was chosen before there were AVRs that supported MUL. One thing I've never understood though is why it's R1 and not the possibly more obvious R0 (which even has "zero" in the name?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:

2) It was chosen before there were AVRs that supported MUL. One thing I've never understood though is why it's R1 and not the possibly more obvious R0 (which even has "zero" in the name?

Even the non-MUL AVRs had an LPM instruction, whose target was always R0. So, R0 was kind of "taken".

I think that the "Zero register" should be changed, though (Before I warmed to the notion of coding in C, my assembler practice was to use R15 for that; inspired, I think, by the way the stack grows downward from the highest-available location). So the library code generated with a "use something besides R0::R1 for zero" policy wouldn't be useable with preexisting code; big deal. Rebuild any libraries you may have using the newer compiler, and press on.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

(Before I warmed to the notion of coding in C, my assembler practice was to use R15 for that; ...

I had a couple of tight ASM apps a number of years ago, and experimented with "zero register". My outcome was that it didn't help. ;) YMMV.

TI MSP430 has a "constant generator" in R3; 6 values IIRC.

Lee

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I have hardly seen GCC(AVR) actually making use of R1 as a zero register. Especially for ISRs there is the possibility for another small optimization by checking if it is actually used inside the ISR. If not one could leave out saving and clearing R1.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I went looking through the .lss file for my application, and it does seem like R1 isn't used everywhere it might be, but it IS used whenever an I/O register is being cleared. Here's an excerpt that turned up as the first "hit" of searching for r1 in my .lss:

//     ################
//     # Initializers #
//     ################
//
Initialize(SPI)
     174:	bb 9a       	sbi	0x17, 3	; 23
{
    // Make the MISO pin (Bit 3 of Port B) into an output:
    //
    DDRB |= _BV(3);

    SPDR = 0;
     176:	1f b8       	out	0x0f, r1	; 15
    SPCR = 0x40;  // Enable SPI system in SLAVE mode default polarity & phasing
     178:	80 e4       	ldi	r24, 0x40	; 64
     17a:	8d b9       	out	0x0d, r24	; 13

Also when "bools" are being set. Here's another excerpt, although the interspersed sourcecode doesn't correlate too well with the assembler:

    // We're done with the TWI transceiver now...
    //
    twiReserved = false;
     37e:	10 92 0c 01 	sts	0x010C, r1
     382:	81 2f       	mov	r24, r17
     384:	11 11       	cpse	r17, r1
     386:	81 e0       	ldi	r24, 0x01	; 1

Checking back against the source, the manipulation of r24 is generating the return value of the containing function, which is typed as "bool", and is the translation of a sourcecode line:

    return success;

.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Kleinstein wrote:
I have hardly seen GCC(AVR) actually making use of R1 as a zero register. Especially for ISRs there is the possibility for another small optimization by checking if it is actually used inside the ISR. If not one could leave out saving and clearing R1.
The prologue/epilogue generation in GCC isn't that smart yet. If the fastest response is needed, I write the ISR in a separate ASM file. It is really rather simple!

I think it was already stated in the thread but I'll point it out here again - GCC can't assume that R1 is zero in an interrupt because the AVR could have been interrupted after a MUL instruction which clobbers R0:R1. It has been suggested before on the avr-gcc mailing list to change the register allocation to use a different register for the zero register and other improvements but I don't think that it ever got very far (library and hand-written assembly backward compatibility reasons were stated).

Math is cool.
jevinskie.com

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

theusch wrote:
Quote:

(Before I warmed to the notion of coding in C, my assembler practice was to use R15 for that; ...

I had a couple of tight ASM apps a number of years ago, and experimented with "zero register". My outcome was that it didn't help. ;) YMMV.

TI MSP430 has a "constant generator" in R3; 6 values IIRC.

Lee

Really? Interesting. I estimate that a zero register saves me from using several 'clr' instructions (or similar) over the course of a(n assembly language) project, especially when doing arithmetic or comparisons between a 16-bit number and an 8-bit number.

I wish the gcc zero register would be changed as well. I find that (in a C project) whatever instructions might not be needed by using the zero register are more than made up for by gcc having to have push r1/pop r1 in ISR prologues and epilogues. Sure I can (and do at times) code ISRs in assembly, but gas doesn't exactly make assembly language programming easy compared to Atmel's assembler, and I thought the idea of programming in C is to make life easier. ;-)

Mark

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

maalper wrote:
theusch wrote:
Quote:

(Before I warmed to the notion of coding in C, my assembler practice was to use R15 for that; ...

I had a couple of tight ASM apps a number of years ago, and experimented with "zero register". My outcome was that it didn't help. ;) YMMV.

TI MSP430 has a "constant generator" in R3; 6 values IIRC.

Lee

Really? Interesting. I estimate that a zero register saves me from using several 'clr' instructions (or similar) over the course of a(n assembly language) project, especially when doing arithmetic or comparisons between a 16-bit number and an 8-bit number.

Now what is the cost of a CLR instruction? and what is the cost of the MOV that you are replacing it with? Are you really saving? I think not.

Where you save is in operations like adding with carry of 0, since there is no add immediate with carry instruction. In reality this need probably doesn't come up too often, so you aren't really saving by having a zero register. In fact, as pointed out, it is probably costing more than it's saving. Better to set up the zero register in those few code scenarios where it is actually needed, otherwise leave it open for general use.

Writing code is like having sex.... make one little mistake, and you're supporting it for life.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You do save here-and-there. My statement was based on if the savings outweigh the value of "losing" the register for bit flags, high-use "register variables", working storage for 32-bit arithmetic, and the like. Those were '4433 apps with no MUL or GPIOR; with (say) a Mega48 the answer may be different--GPIOR0 is good for flags, and GPIOR1/2 might be nice for constants...

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I also had the impression that GCC is note very efficient in using the zero register. In the example of adding 16 bit and 32 bit numbers it is first setting 2 registers to 0 to expand the 16 bit number to 32 bit instead of using the zero reg. But this is probably because there is no special mixed resolution arithmetics codes, but just promoting the 16 bit value to 32 bits first, and than doing a 32 bit add.

With relatively few uses of the Zero reg left, I am wondering if it is actually saving more than it looses by blocking a register.

However its allways easier to find weak spots than to improve on them. In other cases gcc does a good job on optimization, too good for some delay loops.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Why doesn't an option exist in the compiler to choose to have a different register (such as 15) be the zero register?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Unless everyone always used __zero_reg__ as recommended in asm code, there would be problems for too many people if r1 was switched to something else. Even if it was changed, libraries would (should) still need to clear r1 in case the library was used with an older version of the compiler, so no gain with libraries. I guess one could count the 'eor r1,r1' in a project (probably not too many in a 'typical' project), and also count how many times r2 is used (probably not too many either), and see if it worth the switch to lose r2, but eliminate the clearing of r1. If it is, gcc is open source so make the change for your own benefit. I happen to think backward compatibility wins in this case, as the cost is not very high (for most projects I suspect).

But what do I know (don't answer, please).

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

Why doesn't an option exist in the compiler to choose to have a different register (such as 15) be the zero register?

And when you link with a pre-built lib that had the zero-register set to R1, how well will that work?

As of January 15, 2018, Site fix-up work has begun! Now do your part and report any bugs or deficiencies here

No guarantees, but if we don't report problems they won't get much of  a chance to be fixed! Details/discussions at link given just above.

 

"Some questions have no answers."[C Baird] "There comes a point where the spoon-feeding has to stop and the independent thinking has to start." [C Lawson] "There are always ways to disagree, without being disagreeable."[E Weddington] "Words represent concepts. Use the wrong words, communicate the wrong concept." [J Morin] "Persistence only goes so far if you set yourself up for failure." [Kartman]

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

Why doesn't an option exist in the compiler to choose to have a different register (such as 15) be the zero register?

Like Curt says, as soon as you link with a prebuilt library providing printf() or strcpy() or sin() or whatever and it was built previously to use R1 as __zero_reg__ then you'd hit problems. The only way a compiler switch could work is if all the library calls were built from source at compile time too.

Cliff

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Kleinstein wrote:
I also had the impression that GCC is note very efficient in using the zero register. In the example of adding 16 bit and 32 bit numbers it is first setting 2 registers to 0 to expand the 16 bit number to 32 bit instead of using the zero reg. But this is probably because there is no special mixed resolution arithmetics codes, but just promoting the 16 bit value to 32 bits first, and than doing a 32 bit add.

I think I could implement mixed arithmetics fairly easily.. this would help a lot in multiplication cases as well.

I realize the zero register is good to have, I just don't think it should be r1.. maybe r2 would be better. I would even be in favor of having r2 and r3 both be zeros, which would allow movw to clear two registers in 1 instruction.. this occurs frequently, and currently it is always two instructions.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I think also, using r0, r1 for zero and stack saving was the most disadvantageous choice.
It cost always two unneeded PUSH/POP on every interrupt.

r0, r1 should be reserved for LPM, SPM and MUL.

Peter

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
It cost always two unneeded PUSH/POP on every interrupt.
Maybe I'm wrong, but one of those is most likely necessary anyway to save sreg (have to put it somewhere before pushing it, so R0 is as good as any). So you save 1 push/pop of r1, which also 'could' be handled by compiler optimization (assuming it did).
Quote:
I would even be in favor of having r2 and r3 both be zeros, which would allow movw to clear two registers in 1 instruction.. this occurs frequently
If its to your advantage, save/restore r2/r3 and use them for your zero regs (6 instructions to do it, so would need to save at least 7 to be of any benefit).

I think the bottom line is you are not going to get the gcc developers to change it very quickly (just guessing).

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

glitch wrote:
maalper wrote:

Really? Interesting. I estimate that a zero register saves me from using several 'clr' instructions (or similar) over the course of a(n assembly language) project, especially when doing arithmetic or comparisons between a 16-bit number and an 8-bit number.

Now what is the cost of a CLR instruction? and what is the cost of the MOV that you are replacing it with? Are you really saving? I think not.

Where you save is in operations like adding with carry of 0, since there is no add immediate with carry instruction. In reality this need probably doesn't come up too often, so you aren't really saving by having a zero register. In fact, as pointed out, it is probably costing more than it's saving. Better to set up the zero register in those few code scenarios where it is actually needed, otherwise leave it open for general use.

I was thinking of 16-bit with 8-bit comparisons like:

     cp   num1L, num2
     cpc  num1H, __zero_reg__

instead of something like:

     clr  tmpReg
     cp   num1L,num2
     cpc  num1H,tmpReg

or 16-bit with 8-bit additions like:

     add  num1L, num2
     adc  num1H, __zero_reg__

instead of something like:

     clr  tmpReg
     add  num1L,num2
     adc  num1H,tmpReg

If I read you correctly it sounds like you were thinking along similar lines. And yes, these cases only occur once or twice per project, typically.

Mark

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
Quote:

Why doesn't an option exist in the compiler to choose to have a different register (such as 15) be the zero register?

Like Curt says, as soon as you link with a prebuilt library providing printf() or strcpy() or sin() or whatever and it was built previously to use R1 as __zero_reg__ then you'd hit problems. The only way a compiler switch could work is if all the library calls were built from source at compile time too.

Cliff

Why can't they be built from source at compile time? Perhaps an option for the compiler that would use a more convenient register and also recompile automatically. I have no idea how much work it would take though to implement this.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

curtvm wrote:
Quote:
It cost always two unneeded PUSH/POP on every interrupt.
Maybe I'm wrong, but one of those is most likely necessary anyway to save sreg (have to put it somewhere before pushing it, so R0 is as good as any). So you save 1 push/pop of r1, which also 'could' be handled by compiler optimization (assuming it did).

You can save at least two PUSH/POP, since now the interrupt PUSH/POP r0 twice, since it can not be reserved for SREG only.

Maybe for interrupts without SEI all three PUSH/POP can be removed.

E.g. using r2 as zero reg and r3 for SREG saving would be very nice (would save 11 or 15 cycle).

Peter

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

danni wrote:
Maybe for interrupts without SEI all three PUSH/POP can be removed.
I assume that you're referring to an interrupt handler. Whether or not the ISR re-enables interrupts is immaterial since no interrupt handler should assume any particular register content. The determining factors are 1) does the code generated for the ISR rely on the zero register being zero, or 2) does the ISR call one or more functions that might. If neither of those factors exist, there is no need for the ISR to push/initialize/pop the zero register.

Don Kinzer
ZBasic Microcontrollers
http://www.zbasic.net

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

Why can't they be built from source at compile time?

3rd party binary lib..

As of January 15, 2018, Site fix-up work has begun! Now do your part and report any bugs or deficiencies here

No guarantees, but if we don't report problems they won't get much of  a chance to be fixed! Details/discussions at link given just above.

 

"Some questions have no answers."[C Baird] "There comes a point where the spoon-feeding has to stop and the independent thinking has to start." [C Lawson] "There are always ways to disagree, without being disagreeable."[E Weddington] "Words represent concepts. Use the wrong words, communicate the wrong concept." [J Morin] "Persistence only goes so far if you set yourself up for failure." [Kartman]

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

JohanEkdahl wrote:
Quote:

Why can't they be built from source at compile time?

3rd party binary lib..

My assumption is that if you choose this compiler option that you have the source for everything.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Salgat wrote:

Why can't they be built from source at compile time? Perhaps an option for the compiler that would use a more convenient register and also recompile automatically. I have no idea how much work it would take though to implement this.

I suggest that you try to build avr-libc on your own sometime. It would be quite educational.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Well I'm quite curious about compilers in general, probably going to take a compiler class as one of my electives once I transfer.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

dkinzer wrote:
If neither of those factors exist, there is no need for the ISR to push/initialize/pop the zero register.

That's exact the reason, why using r0,r1 was very bad but e.g. r2,r3 was many times better.

You can never rely on registers, which are changed by needed instructions.

Peter

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You still have to worry about any instructions like 'mul' (or using __temp_reg__) which may be in the isr or a function called by the isr.

If a 'mul' is in the isr, or a called function of the isr, r0/r1 still has to be saved/restored either in the isr or the called function. So now you add r0/r1 to the 'call-saved' register list, and any function using a 'mul' (or using __temp_reg__) will have to save/restore r0/r1 as it could be called from an isr. You now moved the save/restore of r0/r1 from isr's to possibly a bunch of functions (functions now do keep r1 clear, but at least they don't have to save/restore it).

The current scheme (in my humble opinion) is actually pretty good. Maybe someone should come up with a 'typical app' (even though there really is no such thing) and see what would happen to code size/speed of isr's/etc. if change was made as suggested.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The main problem with r1 as the zero reg. is, that one has to restore it from time to time and save in the ISR. If the zero reg. would be R2 not even the isr would need to save it. I would guess most ISRs don't use a mul and would not need to save r0/r1 if they don't need them.

The registers r0/r1 would be better used for local use only. So it should be caller saved, but only it actually used. This would make the best use of registers that are sometimes needed by special instructions.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

If the zero reg. would be R2 not even the isr would need to save it.

What if the programmer happens to use R2 in the ISR? Surely whatever register is guaranteed to hold 0 needs to be protected in the ISR or it may come back with non-0 in it?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
The main problem with r1 as the zero reg. is, that one has to restore it from time to time and save in the ISR
I would look at one of your own 'typical' projects, count the number of 'eor r1,r1' in it. I suspect its not that many times.

If you moved the zero reg to r2, you eliminate the save/clear/restore of r1 in the isr, but then you have functions (and libraries and other asm pieces) needing to save/restore r1 instead of just keeping it clear as they do now.

So if you currently have 5 isr's, and there are 15 eor r1,r1's in a project, you break even on code space (-5x3, +15x2, -15x1). So ultimately I don't think any other method is going to be a big code space saver. (You also lose r2, which could cause code size increase anyway).

If you want speed of isr's, asm is usually the answer anyway.

I think we need proof that there is a better way. A real project that all can see.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
What if the programmer happens to use R2 in the ISR?

I understand not the problem.

If the compiler reserve R2 as zero-reg and R3 as SREG saving, then it would be very stupid, if the compiler generate code with another using.

And naturally the programmer must avoid different usage inside assembler code.

I see no problem, to have only 30 registers available for general usage.

Peter

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

danni wrote:
If the compiler reserve R2 as zero-reg and R3 as SREG saving
My sense of it is that it would be better to expend effort to improve the code generation for ISRs (not saving/restoring r0/r1 unless they are used or an external function is called) as compared to expending the effort to change the zero register assignment which would require having to inspect/rework all existing assembly language code.

Don Kinzer
ZBasic Microcontrollers
http://www.zbasic.net

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I agree with "dkinzer" that we probably have to live with R1 being used as a zero register. Unless there is another good reason for a new register assignment it probably won't change. R2 would be clearly a better choice for the zero register, but the difference is just not large enough for the trouble of changing the regster allocation. Allthough one first step would be to have an option to change it if you want, even if this causes a lot of problems with binary libraries and ASM code.

For the ISRs there seems to be a relatively simple possibility for optimization: Check if R0/R1 are actually used and only save/restore them in this case. The next step after that would be to additionally avoid using them in the ISR. In a similar way it may be possible to find a few cases where R1 is restored to 0 without need in the main programm.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The AVR port was created many years ago, long before the multiply instructions were introduced.

Yes, R1 seems like a bad choice for a zero constant. Hindsight is always 20/20.
Yes, it will be difficult to change it now. It will cause backwards compatibility problems (nightmares), especially in avr-libc.
No, you cannot change the optimizations of ISRs. An ISR can happen at any time. How can the compiler check if R0/R1 is being used, or not, so it won't save/restore those registers? The compiler certainly does not know about functions in other compilation units.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

EW wrote:
How can the compiler check if R0/R1 is being used, or not, so it won't save/restore those registers?
Two tests:
1) If the generated ISR code performs any subroutine call, then r0, r1, r18-27, r30, r31 must be saved/restored and r1 must be initialized to zero prior to the first call.

2) If no subroutine calls are generated, then save/restore only those registers actually used in the ISR code.

The logic of test #1 is already being performed. The missing piece is test #2.

Don Kinzer
ZBasic Microcontrollers
http://www.zbasic.net

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Oh, sorry, my bad. I had the sense of things inverted. Of course the compiler can do that (#1).

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Even if subroutines are called it should be possiblle to track down which registers are used, as long as the subroutines (and all the ones called from there) are available. Things gets a little more time consuming however.
Its only if indirect calls are used that the compiler is lost.