[SOLVED] How to make volatile register variable

Go To Last Post
40 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi,

I would like to make 16 bit counter by increasing a variable on every overflow of 8 bit counter (on ATTiny13). To save cycles I wanted to use register variable and increment it in ISR. So I declared it both volatile and register. Atmel Studio 7.0 gave me warning

Warning optimization may eliminate reads and/or writes to register variables [-Wvolatile-register-var]

Looking at internet I found it is known "feature". I found "workaround" - using

__asm__ __volatile__("":: "r" (foo));

should force the compiler to consider foo updated and so it will not optimize it away. If I understand it right I need to place this statement before each access to foo if there is risk optimizations will screw it. Is it right? On the other side may I use this to make non-register non-volatile variable behave as volatile for next access?

 

EDIT: the line above is wrong, it should be look at end post #36 for solution.

EDIT 2: On second thought the above line is not necessarily "wrong". If I understand it right it says to the compiler "don't optimize away previous writes to foo" while line in post #36 tell "in foo may be anything, you cannot guess from previous code". To make the register variable truly volatile

__asm__ __volatile__("":: "r" (foo));

should be called after every write and

asm volatile("":"=r" (foo));

before every read of foo. Since it is not translated into any code (it just suppresses optimizations) both can be called where either is needed via single macro/function hopefully without any harm.

Last Edited: Fri. Aug 11, 2017 - 01:22 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Smajdalf wrote:
To save cycles

Did you actually check how many cycles the compiler uses when left to its own devices?

 

If this level of details is really critical to you, then you should probably be writing it in assembler anyhow ...

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Note also that if you bind a register you have to tell the compiler to build all the code in the project so it avoids using that register elsewhere and the one thing you cannot change is prebuilt library code in libc (strlen, printf, etc). That code is already built and may have used your chosen register so if you use anything from the C library then grep an asm listing to ensure that the chose register is not corrupted by anything happening in linked library code.

 

As to using both volatile and register binding. I understand the intent but I have a feeling the compiler may treat it as implied volatile because of the binding anyway. AFAIK such variable binding can only be done at global scope (certainly file scope) so the fact that it's non-local presumably means the compiler is duty bound to access it anyway? It may be updated by an extern.

Last Edited: Tue. Aug 8, 2017 - 02:28 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yes, I am writing the ISR in inline assemby.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

Why not use an "unused" register to hold your count, such as OCR0B or EEDR, then you don't have to worry about reserving a cpu register, and would be just as fast.

 

Jim

 

 

(Possum Lodge oath) Quando omni flunkus, moritati.

"I thought growing old would take longer"

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
As to using both volatile and register binding, ...

An interesting dilemma.

 

At the C source level, one can get ties up in knots at times with both "volatile" and "register" keywords.

 

OP implies a particular toolchain.  It might be different in other toolchains.  In particular, CodeVision will really respect "register", up to the limits of available registers for such variables.

 

IME with CV, an 8-bit "register" variable is inherently volatile.  The C standard says that what constitutes access is implementation defined...

J.3.10 Qualifiers
1 — What constitutes an access to an object that has volatile-qualified type (6.7.3)
 

Given this quick research, I'd agree that OP needs to understand how the particular toolchain handles the situation, and "make it work" accordingly--which OP did.

 

===============

Like the other respondents, the utility of this exercise is ... interesting.  I guess engineering is the art of making what you need from what you have.  [isn't there another very recent thread about "extending" timer(s)?]  There are a number of 'Freaks that love their Tiny13s; I've never felt the attraction.  The price has jumped around over the years, but the last time we considered it for a highish-volume cheap "mote" it was still close to a buck and wasn't cheap enough.  For smaller volumes where last penny doesn't matter, we choose a model with needed facilities.

 

OP's "save cycles" implies that the timer is running fast -- say, at /1.  Servicing an empty ISR takes about 12 cycles minimum, so about 5% of CPU.  Add something to the ISR And you might be up to 10%.  Pretty deterministic, probably, if no other interrupt sources enabled.

 

If the main loop is fast enough under all paths the overflow bit could be polled.

 

Pretty sparse I/O map to invent a "GPIOR".  Still would need to save GP register, pull it from I/O, increment, return to I/O, and restore.

 

 

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Smajdalf wrote:
Yes, I am writing the ISR in inline assemby.

inline assembly is always a bit of a kludge - why not just write it as a proper, separate assembler module?

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

awneil wrote:
inline assembly is always a bit of a kludge - why not just write it as a proper, separate assembler module?

If it is a good workaround, sometimes a bit of inline ASM isn't a bad thing.

 

And y'all know that I weigh in on the "C vs. ASM" topics, and indeed I'll write my tiny Tiny apps in C, generally.  But as the OP has chosen a model with only 500 words of program space, if trying to do something tricky/sophisticated that requires contortions in C, I'd be tempted to just make it an ASM app.  Perhaps use C to make the skeleton, or "mostly working" program.  Then lift the generated code and make the ASM app.

 

Any increment method will affect SREG flags, right?  So for speed, another register is needed to save-restore SREG.

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

 

If it "requires contortions", that's usually a pretty good indication that you're using the wrong tool...

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Counting overflows in an ISR is not necessarily the best way to go about it:

https://www.avrfreaks.net/comment/583010#comment-583010

The method is not re-entrant,

That said, at most two copies of the function should be necessary,

one for ISRs and one for the main line.

Of course, nested ISRs could mess that up.

Note that ISR prologues often take more time than other function calls.

 

Note that using an IO register could be faster than SRAM (IN vs. LDS),

but slower than using R2 (IN vs. nothing).

In each case, the difference is two cycles.

Iluvatar is the better part of Valar.

Last Edited: Tue. Aug 8, 2017 - 03:41 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Voted for awneil & ki0bk.

Micro managing stuff like this is far to much effort for to less performance gain.

If there is any gain at all. It will prevent GCC from doing it's own optimisations).

 

And if you realy this close to the limits of the uC used, you should probably choose another uC.

Doing magic with a USD 7 Logic Analyser: https://www.avrfreaks.net/comment/2421756#comment-2421756

Bunch of old projects with AVR's: http://www.hoevendesign.com

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Just a general info:

If you want ASM (and if speed matter on a small chip like the tiny13 I would write everything in ASM!)

I you want to avoid atomic problems then in "main" use movw to read/write from the 16 bit counter.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Seems you are using avr-gcc so use a pair of low registers, for example r6 and r7 (come on, who ever uses r6?), because gcc doesn't "like" to use them; they are less likely to be used by the libraries, even the ASM parts.

 

edit: sure, it costs 1 extra instruction to use a low register, but you are almost sure to have no conflicts.

    inc r6
    brne skip
    inc r7
skip:

 

Last Edited: Wed. Aug 9, 2017 - 11:23 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

El Tangas wrote:
they are less likely to be used by the libraries
Really? ;-)

C:\SysGCC\avr\avr\lib\avr5>avr-objdump -S libc.a | grep r6
  d0:   34 01           movw    r6, r8
  d2:   06 15           cp      r16, r6
  e2:   53 01           movw    r10, r6
  ec:   b3 01           movw    r22, r6
 100:   c3 01           movw    r24, r6
 106:   35 01           movw    r6, r10
 140:   30 01           movw    r6, r0
 14c:   60 0e           add     r6, r16
 1aa:   b3 01           movw    r22, r6
 1b2:   a3 01           movw    r20, r6
 1b4:   c3 01           movw    r24, r6
 1be:   3c 01           movw    r6, r24
 1c6:   a3 01           movw    r20, r6
 1fc:   39 01           movw    r6, r18
 1fe:   60 0e           add     r6, r16
 20a:   23 01           movw    r4, r6
 21e:   6a 14           cp      r6, r10
 26c:   c3 01           movw    r24, r6
 276:   a3 01           movw    r20, r6
 296:   c3 01           movw    r24, r6
 2a8:   6b 80           ldd     r6, Y+3 ; 0x03
 2ac:   6a 14           cp      r6, r10
 2b8:   b3 01           movw    r22, r6
 2c0:   6b 80           ldd     r6, Y+3 ; 0x03
 30a:   86 18           sub     r8, r6
 396:   34 01           movw    r6, r8
 398:   06 15           cp      r16, r6
 3a8:   53 01           movw    r10, r6
 3b2:   b3 01           movw    r22, r6
 3c6:   c3 01           movw    r24, r6
 3cc:   35 01           movw    r6, r10
  7c:   61 2c           mov     r6, r1
 102:   68 16           cp      r6, r24
 108:   3c 01           movw    r6, r24
 124:   66 16           cp      r6, r22
 122:   66 08           sbc     r6, r6
 168:   a3 01           movw    r20, r6
 108:   66 08           sbc     r6, r6
 14e:   a3 01           movw    r20, r6
   a:   6d 92           st      X+, r6
  52:   6d 90           ld      r6, X+
   4:   6f 92           push    r6
  76:   3c 01           movw    r6, r24
  84:   c3 01           movw    r24, r6
  e0:   6f 90           pop     r6
  92:   61 2c           mov     r6, r1
  b0:   64 16           cp      r6, r20
 516:   93 01           movw    r18, r6
 548:   39 01           movw    r6, r18
 554:   6f 1a           sub     r6, r31
 568:   c3 01           movw    r24, r6
 112:   38 01           movw    r6, r16
 116:   6f 0e           add     r6, r31
 138:   83 01           movw    r16, r6
 142:   38 01           movw    r6, r16
 146:   6f 0e           add     r6, r31
 16e:   83 01           movw    r16, r6
 53c:   60 82           st      Z, r6
 550:   3c 01           movw    r6, r24
 560:   c3 01           movw    r24, r6

and I tried the same on all from R2 to R23 and they are all used (above/below that they  are DEFINITELY used!).

 

So if you are going to bind to a register and you use libc you need to grep a final listing to see if anything you actually pull in from libc is using a register you "reserved".

 

(NB CodeVision avoids this by rebuilding library code taking into account register reservations)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Ah well, guess it needs to be written in assembly, then cheeky

 

edit: but wait, do you have statistics? I wonder which one is the less used.

Last Edited: Wed. Aug 9, 2017 - 11:33 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Actually I did that on "avr5" but tiny13 is really avr25 architecture. But I just repeated the exercise and there's extensive usage of R6 there too.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

but there you have an atomic problem because the update is in ISR, and "main" can read/write direct to r7:r6 ()

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Isn't it what we are talking here? I was trying to see which registers are less likely to be used. My tests using the method in #14 show that it's r4 and r5.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

check the GCC doc., there is a way to tell the compiler not to use some registers.

 

add:

And after that you place a register variable there.

Last Edited: Wed. Aug 9, 2017 - 12:30 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
So if you are going to bind to a register and you use libc you need to grep a final listing to see if anything you actually pull in from libc is using a register you "reserved".
Besides libc there is libm and libgcc. I would suggest to do that final grep always (independent of libc usage).

Stefan Ernst

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

sparrow2 wrote:
check the GCC doc., there is a way to tell the compiler not to use some registers.
That would be the command line option -ffixed-N where N is the register it should not use in generated code. So -ffixed-7 would tell it to not use R7 etc.

 

That's one of the many useful gems on Georg-Johan's page at: https://gcc.gnu.org/wiki/avr-gcc...

Last Edited: Wed. Aug 9, 2017 - 12:47 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

But all of these suggestions basically amount to fighting the tools.

 

Again: when you find yourself fighting the tools, that's usually a pretty good indication that you're using the wrong tool...

 

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
(NB CodeVision avoids this by rebuilding library code taking into account register reservations)

 

Hmmm--what is "library"?

 

How many real "libraries" would a Tiny13 app ever invoke?  From the standard C library, >>perhaps<< one of the math functions?  I'd think string family, and stdio, and probably stdlib as unlikely.  memset()?  Perhaps, but the memory spaces are so small...

 

Now, what about this CV rebuilding libraries to take into account register reservations?  I've never been aware of such; behind the curtains?  Global register variables take no special care, IME.  If you don't use them, the listing will show no uses for those registers.

 

Any function, whether intrinsic (EEPROM operations, math routines) or library will always preserve and restore what is used, as necessary, AFIAK.

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

But the combination of C fast and tiny13 is not a good combo. (As before I would write it all in ASM)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Lee I was pointing that out as a reason to consider using CV for this kind of thing. I thought the whole reason it builds lib code was for this very thing of not using a register the user has elected to reserve. Perhaps I got the details of that feature wrong? Seems like a good thing to me if that's what it does.

 

(less "tool battling").

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I had to graph it. Here is the register usage of libc.a, avr25 version (only call-saved registers):

 

 

I used this windows batch file, pretty trivial except for r2 and r3:

avr-objdump -S libc.a | grep "r2[^^0-9]\|r2$" > r2.txt
avr-objdump -S libc.a | grep "r3[^^0-9]\|r3$" > r3.txt
avr-objdump -S libc.a | grep r4 > r4.txt
avr-objdump -S libc.a | grep r5 > r5.txt
avr-objdump -S libc.a | grep r6 > r6.txt
avr-objdump -S libc.a | grep r7 > r7.txt
avr-objdump -S libc.a | grep r8 > r8.txt
avr-objdump -S libc.a | grep r9 > r9.txt
avr-objdump -S libc.a | grep r10 > r10.txt
avr-objdump -S libc.a | grep r11 > r11.txt
avr-objdump -S libc.a | grep r12 > r12.txt
avr-objdump -S libc.a | grep r13 > r13.txt
avr-objdump -S libc.a | grep r14 > r14.txt
avr-objdump -S libc.a | grep r15 > r15.txt
avr-objdump -S libc.a | grep r16 > r16.txt
avr-objdump -S libc.a | grep r17 > r17.txt
avr-objdump -S libc.a | grep r28 > r29.txt
avr-objdump -S libc.a | grep r29 > r28.txt

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Would a "grep -w r2" not have worked and been simpler then?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Probably. I don't use grep much.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks you for all your suggestions. I cannot say I understood everything. From my own experiments the compiler does not care about the "volatile" keyword - it optimize away conditions depending on the register variable. Sadly the "workaround" from my OP does not help:

#include <avr/wdt.h>
volatile register uint8_t foo asm("r5");
int main() {
	while (true) {
		foo=0; //comment out?
		PINB=11;PINB=11;PINB=11;PINB=11;PINB=11;PINB=11;PINB=11;PINB=11;
		__asm__ __volatile__("":: "r" (foo));
		if (foo){
			PINB=11;PINB=11;PINB=11;PINB=11;PINB=11;PINB=11;PINB=11;PINB=11;
		}
	}
}

Compiles as 60 bytes. If I comment the foo=0; line away it compiles as 90 bytes. Looking into lss confirms the "if part" is optimized away if foo is set to 0. Despite there is plenty of time for an interrupt to update the foo before it should be checked (I know no interrupts are enabled but the compiler does not know it. I tried with enabled interrupts but I wanted to make "minimal" example). Does anyone know how can I force the compiler to work with foo as volatile?

Despite I know C compiler is horribly ineffective I would like to keep it for not critical parts and only help it with ASM when it is needed. (Example: binding this variable to r5 saves 40 bytes in my program. Yet without the binding r5 is never used.) But the compiler looks like it quite resistant. 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Smajdalf wrote:
Does anyone know how can I force the compiler to work with foo as volatile?
Not directly.

Though register volatile does make sense, their intended purposes conflict,

so most compilers, including gcc, reject the combination by design.

You might access foo exclusively through in line assembly.

Iluvatar is the better part of Valar.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Image may contain: text

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Probably you will have to write asm functions to load/store the register variable (basically move the register to/from r24).

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

El Tangas wrote:
Probably you will have to write asm functions to load/store the register variable

LOL - so back to #2: just do it in assembler!!

 

 

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

@awneil: What is the "easy solution"? Writing everything in ASM? Or migrate to more powerful hardware - only because I don't know how to force C to do what I want? I like ASM and I think I can use it but it is very time consuming. I really would like to learn how to use inline ASM in C(++).

skeeve wrote:
their intended purposes conflict

What are their intended purposes? I understand "volatile" is variable that can be changed anytime (i.e. via interrupt or by being in a register manipulated by hardware outside of CPU) and "register" means "don't save the variable to RAM, keep it in registers all the time" - because it is used a lot. In fact it makes perfect sense to bind volatile variable to a register - price for "reloading" it is not so high and I would expect the compiler to do it on its own. But no, in small programs the compiler often uses RAM and have about 20 registers which are never used...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

Exactly how tight is your timing constraint here anyway? Assume you could achieve the goal would the 2 cycles or whatever it is that are saved by not doing an LDS really going to change the whole thing from not achievable to finally-doable?

 

Anyway as other have said, it's  1K micro - 512 opcodes. You can't really be doing anything so complex in 512 opcodes that it demands the use of C or C++ anyway? If it truly is so timing constrained that a couple of cycles matter then you will give yourself a whole lt more headroom by doing the entire thing in Asm.

 

But if it is the case that 2 cycles matter would you not consider a faster running CPU? Oh and if speed is of the essence I assume you've already wound it up to 20MHz (or perhaps even more if you want to risk over-clocking for a one-off?)? If not, why not, is that not simpler than frigging about in the C code? Or is this battery powered and speed/energy matter?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

@clawson: Sorry I don't understand this. You say it is OK to write poor code - I can ever get faster CPU. Or more hardware. Or both. But it is NOT how I want to write my programs. I realize it is how programs are generally written (from user's (mine) point of view Win 10 are on contemporary hardware about as fast as Win 95 used be on ten times weaker hardware; "upgrades" of SW like browsers on fixed computer makes it slower and slower without any added functionality and so on). It may or may not be critical to save the cycles in this application. But if it is possible to do this and I learn it today every time I encounter this in future I will save a few CK "for free". It will be little benefit alone but a few CK here and there and ...

 

EDIT:

It looks like my source (I cannot find it now) or my copying was "nearly accurate". This

asm volatile("":"=r" (foo));

seems to work. After reading what it means in actually makes much more sense and now I see the statement in my OP could never work.

Last Edited: Wed. Aug 9, 2017 - 06:56 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:

Lee I was pointing that out as a reason to consider using CV for this kind of thing.

 

CV only uses a subset of the 32 registers in generated code. It also has a tick-box option called Automatic Global Register Allocation...

 

I've just run a compile on a fairly large (16k of binary) program and here is the register usage...

<br />
ATmega328P register use summary:<br />
r0 :  62 r1 :  20 r2 :   0 r3 :   0 r4 :   0 r5 :   0 r6 :   0 r7 :   0 <br />
r8 :   0 r9 :   0 r10:   0 r11:   0 r12:   0 r13:   0 r14:   0 r15:  11 <br />
r16: 175 r17: 150 r18: 107 r19: 192 r20: 132 r21: 101 r22: 104 r23:  85 <br />
r24: 162 r25:  33 r26: 603 r27: 301 r28: 137 r29:   1 r30:1617 r31: 672 <br />
x  :  50 y  : 727 z  :  15 <br />

 

Quote:

The registers in the range R2 to R14, not used for bit variables, can be automatically allocated to char and int global variables and global pointers by checking the Automatic Global Register Allocation check box.

The ability to force a variable into a register(s), even when writing straight C, can offer some optimisation benefits.

 

For example, this code...

 

<br />
//clock the shift register<br />
shift_register = (shift_register << 1);<br />
// put the bit into the shift register<br />
if (dcc_bit) {shift_register = (shift_register | 0x0001);}<br />
//check for preamble in shift register<br />
if ((shift_register & PREAMBLE_MASK) == PREAMBLE)<br />
{<br />

 

...normally generates this...

 

<br />
;0000 0052 //clock the shift register<br />
;0000 0053 shift_register = (shift_register << 1);<br />
LDS  R30,_shift_register<br />
LDS  R31,_shift_register+1<br />
LSL  R30<br />
ROL  R31<br />
STS  _shift_register,R30<br />
STS  _shift_register+1,R31<br />
;0000 0054 <br />
;0000 0055 // put the bit into the shift register<br />
;0000 0056 if (dcc_bit) {shift_register = (shift_register | 0x0001);}<br />
SBRS R15,0<br />
RJMP _0x3<br />
LDS  R30,_shift_register<br />
LDS  R31,_shift_register+1<br />
ORI  R30,1<br />
STS  _shift_register,R30<br />
STS  _shift_register+1,R31<br />
;0000 0057 <br />
;0000 0058 //check for preamble in shift register<br />
;0000 0059 if ((shift_register & PREAMBLE_MASK) == PREAMBLE)<br />
_0x3:<br />
LDS  R30,_shift_register<br />
LDS  R31,_shift_register+1<br />
ANDI R31,HIGH(0x7FF)<br />
CPI  R30,LOW(0x7FE)<br />
LDI  R26,HIGH(0x7FE)<br />
CPC  R31,R26<br />
BRNE _0x4<br />

 

...but if I do this...

 

<br />
register unsigned int shift_register @2;<br />

 

...then I get...

 

<br />
;0000 0050 //clock the shift register<br />
;0000 0051 shift_register = (shift_register << 1);<br />
LSL  R2<br />
ROL  R3<br />
;0000 0052 <br />
;0000 0053 // put the bit into the shift register<br />
;0000 0054 if (dcc_bit) {shift_register = (shift_register | 0x0001);}<br />
SBRS R15,0<br />
RJMP _0x3<br />
LDI  R30,LOW(1)<br />
OR   R2,R30<br />
;0000 0055 <br />
;0000 0056 //check for preamble in shift register<br />
;0000 0057 if ((shift_register & PREAMBLE_MASK) == PREAMBLE)<br />
_0x3:<br />
MOVW R30,R2<br />
ANDI R31,HIGH(0x7FF)<br />
CPI  R30,LOW(0x7FE)<br />
LDI  R26,HIGH(0x7FE)<br />
CPC  R31,R26<br />
BRNE _0x4<br />

 

...which is very worth while when that code executes inside an ISR with a fast repetition rate.

#1 Hardware Problem? https://www.avrfreaks.net/forum/...

#2 Hardware Problem? Read AVR042.

#3 All grounds are not created equal

#4 Have you proved your chip is running at xxMHz?

#5 "If you think you need floating point to solve the problem then you don't understand the problem. If you really do need floating point then you have a problem you do not understand."

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Brian Fairchild wrote:
The ability to force a variable into a register(s), even when writing straight C, can offer some optimisation benefits.

I've seen code size reduction over 10% in a Mega48-class app.  I'll make a guess as to most-used variables -- "state"; "inputs_state"; ... -- and/or make a pass after most of the code is in place.  Might as well make good use of

r2 :   0 r3 :   0 r4 :   0 r5 :   0 r6 :   0 r7 :   0 <br />
r8 :   0 r9 :   0 r10:   0 r11:   0 r12:   0 r13:   0 r14:   0

In addition, knowing that R16-R22 (or is it R23) are used for function locals, in declaration order and "as suitable", means that main() 8-bit "scratch" and 16-bit "worknum" can be used hard as temporaries in operations, and know that they are living in high registers making for good AVR code.

 

But kind of beside the point for this thread, perhaps.  Sometimes one must try to find the "best" way to do an important piece of code.  Lots of threads here were that has been done.  And IME was important on PDP8 and '6809 and x86 and ...

 

 

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Well, you have a kind of solution, at least. To make it more pretty, you could do this:

 


#include <avr/wdt.h>
register volatile uint8_t foo asm ("r5");

inline void volatize(){
    asm volatile ("": "=r" (foo));
}

#define FOO volatize(), foo

int main() {
    while (true) {
        FOO=0; //comment out?
        PINB=11;PINB=11;PINB=11;PINB=11;PINB=11;PINB=11;PINB=11;PINB=11;
        if (FOO){
            PINB=11;PINB=11;PINB=11;PINB=11;PINB=11;PINB=11;PINB=11;PINB=11;
        }
    }
}

 

The volatize() function does nothing, hopefully, except warn the compiler that foo has just been used. So, using the comma operator, we can define a FOO that acts almost like a volatile foo.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

This thread quickly helped me solve the same type of optimization "problem" as described by the OP.

 

As the OP mentioned, the GCC compiler optimizes-away (I tried -O1, -O2, -O3, -Os) register variable reads. I ended up using a macro to declare both the register variable and a read function (no additional generated code) to force a read of that variable:

 

// Work-around to force GCC compiler to consider register variable as volatile for reads.
// The compiler seems to preserve writes to a register variable as if it were volatile,
// but optimizes-away (-O1, -O2, -O3, -Os) reads unless this work-around is used.

#define DECLARE_VOLATILE_REGISTER( varType, varName, regName )\
    register varType varName asm(#regName);\
    inline varType ReadVolatileReg_##varName( void )\
    {\
        asm volatile( "" : "=r" (varName) : : );    /* fake write to register variable */\
        return varName;\
    }

DECLARE_VOLATILE_REGISTER( int8_t, foo, r6 );
DECLARE_VOLATILE_REGISTER( uint8_t, bar, r7 );

int main( void )
{
    foo = 12;	// write does not get optimized-away
    bar = 34;	// write does not get optimized-away

    while ( ReadVolatileReg_foo() < 100 )
    {
        // foo and bar are updated in ISR in external file

	// PORTA = bar;  // normal read: compiler does not read bar; instead writes 34 to PORTA
        PORTA = ReadVolatileReg_bar();  // force "volatile": compiler reads bar first
    }
}

 

 

I observed that the GCC compiler seems to preserve writes to a register variable as if it were volatile. Therefore I did not find it necessary to use the "fake read" described by the OP:

asm volatile ( "" : : "r" (foo) : );      /* fake read from register variable */

 

Has anyone seen the compiler optimize-away register variable writes?

 

Last Edited: Wed. Jul 8, 2020 - 08:28 AM