Forum Menu




 


Log in Problems?
New User? Sign Up!
AVR Freaks Forum Index

Post new topic   Reply to topic
View previous topic Printable version Log in to check your private messages View next topic
Author Message
wek
PostPosted: Jun 09, 2010 - 08:18 PM
Raving lunatic


Joined: Dec 16, 2005
Posts: 3086
Location: Bratislava, Slovakia

Programs contain sequences of statements, and a naive compiler would execute them exactly in the order as they are written. But an optimizing compiler is free to reorder the statements - or even parts of them - if the resulting "net effect" is the same. The "measure" of the "net effect" is what the standard calls "side effects", and is accomplished exclusively through accesses (reads and writes) to variables tagged as volatile. So, as long as all volatile reads and writes are to the same addresses and in the same order (and writes write the same values), the program is correct, regardless of other operations in it. (One important point to note here is, that time duration between consecutive volatile accesses is not considered at all.)

Unfortunately, there are also operations which are not covered by volatile accesses. An example of this in avr-gcc/avr-libc are the cli()/sei() macros defined in <avr/interrupt.h>, which convert directly to the respective assembler mnemonics through the __asm__() statement. These don't constitute a variable access at all, not even volatile, so the compiler is free to move them around. Although there is a "volatile" qualifier which can be attached to the __asm__() statement, its effect on (re)ordering is not clear from the documentation (and is more likely only to prevent complete removal by the optimiser), as it (among other) states:
Note that even a volatile asm instruction can be moved relative to other code, including across jump instructions. [...] Similarly, you can't expect a sequence of volatile asm instructions to remain perfectly consecutive.

There is another mechanism which can be used to achieve something similar: memory barriers. This is accomplished through adding a special "memory" clobber to the assembler statement, and ensures that all variables are flushed from registers to memory before the statement, and then re-read after the statement. The purpose of memory barriers is slightly different than to enforce code ordering: it is supposed to ensure that there are no variables "cached" in registers, so that it is safe to change the content of registers e.g. when switching context in a multitasking OS (on "big" processors with out-of-order execution they also imply usage of special instructions which force the processor into "in-order" state (this is not the case of AVRs)).

However, memory barrier works well in ensuring that all volatile accesses before and after the barrier occur in the given order with respect to the barrier. However, it does not ensure the compiler moving non-volatile-related statements across the barrier. Peter Dannegger provided a nice example of this effect:
Code:
#define cli() __asm volatile( "cli" ::: "memory" )
#define sei() __asm volatile( "sei" ::: "memory" )

unsigned int ivar;

void test2( unsigned int val )
{
  val = 65535U / val;

  cli();

  ivar = val;

  sei();
}
compiles with optimisations switched on (-Os) to
Code:
00000112 <test2>:
 112:   bc 01          movw   r22, r24
 114:   f8 94          cli
 116:   8f ef          ldi   r24, 0xFF   ; 255
 118:   9f ef          ldi   r25, 0xFF   ; 255
 11a:   0e 94 96 00    call   0x12c   ; 0x12c <__udivmodhi4>
 11e:   70 93 01 02    sts   0x0201, r23
 122:   60 93 00 02    sts   0x0200, r22
 126:   78 94          sei
 128:   08 95          ret
where the potentially slow multiplication is moved across cli(), resulting in interrupts to be disabled longer than intended. Note, that the volatile access occurs in order with respect to cli()/sei(); so the "net effect" required by the standard is achieved as intended, it is "only" the timing which is off. However, for most of embedded applications, timing is an important, sometimes critical factor.

Unfortunately, at the moment, in avr-gcc (nor in the C standard), there is no mechanism to enforce complete match of written and executed code ordering - except maybe of switching the optimization completely off (-O0), or writing all the critical code in assembly.

To sum it up:
  • memory barriers ensure proper ordering of volatile accesses
  • memory barriers don't ensure statements with no volatile accesses to be reordered across the barrier



[This article was written as a supporting documentation for related items in avr-libc - the sei()/cli() macros in <avr/interrupt.h>, the ATOMIC_BLOCK mechanism in <utils/atomic.h> and the newly being introduced _MemoryBarrier() in <avr/cpufunc.h>. It also drew from http://www.avrfreaks.net/index.php?name ... ighlight=.

Comments please. Thanks.

Jan Waclawek

[edit] fixed the last link


Last edited by wek on Jun 09, 2010 - 09:54 PM; edited 1 time in total
 
 View user's profile Send private message Visit poster's website 
Reply with quote Back to top
theusch
PostPosted: Jun 09, 2010 - 08:34 PM
10k+ Postman


Joined: Feb 19, 2001
Posts: 25889
Location: Wisconsin USA

Quote:

The "measure" of the "net effect" is what the standard calls "side effects", and is accomplished exclusively through accesses (reads and writes) to variables tagged as volatile.

I cannot agree with this, especially the "exclusively" part. The optimization situations that you allude to existed long before microcontrollers and "volatile". You start with basic blocks first, mix well with a portion of sequence points, and add a dash of register lifetime. Volatile is like the sprinkles on the top of the cupcake icing. (generalizations based on past live(s) doing compilers and the like)

However, I can certainly see your point as you expand on the particular situation(s) being addressed.
 
 View user's profile Send private message  
Reply with quote Back to top
Bingo600
PostPosted: Jun 09, 2010 - 09:20 PM
Raving lunatic


Joined: Apr 25, 2004
Posts: 3808
Location: Denmark

@Jan

Tell the poor compiler that you insist on using val "NOW ....."

Code:

#define cli() __asm volatile( "cli" ::: "memory" )
#define sei() __asm volatile( "sei" ::: "memory" )


unsigned int ivar;

void test2( unsigned int val )
{

  val = 65535U / val;

  asm volatile ("" : : "b"(val): "memory");


  cli();

  ivar = val;

  sei();
}


Code:

test2:
/* prologue: function */
/* frame size = 0 */
   movw r22,r24
   ldi r24,lo8(-1)
   ldi r25,hi8(-1)
   rcall __udivmodhi4
   movw r30,r22
/* #APP */
 ;  23 "testw.c" 1
   cli
 ;  0 "" 2
/* #NOAPP */
   sts (ivar)+1,r23
   sts ivar,r22
/* #APP */
 ;  27 "testw.c" 1
   sei
 ;  0 "" 2
/* epilogue start */
/* #NOAPP */
   ret


Some hints here

http://blog.regehr.org/archives/28

http://www.cs.utah.edu/~regehr/papers/e ... eprint.pdf

And i think Dean or Danni also mentioned this trick.


Btw: I'm not 100% sure about the "b" in
Code:
: "b"(val):


I used b to get it to shut up Smile

And you doesnt access the var at all , you just tell the compiler that you want to use/access it.

/Bingo

Ps: If you complain about the
Code:
 movw r30,r22


Then "Put the beast out of it's misery" and go buy an IAR compiler Smile

Wonder where it came from ... Smile
Was it the "b" access ... ?

.
 
 View user's profile Send private message  
Reply with quote Back to top
wek
PostPosted: Jun 09, 2010 - 09:50 PM
Raving lunatic


Joined: Dec 16, 2005
Posts: 3086
Location: Bratislava, Slovakia

Bingo600 wrote:

asm volatile ("" : : "b"(val): "memory");

Well, some form of "volatilisation" should help (btw. "b" is not a particularly good choice in this case, standing for "Base pointer register (r28–r31)", the register allocator might feel pressed in a more convoluted situation and move things around unnecessarily... or not, as it's a pretty unpredictible beast... Smile ). Danni's solution is also a form of "volatilisation", except he did it on the other side of barrier, involving C-ish cast magic, http://www.avrfreaks.net/index.php?name ... 85#672085.

But my point is slightly different: the article is NOT intended to provide a solution, it is intended to WARN about the effect. Once you KNOW this may happen, you will be aware of it when hunting down the subtle bugs it may cause.

Jan
 
 View user's profile Send private message Visit poster's website 
Reply with quote Back to top
Bingo600
PostPosted: Jun 09, 2010 - 09:55 PM
Raving lunatic


Joined: Apr 25, 2004
Posts: 3808
Location: Denmark

@Jan

Well im not an assembler guru , but i take it that you get the hint ...
Tell gcc that you want to use the variable as in the above , but maybe with the right "magic" instead of "b".
Then it will deliver it at that point.

But you are right that it would be optimal if the plain ::: memory , could have done it.
But i do think a volatile would help see below link.

There was a previous issue here
http://www.avrfreaks.net/index.php?name ... mp;t=89378


/Bingo
 
 View user's profile Send private message  
Reply with quote Back to top
Bingo600
PostPosted: Jun 09, 2010 - 10:01 PM
Raving lunatic


Joined: Apr 25, 2004
Posts: 3808
Location: Denmark

Seems like the "r" aka. any register will get rid of the extra code.

Code:
void test2( unsigned int val )
{

  val = 65535U / val;

  asm volatile ("" : : "r"(val): "memory");


  cli();

  ivar = val;

  sei();
}


Code:

test2:
/* prologue: function */
/* frame size = 0 */
   movw r22,r24
   ldi r24,lo8(-1)
   ldi r25,hi8(-1)
   rcall __udivmodhi4
/* #APP */
 ;  23 "testw.c" 1
   cli
 ;  0 "" 2
/* #NOAPP */
   sts (ivar)+1,r23
   sts ivar,r22
/* #APP */
 ;  27 "testw.c" 1
   sei
 ;  0 "" 2
/* epilogue start */
/* #NOAPP */
   ret




Could any of the "C" gurus make a TOUCH(val) macro out of this one : asm volatile ("" : : "r"(val): "memory");

Could come in handy ....

/Bingo
 
 View user's profile Send private message  
Reply with quote Back to top
wek
PostPosted: Jun 09, 2010 - 10:05 PM
Raving lunatic


Joined: Dec 16, 2005
Posts: 3086
Location: Bratislava, Slovakia

theusch wrote:
Quote:

The "measure" of the "net effect" is what the standard calls "side effects", and is accomplished exclusively through accesses (reads and writes) to variables tagged as volatile.

I cannot agree with this, especially the "exclusively" part.

That's just a fancy word... Wink

Lee wrote:
The optimization situations that you allude to existed long before microcontrollers and "volatile". You start with basic blocks first, mix well with a portion of sequence points, and add a dash of register lifetime. Volatile is like the sprinkles on the top of the cupcake icing. (generalizations based on past live(s) doing compilers and the like)

This all comes from C99 5.1.2.3. Sure, I've committed (over)simplification, but I don't think the description of the problem needs to go to further details (which I think the same what you said in the following:)

Lee wrote:
However, I can certainly see your point as you expand on the particular situation(s) being addressed.


You are free to suggest different wording, of course.

Jan
 
 View user's profile Send private message Visit poster's website 
Reply with quote Back to top
Bingo600
PostPosted: Jun 09, 2010 - 10:11 PM
Raving lunatic


Joined: Apr 25, 2004
Posts: 3808
Location: Denmark

wek wrote:

But my point is slightly different: the article is NOT intended to provide a solution, it is intended to WARN about the effect. Once you KNOW this may happen, you will be aware of it when hunting down the subtle bugs it may cause.

Jan


Ahh... I get it (now...)
It was a "warning/info" not a "how to avoid" question ..

Sorry ... But a "TOUCH" macro could come in handy anyways.

Even though one prob still has to check if "the beast" does what "You think/expect you have told it to do" Smile

/Bingo
 
 View user's profile Send private message  
Reply with quote Back to top
ArnoldB
PostPosted: Jun 10, 2010 - 06:54 AM
Raving lunatic


Joined: Nov 29, 2007
Posts: 3219


Do I get it right, you are trying to move an age-old avr-libc discussion to avrfreaks, to build up some pressure on the decision makers there?
 
 View user's profile Send private message  
Reply with quote Back to top
wek
PostPosted: Jun 10, 2010 - 07:48 AM
Raving lunatic


Joined: Dec 16, 2005
Posts: 3086
Location: Bratislava, Slovakia

ArnoldB wrote:
Do I get it right, you are trying to move an age-old avr-libc discussion to avrfreaks, to build up some pressure on the decision makers there?
Well, there are no real decisions made there, as far as substantial issues of gcc are concerned.

I'm just trying to document the status quo.

JW
 
 View user's profile Send private message Visit poster's website 
Reply with quote Back to top
dl8dtl
PostPosted: Jun 10, 2010 - 10:47 AM
Raving lunatic


Joined: Dec 20, 2002
Posts: 7276
Location: Dresden, Germany

I've asked Jan to discuss this here, so his contribution can go into
the avr-libc documentation, as what's currently there might be a
little terse if you never thought about all those details.

_________________
Jörg Wunsch

Please don't send me PMs, use email if you want to approach me personally.
Please read the `General information...' article before.
 
 View user's profile Send private message Send e-mail Visit poster's website 
Reply with quote Back to top
anders_m
PostPosted: Jun 10, 2010 - 01:50 PM
Wannabe


Joined: Dec 18, 2009
Posts: 79


One thing I've though about is a new "reorder_barrier" GCC function attribute. The intended effect would be to prevent the compiler from moving any code across a call to a function with that attribute, ie. from the programmer's point of view basically the same effect as a memory clobber but not quite as expensive.

However, I'm not enough of a language lawyer to judge just how bad an idea this would be.
 
 View user's profile Send private message  
Reply with quote Back to top
theusch
PostPosted: Jun 10, 2010 - 04:10 PM
10k+ Postman


Joined: Feb 19, 2001
Posts: 25889
Location: Wisconsin USA

I haven't dug into the internals, but if the volatile accesses were made a "sequence point"--or the equivalent in the GCC approach--then the code couldn't be moved outside of it's "area". The shortest analogy I can think of is a label in assembler that is the target of a branch/jump/call/... . One must be careful on placing code before/after this destination. This doesn't affect register contents at the "fall though" so may not be as brutal as the memory clobber mentioned.
 
 View user's profile Send private message  
Reply with quote Back to top
ArnoldB
PostPosted: Jun 10, 2010 - 07:33 PM
Raving lunatic


Joined: Nov 29, 2007
Posts: 3219


And what is wrong with first starting to support the build-ins GCC already has? Or, in case they are already reasonably supported for AVRs, spread the news about them?

In particular I am talking about

http://gcc.gnu.org/onlinedocs/gcc-4.3.5 ... ltins.html

anders_m wrote:
One thing I've though about is a new "reorder_barrier" GCC function attribute.

For those asking for just a memory barrier, let me point you to __sync_synchronize() on the above mentioned page.

I really don't see the point of inventing new syntax, new attributes or new semantic. Get the __sync_*() stuff running and you are in business. No point in letting the not invented here syndrome take over.

And if you don't like the function names, wrap them in convenience macros.

And I hate it when things that are named like an assembler opcode (e.g. cli(), sbi(), or nop()) do even a iota more than the corresponding opcodes.
 
 View user's profile Send private message  
Reply with quote Back to top
wek
PostPosted: Jun 10, 2010 - 09:41 PM
Raving lunatic


Joined: Dec 16, 2005
Posts: 3086
Location: Bratislava, Slovakia

ArnoldB wrote:
Get the __sync_*() stuff running and you are in business.
Good point. Do you know of any timeline of when will this appear in avr-gcc, more precisely, in WinAVR/its sucessor?

JW
 
 View user's profile Send private message Visit poster's website 
Reply with quote Back to top
theusch
PostPosted: Jun 10, 2010 - 10:02 PM
10k+ Postman


Joined: Feb 19, 2001
Posts: 25889
Location: Wisconsin USA

Hmmm--my amateur reading of the C standard says to me that if the sei() and cli() are considered volatile (I'd think they should be as it is the same as modifying the SREG volatile sfr, right?), then the following should never get out of order:

Code:

cli();
frog = dog;
sei();


I use this draft for reference:
http://www.open-std.org/jtc1/sc22/wg14/ ... /n1124.pdf

Searching for sequence point, especially as related to volatile, I find:
Quote:
5.1.2.3 Program execution

The semantic descriptions in this International Standard describe the behavior of an abstract machine in which issues of optimization are irrelevant.

Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression may produce side effects. At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place. (A summary of the sequence points is given in annex C.)

As the cli() is an access of a volatile object (at least I'd think that) things look pretty clear as "frog = dog;" is an "expression statement" and Appendix C lists
Quote:
the expression in an expression
statement (6.8.3);
lists 6.8.3 as a sequence point, and 6.8.3 is the expression statement.
 
 View user's profile Send private message  
Reply with quote Back to top
ArnoldB
PostPosted: Jun 10, 2010 - 10:04 PM
Raving lunatic


Joined: Nov 29, 2007
Posts: 3219


The __sync_*() buildins appeared in the GCC around v4.1.

I pointed to 4.3.5 documentation, because that is a current GCC version for Linux avr-gcc, and WinAvr is for sure not much different.

So the buildins are in GCC for some time now. I don't know if they have been implemented for the AVR target. If they have, why not use them? If they haven't avr-libc could implement those that make sense under the corresponding __sync_*_n names.
 
 View user's profile Send private message  
Reply with quote Back to top
ArnoldB
PostPosted: Jun 10, 2010 - 10:27 PM
Raving lunatic


Joined: Nov 29, 2007
Posts: 3219


I just ran
Code:

int a;

void f() {
   a = 1;
   __sync_synchronize();
   a = 2;
}

void g() {
   a = 1;
   a = 2;
}
Through a 4.3.3 avr-gcc with -Os optimization and no other options.
Code:

   .file   "a.c"
__SREG__ = 0x3f
__SP_H__ = 0x3e
__SP_L__ = 0x3d
__tmp_reg__ = 0
__zero_reg__ = 1
   .global __do_copy_data
   .global __do_clear_bss
   .text
.global   g
   .type   g, @function
g:
/* prologue: frame size=0 */
/* prologue end (size=0) */
   ldi r24,lo8(2)
   ldi r25,hi8(2)
   sts (a)+1,r25
   sts a,r24
/* epilogue: frame size=0 */
   ret
/* epilogue end (size=1) */
/* function g size 7 (6) */
   .size   g, .-g
.global   f
   .type   f, @function
f:
/* prologue: frame size=0 */
/* prologue end (size=0) */
   ldi r24,lo8(1)
   ldi r25,hi8(1)
   sts (a)+1,r25
   sts a,r24
   ldi r24,lo8(2)
   ldi r25,hi8(2)
   sts (a)+1,r25
   sts a,r24
/* epilogue: frame size=0 */
   ret
/* epilogue end (size=1) */
/* function f size 15 (14) */
   .size   f, .-f
   .comm a,2,1
/* File "a.c": code   22 = 0x0016 (  20), prologues   0, epilogues   2 */
Looks like __sync_synchronize(); is implemented.
 
 View user's profile Send private message  
Reply with quote Back to top
wek
PostPosted: Jun 10, 2010 - 10:45 PM
Raving lunatic


Joined: Dec 16, 2005
Posts: 3086
Location: Bratislava, Slovakia

Sounds like a good news.
Is that avr-gcc which comes with WinAVR20100110?
Could you please try it also on the example above (I am not at a computer with avr-gcc)?
Are the other __sync_xxx() function also implemented? If yes, how do they achieve atomicity - through cli()/sei()?

Thanks,

JW
 
 View user's profile Send private message Visit poster's website 
Reply with quote Back to top
ArnoldB
PostPosted: Jun 10, 2010 - 11:37 PM
Raving lunatic


Joined: Nov 29, 2007
Posts: 3219


wek wrote:
Sounds like a good news.
Is that avr-gcc which comes with WinAVR20100110?
Could you please try it also on the example above (I am not at a computer with avr-gcc)?
Are the other __sync_xxx() function also implemented? If yes, how do they achieve atomicity - through cli()/sei()?
That is an avr-gcc build with the Bingo scripts on Linux.

Regarding the original example, it doesn't work. I think the reason is the same as for the version without sync, the compiler just doesn't consider cli or sei to be memory operands, so it doesn't care.
Code:

test2:
/* prologue: frame size=0 */
/* prologue end (size=0) */
        mov r22,r24
        mov r23,r25
/* #APP */
        cli
/* #NOAPP */
        ldi r24,lo8(-1)
        ldi r25,hi8(-1)
        rcall __udivmodhi4
        sts (ivar)+1,r23
        sts ivar,r22
/* #APP */
        sei
/* #NOAPP */
/* epilogue: frame size=0 */
        ret
/* epilogue end (size=1) */
/* function test2 size 16 (15) */
        .size   test2, .-test2
        .comm ivar,2,1
/* File "b.c": code   16 = 0x0010 (  15), prologues   0, epilogues   1 */


Lets see what __sync_add_and_fetch() does
Code:

int a;
void f() {
   a = 1;
   __sync_add_and_fetch(&a, 4);
   a = 2;
}

void g() {
   a = 1;
   a += 4;
   a = 2;
}

int main() {
   return 1;
}
Code:

   .file   "c.c"
__SREG__ = 0x3f
__SP_H__ = 0x3e
__SP_L__ = 0x3d
__tmp_reg__ = 0
__zero_reg__ = 1
   .global __do_copy_data
   .global __do_clear_bss
   .text
.global   g
   .type   g, @function
g:
/* prologue: frame size=0 */
/* prologue end (size=0) */
   ldi r24,lo8(2)
   ldi r25,hi8(2)
   sts (a)+1,r25
   sts a,r24
/* epilogue: frame size=0 */
   ret
/* epilogue end (size=1) */
/* function g size 7 (6) */
   .size   g, .-g
.global   main
   .type   main, @function
main:
/* prologue: frame size=0 */
/* prologue end (size=0) */
   ldi r24,lo8(1)
   ldi r25,hi8(1)
/* epilogue: frame size=0 */
   ret
/* epilogue end (size=1) */
/* function main size 3 (2) */
   .size   main, .-main
.global   f
   .type   f, @function
f:
/* prologue: frame size=0 */
/* prologue end (size=0) */
   ldi r24,lo8(1)
   ldi r25,hi8(1)
   sts (a)+1,r25
   sts a,r24
   ldi r22,lo8(4)
   ldi r23,hi8(4)
   ldi r24,lo8(a)
   ldi r25,hi8(a)
   rcall __sync_add_and_fetch_2
   ldi r24,lo8(2)
   ldi r25,hi8(2)
   sts (a)+1,r25
   sts a,r24
/* epilogue: frame size=0 */
   ret
/* epilogue end (size=1) */
/* function f size 18 (17) */
   .size   f, .-f
   .comm a,2,1
/* File "c.c": code   28 = 0x001c (  25), prologues   0, epilogues   3 */

So they aren't implemented in GCC 4.3.3 for the AVR. And the called function is also not in the library:
Code:

c.c:(.text+0x28): undefined reference to `__sync_add_and_fetch_2'

There is some opportunity for avr-libc to provide them. And to convince the GCC programmers to include cli/sei when considering memory operations.
 
 View user's profile Send private message  
Reply with quote Back to top
cpluscon
PostPosted: Jun 11, 2010 - 05:24 AM
Raving lunatic


Joined: Jul 10, 2006
Posts: 2654
Location: Minneapolis

Apparently, sei() and cli() don't work as advertised with GCC. Save the C standard references, this is a bug. Someone fix it, and let us know.
 
 View user's profile Send private message  
Reply with quote Back to top
wek
PostPosted: Jun 11, 2010 - 07:42 AM
Raving lunatic


Joined: Dec 16, 2005
Posts: 3086
Location: Bratislava, Slovakia

ArnoldB wrote:
That is an avr-gcc build with the Bingo scripts on Linux.
IIRC they should result in an avr-gcc quite close to what is in latest WinAVR.

ArnoldB wrote:
Regarding the original example, it doesn't work. I think the reason is the same as for the version without sync, the compiler just doesn't consider cli or sei to be memory operands, so it doesn't care.

Well, then that does not solve the problem either.

It's then exactly the same situation as with the "volatile __asm__()". For the latter, there is no binding description of what it exactly does, the documentation consists mainly of handwavings and it also changed quite often (probably as result of users gradually finding out that it does not do exactly what the previous version of documentation promised).

Even if __sync_xxx() would symptomatically do what we want, unless it is exactly documented, it means, that the author of it simply included some ad-hoc kludge to gcc sources to solve some specific problem he came across, and does not know (did not studied thoroughly) the implications of it in various other situations. And the same would also happen with "convince the GCC programmers to include cli/sei when considering memory operations". This inevitably ends up with a "works most of the time and we won't (because we can't and also don't want to) say when it won't work", and "worked more reliably in previous version".

What we'd really need is a reverse process: first, to define what "code reordering" means, produce a standard-like description of what "code reordering prevention" would mean, and then get it implemented.

Of course this just won't happen in the present constellation of things.

--

It does not mean that it's not worth to study the __sync_xxx() stuff, it's just that I'm tired of finding more and more interim solutions.

JW
 
 View user's profile Send private message Visit poster's website 
Reply with quote Back to top
clawson
PostPosted: Jun 11, 2010 - 10:30 AM
10k+ Postman


Joined: Jul 18, 2005
Posts: 62228
Location: (using avr-gcc in) Finchingfield, Essex, England

Quote:

Is that avr-gcc which comes with WinAVR20100110?

Code:
D:\test>avr-gcc -dumpversion
4.3.3

D:\test>avr-objdump -v
GNU objdump (WinAVR 20100110) 2.19

_________________
 
 View user's profile Send private message  
Reply with quote Back to top
skeeve
PostPosted: Jun 11, 2010 - 03:35 PM
Raving lunatic


Joined: Oct 29, 2006
Posts: 2640


There seems to some confusion about the meaning of asm volatile.
asm volatile is not the same as an access of a volatile variable.
The volatile in asm volatile says to the compiler
that it has side effects about which the compiler
is not otherwise being told.
The compiler is allowed to assume that the
side effects are orthogonal to all others.
Thus, absent some other constraint,
asm volatile statements may be reordered at will.
Only the number of each is important.
Ordering can be influenced with operands.

_________________
Michael Hennebry
Iluvatar is the better part of Valar.
 
 View user's profile Send private message  
Reply with quote Back to top
stu_san
PostPosted: Jun 11, 2010 - 03:59 PM
Raving lunatic


Joined: Dec 30, 2005
Posts: 2327
Location: Fort Collins, CO USA

In the avr-gcc mail list, they've been talking about this (in fact, Jan was asked to post here on 'Freaks so that he (and we) would have a wider audience). I believe that all of the above test cases are too simple.

The specific case that really caused problems was:
Code:
void foo(void)
{
   some_temp_variable = result_of_expensive_computation;
                        /* I think it's been a division. */
   cli();
   something_time_cricital = some_temp_variable;
   sei();
}
(example from Joerg Wunsch)

The intention here is to do all of the "expensive" calculation outside the interrupt-free zone, then do the time critical assignment (say to a multi-byte variable such as a timer count) inside the interrupt-free zone.

Unfortunately, the optimizer sees the temp variable as "part" of the final assignment and so brings the entire calculation into the section where the interrupts are off.

Certainly, this can be "fixed" with making the temp volatile (or maybe not - the jury's still out on whether this would work), but that dodges the point, doesn't it. The idea here is that the compiler could do the computation, save the result in a register, turn off the interrupts, then do the assign. Making the temp "volatile" will make the assignment more expensive.

This is (apparently) a problem in the structure of C and the optimizer and is not readily solved. As Jan has said, there is no explicit way in C to tell the compiler/optimizer "don't do anything across this boundary". There are lots of things you can do whose side effect is to cause the boundary, but no explicit method. Well, except maybe this __sync_* stuff - need to read up on that.

I am fascinated by what has come from this discussion, rock on!

Stu

_________________
Engineering seems to boil down to: Cheap. Fast. Good. Choose two. Sometimes choose only one.

Newbie? Be sure to read the thread Newbie? Start here!
 
 View user's profile Send private message  
Reply with quote Back to top
skeeve
PostPosted: Jun 11, 2010 - 05:12 PM
Raving lunatic


Joined: Oct 29, 2006
Posts: 2640


stu_san wrote:
In the avr-gcc mail list, they've been talking about this (in fact, Jan was asked to post here on 'Freaks so that he (and we) would have a wider audience). I believe that all of the above test cases are too simple.

The specific case that really caused problems was:
Code:
void foo(void)
{
   some_temp_variable = result_of_expensive_computation;
                        /* I think it's been a division. */
   cli();
   something_time_cricital = some_temp_variable;
   sei();
}
(example from Joerg Wunsch)

The intention here is to do all of the "expensive" calculation outside the interrupt-free zone, then do the time critical assignment (say to a multi-byte variable such as a timer count) inside the interrupt-free zone.

Unfortunately, the optimizer sees the temp variable as "part" of the final assignment and so brings the entire calculation into the section where the interrupts are off.

Certainly, this can be "fixed" with making the temp volatile (or maybe not - the jury's still out on whether this would work), but that dodges the point, doesn't it. The idea here is that the compiler could do the computation, save the result in a register, turn off the interrupts, then do the assign. Making the temp "volatile" will make the assignment more expensive.
In principle, cli() and sei() could be reordered at will.
They are useful because the compiler is usually not malicious.
Making the temporary an IO operand of cli would put it between the assignments.
If something_time_critical is volatile,
I think that making it an input operand of
sei would ensure that sei not come too early.
There remains the problem of ensuring that unrelated
junk not be inserted between cli and sei.

_________________
Michael Hennebry
Iluvatar is the better part of Valar.
 
 View user's profile Send private message  
Reply with quote Back to top
theusch
PostPosted: Jun 11, 2010 - 05:30 PM
10k+ Postman


Joined: Feb 19, 2001
Posts: 25889
Location: Wisconsin USA

Quote:

In principle, cli() and sei() could be reordered at will.
They are useful because the compiler is usually not malicious.
Making the temporary an IO operand of cli would put it between the assignments.

That's what is losing me here a bit. UBRRL and DDRA and SREG are volatile, right? So the compiler can't move them around before/after/over sequence points, according to the C rules I posted above.

Now, CLI (IIRC) is nothing more than a CBI; etc. Those are operations on a volatile thingy, SREG. Shouldn't they have to follow the same rules?

Now, it is an indirect reference; I guess one wouldn't expect the compiler to interpret embedded ASM. But if you tag the embeeded ASM sequence with "volatile" shouldn't the compiler follow the rules for volatile as presented in the C standard?

Lee
 
 View user's profile Send private message  
Reply with quote Back to top
skeeve
PostPosted: Jun 11, 2010 - 08:36 PM
Raving lunatic


Joined: Oct 29, 2006
Posts: 2640


theusch wrote:
Quote:

In principle, cli() and sei() could be reordered at will.
They are useful because the compiler is usually not malicious.
Making the temporary an IO operand of cli would put it between the assignments.

That's what is losing me here a bit. UBRRL and DDRA and SREG are volatile, right? So the compiler can't move them around before/after/over sequence points, according to the C rules I posted above.

Now, CLI (IIRC) is nothing more than a CBI; etc. Those are operations on a volatile thingy, SREG. Shouldn't they have to follow the same rules?
The compiler never sees the SREG reference.
To the compiler, cli(), sei() and other inline assembly are
black boxes into which it cannot look without assistance.
Gnu syntax allows some assistance,
but the compiler will never look at the instructions.
Quote:
Now, it is an indirect reference; I guess one wouldn't expect the compiler to interpret embedded ASM. But if you tag the embeeded ASM sequence with "volatile" shouldn't the compiler follow the rules for volatile as presented in the C standard?
"volatile" is overloaded.
The "volatile" in "__asm__ volatile" doesn't mean the
same thing as the "volatile" in "volatile char flag".
One could, I suppose, re#define cli() as (SREG &= ~0x80) and sei() as (SREG |= 0x80).

SREG is outside CBI range.
CLI is one of eight instructions specifically
for clearing bits in SREG.

_________________
Michael Hennebry
Iluvatar is the better part of Valar.
 
 View user's profile Send private message  
Reply with quote Back to top
wek
PostPosted: Jun 11, 2010 - 08:58 PM
Raving lunatic


Joined: Dec 16, 2005
Posts: 3086
Location: Bratislava, Slovakia

theusch wrote:
I guess one wouldn't expect the compiler to interpret embedded ASM. But if you tag the embeeded ASM sequence with "volatile" shouldn't the compiler follow the rules for volatile as presented in the C standard?
Very well said. But it apparently does not do that. Even worse - I am convinced nobody knows exactly what is the effect of volatile on asm. I believe somebody back then implemented it to fulfill some particular need, and while it might have worked for his own purposes, it did not work as intended in other cases, and the documentation produced afterwards just codified the mess.

The same I believe is happening with the sync_xxx() stuff.

This is exactly what I warned above: without a precise specification of expectations up front there's more grief than benefit from various ad-hoc implementations of "something".

JW
 
 View user's profile Send private message Visit poster's website 
Reply with quote Back to top
theusch
PostPosted: Jun 11, 2010 - 09:04 PM
10k+ Postman


Joined: Feb 19, 2001
Posts: 25889
Location: Wisconsin USA

Quote:

SREG is outside CBI range.

That's right! I didn't do the decoding; I thought I remembered that CLI & SEI were one of those instruction mnemonics that was a fancy name for another encoded instruction. [I was half right -- CLI is BCLR 7]
 
 View user's profile Send private message  
Reply with quote Back to top
cpluscon
PostPosted: Jun 12, 2010 - 03:22 AM
Raving lunatic


Joined: Jul 10, 2006
Posts: 2654
Location: Minneapolis

At least break cli() and sei() so they don't compile. Or rename them cli_maybe() and sei_maybe(). Can you imagine the time spent looking for those bugs? Egads. For hobbyists and even students this issue might be excusable, but in production environments it's a dealbreaker.
 
 View user's profile Send private message  
Reply with quote Back to top
wek
PostPosted: Jun 12, 2010 - 09:41 AM
Raving lunatic


Joined: Dec 16, 2005
Posts: 3086
Location: Bratislava, Slovakia

cpluscon wrote:
At least break cli() and sei() so they don't compile.

These are macros in <avr/interrupt.h>, so if you feel so, you can break them yourself.

The issue is, that these macros are used for ages and it is quite rare that they don't work as intended. I personally checked several dozen of its occurence in my programs and I've found no irregularity (even without the memory barrier, which is now being added to it, and which should further enforce its working toward what's expected). I believe it's not worth to break thousands of programs, weighting against the potential risk.

On the other hand, the documentation now warns for the risk (pointing to the article which is in the OP). Everybody is supposed to study the documentation thoroughtly before using the tools, right? Wink Also, it is recommended to use the ATOMIC_BLOCK facilities rather than cli()/sei() wherever appropriate; this at least gives clear path to future, would proper bullet-proof atomicity be ever implemented in gcc. (I also started to work on a set of purely asm atomic operation (this might get similar in semantics to the __sync_xxx() stuff) just I don't have time to finish it. )

At the moment, I am afraid, this is the best we can get.

If you think about it, there is always an inevitable risk in using a higher level language and expect it will always work as intended. These languages are designed out of a premise that they work on an abstract machine, with no binding to any particular hardware, and with no constraints in memory and time.

JW
 
 View user's profile Send private message Visit poster's website 
Reply with quote Back to top
cpluscon
PostPosted: Jun 12, 2010 - 04:19 PM
Raving lunatic


Joined: Jul 10, 2006
Posts: 2654
Location: Minneapolis

wek wrote:
so if you feel so, you can break them yourself
I think you missed the point. They should be broken specifically for people who are not aware they don't work.
 
 View user's profile Send private message  
Reply with quote Back to top
TimothyEBaldwin
PostPosted: Jun 14, 2010 - 05:50 PM
Hangaround


Joined: Aug 26, 2008
Posts: 221


skeeve wrote:
There seems to some confusion about the meaning of asm volatile.
asm volatile is not the same as an access of a volatile variable.
The volatile in asm volatile says to the compiler
that it has side effects about which the compiler
is not otherwise being told.


Which potentially include access to volatile variables. Can anyone provide an example of volatile asm statements being moved over volatile variable access?
 
 View user's profile Send private message  
Reply with quote Back to top
skeeve
PostPosted: Jun 15, 2010 - 05:29 PM
Raving lunatic


Joined: Oct 29, 2006
Posts: 2640


TimothyEBaldwin wrote:
skeeve wrote:
There seems to some confusion about the meaning of asm volatile.
asm volatile is not the same as an access of a volatile variable.
The volatile in asm volatile says to the compiler
that it has side effects about which the compiler
is not otherwise being told.


Which potentially include access to volatile variables. Can anyone provide an example of volatile asm statements being moved over volatile variable access?
Putting memory on the clobber list would
tell the compiler that volatile variables
and all other variables might be clobbered.
So far as I know, there is no way to tell the compiler
That only volatile variables might be clobbered.
That only SREG might be clobbered.
That only some specific byte of memory might be clobbered.
That some particular range of bytes might be clobbered.

It seems to me that any inline assembly
that does arithmetic would clobber SREG.
To function correctly without understanding the assembly,
avr-gcc would have to kill the lower seven bit of SREG.

Something that just occurred to me:
the status register isn't volatile.
SREG is #defined to be something volatile,
but the register itself is not.

Perhaps it should be allowed to put "volatile" in the clobber list.
The semantics would be that it could not be
reordered with respect to volatile accesses.

In the mean time, how about something like this
Code:
#define cli2(pre) __asm__ volatile ("cli" :: "r"(pre))

#define sei2(post) __asm__ volatile ("sei" :"+r"(post):)

_________________
Michael Hennebry
Iluvatar is the better part of Valar.
 
 View user's profile Send private message  
Reply with quote Back to top
Display posts from previous:     
Jump to:  
All times are GMT + 1 Hour
Post new topic   Reply to topic
View previous topic Printable version Log in to check your private messages View next topic
Powered by PNphpBB2 © 2003-2006 The PNphpBB Group
Credits