| Author |
Message |
|
|
Posted: Jun 11, 2010 - 05:24 AM |
|


Joined: Jul 10, 2006
Posts: 2654
Location: Minneapolis
|
|
| Apparently, sei() and cli() don't work as advertised with GCC. Save the C standard references, this is a bug. Someone fix it, and let us know. |
|
|
| |
|
|
|
|
|
Posted: Jun 11, 2010 - 07:42 AM |
|

Joined: Dec 16, 2005
Posts: 3086
Location: Bratislava, Slovakia
|
|
|
ArnoldB wrote:
That is an avr-gcc build with the Bingo scripts on Linux.
IIRC they should result in an avr-gcc quite close to what is in latest WinAVR.
ArnoldB wrote:
Regarding the original example, it doesn't work. I think the reason is the same as for the version without sync, the compiler just doesn't consider cli or sei to be memory operands, so it doesn't care.
Well, then that does not solve the problem either.
It's then exactly the same situation as with the "volatile __asm__()". For the latter, there is no binding description of what it exactly does, the documentation consists mainly of handwavings and it also changed quite often (probably as result of users gradually finding out that it does not do exactly what the previous version of documentation promised).
Even if __sync_xxx() would symptomatically do what we want, unless it is exactly documented, it means, that the author of it simply included some ad-hoc kludge to gcc sources to solve some specific problem he came across, and does not know (did not studied thoroughly) the implications of it in various other situations. And the same would also happen with "convince the GCC programmers to include cli/sei when considering memory operations". This inevitably ends up with a "works most of the time and we won't (because we can't and also don't want to) say when it won't work", and "worked more reliably in previous version".
What we'd really need is a reverse process: first, to define what "code reordering" means, produce a standard-like description of what "code reordering prevention" would mean, and then get it implemented.
Of course this just won't happen in the present constellation of things.
--
It does not mean that it's not worth to study the __sync_xxx() stuff, it's just that I'm tired of finding more and more interim solutions.
JW |
|
|
| |
|
|
|
|
|
Posted: Jun 11, 2010 - 10:30 AM |
|


Joined: Jul 18, 2005
Posts: 62245
Location: (using avr-gcc in) Finchingfield, Essex, England
|
|
|
Quote:
Is that avr-gcc which comes with WinAVR20100110?
Code:
D:\test>avr-gcc -dumpversion
4.3.3
D:\test>avr-objdump -v
GNU objdump (WinAVR 20100110) 2.19
|
_________________
|
| |
|
|
|
|
|
Posted: Jun 11, 2010 - 03:35 PM |
|

Joined: Oct 29, 2006
Posts: 2640
|
|
There seems to some confusion about the meaning of asm volatile.
asm volatile is not the same as an access of a volatile variable.
The volatile in asm volatile says to the compiler
that it has side effects about which the compiler
is not otherwise being told.
The compiler is allowed to assume that the
side effects are orthogonal to all others.
Thus, absent some other constraint,
asm volatile statements may be reordered at will.
Only the number of each is important.
Ordering can be influenced with operands. |
_________________ Michael Hennebry
Iluvatar is the better part of Valar.
|
| |
|
|
|
|
|
Posted: Jun 11, 2010 - 03:59 PM |
|


Joined: Dec 30, 2005
Posts: 2327
Location: Fort Collins, CO USA
|
|
In the avr-gcc mail list, they've been talking about this (in fact, Jan was asked to post here on 'Freaks so that he (and we) would have a wider audience). I believe that all of the above test cases are too simple.
The specific case that really caused problems was:
Code:
void foo(void)
{
some_temp_variable = result_of_expensive_computation;
/* I think it's been a division. */
cli();
something_time_cricital = some_temp_variable;
sei();
}
(example from Joerg Wunsch)
The intention here is to do all of the "expensive" calculation outside the interrupt-free zone, then do the time critical assignment (say to a multi-byte variable such as a timer count) inside the interrupt-free zone.
Unfortunately, the optimizer sees the temp variable as "part" of the final assignment and so brings the entire calculation into the section where the interrupts are off.
Certainly, this can be "fixed" with making the temp volatile (or maybe not - the jury's still out on whether this would work), but that dodges the point, doesn't it. The idea here is that the compiler could do the computation, save the result in a register, turn off the interrupts, then do the assign. Making the temp "volatile" will make the assignment more expensive.
This is (apparently) a problem in the structure of C and the optimizer and is not readily solved. As Jan has said, there is no explicit way in C to tell the compiler/optimizer "don't do anything across this boundary". There are lots of things you can do whose side effect is to cause the boundary, but no explicit method. Well, except maybe this __sync_* stuff - need to read up on that.
I am fascinated by what has come from this discussion, rock on!
Stu |
_________________ Engineering seems to boil down to: Cheap. Fast. Good. Choose two. Sometimes choose only one.
Newbie? Be sure to read the thread Newbie? Start here!
|
| |
|
|
|
|
|
Posted: Jun 11, 2010 - 05:12 PM |
|

Joined: Oct 29, 2006
Posts: 2640
|
|
|
stu_san wrote:
In the avr-gcc mail list, they've been talking about this (in fact, Jan was asked to post here on 'Freaks so that he (and we) would have a wider audience). I believe that all of the above test cases are too simple.
The specific case that really caused problems was:
Code:
void foo(void)
{
some_temp_variable = result_of_expensive_computation;
/* I think it's been a division. */
cli();
something_time_cricital = some_temp_variable;
sei();
}
(example from Joerg Wunsch)
The intention here is to do all of the "expensive" calculation outside the interrupt-free zone, then do the time critical assignment (say to a multi-byte variable such as a timer count) inside the interrupt-free zone.
Unfortunately, the optimizer sees the temp variable as "part" of the final assignment and so brings the entire calculation into the section where the interrupts are off.
Certainly, this can be "fixed" with making the temp volatile (or maybe not - the jury's still out on whether this would work), but that dodges the point, doesn't it. The idea here is that the compiler could do the computation, save the result in a register, turn off the interrupts, then do the assign. Making the temp "volatile" will make the assignment more expensive.
In principle, cli() and sei() could be reordered at will.
They are useful because the compiler is usually not malicious.
Making the temporary an IO operand of cli would put it between the assignments.
If something_time_critical is volatile,
I think that making it an input operand of
sei would ensure that sei not come too early.
There remains the problem of ensuring that unrelated
junk not be inserted between cli and sei. |
_________________ Michael Hennebry
Iluvatar is the better part of Valar.
|
| |
|
|
|
|
|
Posted: Jun 11, 2010 - 05:30 PM |
|


Joined: Feb 19, 2001
Posts: 25894
Location: Wisconsin USA
|
|
|
Quote:
In principle, cli() and sei() could be reordered at will.
They are useful because the compiler is usually not malicious.
Making the temporary an IO operand of cli would put it between the assignments.
That's what is losing me here a bit. UBRRL and DDRA and SREG are volatile, right? So the compiler can't move them around before/after/over sequence points, according to the C rules I posted above.
Now, CLI (IIRC) is nothing more than a CBI; etc. Those are operations on a volatile thingy, SREG. Shouldn't they have to follow the same rules?
Now, it is an indirect reference; I guess one wouldn't expect the compiler to interpret embedded ASM. But if you tag the embeeded ASM sequence with "volatile" shouldn't the compiler follow the rules for volatile as presented in the C standard?
Lee |
|
|
| |
|
|
|
|
|
Posted: Jun 11, 2010 - 08:36 PM |
|

Joined: Oct 29, 2006
Posts: 2640
|
|
|
theusch wrote:
Quote:
In principle, cli() and sei() could be reordered at will.
They are useful because the compiler is usually not malicious.
Making the temporary an IO operand of cli would put it between the assignments.
That's what is losing me here a bit. UBRRL and DDRA and SREG are volatile, right? So the compiler can't move them around before/after/over sequence points, according to the C rules I posted above.
Now, CLI (IIRC) is nothing more than a CBI; etc. Those are operations on a volatile thingy, SREG. Shouldn't they have to follow the same rules?
The compiler never sees the SREG reference.
To the compiler, cli(), sei() and other inline assembly are
black boxes into which it cannot look without assistance.
Gnu syntax allows some assistance,
but the compiler will never look at the instructions.
Quote:
Now, it is an indirect reference; I guess one wouldn't expect the compiler to interpret embedded ASM. But if you tag the embeeded ASM sequence with "volatile" shouldn't the compiler follow the rules for volatile as presented in the C standard?
"volatile" is overloaded.
The "volatile" in "__asm__ volatile" doesn't mean the
same thing as the "volatile" in "volatile char flag".
One could, I suppose, re#define cli() as (SREG &= ~0x80) and sei() as (SREG |= 0x80).
SREG is outside CBI range.
CLI is one of eight instructions specifically
for clearing bits in SREG. |
_________________ Michael Hennebry
Iluvatar is the better part of Valar.
|
| |
|
|
|
|
|
Posted: Jun 11, 2010 - 08:58 PM |
|

Joined: Dec 16, 2005
Posts: 3086
Location: Bratislava, Slovakia
|
|
|
theusch wrote:
I guess one wouldn't expect the compiler to interpret embedded ASM. But if you tag the embeeded ASM sequence with "volatile" shouldn't the compiler follow the rules for volatile as presented in the C standard?
Very well said. But it apparently does not do that. Even worse - I am convinced nobody knows exactly what is the effect of volatile on asm. I believe somebody back then implemented it to fulfill some particular need, and while it might have worked for his own purposes, it did not work as intended in other cases, and the documentation produced afterwards just codified the mess.
The same I believe is happening with the sync_xxx() stuff.
This is exactly what I warned above: without a precise specification of expectations up front there's more grief than benefit from various ad-hoc implementations of "something".
JW |
|
|
| |
|
|
|
|
|
Posted: Jun 11, 2010 - 09:04 PM |
|


Joined: Feb 19, 2001
Posts: 25894
Location: Wisconsin USA
|
|
|
Quote:
SREG is outside CBI range.
That's right! I didn't do the decoding; I thought I remembered that CLI & SEI were one of those instruction mnemonics that was a fancy name for another encoded instruction. [I was half right -- CLI is BCLR 7] |
|
|
| |
|
|
|
|
|
Posted: Jun 12, 2010 - 03:22 AM |
|


Joined: Jul 10, 2006
Posts: 2654
Location: Minneapolis
|
|
| At least break cli() and sei() so they don't compile. Or rename them cli_maybe() and sei_maybe(). Can you imagine the time spent looking for those bugs? Egads. For hobbyists and even students this issue might be excusable, but in production environments it's a dealbreaker. |
|
|
| |
|
|
|
|
|
Posted: Jun 12, 2010 - 09:41 AM |
|

Joined: Dec 16, 2005
Posts: 3086
Location: Bratislava, Slovakia
|
|
|
cpluscon wrote:
At least break cli() and sei() so they don't compile.
These are macros in <avr/interrupt.h>, so if you feel so, you can break them yourself.
The issue is, that these macros are used for ages and it is quite rare that they don't work as intended. I personally checked several dozen of its occurence in my programs and I've found no irregularity (even without the memory barrier, which is now being added to it, and which should further enforce its working toward what's expected). I believe it's not worth to break thousands of programs, weighting against the potential risk.
On the other hand, the documentation now warns for the risk (pointing to the article which is in the OP). Everybody is supposed to study the documentation thoroughtly before using the tools, right? Also, it is recommended to use the ATOMIC_BLOCK facilities rather than cli()/sei() wherever appropriate; this at least gives clear path to future, would proper bullet-proof atomicity be ever implemented in gcc. (I also started to work on a set of purely asm atomic operation (this might get similar in semantics to the __sync_xxx() stuff) just I don't have time to finish it. )
At the moment, I am afraid, this is the best we can get.
If you think about it, there is always an inevitable risk in using a higher level language and expect it will always work as intended. These languages are designed out of a premise that they work on an abstract machine, with no binding to any particular hardware, and with no constraints in memory and time.
JW |
|
|
| |
|
|
|
|
|
Posted: Jun 12, 2010 - 04:19 PM |
|


Joined: Jul 10, 2006
Posts: 2654
Location: Minneapolis
|
|
|
wek wrote:
so if you feel so, you can break them yourself
I think you missed the point. They should be broken specifically for people who are not aware they don't work. |
|
|
| |
|
|
|
|
|
Posted: Jun 14, 2010 - 05:50 PM |
|

Joined: Aug 26, 2008
Posts: 221
|
|
|
skeeve wrote:
There seems to some confusion about the meaning of asm volatile.
asm volatile is not the same as an access of a volatile variable.
The volatile in asm volatile says to the compiler
that it has side effects about which the compiler
is not otherwise being told.
Which potentially include access to volatile variables. Can anyone provide an example of volatile asm statements being moved over volatile variable access? |
|
|
| |
|
|
|
|
|
Posted: Jun 15, 2010 - 05:29 PM |
|

Joined: Oct 29, 2006
Posts: 2640
|
|
|
TimothyEBaldwin wrote:
skeeve wrote:
There seems to some confusion about the meaning of asm volatile.
asm volatile is not the same as an access of a volatile variable.
The volatile in asm volatile says to the compiler
that it has side effects about which the compiler
is not otherwise being told.
Which potentially include access to volatile variables. Can anyone provide an example of volatile asm statements being moved over volatile variable access?
Putting memory on the clobber list would
tell the compiler that volatile variables
and all other variables might be clobbered.
So far as I know, there is no way to tell the compiler
That only volatile variables might be clobbered.
That only SREG might be clobbered.
That only some specific byte of memory might be clobbered.
That some particular range of bytes might be clobbered.
It seems to me that any inline assembly
that does arithmetic would clobber SREG.
To function correctly without understanding the assembly,
avr-gcc would have to kill the lower seven bit of SREG.
Something that just occurred to me:
the status register isn't volatile.
SREG is #defined to be something volatile,
but the register itself is not.
Perhaps it should be allowed to put "volatile" in the clobber list.
The semantics would be that it could not be
reordered with respect to volatile accesses.
In the mean time, how about something like this
Code:
#define cli2(pre) __asm__ volatile ("cli" :: "r"(pre))
#define sei2(post) __asm__ volatile ("sei" :"+r"(post):)
|
_________________ Michael Hennebry
Iluvatar is the better part of Valar.
|
| |
|
|
|
|
|