C99 Strict Aliasing Rules and Packed Streams

Go To Last Post
12 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi Freaks,

Got another coding conundrum on my hands. Recently I've installed the beta of the new AVR32/AVR8 combined toolchain package - the successor to WinAVR - onto my PC for testing with my LUFA code, and I've hit a couple of bumps.

The first bump is that apparently something is misconfigured or broken in the beta, since my existing makefiles complain of missing executables unless I also have the current WinAVR around as a backup. That's probably an issue with the names of the binaries being different, but I haven't investigated further yet.

Now, here's what's got me confused. The new AVR-GCC version seems to enforce C99's strict aliasing rules when -Os is used and C99 mode is specified. Whether this used to be the case I'm not sure, but rebuilding my code with the new toolchain version produces warnings about aliased pointers unless I specifically disable it with -fno-strict-aliasing. This isn't ideal however, as I'd rather make my code C99 compliant instead.

If you're not familiar with the strict aliasing rules of C99, this is a good write up on it. I understand what its trying to achieve, although I rather wish they made all pointers aliasable by default unless the restrict keyword is used like with function parameters.

I use aliased pointers a lot in my code - either for optimizations to manipulate sections of a larger data type, or for type-punning. One part where I can't see how to get around the aliased pointer issues is with packed buffers.

In one instance of my code, the aliasing occurs in my SCSI driver of my Mass Storage driver. When the host sends a SCSI command to the device, it does so by packing the data into a buffer, and then streaming it as a large block. On the device side, I then need to use aliased pointers to re-interpret the bytes in the buffer as the SCSI command values they really are:

static void SCSI_Command_ReadWrite_10(const bool IsDataRead)
{
    uint32_t BlockAddress = SwapEndian_32(*(uint32_t*)&CommandBlock.SCSICommandData[2]);
    uint16_t TotalBlocks  = SwapEndian_16(*(uint16_t*)&CommandBlock.SCSICommandData[7]);

Which makes the compiler complain. The thing is, I can't figure out what the proper procedure would be to convert the bytes of the buffer at position n into a data type of type x without breaking the strict aliasing rules, or using lots and lots of bitshifts and masking or slow memcpy() calls.

Does anyone know of the correct way to achieve what I'm doing (type-punning bytes in the packed array to a native datatype) without aliased pointers? The usual - and, I might add, non-defined - method of unions to perform the type punning won't work, as the positions of the command variables in the buffer are not fixed.

- Dean :twisted:

Make Atmel Studio better with my free extensions. Open source and feedback welcome!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Dean,

I just googled for ways to avoid the strict aliasing warning and, for some code that was now subject to the warning, the suggested patch was:

@@ -744,14 +744,16 @@ static void iptcc_delete_rule(struct rule_head *r)
  * to be called from specific places within the parser */
 static int __iptcc_p_del_policy(TC_HANDLE_T h, unsigned int num)
 {
+       const unsigned char *data;
+
        if (h->chain_iterator_cur) {
                /* policy rule is last rule */
                struct rule_head *pr = (struct rule_head *)
                        h->chain_iterator_cur->rules.prev;
 
                /* save verdict */
-               h->chain_iterator_cur->verdict = 
-                       *(int *)GET_TARGET(pr->entry)->data;
+               data = GET_TARGET(pr->entry)->data;
+               h->chain_iterator_cur->verdict = *(const int *)data;

in which a direct assignment using *(int *) has been split to use a new intermediate variable which presumably suppresses the warning. But I don't wholly understand why the warning isn't given for the last + in this as that seems to be doing the same thing only this time it is *(const int *) but that's surely reinterpreting "const unsigned char *data" ? So why isn't that against the rule as well?

I do think this whole thing is a real shame - one of the powers of C is the ability to reinterpret data in memory through different pointer access - and now they've banned it?

(BTW would void* - the universal "get out of jail free" card help in all this?)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The new C99 aliasing rules permit char* pointers to alias ANY memory -- however the reverse isn't true. That code might suppress the warning, but it's still invalid as the cast from a char* to int* is a forbidden alias.

Basically, char* can alias anything, but all other pointers may not alias - even a char* - unless the pointers differ only by a qualifier such as const. The only other exception to this are function parameters - unless specifically marked with the new "restrict" qualifier, the compiler cannot assume that the pointed-to locations do not overlap. That puts a major downer on pretty much all embedded code.

What gets me is that the "accepted" way around this is to use a union for the type punning -- but the C standard says that officially, the conversion through the union produces undefined results. If we're going to solve one new rule by violating another, have we gained anything other than more unwieldy code?

I think I might just have to settle for disabling strict aliasing and adhering to all other C99 rules, which is a shame. The strict aliasing does give benefits in a few instances, but it also forces us to re-write large swaths of code with clumsy workarounds.

- Dean :twisted:

EDIT: What gets me is that this has been in the standard for TEN YEARS, but it's only now that the compilers are beginning to optimize accordingly that it's being pointed out.

Make Atmel Studio better with my free extensions. Open source and feedback welcome!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Waitaminute, this SwapEndian_32() stuff reshuffles the bytes anyway? Then why are you concerned about bitshifts

I would use single functions (maybe inlined) or macros to fetch the data in the desired form, using bitshifts

static inline uint32_t get_uint32(const char *data, int idx)

Stealing Proteus doesn't make you an engineer.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

abcminiuser wrote:
The first bump is that apparently something is misconfigured or broken in the beta, since my existing makefiles complain of missing executables unless I also have the current WinAVR around as a backup. That's probably an issue with the names of the binaries being different, but I haven't investigated further yet.

https://www.avrfreaks.net/index.p... ?
I had a similar problem except it complained about missing dlls rather than exes.

abcminiuser wrote:
Now, here's what's got me confused. The new AVR-GCC version seems to enforce C99's strict aliasing rules when -Os is used and C99 mode is specified. Whether this used to be the case I'm not sure,

The older gccs did that, too, so your code apparenlty violated the rules already. The warning does not warn for every possible violation of the rules, and the details of the warning algorithm apparently changes between versions. Read about -Wstrict-aliasing in http://gcc.gnu.org/onlinedocs/gc...

abcminiuser wrote:
but rebuilding my code with the new toolchain version produces warnings about aliased pointers unless I specifically disable it with -fno-strict-aliasing.

-fno-strict-aliasing does *not* disable the warning, it does significantly more - it disables the optimisation assuming the programmer strictly follows the aliasing rules (as a side-effect, it does disable the warning, of course, as it then is not needed anymore). As mentioned above, it would be -Wno-strict-aliasing to disable only the warning.

Quote:
The thing is, I can't figure out what the proper procedure would be to convert the bytes of the buffer at position n into a data type of type x without breaking the strict aliasing rules, or using lots and lots of bitshifts and masking or slow memcpy() calls.

As ArnoldB said, this is the only kosher way. Contrary to what Cliff thinks, and contrary to the common usage, this is the only right way... Did I mention I hate C? ;-)

But, would I be you, I wouldn't hesitate to add -fno-strict-aliasing to the compiler's switches. The reasons to do so:

  • you don't need to rewrite lots of working and tried code
  • packed buffers (structs) are gcc extension hence non-standard anyway
  • that switch is here apparently to provide exactly what you want - as mentioned, too many people do type punning through pointers to the gcc developers would dare to drop it altogether
  • the optimisation provided by assuming strict aliasing is in my empirical findings marginal
  • there is an IMHO serious pessimisation related to -fstrict-aliasing, potentially increasing surprisingly RAM (stack) usage, see http://lists.nongnu.org/archive/... (the whole thread has to be read and it's no entertaining reading).
JW

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

Contrary to what Cliff thinks, and contrary to the common usage, this is the only right way...

You'd be surprised what Cliff thinks! I faced the Endian issue when writing some FAT32 analysis code for a MIPS4150 processor. FAT32 (Intel) is little endian, while MIPS is big endian. What I came up with was (small snippet):

=====================================================================================================
	ide_io(0, (UI64)0, mbr_sector_buffer, 1, ERTFS_DMA_READ, 1); // read LBA 0 (=MBR)
	partition_start = mbr_sector_buffer[0x1C6] + (mbr_sector_buffer[0x1C7] << 8) + (mbr_sector_buffer[0x1C8] << 16) + (mbr_sector_buffer[0x1C9] << 24);
	CJL_sector_info.BPB_sector = partition_start;

	ide_io(0, (UI64)partition_start, bpb_sector_buffer, 1, ERTFS_DMA_READ, 1); // read boot/BPB sector
	fsinfo_offset = bpb_sector_buffer[48]+ (bpb_sector_buffer[49]<<8);
	CJL_sector_info.FSINFO_sector = partition_start + fsinfo_offset;
	BPB_RsvdSecCnt = bpb_sector_buffer[14]+ (bpb_sector_buffer[15]<<8);
	DFAT1_start = partition_start + BPB_RsvdSecCnt;

	ide_io(0, (UI64)partition_start + fsinfo_offset, fsinfo_sector_buffer, 1, ERTFS_DMA_READ, 1); // read FSinfo sector

where I'm afraid I reverted to the multiple shift solution, 4 shifts for the 32bit fields and 2 for the 16 bit fields :-(

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

Waitaminute, this SwapEndian_32() stuff reshuffles the bytes anyway? Then why are you concerned about bitshifts

In this instance, yes, I just want to grab the uint32_t at the specified indexes and swap the bytes, as the host sends them big-endian and GCC is little-endian. The resulting code from this is just two direct loads (very compact) as the SwapEndian_XXX() are inline functions which use the union method of playing with the bytes, which makes optimal code.

Other instances in my codebase don't need to swap the byte ordering, I just need to extract out the integers in the received buffer. Since the sent command in the buffer isn't fixed, I can't just use a union and structs to grab the values the "koshier" way. I tried bitshifts on the buffer bytes, but GCC actually produced a truly awful literal compilation, with a bunch of redundant register clears and shifts.

Quote:

-fno-strict-aliasing does *not* disable the warning, it does significantly more - it disables the optimisation assuming the programmer strictly follows the aliasing rules (as a side-effect, it does disable the warning, of course, as it then is not needed anymore). As mentioned above, it would be -Wno-strict-aliasing to disable only the warning.

My statement was a bit ambiguous there, I meant I disabled strict aliasing altogether rather than just hid the warning. Hiding warnings are bad when you actually know your code violates what the warning is about - you need to either fix the code or disable the optimization.

Thanks for the tips. From here, I see the ONLY really, truly standards compliant way of type punning to an integer would be to make a char* pointer to the data, and then copy the bytes over and shift as needed. In some situations the union method is also appropriate, although that violates the standards also, strictly speaking.

I like to do the right thing where possible, but I think I'll just cut my losses, retain my optimal output and just keep strict aliasing turned off.

- Dean :twisted:

Make Atmel Studio better with my free extensions. Open source and feedback welcome!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

abcminiuser wrote:
Does anyone know of the correct way to achieve what I'm doing (type-punning bytes in the packed array to a native datatype) without aliased pointers? The usual - and, I might add, non-defined - method of unions to perform the type punning won't work, as the positions of the command variables in the buffer are not fixed.

#define _byteleu32(source, offset, index) \
   (((uint32_t)(((unsigned char*)&source)[offset+index]))<<(8*index))

#define little_endian_to_uint32_t(source, offset) \
    _byteleu(source, offset, 0) | \
    _byteleu(source, offset, 1) | \
    _byteleu(source, offset, 2) | \
    _byteleu(source, offset, 3)

"SCSI is NOT magic. There are *fundamental technical
reasons* why it is necessary to sacrifice a young
goat to your SCSI chain now and then." -- John Woods

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

skeeve wrote:

#define _byteleu32(source, offset, index) \
   (((uint32_t)(((unsigned char*)&source)[offset+index]))<<(8*index))

#define little_endian_to_uint32_t(source, offset) \
    _byteleu(source, offset, 0) | \
    _byteleu(source, offset, 1) | \
    _byteleu(source, offset, 2) | \
    _byteleu(source, offset, 3)

I wrote a function that used this.
Wow. Talk about bloated.
60 instructions.
12 of them LDI *,0
6 of them EOR Rx,Rx
It used the stack.
I think that avr-gcc has a problem in
general with typecasts to larger types.

"SCSI is NOT magic. There are *fundamental technical
reasons* why it is necessary to sacrifice a young
goat to your SCSI chain now and then." -- John Woods

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

I wrote a function that used this.
Wow. Talk about bloated.
60 instructions.
12 of them LDI *,0
6 of them EOR Rx,Rx
It used the stack.
I think that avr-gcc has a problem in
general with typecasts to larger types.

Now you see my dilemma - I saw the same thing when I tried to convert to the Kosher way of using shifts. With my type punning through the pointer (and even with the endian swap function) I get two LDI instructions as the output. There just isn't any way I can see to maintain the level of compactness of my code without sacrificing some C99 compatibility.

The ONLY ways to do it are to memcpy() the bytes, or to bitshift. A union can also be used in some places, but that goes against the union specification of the standard.

Things would have been much better if they had just made all pointers aliased as they were before, and required people to add the "restrict" qualifier to all pointers where the alias restrictions should be enforced.

I'd rather my code conformed, but my tests show that my non-trivial Bluetooth code only gains TWO BYTES when aliasing restrictions are enforced. Not worth the effort when following the rules causes the code size and speed to become several orders of magnitude slower, especially in an embedded app.

- Dean :twisted:

Make Atmel Studio better with my free extensions. Open source and feedback welcome!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

This compiles well with -fstrict-aliasing:

unsigned long greek(unsigned char *buf, unsigned char index)
{
union {
    unsigned long ul;
    unsigned char uc[sizeof(long)];
} fred;
/* Characters have a special dispensation with regard to aliasing.
   I think it applies here. */
fred.uc[3]=buf
; fred.uc[2]=buf[index+1]; fred.uc[1]=buf[index+2]; fred.uc[0]=buf[index+3]; return fred.ul; }

Presumably it would do better inlined.

000000a6 :
{
union {
unsigned long ul;
unsigned char uc[sizeof(long)];
} fred;
fred.uc[3]=buf
; a6: fc 01 movw r30, r24 a8: e6 0f add r30, r22 aa: f1 1d adc r31, r1 fred.uc[2]=buf[index+1]; fred.uc[1]=buf[index+2]; ac: 72 81 LDD r23, Z+2 ; 0x02 fred.uc[0]=buf[index+3]; ae: 63 81 LDD r22, Z+3 ; 0x03 return fred.ul; } b0: 81 81 LDD r24, Z+1 ; 0x01 b2: 90 81 LD r25, Z b4: 08 95 RET

Using a pointer to unsigned char also compiles without error,
but generates bloated code.
Edit:

Tim Rentsch at http://groups.google.com/group/c... wrote:
the Standard mentions
union type punning explicitly, in 6.5.2.3 p3 (in a footnote):

If the member used to access the contents of a union object is
not the same as the member last used to store a value in the
object, the appropriate part of the object representation of
the value is reinterpreted as an object representation in the
new type as described in 6.2.6 (a process sometimes called
"type punning"). This might be a trap representation.

The aforementioned standard is C99.

"SCSI is NOT magic. There are *fundamental technical
reasons* why it is necessary to sacrifice a young
goat to your SCSI chain now and then." -- John Woods

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

abcminiuser wrote:
Things would have been much better if they had just made all pointers aliased as they were before, and required people to add the "restrict" qualifier to all pointers where the alias restrictions should be enforced.
"restrict" and strict aliasing rules are designed for somewhat different things.
Most hunters would be rather surprised if shooting an elk killed an elephant.
A union explicitly tells the compiler that a
particular elk might be a particular elephant in disguise.
"restrict" tells the compiler that what looks
like two elk is not the same elk twice.
It allows the sort of optimizations that have
been available since fortran was called FORTRAN.

"SCSI is NOT magic. There are *fundamental technical
reasons* why it is necessary to sacrifice a young
goat to your SCSI chain now and then." -- John Woods