Forum Menu




 


Log in Problems?
New User? Sign Up!
AVR Freaks Forum Index

Post new topic   Reply to topic
View previous topic Printable version Log in to check your private messages View next topic
Author Message
foxbatgb
PostPosted: Aug 19, 2010 - 12:34 PM
Newbie


Joined: Aug 19, 2010
Posts: 6


I've been a lurker here for a while and have decided to register to get some help. There is a bug in code generation by gcc 4.5.0/4.5.1 that causes a crash/reset when class instances are declared at global scope. It affects the mega1280. People have come to see it as a bug in the HardwareSerial arduino class because that class creates several global instances of itself:

http://www.arduino.cc/cgi-bin/yabb2/YaB ... 084886/all
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44617

I think I am close to narrowing down the problem and have created the following minimal code that, when compiled, will exhibit the problem. It's designed to light the LED on pin 13 if initialization works. As presented, it will fail and the LED stays off. However, change anything significant at all, such as commenting out the _u2x = u2x; constructor line will cause it to work, as will commenting out one of the Instance* declarations.

What I'm asking for is a little help from someone that can read assembly language. I've attached broken and working assembler listing files. Can anyone spot where the compiler is messing this up?

Regards,
- Andy Brown

Code:

#include <avr/io.h>


class MyClass
{
private:
    uint8_t _rxen;
    uint8_t _txen;
    uint8_t _rxcie;
    uint8_t _udre;
    uint8_t _u2x;

public:
  MyClass(unsigned char rxen, unsigned char txen, unsigned char rxcie, unsigned char udre, unsigned char u2x)
   {
     _rxen = rxen;
     _txen = txen;
     _rxcie = rxcie;
     _udre = udre;
     _u2x = u2x;
   }
};


MyClass Instance1(RXEN1, TXEN1, RXCIE1, UDRE1, U2X1);
MyClass Instance2(RXEN2, TXEN2, RXCIE2, UDRE2, U2X2);


int main()
{
   DDRB=0xff;
   PORTB=128;

   for(;;);
}


The compiler is gcc 4.5.1 (Windows 7 x64, mingw compiled), invocation flags:

avr-g++ -Wall -g2 -gstabs -Os -fpack-struct -fshort-enums -funsigned-char -funsigned-bitfields -fno-exceptions -mmcu=atmega1280 -DF_CPU=16000000UL -MMD -MP -MF"main.d" -MT"main.d" -c -o"main.o" "../main.cpp"
 
 View user's profile Send private message  
Reply with quote Back to top
westfw
PostPosted: Aug 21, 2010 - 05:40 AM
Posting Freak


Joined: Jun 19, 2002
Posts: 1463
Location: SF Bay area

OK. I'm game. If you strip the disassembled object files of all the text and look at just the lines that claim to be emitted object code, it looks like the "broken" image has an unexplained "gap" between address 0x6 and address 0x10. I don't have a disassembler handy, but it LOOKS like the code is otherwise similar in length and content. But who knows what will happen if that "gap" stays uninitialized in flash.
broken.txt also has a mysterious
Code:
  98                  68,0,31,.LM1-.LFBB1

in that same spot, which would occupy the right amount of space if those are somehow "int" values. (intermediate code that wasn't translated? I dunno.)
It's a bit hard to tell what is really there and what is an artifact of the listing that gcc is producing; do you have disassembly of the final object code?
 
 View user's profile Send private message  
Reply with quote Back to top
Koshchi
PostPosted: Aug 21, 2010 - 06:28 PM
10k+ Postman


Joined: Nov 17, 2004
Posts: 15092
Location: Vancouver, BC

Actually, neither looks correct. I think that the "working" one just happens to set the bit in PORTB and the other doesn't. Both listings look quite jumbled and are missing output codes in places. Both don't have the LDI R24, 0xFF, and the working does not have the first STS. The gap that westfw spotted should contain the RJMP 0 and the first LDI to initialize the member variables of the instances. It is also missing that first STS and three others.

The emitted address/opcodes should be beside the assembly that they represent. Here is the same code compiled in 4.3.3:

_________________
Regards,
Steve A.

The Board helps those that help themselves.
 
 View user's profile Send private message  
Reply with quote Back to top
westfw
PostPosted: Aug 22, 2010 - 01:02 AM
Posting Freak


Joined: Jun 19, 2002
Posts: 1463
Location: SF Bay area

Hmmph. I tried to build gcc 4.5.1 on my Mac so I could look into this further, but was unsuccessful. Admittedly, the Mac environment is a bit weird (a native gcc and lives in /usr/bin, but fink and other downloaded packages live in /sw/... (and the target directory for the avr version is /sw/cross/...) But still...

1) Unlike the configure steps for (all) other packages like gmp, mpfr, and mpc, gcc's configure doesn't seem to figure out that it should use -m64, leading to "architecure mismatches" in the sanity testing. search web, examine logs, Add CLFAGS=-m64 to configure cmd line.

2) Building gcc/dfp.o can't find gstdint.h, even though that file exists in the apparently correct place (builddir/../libgcc/). (build the .o manually with added -I switch.)

3) libcpp is apparently missing a "configure" file? (the build doesn't find libcpp.a, and all the steps I can think of won't build it...) (still stuck on this one.)

these are errors that ought not exist Sad
(or problems with my system that ought to be caught during "configure" ?) Grr.
 
 View user's profile Send private message  
Reply with quote Back to top
westfw
PostPosted: Aug 22, 2010 - 03:35 AM
Posting Freak


Joined: Jun 19, 2002
Posts: 1463
Location: SF Bay area

Finally got gcc/g++ installed. Other issue were:
1) Several other "configure" files missing. (fix: download them individually via svn browser.) (maybe I just downloaded a bad tar file?)
2) libgmpxx needed, but not checked for. (fix: configure gmp with --enable-cxx)
2) libppl too old, but not checked in configure.
3) libppl required newer libgmp (not checked by configure for gcc; not sure it should have.)
4) newer libgmp apparently doesn't compile on macs due to problems in the apple version of gcc WRT "extern inline". (workaround: patch gmp.h so that it thinks it doesn't have "extern inline")
Grr.
 
 View user's profile Send private message  
Reply with quote Back to top
westfw
PostPosted: Aug 22, 2010 - 07:05 AM
Posting Freak


Joined: Jun 19, 2002
Posts: 1463
Location: SF Bay area

I have duplicated your findings with my 4.5.1 install an Mega. Attached are the "full" (elf) disassemblies, which look much better than the "assembly listing" that C seems to generate (! a bug in itself?)
However, both main() and initializer functions look fine

Quote:
The gap that westfw spotted should contain the RJMP 0 and the first LDI to initialize the member variables of the instances.

Well, no. The static class initialization is done by a function that is called BEFORE main(). You won't see it if you just disassemble the .o file...

It looks a bit like the initialization code was changed to allow the table of initialization functions to be located beyond the first 64k of flash. Perhaps this is not quite correct (though it's a bit mysterious that it would work at all, ever, if not...)
 
 View user's profile Send private message  
Reply with quote Back to top
westfw
PostPosted: Aug 22, 2010 - 07:39 AM
Posting Freak


Joined: Jun 19, 2002
Posts: 1463
Location: SF Bay area

Ah hah! The new initialization code uses R20 to hold the high 8 bits (of 24) of the table of initialization functions. However, R20 is NOT preserved by the initialization functions!! So the next time through the loops, we fetch an address from some random location of flash instead of where it should be coming from, and go off into never-never land!

I'll have to read up to decide whether this is a problem with the init functions using registers that they shouldn't, or the loop not preserving a register that needs preserved... But it's definitely wrong!

Relevant bits of code in __do_global_ctors
Code:

    ldi   r20, hh8(__ctors_end)
   rjmp   .L__do_global_ctors_start
.L__do_global_ctors_loop:
   sbiw   r28, 2
   sbc     r20, __zero_reg__
   mov_h   r31, r29
   mov_l   r30, r28
   out     __RAMPZ__, r20
   XCALL   __tablejump_elpm__
.L__do_global_ctors_start:
   cpi   r28, lo8(__ctors_start)
   cpc   r29, r17
   cpc   r20, r16
   brne   .L__do_global_ctors_loop
And in the class setup code
Code:
  MyClass(unsigned char rxen, unsigned char txen, unsigned char rxcie, unsigned char udre, unsigned char u2x)
   {
     _rxen = rxen;
 152:   44 e0          ldi   r20, 0x04   ; 4
 154:   40 93 00 02    sts   0x0200, r20
     _txen = txen;
 158:   33 e0          ldi   r19, 0x03   ; 3
 15a:   30 93 01 02    sts   0x0201, r19
     _rxcie = rxcie;
 15e:   27 e0          ldi   r18, 0x07   ; 7
 160:   20 93 02 02    sts   0x0202, r18
     _udre = udre;
 164:   95 e0          ldi   r25, 0x05   ; 5
 166:   90 93 03 02    sts   0x0203, r25
     _u2x = u2x;
 16a:   81 e0          ldi   r24, 0x01   ; 1
 16c:   80 93 04 02    sts   0x0204, r24
    uint8_t _u2x;
 
 View user's profile Send private message  
Reply with quote Back to top
westfw
PostPosted: Aug 22, 2010 - 08:03 AM
Posting Freak


Joined: Jun 19, 2002
Posts: 1463
Location: SF Bay area

This patch appears to fix things. I'm not sure it's a fully correct patch, or if it's the best patch, but it works with your example case!
I'll see if I can navigate bug submition...
 
 View user's profile Send private message  
Reply with quote Back to top
westfw
PostPosted: Aug 22, 2010 - 08:26 AM
Posting Freak


Joined: Jun 19, 2002
Posts: 1463
Location: SF Bay area

That was more fun than I've had in a while...
Smile
 
 View user's profile Send private message  
Reply with quote Back to top
foxbatgb
PostPosted: Aug 22, 2010 - 08:38 PM
Newbie


Joined: Aug 19, 2010
Posts: 6


westfw, you're a genius Smile

Interesting that you should mention that the bug seemed to come about as the result of a change to allow initializers above 64K. Perhaps fixing this bug introduced the new one.

I'll apply your patch and see if it works for my projects. Glad you had fun tracking this one down!
 
 View user's profile Send private message  
Reply with quote Back to top
westfw
PostPosted: Aug 23, 2010 - 12:35 AM
Posting Freak


Joined: Jun 19, 2002
Posts: 1463
Location: SF Bay area

For better or worse, I've had a lot of experience tracking down bugs. "What's changed since it worked" is one of those easy questions, IF you can narrow down the pieces that are likely to have broken.

I want to be sure you realize how important your "minimalized" example was. I'm not a compiler guru, and would normally not have tried to fix an apparent compiler bug. But since this was down to the "this 400 byte program works, but this nearly identical 400 byte program doesn't work", I figured I had a reasonable chance of at least indentifying an area for the compiler folks to look at. An in the end, it wasn't exactly a compiler bug at all (assuming that my diagnosis is correct.)
 
 View user's profile Send private message  
Reply with quote Back to top
westfw
PostPosted: Aug 23, 2010 - 09:36 AM
Posting Freak


Joined: Jun 19, 2002
Posts: 1463
Location: SF Bay area

Aw; someone else found the same problem about 2 weeks ago...
 
 View user's profile Send private message  
Reply with quote Back to top
EW
PostPosted: Aug 23, 2010 - 12:31 PM
Raving lunatic


Joined: Mar 01, 2001
Posts: 5013
Location: Rocky Mountains

Also, if you're going to post a patch to GCC, please use the -u switch when generating the patch. It makes it easier to read. Wink

_________________
Eric Weddington
www.linkedin.com/in/ericweddington/
 
 View user's profile Send private message Send e-mail Visit poster's website 
Reply with quote Back to top
westfw
PostPosted: Aug 26, 2010 - 03:49 PM
Posting Freak


Joined: Jun 19, 2002
Posts: 1463
Location: SF Bay area

I got a private email along the lines of "great, this is fixed now, right?" Lest anyone else be similarly confused:

NO, this is still an open problem.

It's now gcc bug http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45263 (which as of today, is still "unassigned" and "unconfirmed"; it has two "amateur" patches but no official attention.)

There is currently NO version of gcc that correctly supports (c++ only?) programs larger than 80k. Older versions cannot access the class initializer table at all if it its beyond the 64k limit, and newer versions access it "wrong" under some (common) circumstances...
 
 View user's profile Send private message  
Reply with quote Back to top
westfw
PostPosted: Aug 28, 2010 - 03:04 AM
Posting Freak


Joined: Jun 19, 2002
Posts: 1463
Location: SF Bay area

Quote:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29141


And the other bad news:

I don't think I believe that the fix implemented for 29141 actually fixes all cases of the bug, even not counting that it has bad code in the piece it does implement. The fix for 29141 allows the table of constructors created by the compiler to be at a location beyond the first 64k, but the constructor function addresses IN that table seem to still be only 16 bits long. It looks like the linker only produces a 16bit table, and then the code that walks through the table hardwires a zero into the upper 8 bits for the called constructor as well. This is beyond me; since I'm not even aware of how gcc handles "far" addresses on big AVRs. (You might be able to work around this by making sure the .o files containing constructors occur in the first 64k of flash.)

The good news is that I think it ought to be possible to "fix" older versions of the compiler by using linker switches (or a linker script) to make sure that the table of constructor addresses (the .ctors and .dtors segments from .o files) is put into flash at the beginning of the flash .text segment instead of at the end. This would theoretically allow an updated Arduino IDE or Makefile to work around the bug without needing a patched compiler. (OTOH, linker scripts seem even more like Black Magic than AVR Assembler. I can't figure out how to actually do this, or even if it's really possible. And link wizards out there?)

As explanation, the pseudocode looks like this:

Code:
code_t * constructor_table[] = {serial_constructor_func, ethernet_constructor_func, ...};
init() {
   :
  function_ptr_t *entry = &constructor_table[0];
  while (entry < END_OF(constructor_table)) {
     function_ptr func = *entry;
     envokeFunction(func);
     entry += sizeof(code_t *);
  }
  :
}


So, 29141 fixed things so that function_ptr_t is now 24 bits, and fixed "envokeFunction()" to handle a 24bit pointer, but it 1) used a register that the constructors could trash, and 2) did NOT increase the size of "code_t *"
 
 View user's profile Send private message  
Reply with quote Back to top
ricklon
PostPosted: Sep 17, 2010 - 08:34 AM
Newbie


Joined: Jan 02, 2010
Posts: 1


I just duplicated this entire process and got to the same point. Anyone taken a further look at this?
 
 View user's profile Send private message  
Reply with quote Back to top
westfw
PostPosted: Nov 01, 2010 - 12:41 AM
Posting Freak


Joined: Jun 19, 2002
Posts: 1463
Location: SF Bay area

Quote:
The fix for 29141 allows the table of constructors created by the compiler to be at a location beyond the first 64k, but the constructor function addresses IN that table seem to still be only 16 bits long.

I can't tell whether this is a bug or not. there are "hints" that all call destinations >64k are handled by a "trampoline" area in the low 64k, in which case things could be OK.

But I can't find any summary of just how >64k AVRs are supposed to be supported Sad
 
 View user's profile Send private message  
Reply with quote Back to top
EW
PostPosted: Nov 01, 2010 - 12:12 PM
Raving lunatic


Joined: Mar 01, 2001
Posts: 5013
Location: Rocky Mountains

Here is a hint: Most of the GCC developers for the AVR port don't come to AVR Freaks on a regular basis. If you are going to be involved in working on AVR GCC, then I suggest that you subscribe to the avr-gcc-list mailing list and/or the avr-libc-dev mailing list for the avr-libc project. All of the AVR GCC developers are subscribed to those mailing lists and you're likely to get better feedback than posting a thread here.

_________________
Eric Weddington
www.linkedin.com/in/ericweddington/
 
 View user's profile Send private message Send e-mail Visit poster's website 
Reply with quote Back to top
bennettmarks439
PostPosted: Sep 09, 2011 - 08:02 PM
Newbie


Joined: Jun 24, 2011
Posts: 1


Actually the bug with clobbering R20 when calling __tablejump_elpm__ from __do_global_ctors has now been fixed in the GNU mainline v4.6.1. (BUG # 45263, New Revision: 174427, http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45263) My question is, since Atmel is still shipping 4.5.1, how do I get this fix into my libgcc.a? And where do I get it? Or do I have to assemble it myself?
 
 View user's profile Send private message  
Reply with quote Back to top
clawson
PostPosted: Sep 09, 2011 - 09:35 PM
10k+ Postman


Joined: Jul 18, 2005
Posts: 71717
Location: (using avr-gcc in) Finchingfield, Essex, England

Quote:

Or do I have to assemble it myself?

Yes.

_________________
 
 View user's profile Send private message  
Reply with quote Back to top
Display posts from previous:     
Jump to:  
All times are GMT + 1 Hour
Post new topic   Reply to topic
View previous topic Printable version Log in to check your private messages View next topic
Powered by PNphpBB2 © 2003-2006 The PNphpBB Group
Credits