Am I crazy? Local static array weirdness . [solved]

Go To Last Post
24 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hello!

I've been ripping my hair out for a couple of hours in a large project which conks out spectacularly. I've isolated the problem in a small test case:

void testarray(void) {
  static uint8_t array[32];
  uint8_t i;
  for(i=0;i<32;i++) array[i]=i;
  for(i=0;i<32;i++) {
    bbprint("array[");
    bbprintdec(i);
    bbprint("]=");
    bbprintdec(array[i]);
    bbprint("\n");
  }
}

Which produces:

array[0]=0
array[1]=1
array[2]=2
array[3]=3
array[4]=4
array[5]=5
array[6]=6
array[7]=7
array[8]=8
array[9]=9
array[10]=0
array[11]=161
array[12]=0
array[13]=3
array[14]=0
array[15]=161
array[16]=2
array[17]=52
array[18]=0
array[19]=174
array[20]=0 
array[21]=21
array[22]=2
array[23]=172
array[24]=24
array[25]=25
array[26]=26
array[27]=27
array[28]=28
array[29]=29
array[30]=30
array[31]=31

If I make the array global I get the {0..31} expected result.

This is with avr-gcc 4.4.4. Could there be a problem with it or am I doing something wrong?

Does this ring a bell with anyone?

Thanks in advance.

odokemono. I try to entice electrons to do my bidding.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Are there any interrupts occurring when this code runs? Also when you build what does avr-size report for "Data:" size?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

odokemono wrote:
This is with avr-gcc 4.4.4. Could there be a problem with it

Yes.

Unless you are liking weird and unexplainable problems like this one, try to stick to the "official" editions.

JW

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
If I make the array global I get the {0..31} expected result.
What happens if you just remove the "static"?

Regards,
Steve A.

The Board helps those that help themselves.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I've tried reverting to avr-gcc 4.3.4, same results, and global/static doesn't help apparently, it seems more related to code size. The bigger the source, the nastier things get. I even see code corruption now, self-contained functions which worked fine before are now crashing. It's all very confusing.

Quote:
Are there any interrupts occurring when this code runs? Also when you build what does avr-size report for "Data:" size?

No interrupts at all, none vectored, none hardware, no timer use, etc... Data size:

   text    data     bss     dec     hex filename
   1490     226      68    1784     6f8 testarrayprob.elf

This should be fine on the attiny88 I'm using.

I'm currently looking at assembler listings to try to figure out what's happening.

Quote:
Unless you are liking weird and unexplainable problems like this one, try to stick to the "official" editions.

I had avr-binutils-2.19.1, avr-gcc-4.3.4, and avr-libc-1.6.7 built, but I'll try with the current building script in the sticky in the mean time. Sure, I like weird things but gcc optimization is already a bit too weird for me.

If worse comes to worst, I might ditch C and try to code in assembler completely. Then if things fail I'll only have myself to blame. :)

Thanks for the pointers, guys!

odokemono. I try to entice electrons to do my bidding.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
If I make the array global I get the {0..31} expected result.
I'd guess the problem is still there but simply moved in memory. Likely you have some other errant code, apparently accessing an array of 7 ints in the wrong spot. Or perhaps bbprintdec is getting whacked somehow, or maybe it can't handle some arguments >9?

Without seeing all the code it is hard to say much more.

C: i = "told you so";

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Please post an example of the printout like in the first post along with part of .lss (result of objdump) of that routine.

Don't some of the unexpected values correspond with the addresses from where bbprint() is called?

Couldn't printing the stack pointer reveal something?

JW

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

This sounds a lot like the problem I'm having here
https://www.avrfreaks.net/index.p...
which I've been unable to resolve.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I've started again on the problem just now looking closely at the assembly code that avr-gcc produces, I'm seeing things that aren't quite right.

The first has to do with the beginning of a function which needs local memory. It first pushes the registers the function will use then decreases the stack pointer by the number of bytes the function will need. Very sensible, but let's look at the listing:

void puzzle(uint8_t size) {
 3e2: 2f 92         push  r2
 3e4: 3f 92         push  r3
 3e6: cf 92         push  r12
 3e8: df 92         push  r13
 3ea: ef 92         push  r14
 3ec: ff 92         push  r15
 3ee: 0f 93         push  r16
 3f0: 1f 93         push  r17
 3f2: df 93         push  r29
 3f4: cf 93         push  r28
 3f6: cd b7         in  r28, 0x3d ; 61
 3f8: de b7         in  r29, 0x3e ; 62
 3fa: a6 97         sbiw  r28, 0x26 ; 38
 3fc: 0f b6         in  r0, 0x3f  ; 63
 3fe: f8 94         cli
 400: de bf         out 0x3e, r29 ; 62
 402: 0f be         out 0x3f, r0  ; 63
 404: cd bf         out 0x3d, r28 ; 61
 406: 89 83         std Y+1, r24  ; 0x01

Now, "in r28, 0x3d" and "in r29, 0x3e" get the current SP value in word-register r28. "sbiw r28, 0x26" decreases that value by 38. That's fine so far, 38 bytes is the memory this function will use. "in r0, 0x3f" gets the current value of the status register. This is so it can remember the value of the I bit (whether or not the interrupts are enabled) this is important because it next does a CLI.

The reason for the CLI is obvious. It needs to update atomically the stack pointer by re-outing both bytes to form the word.

Now here comes the problem: It first does "out 0x3e, r29", which is the high-byte value of SP. Then it restores the status register ("out 0x3f, r0") which re-enables interrupts if they were enabled at the beginning. This is bad because at this point we have a partially restored SP value and we just re-enabled interrupts. If one interrupt gets called at this point it will obviously push and pop registers on the stack with a bad pointer. Lastly, the low-byte value of the SP is set properly with "out 0x3d, r28", but by this time it may be already too late.

Sure, this is a very small window of opportunity for a bug to manifest itself, but it's sloppy programming nonetheless.

The next worrisome thing I see in the listing is at the very beginning:

00000028 <__ctors_end>:
  28: 11 24         eor r1, r1
  2a: 1f be         out 0x3f, r1  ; 63
  2c: cf ef         ldi r28, 0xFF ; 255
  2e: d1 e0         ldi r29, 0x01 ; 1
  30: de bf         out 0x3e, r29 ; 62
  32: cd bf         out 0x3d, r28 ; 61

This sets the stack pointer at RAMEND, the end of ram. But wait, my target is attiny88, which has 512 bytes of ram. That number should be 0x2ff, not 0x1ff, like clearly stated in the specs.

Just 10 minutes of looking at the assembly listing made me very worried indeed. Maybe the attiny88 target isn't quite ready for production. I'll next try with another chip and keep posting my results.

odokemono. I try to entice electrons to do my bidding.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
This is bad because at this point we have a partially restored SP value and we just re-enabled interrupts. If one interrupt gets called at this point it will obviously push and pop registers on the stack with a bad pointer.
Not so obvious since the next opcode (the restoring of the low byte of the stack pointer in this case) is guaranteed to be run before any interrupt will occur. This has been discussed before, it is not a bug.

Regards,
Steve A.

The Board helps those that help themselves.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Tried with ATmega644P: No problems, SP points to correct address.

I'm going to code on the mega644 for a while so I enjoy myself more and see if the problem bites me again with larger code.

Quote:
Not so obvious since the next opcode (the restoring of the low byte of the stack pointer in this case) is guaranteed to be run before any interrupt will occur. This has been discussed before, it is not a bug.

Yup, while avrfreaks was down I read that in the specs. The ordering is still counter-intuitive though. It wouldn't hurt avr-gcc to re-out SPh and SPl one right after the other.

odokemono. I try to entice electrons to do my bidding.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
It wouldn't hurt avr-gcc to re-out SPh and SPl one right after the other.
Except that interrupts would remain disabled longer.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

odokemono wrote:
The ordering is still counter-intuitive though. It wouldn't hurt avr-gcc to re-out SPh and SPl one right after the other.

There is no doubt that the ordering is counter-intuitive, that is unfortunate, but it doesn't make sense to rearrange it because that would guarantee another cycle of interrupt latency where it does not do any good. Given the fact that it would be increasing latency simply to make reading a bit easier, there is really no reason to do it because, as you found out, the behavior is documented.

Martin Jay McKee

As with most things in engineering, the answer is an unabashed, "It depends."

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

https://www.avrfreaks.net/index.p...
So it is a known bug, but it wouldn't hurt to submit as bug in https://savannah.nongnu.org/bugs...

JW

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Just to let everyone know: I've found the problem. I have a couple of global variables that end being allocated below the stack pointer.

From the beginning of a function which has a local array of about 36 bytes:

&BBDELAY=0x0214
&BBCKPR=0x0216
&puz=0x022A
Stack Pointer=0x0227

puz is the array local to the function, BBDELAY and BBCKPR are globals declared elsewhere. 0x227-0x216 gives me just 17 bytes of breathing room before BBCKPR gets clobbered, which it does when the function later calls another one which pushes a bunch of registers.

I don't know why my globals are at those addresses. Globals and statics should either be allocated from RAMSTART and up or from RAMEND and down with a properly positioned stack pointer at start.

There's no way in heck that I've got over 400 bytes of globals or static variables, in fact I've only got two uint8s and one uint16, that's more in the approximate region of 4 bytes. :)

I'll make a test case after lunch today and try to figure out more.

odokemono. I try to entice electrons to do my bidding.

Last Edited: Mon. Sep 20, 2010 - 02:01 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

There's no way in heck that I've got over 500 bytes of globals or static variables

Constant strings, perhaps?

Quote:
Just to let everyone know: I've found the problem.

Then only one question remains. Are you actually crazy? :wink:

As of January 15, 2018, Site fix-up work has begun! Now do your part and report any bugs or deficiencies here

No guarantees, but if we don't report problems they won't get much of  a chance to be fixed! Details/discussions at link given just above.

 

"Some questions have no answers."[C Baird] "There comes a point where the spoon-feeding has to stop and the independent thinking has to start." [C Lawson] "There are always ways to disagree, without being disagreeable."[E Weddington] "Words represent concepts. Use the wrong words, communicate the wrong concept." [J Morin] "Persistence only goes so far if you set yourself up for failure." [Kartman]

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
Constant strings, perhaps?
You know, when I decided to take the plunge back into micro-controllers, after 20+ years of coding only in C on Unix machines I chose the AVR familiy of chips because I thought: "Hey! Cool toolchain, all open source, GCC, I can build my own programmer in 5 minutes! I know this like the back of my hand!"

I made a lot of assumptions that I shouldn't have based on knowledge that doesn't apply in this different domain.
I forgot Valentine Michael Smith's motto: "I am only an egg".

You, Sir Johan, deserve my most heartfelt thanks. My globals now reside very near RAMSTART, well away from that destructive pointer from the depths of hell and my strings-a-plenty have been safely segregated to PROGMEM, where they wave happily with PSTR.

Quote:
Then only one question remains. Are you actually crazy?

That's a possibility that I haven't the luxury to eliminate, ever.

I'll see about placing a bug-report on the wrongful SP initialization for attiny88 though. The problem's been known since more than 1.5 year, it should still be fixed.

Oh, I forgot: good points all-around about the interrupt things, it does make a lot of sense.

Problem solved, caused by PEBKAC, end of thread, thanks everyone!

odokemono. I try to entice electrons to do my bidding.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

odokemono wrote:
I'll see about placing a bug-report on the wrongful SP initialization for attiny88 though. The problem's been known since more than 1.5 year, it should still be fixed.
Have a look at the newest toolchain from the beta site (www.atmel.no/beta_ware, AVR Toolchain Installer for use with AVR Studio 4.18 SP3 Release Candidate), that's the latest one from the developers (no,the public sources are not up to date), maybe it's already fixed.

JW

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi Odo,

I hope you're (a little bit) crazy. My experience is that most "normal" people are boring.

I prefer crasy above boring.

Doing magic with a USD 7 Logic Analyser: https://www.avrfreaks.net/comment/2421756#comment-2421756

Bunch of old projects with AVR's: http://www.hoevendesign.com

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

wek wrote:
odokemono wrote:
I'll see about placing a bug-report on the wrongful SP initialization for attiny88 though. The problem's been known since more than 1.5 year, it should still be fixed.
Have a look at the newest toolchain from the beta site (www.atmel.no/beta_ware, AVR Toolchain Installer for use with AVR Studio 4.18 SP3 Release Candidate), that's the latest one from the developers (no,the public sources are not up to date), maybe it's already fixed.

JW

It wasn't. I placed a bug report and it's been fixed since: http://savannah.nongnu.org/bugs/?31086

odokemono. I try to entice electrons to do my bidding.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Oh and by the way thanks to everyone's help and pointers, my project using the tiny88 is coming along well. It's a little board that receives WWV timecode signals (atomic clock) to discipline a freewheeling software oscillator and provides a 1 PPS for NTP use.

It also has a large display, which I like:
(That's a container of Play-Doh it's sitting on. The cabinet will come when the electronics and software are finished.)

odokemono. I try to entice electrons to do my bidding.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

odokemono wrote:
It also has a large display

Nice. For some reason, I like 7-segments (where appropriate) more than a million-pixel LCD...

odokemono wrote:
The cabinet will come when the electronics and software are finished.)
Oh, this sounds SO familiar... ;-)

JW

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Damn.
You've got a bigger display than I do.

but this one still beats your's:
http://www.standard-time.com/bil... :-)

Keep up the good work!

Doing magic with a USD 7 Logic Analyser: https://www.avrfreaks.net/comment/2421756#comment-2421756

Bunch of old projects with AVR's: http://www.hoevendesign.com

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

> http://www.standard-time.com/bil...
This is insane...
But the following is doable:
http://www.artlebedev.com/everyt...
pity it's only simulation.

JW