Help debug my ATtiny20 not running after indirect store instruction ST Z

1 post / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Summary

 

I have a small C/C++ program that I'm compiling for an ATtiny20. I'm trying to bring it up for the first time by slowly allowing more of the code to run to see what breaks. I'm basically starting by turning on an LED and running until an infinite loop that just spins to confirm that everything is working.

 

I've gotten down to two independent chunks of C/C++ that can be commented out to "fix" the issue. However, I need (read: would like) those lines in my program and I'm not sure how they're affecting the running program and therefore I'm not sure how to fix it while keeping the features I want.

 

Unfortunately, I cannot post the entire piece of code. I can however post many of the relevant parts and hopefully I've included them all.

 

If there are pertinent missing pieces of information, please let me know and I can probably post them.

 


 

First "breaking" code

 

 

One of the first things my code does is turn on an LED, among other initialization. Here is my main() for reference.

 

Main.cpp

#include ...

cRGBW ws2812;

void main() {
  // Make sure watchdog timer is off
  wdt_disable();

  // Set clock source
  CCP = 0xD8;
  CLKMSR = 0;
  
  // Set clock prescaler
  CCP = 0xD8;
  CLKPSR = 0;

  // Set button IOpin as input with pull-up enabled
  button.input();
  button.enablePullUp();
  
  // Set vibrator control line as output
  vib.output();
  vib.off();
  
  // Set LED IOpin as output
  led.output();
  led.on();

  MLX::init();

  // while (1); // Uncomment to keep led on.

  ws2812.g = 255;

  while (1); // Prevent rest of code from running for now
  
  // ...
}

 

And here we have the seemingly problematic line:

ws2812.g = 255;

Now, this seems pretty simple. ws2812 is statically allocated. Just set all the bits in one byte.

 

But looking at the compiled assembly (generated with `avr-gcc -s`), that lines generates 4 lines of assembly, one of which is suspect. (Bonus: Why doesn't this compile to a single Store Direct to SRAM (STS) instruction?)

  .file	"main.cpp"
__SP_H__ = 0x3e
__SP_L__ = 0x3d
__SREG__ = 0x3f
__CCP__ = 0x3c
__tmp_reg__ = 16
__zero_reg__ = 17
  .section	.text.startup,"ax",@progbits
.global	main
  .type	main, @function
main:
/* prologue: function */
/* frame size = 0 */
/* stack size = 0 */
.L__stack_usage = 0
  ldi r20,lo8(-40)
  ldi r21,0
/* #APP */                     //Start of wdt_disable()
  ;  29 "src/main.cpp" 1
  in __tmp_reg__,__SREG__
  cli
  wdr
  out 60,r20
  in  r21,49
  cbr r21,8
  out 49,r21
  out __SREG__,__tmp_reg__
  
  ;  0 "" 2
/* #NOAPP */                   //End of wdt_disable()
  out __CCP__,r20              //CCP = 0xD8;
  out 0x37,__zero_reg__        //CLKMSR = 0;
  out __CCP__,r20              //CCP = 0xD8;
  out 0x36,__zero_reg__        //CLKPSR = 0;
  cbi 0x5,2                    //button.input();
  sbi 0x7,2                    //button.enablePullUp();
  sbi 0x1,6                    //vib.output();
  cbi 0x2,6                    //vib.off();
  sbi 0x1,5                    //led.output();
  sbi 0x2,5                    //led.on();
  rcall _ZN8MLX90363IN3AVR5IOpinINS0_5Ports1AELh2EEEN10libCameron16BitBangMasterSPIINS1_IS3_Lh1EEENS1_INS2_1BELh0EEENS1_IS3_Lh0EEELb0ELb0ELj0EEEE4initEv   //MLX::init();
  ldi r20,lo8(-1)             // These lines go away if I uncomment the earlier while (1);
  ldi r30,lo8(ws2812)         // These lines go away...
  ldi r31,hi8(ws2812)         // These lines go away...
  st Z,r20                    // These lines go away... This I believe is the instruction that kills it
.L2:
  rjmp .L2
  .size	main, .-main
.global	ws2812
  .type	ws2812, @object
  .size	ws2812, 4
ws2812:
  .zero	4
  .ident	"GCC: (AVR_8_bit_GNU_Toolchain_3.5.4_1709) 4.9.2"
.global __do_clear_bss

I believe the call to MLX::init(); isn't doing anything to mess with the LED as, if I leave that code in, the led stays on.

 

Looking at the 4 lines of assembly that are apparently causing the problem, I don't see how an `LDI` instruction could mess anything up. My bet is that the address of ws2812 is too large or something and it's accidentally messing with the stack pointer when `ST Z,x` is run. However, using readelf -s to list symbols, as far as I understand the output, the SRAM location of ws2812 is only 0x4b which isn't that high. Am I reading this wrong?

Symbol table '.symtab' contains 153 entries:
   Num:    Value  Size Type    Bind   Vis      Ndx Name
     0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 00000000     0 SECTION LOCAL  DEFAULT    1
[...]
    74: 0080004b     4 OBJECT  GLOBAL DEFAULT    3 ws2812
[...]

Second "breaking" lines

 

Diving into this deeper, I realized I am using a global instance of one of my simple classes which means its constructor is run before main() by _init(). I know using this language feature can be prone to problems but I believe I've used it carefully. However, this apparently is part of the problem since commenting out those lines keeps the LED on. Alternatively, commenting out the instance of the class (which prevents the constructor from running) also makes the LED work. Is there something I'm doing here that is problematic? What am I missing?

 

template <class MOSI, class MISO, class SCLK, bool CPOL_ = false, bool CPHA_ = false, unsigned int halfBitPeriodClockCycles = 0>
class BitBangMasterSPI {
	static inline void clkIdle() {SCLK::set(CPOL_);}
	static inline void clkActive() {SCLK::set(!CPOL_);}
	
	// ...
}

template <class MOSI, class MISO, class SCLK, bool CPOL_, bool CPHA_, unsigned int halfBitPeriodClockCycles>
BitBangMasterSPI<MOSI, MISO, SCLK, CPOL_, CPHA_, halfBitPeriodClockCycles>::BitBangMasterSPI() {
  MOSI::off();
  clkIdle();
  
  MOSI::output();
  SCLK::output();
  MISO::input();
};

Looking at the compiled assembly version, I don't see anything weird. (Besides that clkIdle() fully inlines and compiles down to a single instruction with -O1 or -O2 but not -Os.) All the functions use 0 stack so I don't expect that I'm running out of SRAM. I realize that there is a 2 byte overhead for function calls but I don't believe I'm calling that many layers deep. Maybe I'm being naive.

  .file	"BitBangMasterSPI.impl.cpp"
__SP_H__ = 0x3e
__SP_L__ = 0x3d
__SREG__ = 0x3f
__CCP__ = 0x3c
__tmp_reg__ = 16
__zero_reg__ = 17
  .section	.text._ZN10libCameron16BitBangMasterSPIIN3AVR5IOpinINS1_5Ports1AELh1EEENS2_INS3_1BELh0EEENS2_IS4_Lh0EEELb0ELb0ELj0EE9clkActiveEv,"axG",@progbits,_ZN10libCameron16BitBangMasterSPIIN3AVR5IOpinINS1_5Ports1AELh1EEENS2_INS3_1BELh0EEENS2_IS4_Lh0EEELb0ELb0ELj0EE9clkActiveEv,comdat
  .weak	_ZN10libCameron16BitBangMasterSPIIN3AVR5IOpinINS1_5Ports1AELh1EEENS2_INS3_1BELh0EEENS2_IS4_Lh0EEELb0ELb0ELj0EE9clkActiveEv
  .type	_ZN10libCameron16BitBangMasterSPIIN3AVR5IOpinINS1_5Ports1AELh1EEENS2_INS3_1BELh0EEENS2_IS4_Lh0EEELb0ELb0ELj0EE9clkActiveEv, @function
_ZN10libCameron16BitBangMasterSPIIN3AVR5IOpinINS1_5Ports1AELh1EEENS2_INS3_1BELh0EEENS2_IS4_Lh0EEELb0ELb0ELj0EE9clkActiveEv:
/* prologue: function */
/* frame size = 0 */
/* stack size = 0 */
.L__stack_usage = 0
  sbi 0x2,0                                     // SCLK::set(!CPOL_);
  ret
  .size	_ZN10libCameron16BitBangMasterSPIIN3AVR5IOpinINS1_5Ports1AELh1EEENS2_INS3_1BELh0EEENS2_IS4_Lh0EEELb0ELb0ELj0EE9clkActiveEv, .-_ZN10libCameron16BitBangMasterSPIIN3AVR5IOpinINS1_5Ports1AELh1EEENS2_INS3_1BELh0EEENS2_IS4_Lh0EEELb0ELb0ELj0EE9clkActiveEv
  .section	.text._ZN10libCameron16BitBangMasterSPIIN3AVR5IOpinINS1_5Ports1AELh1EEENS2_INS3_1BELh0EEENS2_IS4_Lh0EEELb0ELb0ELj0EE18halfBitPeriodDelayEv,"axG",@progbits,_ZN10libCameron16BitBangMasterSPIIN3AVR5IOpinINS1_5Ports1AELh1EEENS2_INS3_1BELh0EEENS2_IS4_Lh0EEELb0ELb0ELj0EE18halfBitPeriodDelayEv,comdat
  .weak	_ZN10libCameron16BitBangMasterSPIIN3AVR5IOpinINS1_5Ports1AELh1EEENS2_INS3_1BELh0EEENS2_IS4_Lh0EEELb0ELb0ELj0EE18halfBitPeriodDelayEv
  .type	_ZN10libCameron16BitBangMasterSPIIN3AVR5IOpinINS1_5Ports1AELh1EEENS2_INS3_1BELh0EEENS2_IS4_Lh0EEELb0ELb0ELj0EE18halfBitPeriodDelayEv, @function
_ZN10libCameron16BitBangMasterSPIIN3AVR5IOpinINS1_5Ports1AELh1EEENS2_INS3_1BELh0EEENS2_IS4_Lh0EEELb0ELb0ELj0EE18halfBitPeriodDelayEv:
/* prologue: function */
/* frame size = 0 */
/* stack size = 0 */
.L__stack_usage = 0
  ret
  .size	_ZN10libCameron16BitBangMasterSPIIN3AVR5IOpinINS1_5Ports1AELh1EEENS2_INS3_1BELh0EEENS2_IS4_Lh0EEELb0ELb0ELj0EE18halfBitPeriodDelayEv, .-_ZN10libCameron16BitBangMasterSPIIN3AVR5IOpinINS1_5Ports1AELh1EEENS2_INS3_1BELh0EEENS2_IS4_Lh0EEELb0ELb0ELj0EE18halfBitPeriodDelayEv
  .section	.text._ZN3AVR5IOpinINS_5Ports1AELh0EE3offEv,"axG",@progbits,_ZN3AVR5IOpinINS_5Ports1AELh0EE3offEv,comdat
  .weak	_ZN3AVR5IOpinINS_5Ports1AELh0EE3offEv
  .type	_ZN3AVR5IOpinINS_5Ports1AELh0EE3offEv, @function
_ZN3AVR5IOpinINS_5Ports1AELh0EE3offEv:
/* prologue: function */
/* frame size = 0 */
/* stack size = 0 */
.L__stack_usage = 0
  cbi 0x2,0                                     // SCLK::set(CPOL_);
  ret
  .size	_ZN3AVR5IOpinINS_5Ports1AELh0EE3offEv, .-_ZN3AVR5IOpinINS_5Ports1AELh0EE3offEv
  .section	.text._ZN10libCameron16BitBangMasterSPIIN3AVR5IOpinINS1_5Ports1AELh1EEENS2_INS3_1BELh0EEENS2_IS4_Lh0EEELb0ELb0ELj0EE7clkIdleEv,"axG",@progbits,_ZN10libCameron16BitBangMasterSPIIN3AVR5IOpinINS1_5Ports1AELh1EEENS2_INS3_1BELh0EEENS2_IS4_Lh0EEELb0ELb0ELj0EE7clkIdleEv,comdat
  .weak	_ZN10libCameron16BitBangMasterSPIIN3AVR5IOpinINS1_5Ports1AELh1EEENS2_INS3_1BELh0EEENS2_IS4_Lh0EEELb0ELb0ELj0EE7clkIdleEv
  .type	_ZN10libCameron16BitBangMasterSPIIN3AVR5IOpinINS1_5Ports1AELh1EEENS2_INS3_1BELh0EEENS2_IS4_Lh0EEELb0ELb0ELj0EE7clkIdleEv, @function
_ZN10libCameron16BitBangMasterSPIIN3AVR5IOpinINS1_5Ports1AELh1EEENS2_INS3_1BELh0EEENS2_IS4_Lh0EEELb0ELb0ELj0EE7clkIdleEv:
/* prologue: function */
/* frame size = 0 */
/* stack size = 0 */
.L__stack_usage = 0
  rjmp _ZN3AVR5IOpinINS_5Ports1AELh0EE3offEv
  .size	_ZN10libCameron16BitBangMasterSPIIN3AVR5IOpinINS1_5Ports1AELh1EEENS2_INS3_1BELh0EEENS2_IS4_Lh0EEELb0ELb0ELj0EE7clkIdleEv, .-_ZN10libCameron16BitBangMasterSPIIN3AVR5IOpinINS1_5Ports1AELh1EEENS2_INS3_1BELh0EEENS2_IS4_Lh0EEELb0ELb0ELj0EE7clkIdleEv
  .section	.text._ZN10libCameron16BitBangMasterSPIIN3AVR5IOpinINS1_5Ports1AELh1EEENS2_INS3_1BELh0EEENS2_IS4_Lh0EEELb0ELb0ELj0EEC2Ev,"axG",@progbits,_ZN10libCameron16BitBangMasterSPIIN3AVR5IOpinINS1_5Ports1AELh1EEENS2_INS3_1BELh0EEENS2_IS4_Lh0EEELb0ELb0ELj0EEC5Ev,comdat
  .weak	_ZN10libCameron16BitBangMasterSPIIN3AVR5IOpinINS1_5Ports1AELh1EEENS2_INS3_1BELh0EEENS2_IS4_Lh0EEELb0ELb0ELj0EEC2Ev
  .type	_ZN10libCameron16BitBangMasterSPIIN3AVR5IOpinINS1_5Ports1AELh1EEENS2_INS3_1BELh0EEENS2_IS4_Lh0EEELb0ELb0ELj0EEC2Ev, @function
_ZN10libCameron16BitBangMasterSPIIN3AVR5IOpinINS1_5Ports1AELh1EEENS2_INS3_1BELh0EEENS2_IS4_Lh0EEELb0ELb0ELj0EEC2Ev:
/* prologue: function */
/* frame size = 0 */
/* stack size = 0 */
.L__stack_usage = 0
  cbi 0x2,1                                             // MOSI::off();
  rcall _ZN3AVR5IOpinINS_5Ports1AELh0EE3offEv           // clkIdle();
  sbi 0x1,1                                             // MOSI::output();
  sbi 0x1,0                                             // SCLK::output();
  cbi 0x5,0                                             // MISO::input();
  ret
  .size	_ZN10libCameron16BitBangMasterSPIIN3AVR5IOpinINS1_5Ports1AELh1EEENS2_INS3_1BELh0EEENS2_IS4_Lh0EEELb0ELb0ELj0EEC2Ev, .-_ZN10libCameron16BitBangMasterSPIIN3AVR5IOpinINS1_5Ports1AELh1EEENS2_INS3_1BELh0EEENS2_IS4_Lh0EEELb0ELb0ELj0EEC2Ev
  .weak	_ZN10libCameron16BitBangMasterSPIIN3AVR5IOpinINS1_5Ports1AELh1EEENS2_INS3_1BELh0EEENS2_IS4_Lh0EEELb0ELb0ELj0EEC1Ev
  .set	_ZN10libCameron16BitBangMasterSPIIN3AVR5IOpinINS1_5Ports1AELh1EEENS2_INS3_1BELh0EEENS2_IS4_Lh0EEELb0ELb0ELj0EEC1Ev,_ZN10libCameron16BitBangMasterSPIIN3AVR5IOpinINS1_5Ports1AELh1EEENS2_INS3_1BELh0EEENS2_IS4_Lh0EEELb0ELb0ELj0EEC2Ev

Questions

 

Is there something more going on in _init() that I'm not realizing?

Is there anything else that could be causing this problem?

 

Bonus:

 

Why aren't all of my small inline functions compiling all the way down to single lines of assembly? They seem to do it sometimes but not when there is a longer chain of trivial function calls. I've tried with and without no-inline-small and a much of other compiler flags.

Why is gcc using a Store Indirect (and 4 instructions) instead of a single Store Direct?

 


Reference

 

  • avr-gcc version: avr-g++ (AVR_8_bit_GNU_Toolchain_3.5.4_1709) 4.9.2
  • Using Atmel ICE and Atmel Studio 7 to load .hex files
  • Interrupts are not used, WDT is not enabled with fuses
  • I'm using avr-gcc's default _init()
  • I'm using -Os optimizations (-O2 produces the same assembly for the first issue. Can't test because output hex is too large)
  • Using custom Makefile (c/cpp file are compiled to .o, some .o archived into .a files, all .o and .a files linked into elf)
  • avr-g++ simplified flags:  -c -std=gnu++11 -fno-exceptions -mmcu=attiny20 -DF_CPU=8000000UL -UAVR -Os -pipe -ffreestanding -Wall -fshort-enums -funsigned-char -funsigned-bitfields -fno-inline-small-functions -fno-strict-aliasing -fpack-struct -ffunction-sections -fdata-sections

 

Pushing AVRs to their limits