Adding a third case breaks my switch statement. What am I doing wrong?

Go To Last Post
24 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I’ve spent two full days now troubleshooting the following problem and I can’t figure it out for the life of me.  The code seems so trivial that I feel like an idiot that it's not working.  I have the following function with a switch/case block inside, which works just fine on my ATmega168.  The function takes an argument row of type uint8_t and returns another uint8_t corresponding the selected case:

uint8_t lcd_get_offset(uint8_t row) {
  uint8_t row_offset = 0x00;  // initialize to zero
  switch(row) {
  case 1 :
    row_offset = 0x01;
    break;
  case 2 :
    row_offset = 0x02;
    break;
  default :
    row_offset = 0x06;
    break;
  }
  return row_offset;
}

Easy: the result of lcd_get_offset(1) is 1, as expected. The result of lcd_get_offset(2) is 2, as expected as well. The function returns 6 for any other value. No problemo.

 

If I simply add a third case to the switch statement, however, things go awry.  Now, the following function with case 3 added returns 255 for function argument values of row=1, 2, or 3, which is wrong, but it still returns the default value of 6 for any other argument.  All I did was add the third case, the function is otherwise exactly the same.

uint8_t lcd_get_offset(uint8_t row) {
  uint8_t row_offset = 0x00; // initialize to zero
  switch(row) {
  case 1 :
    row_offset = 0x01;
    break;
  case 2 :
    row_offset = 0x02;
    break;
  case 3 :   // adding case 3 breaks things if input argument row =1, 2 or 3.
    row_offset = 0x03;
    break;
  default :
    row_offset = 0x06;
    break;
  }
  return row_offset;
}

If I comment out the three lines starting with "case 3 :” then the function works just fine once again.  Is there a fundamental issue with my switch/case statement?  Otherwise, is this some kind of issue with the compiler? About a year ago I upgraded my toolchain to the latest version of Crosspack-AVR, with avr-gcc 4.8.1, but haven't had any issues until now.  The second version of my function works fine and returns the expected values when compiled with gcc as a plain old non-AVR program on my computer, which makes me think the code itself is fine.  Perhaps it's a bug in avr-gcc?  Any insight is appreciated!

 

Thanks!

This topic has a solution.
Last Edited: Mon. Feb 21, 2022 - 01:54 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

How do you "know" what is being returned?  Any chance you are misreading what the function is giving?  It seems rather odd, since 255 is not ever assigned at all.

 

Just for grins, what happens if you use char instead of unint8_t?

 

Could there be some hidden characters (invisible on screen), throwing things into the ditch?---you could try retyping the whole thing.

I've very rarely had lines fail for "no reason" and retyping them did the trick.

 

Aside from all that, looks like you could make this much shorter...if it is <=3 return the value, else return 0x06...why make it complicated & a mile longer?

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

Last Edited: Sun. Feb 20, 2022 - 09:52 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks for the suggestions!  I just tried retyping the functions using char instead of uint8_t, and unsigned char, and the behavior is exactly the same.  The first version of my function returns 1 for case 1, and 2 for case 2, and 6 for anything else, while the second version (adding case 3) returns 255 for argument values of 1, 2, or 3.

 

 

I "know" what the function returns because I've tried printing it to the UART, and to an LCD, and in both cases the first function works just fine and prints the expected value. The second version of my function just returns 255.  All I did was add the third case.   Any other suggestions to verify what my function is actually returning?

 

As for why I'm using the switch statement and don't just return the value - I eventually need to have several more cases for different switch arguments, and the cases will return other 8-bit values that aren't just the argument like I've illustrated above. I'm trying to build up to that point, but wanted to get it working with just a couple of cases first.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The only thing I can think of is to start your function off with return 0x06 (basically the function does nothing) & see if you are print 6 as expected...if it still prints 255, you know to concentrate on the printing.

 

Edit: well I sorta take that idea back, since you said you do get 6 in the smaller number of cases.

 

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

Last Edited: Sun. Feb 20, 2022 - 10:29 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 2

> Is there a fundamental issue with my switch/case statement?

No that's just fine. You proved this yourself on your PC.

 

> Otherwise, is this some kind of issue with the compiler?

No - More like some issue with the code calling this function.

 

> Perhaps it's a bug in avr-gcc?

After looking at the ASM generated (for avr-gcc 5.4.0)  then certainly not:

 

We freaks often ask for a complete minimal test program that demonstrates the problem. This is one of those cases.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Another idea:

Could the length or position(s) of your code cause some interference, say the LCD code was expecting something at a certain address, or accidentally using something in flash, that becomes different with the added code (such as normally 0xFF, but no longer)? Those kind of problems can be rather head scratchers, until you flesh them out.  You could have some timing sensitivity, where if the response was delayed 0.01 us, you get bogus results (though you'd have to work hard to create such a beast).

 

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

Last Edited: Sun. Feb 20, 2022 - 10:48 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

This behaviour is really strange.

Try making your row parameter an uint16_t!

Did you check with step-by-step-debugging what is going on?
Doing this, try to understand what it is going on in the disassembly window.
Try other optimzation levels - especially lower levels
Try the function with an assignment of the return value to an volatile static variable - just to be sure.
My gut feeling and experience says, that the problem may be somewhere else...

Amazingly, as soon as you're doing it right, it works!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The only thing I can think of is to start your function off with return 0x06 (basically the function does nothing) & see if you are print 6 as expected...

Sure enough, if I explicitly enter an integer literal for the function return, that is what gets printed to both the UART and the LCD.  I've used the same UART library for several other small projects and have had no issues with it before, so I'm pretty comfortable trusting the return values streamed over the serial port.

We freaks often ask for a complete minimal test program that demonstrates the problem. This is one of those cases.

Good idea.  My program isn't that big but it's split into a 6 different files on account of the UART and LCD libraries.  I'll try to consolidate everything into the shortest program that still has the issue.  

 

Thanks again.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'll try to consolidate everything into the shortest program that still has the issue.  

That really makes me wonder if it isn't two chunks of code somehow banging into each other, somehow length or placement dependent.  At least you are finding out things bit -by-bit 

 

There could be a marker (such as end of line) that gets squeezed out of existence...many many years ago I had some LCD flicker, everything worked great, except this damn occasional flicker...took a full day to find it. Debugger proved there was no issue & certain routines were executing differently than actually observed with the scope when not debugging--an impossibility!!   After a full day of aggravation, I discovered a terminator was left off of one menu item, causing an LCD writing overflow wraparound causing some other routines to run (displaying garbage/flicker for a fraction of a ms), until it finally hit something in memory that looked like the terminator.   It just so happened the ram debugger inserted the same terminator character (what was the chance of that!!) to allow single step debugging...with that insertion, the problem went away, since it hit inserted character before wraparound could occur...so debugging would "fix the problem". 

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Rwilliams06 wrote:
Perhaps it's a bug in avr-gcc?  Any insight is appreciated!

 

You are describing "magical" behavior that can't be diagnosed without extra information. An assembly listing for the function in question would be useful.

Dessine-moi un mouton

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I’ve spent two full days now troubleshooting the following problem and I can’t figure it out for the life of me.

Spent a a little time figuring out how to read the compiler asm output, and you will get your answer. There is nothing that should be a mystery if you look at the asm output. Get your compile process to produce an lss file via objdump so you can see what is going on with the compiler.

 

A primer-

https://godbolt.org/z/rzq5fjbv8

In this example (using gcc 4.6.4), reading the asm shows that there is nothing wrong or could be wrong with that function. Choosing a different compiler version or optimization options could make the compiler inline the use if it knows the row value (no call, it already knows what the result will be), but the results should still be correct.

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

Thanks to everyone for the helpful responses!  I determined that it's definitely an issue with the optimization settings in the version of avr-gcc 4.8.1 that I'm using.  If I select optimization level 2 or higher, and the function with the switch is in a separate .c file, the switch/case does not work properly if I add a third case statement.  Going down to optimization level 1 or 0 makes everything work fine again.

 

The funny thing is that if I compile the exact same files with avr-gcc 4.3, I can use all of the optimization levels and don't have any problems.

 

I'm new to looking at the asm output and need to spend more time figuring out how to save and interpret the asm output.  Whenever I use objdump on my program object file the result is almost 2k lines long and it's not clear to me yet which parts are relevant.  I'll study up on this and follow up once I have a better grasp on the asm dump.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

Thanks to everyone for the helpful responses!  I determined that it's definitely an issue with the optimization settings in the version of avr-gcc 4.8.1 that I'm using. 

 Well, that hardly seems palatable--your function is only the barest of bones in simplicity- rudimentary logic---why would it undergo such a wild gyration response?  I'd think there must be more to the story.  keep poking at looking at the ASM output.

 

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 2

avrcandies wrote:
I'd think there must be more to the story.
+1

 

But until we see the short, compilable program that still exhibits the problem it's all just idle speculation and conjecture. 

This reply has been marked as the solution. 
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

See this

 

https://www.avrfreaks.net/forum/...

 

The original post appears to describe exactly the same issue.

 

There's also https://www.avrfreaks.net/forum/...

Dessine-moi un mouton

Last Edited: Mon. Feb 21, 2022 - 12:53 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Please let me know if there's a better way to post my program other than copy-and-paste.  Here are the files which I'm compiling in avr-gcc 4.8.1.  I've stripped them all of the UART and other LCD code from my program, and whittled it town to a single function in lcd.c which is called by the main program.  There are big differences in the asm dump depending on if all three cases are included, or if the third case is commented out.  Here's the code for lcd.c, including the switch statement:

// lcd.c

#include "lcd.h"

char lcd_get_offset(char row) {
  char row_offset = 0x00; // initialize to zero
  switch(row) {
  case 1 :
    row_offset = 0x01;
    break;
  case 2 :
    row_offset = 0x02;
    break;
  case 3 :
    row_offset = 0x03;
    break;
  default :
    row_offset = 0x06;
    break;
  }
  return row_offset;
}

The header is just:

//lcd.h
char lcd_get_offset(char row);

The program that calls the function is:

// trigger_counter.c
#define F_CPU 14745600  // Clock speed in Hz

#include "./lcd.h"

int main() {

  char switch_result0 = 0;
  char switch_result1 = 0;
  char switch_result2 = 0;

  switch_result0 = lcd_get_offset(0);
  switch_result1 = lcd_get_offset(1);
  switch_result2 = lcd_get_offset(2);

  while(1) {
    // do nothing
  }

  return 0;
}

The makefile I'm using to compile this is

GCCFLAGS=-g -O2 -Wall -mmcu=atmega168

all:	trigger_counter.c lcd.c
	avr-gcc ${GCCFLAGS} -o lcd.o -c lcd.c
	avr-gcc ${GCCFLAGS} -o trigger_counter.o trigger_counter.c lcd.o
	avr-objcopy -j .text -O ihex trigger_counter.o trigger_counter.hex

 

Using avr-objdump to look at the asm generated by the compiler, the output is very different depending on if I include the third case in the switch statement.  With the third case included, the dump of lcd.o is as follows. 

 

// output with all 3 cases included in switch statement
lcd.o:     file format elf32-avr

Disassembly of section .text:

00000000 <lcd_get_offset>:
// lcd.c

#include "lcd.h"

char lcd_get_offset(char row) {
   0:	81 50       	subi	r24, 0x01	; 1
   2:	83 30       	cpi	r24, 0x03	; 3
   4:	00 f4       	brcc	.+0      	; 0x6 <lcd_get_offset+0x6>
   6:	e8 2f       	mov	r30, r24
   8:	f0 e0       	ldi	r31, 0x00	; 0
   a:	e0 50       	subi	r30, 0x00	; 0
   c:	f0 40       	sbci	r31, 0x00	; 0
   e:	80 81       	ld	r24, Z
  10:	08 95       	ret
  char row_offset = 0x00; // initialize to zero
  switch(row) {
  12:	86 e0       	ldi	r24, 0x06	; 6
  default :
    row_offset = 0x06;
    break;
  }
  return row_offset;
}
  14:	08 95       	ret

 

And now here's the object dump with case 3 commented out in the switch statement, still using -O2.  The program works as expected in this case.  It's interesting that commenting out the third switch case results in a significantly larger asm dump:

// output with only 2 cases included in switch statement
lcd.o:     file format elf32-avr

Disassembly of section .text:

00000000 <lcd_get_offset>:

#include "lcd.h"

char lcd_get_offset(char row) {
  char row_offset = 0x00; // initialize to zero
  switch(row) {
   0:	81 30       	cpi	r24, 0x01	; 1
   2:	01 f0       	breq	.+0      	; 0x4 <lcd_get_offset+0x4>
   4:	82 30       	cpi	r24, 0x02	; 2
   6:	01 f4       	brne	.+0      	; 0x8 <lcd_get_offset+0x8>
  case 1 :
    row_offset = 0x01;
    break;
  case 2 :
    row_offset = 0x02;
   8:	82 e0       	ldi	r24, 0x02	; 2
  default :
    row_offset = 0x06;
    break;
  }
  return row_offset;
}
   a:	08 95       	ret
    break;
    //case 3 :
    //row_offset = 0x03;
    //break;
  default :
    row_offset = 0x06;
   c:	86 e0       	ldi	r24, 0x06	; 6
    break;
   e:	08 95       	ret

char lcd_get_offset(char row) {
  char row_offset = 0x00; // initialize to zero
  switch(row) {
  case 1 :
    row_offset = 0x01;
  10:	81 e0       	ldi	r24, 0x01	; 1
  12:	08 95       	ret

I'm still figuring out what all of the asm instructions mean, but this second asm dump with just 2 cases in the switch has much more going on than the first dump, which had all three cases in the switch. 

 

I can also post assembly dumps of the whole program which is much, much longer, and the asm dumps with optimization off, but I think these two capture the issue. As I mentioned before, with the optimization totally off, the switch/case statement works whether I have two cases, or three.

 

Any thoughts?  Otherwise I'm going to keep learning the asm to try and make sense of this.  Thanks again. 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

AndreyT wrote:

See this

 

https://www.avrfreaks.net/forum/...

 

The original post appears to describe exactly the same issue.

 

There's also https://www.avrfreaks.net/forum/...

 

Sure enough, that looks like the exact same issue as mine.  I just changed my makefile to include avr-objcopy -R .eeprom -R .fuse -R .lock instead of -j .text and now it works, regardless of optimization level and the number of cases!  Perhaps upgrading to avr-gcc 4.8.1 changed the way some of the compiled program was represented in memory and my old makefile that worked with gcc 4.3 didn't work anymore?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 2

but this second asm dump with just 2 cases in the switch has much more going on than the first dump, which had all three cases in the switch. 

When there are fewer cases, the compiler just used 'immediate' data (it created inline) because of optimization. When switch size increases, at some point the compiler decides to use a table of data to lookup (setting up Z for the lookup in this case). The data is constant data, so goes into the .rodata subsection of the .data section, and the linker script will give the .data section a load address at the end of the text region, which means in the startup code that constant data gets copied from flash to ram. So optimization is hiding your problem (using objcopy), until you get enough switch cases for the compiler to start producing a constant lookup table. Different versions of the compiler may have different ideas about when to create that lookup table.

 

If you use objcopy to keep only .text, the data that is given a load address (into flash) is ignored. The startup code is still doing its job (moving flash data to ram data, expecting the data to be there), but since the resulting hex file did not include the .data section, you end up reading erased flash.

 

I can take your same code, generate an elf file, then use objcopy with -j .text to create a hex file, and the table will be missing from the hex file. Doesn't matter which version of compiler used.

 

 

Simple example-

 

//main.c

char data='a'; //this will be stored in flash, loaded into ram at startup

int main(){ SREG = data; while(1){} } //read data (force to be used via sreg, so not optimized away)

 

compile, then use objcopy -j .text, this is the relevant output- 

:0200A000FFCF90
:00000001FF

end of text, then end of file record

 

use objcopy -j .text -j .data, this is the relevant output- 

:0200A000FFCF90 //end of text
:0200A2006100FB //data to load into ram (0x61), a padding byte (0x00) added by linker script (align/fill) in this case, ignore
:00000001FF

notice that now the data that needs moving from flash to ram is now included

 

Not sure how you can ever get away with only -j .text if using any initialized data (flash to ram).

 

That is my take on it anyway.

 

edit- also not sure why the need to do any section specifying when using objcopy to produce a hex file, since the different kinds of memory are differentiated in the hex file for an avr, via address. Maybe it is a primary way to specify to a programmer 'what' to program (no fuse data seen = no fuse programming, for example), instead of other command line or gui options to indicate what you want programmed. Something like mplabx will use objcopy with no other options other than -O ihex, and the options to program eeprom, fuse bits takes place elsewhere (the programmer then ignores the data in the hex file that is not needed).

Last Edited: Mon. Feb 21, 2022 - 06:54 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Rwilliams06 wrote:
Please let me know if there's a better way to post my program other than copy-and-paste.

Copy-Paste means we get to see the code in the post - which often helps.

 

Another option is to zip-up the entire project, and attach that file.

This is particularly useful when the problem is beyond just source code, and includes the project setup. And when the code is just too big for a post.

 

 

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Why are you using 4.8.1?

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

Rwilliams06 wrote:
About a year ago I upgraded my toolchain to the latest version of Crosspack-AVR, with avr-gcc 4.8.1, but haven't had any issues until now.

https://github.com/obdev/CrossPack-AVR/blob/master/manual-source/releasenotes.html#L5

Download CrossPack for AVR Development

 

edit :

avr-gcc | MacPorts

 

"Dare to be naïve." - Buckminster Fuller

Last Edited: Mon. Feb 21, 2022 - 02:25 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 2

AVR Toolchain for Mac (avr-gcc 5.4.0, same as in Studio for PCs) is here: https://www.microchip.com/en-us/...

 

(although, this problem seems not so much a "bug" in 4.8.1, but rather a change in which section jump tables for case statements were placed in, causing an incompatibility with older makefiles use of objcopy ?  (occurring some time between 4.3 and 4.8)

 

Last Edited: Mon. Feb 21, 2022 - 05:58 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

@curtvm, Thanks for the explanation - reading more about the data sections, i think that all makes sense to me.

 

It's good to know there's a more recent AVR mac toolchain from Microchip.  Why have I been using an out-of-date toolchain? About 10 years ago I bought a "learn AVR" kit that included the MCU, LCD, and some tutorials, They recommended installing the AVR Crosspack toolchain since there wasn't an official Atmel mac toolchain at the time. I wrote a few simple programs to count trigger signals and send triggers with different time delays to some ultrasound equipment and I used the Makefile and LCD drivers included in the kit.  The tutorials didn't really get into the details of memory sections or what the -j .text part of the makefile did, but everything worked until I recently downloaded the newer Crosspack binaries.  That broke the LCD position selector, which had this switch statement in it... Long story short, I've learned more in the last few days about the AVR than I have in a while. I appreciate all of the responses.

Last Edited: Tue. Feb 22, 2022 - 12:28 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

BTW: Upgrading to avr-gcc 5.4.0 will not solve the OP's problem, It will still use .rodata under -Os optimisation: (Not that that's reason NOT to upgrade)

	.size	main, .-main
	.section	.rodata.CSWTCH.1,"a",@progbits
	.type	CSWTCH.1, @object
	.size	CSWTCH.1, 3
CSWTCH.1:
	.byte	1
	.byte	2
	.byte	3
	.ident	"GCC: (AVR_8_bit_GNU_Toolchain_3.6.1_1752) 5.4.0"
.global __do_copy_data