AS7 Bad Code Generation

Go To Last Post
22 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I am an experienced Embedded Developer and New to AS7 using the ATMega328PB.

 

The project works but while playing with the debugger strange things happened with the sequence of breakpoints and single steps.  The problem is the code in main() shows up multiple times in the .lss listing file.

 

Here is the .c :

 

int main (void)
{

	int SwLast;

	// Set PortB Bit 7 as Output
	DDRB |= BMSK_Led;
	// Set Port B Bit 5 as Input
	DDRB &= ~(BMSK_Switch);
	
	uart_0_init();

	sei();

	AX_Led_On();
	
	while( 1 ) 
	{
	// Loop forever

		if((PORT_Switch & BMSK_Switch) EQ BMSK_Switch)
		{
			// Switch Off
			if(SwLast EQ 1)
			{
				Msg[0] = 'T';
				Msg[1] = 0;
				Msg[2] = 1;
				Msg[3] = 2;
				Msg[4] = 3;
				Msg[5] = 4;
				Msg[6] = 5;
				Msg[7] = 6;
				Msg[8] = 7;
				
				Msglen = 5;
								
				if( SPkt_SendMsg() NE SPKT_SUCCESS )
				{
					AX_Led_Off();
				}
			}
			SwLast = 0;
		}
		else
		{
			// Switch On
			SwLast = 1;

		}
	}

}

And the illustrative part of the .lss file with narration:

 

At location e4, the compiler(linker) starts to output the object code for main().

 

After the call to AX_Led_On(), the compiler seems to completely skip the code for the two if() statements and goes into the output of the Msg[] assignment code where it seems to contain some gibberish although it might be considered as some type of optimization.  The compiler seems to stop after the assignment of Msg[2].

 

At location 102, the compiler again starts to output the object code for main() although now skipping a portion of the code which appeared in the last.  Performing the assignment for SwLast and then skipping the assignments and picking up at Msg[5].

 

At location 140, the compiler seems to finally get it right and the compile looks good and contains all the conditional code and the correct assignments in the correct sequence.

 

When the code at location 140 finally gets to run, the code works as advertised and debugs normally.  Before that, the debugger skips all around and lands on non executable lines.  This tells me there are messed up symbol tables and other things wrong with the Compile, Assemble, Link process.

 

000000e4 <main>:
{

	int SwLast;

	// Set PortB Bit 7 as Output
	DDRB |= BMSK_Led;
  e4:	25 9a       	sbi	0x04, 5	; 4
	// Set Port B Bit 5 as Input
	DDRB &= ~(BMSK_Switch);
  e6:	27 98       	cbi	0x04, 7	; 4
	
	uart_0_init();
  e8:	0e 94 be 00 	call	0x17c	; 0x17c <uart_0_init>

	sei();
  ec:	78 94       	sei

	AX_Led_On();
  ee:	0e 94 6e 00 	call	0xdc	; 0xdc <AX_Led_On>
		if((PORT_Switch & BMSK_Switch) EQ BMSK_Switch)
		{
			// Switch Off
			if(SwLast EQ 1)
			{
				Msg[0] = 'T';
  f2:	00 e0       	ldi	r16, 0x00	; 0
  f4:	11 e0       	ldi	r17, 0x01	; 1
  f6:	0f 2e       	mov	r0, r31
  f8:	f4 e5       	ldi	r31, 0x54	; 84
  fa:	4f 2e       	mov	r4, r31
  fc:	f0 2d       	mov	r31, r0
				Msg[1] = 0;
				Msg[2] = 1;
  fe:	55 24       	eor	r5, r5
 100:	53 94       	inc	r5

char Msg[10];  // Msg Buffer
int Msglen;

int main (void)
{
 102:	e1 2c       	mov	r14, r1
 104:	f1 2c       	mov	r15, r1

			SwLast = 1;
 106:	d1 e0       	ldi	r29, 0x01	; 1
 108:	c0 e0       	ldi	r28, 0x00	; 0
				Msg[5] = 4;
				Msg[6] = 5;
				Msg[7] = 6;
				Msg[8] = 7;
				
				Msglen = 5;
 10a:	0f 2e       	mov	r0, r31
 10c:	f5 e0       	ldi	r31, 0x05	; 5
 10e:	cf 2e       	mov	r12, r31
 110:	d1 2c       	mov	r13, r1
 112:	f0 2d       	mov	r31, r0
			if(SwLast EQ 1)
			{
				Msg[0] = 'T';
				Msg[1] = 0;
				Msg[2] = 1;
				Msg[3] = 2;
 114:	68 94       	set
 116:	66 24       	eor	r6, r6
 118:	61 f8       	bld	r6, 1
				Msg[4] = 3;
 11a:	0f 2e       	mov	r0, r31
 11c:	f3 e0       	ldi	r31, 0x03	; 3
 11e:	7f 2e       	mov	r7, r31
 120:	f0 2d       	mov	r31, r0
				Msg[5] = 4;
 122:	68 94       	set
 124:	88 24       	eor	r8, r8
 126:	82 f8       	bld	r8, 2
				Msg[6] = 5;
 128:	0f 2e       	mov	r0, r31
 12a:	f5 e0       	ldi	r31, 0x05	; 5
 12c:	9f 2e       	mov	r9, r31
 12e:	f0 2d       	mov	r31, r0
				Msg[7] = 6;
 130:	0f 2e       	mov	r0, r31
 132:	f6 e0       	ldi	r31, 0x06	; 6
 134:	af 2e       	mov	r10, r31
 136:	f0 2d       	mov	r31, r0
				Msg[8] = 7;
 138:	0f 2e       	mov	r0, r31
 13a:	f7 e0       	ldi	r31, 0x07	; 7
 13c:	bf 2e       	mov	r11, r31
 13e:	f0 2d       	mov	r31, r0

char Msg[10];  // Msg Buffer
int Msglen;

int main (void)
{
 140:	8e 2d       	mov	r24, r14
 142:	9f 2d       	mov	r25, r15
 144:	02 c0       	rjmp	.+4      	; 0x14a <main+0x66>

			SwLast = 1;
 146:	8d 2f       	mov	r24, r29
 148:	9c 2f       	mov	r25, r28
	
	while( 1 ) 
	{
	// Loop forever

		if((PORT_Switch & BMSK_Switch) EQ BMSK_Switch)
 14a:	1f 9b       	sbis	0x03, 7	; 3
 14c:	fc cf       	rjmp	.-8      	; 0x146 <main+0x62>
		{
			// Switch Off
			if(SwLast EQ 1)
 14e:	01 97       	sbiw	r24, 0x01	; 1
 150:	b9 f7       	brne	.-18     	; 0x140 <main+0x5c>
			{
				Msg[0] = 'T';
 152:	f8 01       	movw	r30, r16
 154:	40 82       	st	Z, r4
				Msg[1] = 0;
 156:	11 82       	std	Z+1, r1	; 0x01
				Msg[2] = 1;
 158:	52 82       	std	Z+2, r5	; 0x02
				Msg[3] = 2;
 15a:	63 82       	std	Z+3, r6	; 0x03
				Msg[4] = 3;
 15c:	74 82       	std	Z+4, r7	; 0x04
				Msg[5] = 4;
 15e:	85 82       	std	Z+5, r8	; 0x05
				Msg[6] = 5;
 160:	96 82       	std	Z+6, r9	; 0x06
				Msg[7] = 6;
 162:	a7 82       	std	Z+7, r10	; 0x07
				Msg[8] = 7;
 164:	b0 86       	std	Z+8, r11	; 0x08
				
				Msglen = 5;
 166:	d0 92 95 01 	sts	0x0195, r13	; 0x800195 <Msglen+0x1>
 16a:	c0 92 94 01 	sts	0x0194, r12	; 0x800194 <Msglen>
								
				if( SPkt_SendMsg() NE SPKT_SUCCESS )
 16e:	0e 94 72 01 	call	0x2e4	; 0x2e4 <SPkt_SendMsg>
 172:	89 2b       	or	r24, r25
 174:	29 f3       	breq	.-54     	; 0x140 <main+0x5c>
				{
					AX_Led_Off();
 176:	0e 94 70 00 	call	0xe0	; 0xe0 <AX_Led_Off>
 17a:	cc cf       	rjmp	.-104    	; 0x114 <main+0x30>

0000017c <uart_0_init>:
	 else
	 {
		 // Character(s) in Buffer
		 return( S_RX_CHAR );
	 }
 }
 17c:	10 92 4d 01 	sts	0x014D, r1	; 0x80014d <S_Tx0_Head+0x1>
 180:	10 92 4c 01 	sts	0x014C, r1	; 0x80014c <S_Tx0_Head>
 184:	10 92 93 01 	sts	0x0193, r1	; 0x800193 <S_Tx0_Tail+0x1>
 188:	10 92 92 01 	sts	0x0192, r1	; 0x800192 <S_Tx0_Tail>
 18c:	10 92 51 01 	sts	0x0151, r1	; 0x800151 <S_Rx0_Head+0x1>
 190:	10 92 50 01 	sts	0x0150, r1	; 0x800150 <S_Rx0_Head>
 194:	10 92 4b 01 	sts	0x014B, r1	; 0x80014b <S_Rx0_Tail+0x1>
 198:	10 92 4a 01 	sts	0x014A, r1	; 0x80014a <S_Rx0_Tail>
 19c:	10 92 c5 00 	sts	0x00C5, r1	; 0x8000c5 <__TEXT_REGION_LENGTH__+0x7e00c5>
 1a0:	89 e1       	ldi	r24, 0x19	; 25
 1a2:	80 93 c4 00 	sts	0x00C4, r24	; 0x8000c4 <__TEXT_REGION_LENGTH__+0x7e00c4>
 1a6:	10 92 c0 00 	sts	0x00C0, r1	; 0x8000c0 <__TEXT_REGION_LENGTH__+0x7e00c0>
 1aa:	86 e0       	ldi	r24, 0x06	; 6
 1ac:	80 93 c2 00 	sts	0x00C2, r24	; 0x8000c2 <__TEXT_REGION_LENGTH__+0x7e00c2>
 1b0:	88 e9       	ldi	r24, 0x98	; 152
 1b2:	80 93 c1 00 	sts	0x00C1, r24	; 0x8000c1 <__TEXT_REGION_LENGTH__+0x7e00c1>
 1b6:	08 95       	ret

 

I have included my complete project in a .zip file. 

 

Using Optimize -O1 and tried -O0 with no discernable changes.

 

Here is the clipboard copy from AS7 -> help -> about

 

Atmel Studio 7 (Version: 7.0.1417 - )
© 2015 Atmel Corp.
All rights reserved.

OS Version: Microsoft Windows NT 6.2.9200.0
Platform: Win32NT

Installed Packages: Shell VSIX manifest - 7.0
Shell VSIX manifest
Version: 7.0
Package GUID: e874ffe4-fbe3-4624-9a17-61014ede02d0
Company: Atmel Corporation

Installed Packages: Atmel Start - 1.0.102.0
Atmel Start
Version: 1.0.102.0
Package GUID: F8853255-9C7B-4DC2-8E0F-64D9324AEB0E
Company: Atmel

Installed Packages: Atmel Software Framework - 3.34.2.776
ASF
Version: 3.34.2
Package GUID: 4CE20911-D794-4550-8B94-6C66A93228B8
Company: Atmel
HelpUrl: http://asf.atmel.com/3.34.2
Release Description: ASF - 3.34.2 Release

ASF
Version: 3.34.1
Package GUID: 4CE20911-D794-4550-8B94-6C66A93228B8
Company: Atmel
HelpUrl: http://asf.atmel.com/3.34.1
Release Description: ASF - 3.34.1 Release

ASF
Version: 3.33.0
Package GUID: 4CE20911-D794-4550-8B94-6C66A93228B8
Company: Atmel
HelpUrl: http://asf.atmel.com/3.33.0
Release Description: ASF - 3.33.0 Release

ASF
Version: 3.32.0
Package GUID: 4CE20911-D794-4550-8B94-6C66A93228B8
Company: Atmel
HelpUrl: http://asf.atmel.com/3.32.0
Release Description: ASF - 3.32.0 Release

ASF
Version: 3.31.0
Package GUID: 4CE20911-D794-4550-8B94-6C66A93228B8
Company: Atmel
HelpUrl: http://asf.atmel.com/3.31.0
Release Description: ASF - 3.31.0 Release

ASF
Version: 3.30.1
Package GUID: 4CE20911-D794-4550-8B94-6C66A93228B8
Company: Atmel
HelpUrl: http://asf.atmel.com/3.30.1
Release Description: ASF - 3.30.1 Release

ASF
Version: 3.29.0
Package GUID: 4CE20911-D794-4550-8B94-6C66A93228B8
Company: Atmel
HelpUrl: http://asf.atmel.com/3.29.0
Release Description: ASF - 3.29.0 Release

ASF
Version: 3.28.1
Package GUID: 4CE20911-D794-4550-8B94-6C66A93228B8
Company: Atmel
HelpUrl: http://asf.atmel.com/3.28.1
Release Description: ASF - 3.28.1 Release

ASF
Version: 3.27.3
Package GUID: 4CE20911-D794-4550-8B94-6C66A93228B8
Company: Atmel
HelpUrl: http://asf.atmel.com/3.27.3
Release Description: ASF - 3.27.3 Release

ASF
Version: 3.27.0
Package GUID: 4CE20911-D794-4550-8B94-6C66A93228B8
Company: Atmel
HelpUrl: http://asf.atmel.com/3.27.0
Release Description: ASF - 3.27.0 Release

ASF
Version: 3.26.0
Package GUID: 4CE20911-D794-4550-8B94-6C66A93228B8
Company: Atmel
HelpUrl: http://asf.atmel.com/3.26.0
Release Description: ASF - 3.26.0 Release

 

Installed Packages: LiveWatch - 2.0.54
LiveWatch
Version: 2.0.54
Package GUID: 7DF6DCFD-2BCA-41C7-9C0E-1B7F606B008E
Company: Atmel

Installed Packages: GdbConsole - 7.0.188
GdbConsole
Version: 7.0.188
Package GUID: 49258291-0FED-4501-881B-6BAA91BEBCA8
Company: Atmel

Installed Packages: Atmel Kits - 7.0.98
Atmel Kits
Version: 7.0.98
Package GUID: 6F4B8FE4-C464-4916-8B43-AC92431C1CDF
Company: Atmel

Installed Packages: AtmelToolchainProvider - 7.0.894
AtmelToolchainProvider
Version: 7.0.894
Package GUID: AtmelToolchainProvider.Atmel.10EF9C74-D8DA-4872-85F5-D8BB3101E245
Company: Atmel

Installed Packages: Data Visualizer Extension - 2.11.655
Data Visualizer Extension
Version: 2.11.655
Package GUID: 25dc067d-df31-4e22-be7f-cc6a77ccc7f3
Company: Atmel

Installed Packages: Atmel Gallery - 7.8
Atmel Gallery
Version: 7.8
Package GUID: AtmelStudio7ExtensionManager
Company: Atmel

Installed Packages: Visual Assist for Atmel Studio - 10.9.2093.2
Visual Assist for Atmel Studio
Version: 10.9.2093.2
Package GUID: 7997A33C-B154-4b75-B2AC658CD58C9510
Company: Whole Tomato Software

 

 

Attachment(s): 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0
if((PORT_Switch & BMSK_Switch) EQ BMSK_Switch)

Show the macros used in that.

 

But anyway this is just optimisation isn't it? The compiler is free to re-order and split statements as long as the ultimate behaviour is still that which was requested.

 

Have you actually analysed what the opcode sequence achieves - does that differ from what you intended?

 

BTW when it does split statements it's not unusual to see the same piece of annotation several times in a file. The classic example is the for() loop. You may well see the same line being used to annotate three separate blocks of code (the init, the test and the per iteration code (increment usually).

 

EDIT: missed the fact that there was a code archive attached so for the benefit of others:

if((PORT_Switch & BMSK_Switch) EQ BMSK_Switch)

is actually

if((PINB & 0x80) == 0x80)

(which is OK!)

 

The actual code of that (and indeed the main loop of the while(1) is therefore:

 146:	8d 2f       	mov	r24, r29
 148:	9c 2f       	mov	r25, r28
	
	while( 1 ) 
	{
	// Loop forever

		if((PORT_Switch & BMSK_Switch) EQ BMSK_Switch)
 14a:	1f 9b       	sbis	0x03, 7	; 3
 14c:	fc cf       	rjmp	.-8      	; 0x146 <main+0x62>

The code will spend most if its time jumping back from 14c to 146. Only when bit 7 (0x80) in PINB (0x03) is set will it skip the rjmp to execute the rest.

Last Edited: Mon. May 15, 2017 - 03:32 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson,

 

It is much more than optimization.  The same code is output several times in an incomplete fashion before it is output correctly in its entirety.  Turning OFF optimization using -O0 has no effect on this code generation run amok.  As far as I know, optimization would never generate blocks from the same c code which are in themselves incomplete and or incorrect.

 

The code you point to begins at location 140 and is correct as you point out.  It is the code before which is completely incorrect and deviant and in no way captures the intent of the programmer.

 

Locations e4 to 100 is code which is executed and is incorrect and out of sequence.

Locations 102 to 13E is incorrect code which is executed and is incorrect and out of sequence.

At location 140 is the complete and correct code which finally gets control and then is able to work correctly!

 

I believe this is a very serious issue.

 

Peter

 

 

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

PeterHouse wrote:
I believe this is a very serious issue.

I and others like "puzzles" like this.

 

To make it easier for us to explore it is useful to post a complete test program.

 

[interesting -- 'PB.  Same generation for 'PA?]

 

Standard blablabla -- Complete test program; toolchain; version; optimization setting(s); target clock speed.

 

 

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

PeterHouse wrote:
Using Optimize -O1

PeterHouse wrote:
The same code is output several times in an incomplete fashion before it is output correctly in its entirety.

I'm not a GCC guru.  If -O1 is interpreted as "speed" then indeed you might see pieces of code "unrolled" and see more than one sequence.

 

 

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

PeterHouse wrote:
int SwLast;

Indeed the .LSS seems to "jump into" an unintended sequence.

 

But let's clear up SwLast first.  You have made it an automatic variable, and a signed int at that.  Automatic variables have no defined initial value, so the compiler can assume any value it wants.  I'm guessing that if one follows through the generated code it is putting its assumed case of what will happen the first time through at the top of the loop -- which saves a couple cycles getting to the code.

 

The while(1) is fairly simple, so probably no harm.  In other cases, though, it may do unintended things the first time through.  Didn't you get a build warning on that?  Give an initial value of 0, and you still might get the same order...

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The complete code in the form of an AS7 project is posted in the original posting as a .zip file.  The target speed is 16mHz provided by an external oscillator - and no relation to the compiler generating bad code.  This product is being breadboarded using an ATMEL ATMega328PB XPlained Mini module.

 

This does not appear to be a compiler optimization setting problem.  -O1 is the lowest optimization setting other than complete off, -O0, and this setting does not change the bad code generated.

 

The code is classic embedded program code, setup code followed by a forever loop.  The compiler is generating 3 instances of the startup and forever loop and the first two are just incorrect and the third is correct and executes as such.  Fortunately, the execution path falls into the third block of correct code after executing the first two blocks of incorrect code.

 

Two big problems:

   1. how to debug - setting a breakpoint on the first instruction stop correctly and then single stepping the program send the debugger off into the woods until it gets to the "real" code at location 140.

   2. Trust - how do you trust a compiler which will do this?

 

Peter
 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

PeterHouse wrote:
Locations e4 to 100 is code which is executed and is incorrect and out of sequence.

Locations 102 to 13E is incorrect code which is executed and is incorrect and out of sequence.

At location 140 is the complete and correct code which finally gets control and then is able to work correctly!

I disagree.

As Cliff said that are split statements. E.g. "Msg[7] = 6;":

				Msg[7] = 6;
 130:	0f 2e       	mov	r0, r31
 132:	f6 e0       	ldi	r31, 0x06	; 6
 134:	af 2e       	mov	r10, r31
 136:	f0 2d       	mov	r31, r0

				Msg[7] = 6;
 162:	a7 82       	std	Z+7, r10	; 0x07

First (at 130) r10 is prepared with the value 6, later (at 162) the prepared value is stored to Msg[7].

Stefan Ernst

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

For the project, the value SwLast does get assigned in time for the device to work and the SwLast only has two possible values: 0 and 1 to store the last values of an input switch which will not be in the final product.  While debugging, the RAM does seem to be initialized although I know this will NOT be the case when run on a real chip at power up.

 

Eliminating this variable has no impact on the bad code generated.  Other than removing this variable.

 

Peter

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The COMPLETE and CORRECT code as it should be generated begins at location 140 and ends at 17a. 

 

The code between locations e4 and 13e is at best incomplete and is unnecessary.

 

I understand the concept of compiler optimization and stuffing registers with values used many time later and even optimizing out sections of code.  This is not a compiler optimization issue - the compiler is simply getting confused and outputting the SAME code multiple times. 

 

I believe there is something in my code which is causing this issue.  After having loaded and compiled several examples and not been able to see this issue anywhere else there is no other conclusion.

 

Peter

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

PeterHouse wrote:
The code between locations e4 and 13e is at best incomplete and is unnecessary.
Well that is not true for starters. It's not E4 as the lower limit because at E4 is exactly what you asked for:

DDRB |= BMSK_Led;
  e4:	25 9a       	sbi	0x04, 5	; 4
	// Set Port B Bit 5 as Input
	DDRB &= ~(BMSK_Switch);
  e6:	27 98       	cbi	0x04, 7	; 4
	
	uart_0_init();
  e8:	0e 94 be 00 	call	0x17c	; 0x17c <uart_0_init>

	sei();
  ec:	78 94       	sei

	AX_Led_On();
  ee:	0e 94 6e 00 	call	0xdc	; 0xdc <AX_Led_On>

As far as I can see all that is implementing exactly what you asked for. So I guess you are saying it's F2 onwards that is "superfluous"?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

PeterHouse wrote:
Trust - how do you trust a compiler which will do this?

 

No problem there. As long as the effects of the executed code are matching "the intentions of the source code" then it is not breaching the C standard. Generating less-than-optimal code is not a breach of the C standard. E.g. the C standard does not have any requirements re timing.

 

If the situation is what you describe then that is a tick in the inefficient-code-is-generated column for avr-gcc. Bad but not wrong.) If you can pull together a minimal test program that shows this behavior then the avr-gcc developers would probably be interested in hearing about it. (But beware - they are, rightly IMO, rather picky about high quality of bug/deficiency reports.)

 

(If my what-I-get-paid-for Java code does what I want it to then I might be tempted to have a look tonight at the ZIP you've attached.)

As of January 15, 2018, Site fix-up work has begun! Now do your part and report any bugs or deficiencies here

No guarantees, but if we don't report problems they won't get much of  a chance to be fixed! Details/discussions at link given just above.

 

"Some questions have no answers."[C Baird] "There comes a point where the spoon-feeding has to stop and the independent thinking has to start." [C Lawson] "There are always ways to disagree, without being disagreeable."[E Weddington] "Words represent concepts. Use the wrong words, communicate the wrong concept." [J Morin] "Persistence only goes so far if you set yourself up for failure." [Kartman]

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

PeterHouse wrote:
The COMPLETE and CORRECT code as it should be generated begins at location 140 and ends at 17a.
Nonsense.

 

This is that code:

 140:	8e 2d       	mov	r24, r14
 142:	9f 2d       	mov	r25, r15
 144:	02 c0       	rjmp	.+4      	; 0x14a <main+0x66>

			SwLast = 1;
 146:	8d 2f       	mov	r24, r29
 148:	9c 2f       	mov	r25, r28
	
	while( 1 ) 
	{
	// Loop forever

		if((PORT_Switch & BMSK_Switch) EQ BMSK_Switch)
 14a:	1f 9b       	sbis	0x03, 7	; 3
 14c:	fc cf       	rjmp	.-8      	; 0x146 <main+0x62>
		{
			// Switch Off
			if(SwLast EQ 1)
 14e:	01 97       	sbiw	r24, 0x01	; 1
 150:	b9 f7       	brne	.-18     	; 0x140 <main+0x5c>
			{
				Msg[0] = 'T';
 152:	f8 01       	movw	r30, r16
 154:	40 82       	st	Z, r4
				Msg[1] = 0;
 156:	11 82       	std	Z+1, r1	; 0x01
				Msg[2] = 1;
 158:	52 82       	std	Z+2, r5	; 0x02
				Msg[3] = 2;
 15a:	63 82       	std	Z+3, r6	; 0x03
				Msg[4] = 3;
 15c:	74 82       	std	Z+4, r7	; 0x04
				Msg[5] = 4;
 15e:	85 82       	std	Z+5, r8	; 0x05
				Msg[6] = 5;
 160:	96 82       	std	Z+6, r9	; 0x06
				Msg[7] = 6;
 162:	a7 82       	std	Z+7, r10	; 0x07
				Msg[8] = 7;
 164:	b0 86       	std	Z+8, r11	; 0x08
				
				Msglen = 5;
 166:	d0 92 95 01 	sts	0x0195, r13	; 0x800195 <Msglen+0x1>
 16a:	c0 92 94 01 	sts	0x0194, r12	; 0x800194 <Msglen>
								
				if( SPkt_SendMsg() NE SPKT_SUCCESS )
 16e:	0e 94 72 01 	call	0x2e4	; 0x2e4 <SPkt_SendMsg>
 172:	89 2b       	or	r24, r25
 174:	29 f3       	breq	.-54     	; 0x140 <main+0x5c>
				{
					AX_Led_Off();
 176:	0e 94 70 00 	call	0xe0	; 0xe0 <AX_Led_Off>
 17a:	cc cf       	rjmp	.-104    	; 0x114 <main+0x30>

Complete? Then please tell me where the registers r4-r11, that are written to Msg[X], get their values in that code.

Stefan Ernst

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The only "odd" thing is the way the compiler has seen the need to save/restore R31 on almost every array element:

 11a:	0f 2e       	mov	r0, r31
 11c:	f3 e0       	ldi	r31, 0x03	; 3
 11e:	7f 2e       	mov	r7, r31
 120:	f0 2d       	mov	r31, r0
				Msg[5] = 4;
 122:	68 94       	set
 124:	88 24       	eor	r8, r8
 126:	82 f8       	bld	r8, 2
				Msg[6] = 5;
 128:	0f 2e       	mov	r0, r31
 12a:	f5 e0       	ldi	r31, 0x05	; 5
 12c:	9f 2e       	mov	r9, r31
 12e:	f0 2d       	mov	r31, r0
				Msg[7] = 6;
 130:	0f 2e       	mov	r0, r31
 132:	f6 e0       	ldi	r31, 0x06	; 6
 134:	af 2e       	mov	r10, r31
 136:	f0 2d       	mov	r31, r0

One might argue that it should either have just put an R31 save/restore across the entire block or even why not use r24 (or similar) that can be corrupted.

 

Personally I would have just had it memcpy_P() this data into place from a __flash array or something along those lines.

 

EDIT: in passing one might note the "cleverness" in using T when it's loading powers of 2:

 122:	68 94       	set
 124:	88 24       	eor	r8, r8
 126:	82 f8       	bld	r8, 2

Last Edited: Tue. May 16, 2017 - 02:29 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

JohanEkdahl, Sternst, and clawson,

 

You have humbled me and caught me in some misunderstanding and incomplete analysis of the issue.

 

Here is how the problem began:

 

This is a test program for testing a serial packetization library I am building for a project.

 

I decided to try the debugger with single step and met the following very odd behaviour:

 

Debug session:
  Set a breakpoint at Msg[0] = 'T';

  Click to start a debug session
  Program stops correctly
  Click Step Into (F11)
  Program Stops at Msg[2] = 1, skipping Msg[1] = 0
  Program stops with cursor on open brace of main()
  Click Step Into (F11)
  Program stops at assignment SwLast = 1;   near end of main()
  Click Step Into (F11)
  Program stops at Msglen = 5;   in middle of main()
  Click Step Into (F11)
  Program stops at Msg[3] = 2;
  Click Step Into (F11)
  Program stops at Msg[4] = 3;
  Click Step Into (F11)
  Program stops at Msg[5] = 4;
  Click Step Into (F11)
  Program stops at Msg[6] = 5;
  Click Step Into (F11)
  Program stops at Msg[7] = 6;
  Click Step Into (F11)
  Program stops at Msg[8] = 7;
  Click Step Into (F11)
  Program stops with cursor on open brace of main()
  Click Step Into (F11)
  Program stops at if((PORT_Switch & BMSK_Switch) EQ BMSK_Switch)
  Click Step Into (F11)
  Program stops with cursor on if(SwLast EQ 1)
  Click Step Into (F11)
  Program stops with cursor on open brace of main()

  Program debugs as expected from here on.

 

This lead me to look at the .lss file and as a LONG TIME and primarily assembly programmer, I did not like what I saw :)

 

I am looking further into the issue to determine if there really is an issue.  I would love any continued feedback on this and will be watching the forum.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I wrote some words a while back...

https://www.avrfreaks.net/forum/t...

 

that may/may not be of interest. In that I talk about it splitting single statements into multiple blocks of code. For example the Msg[6] stuff consists of:

				Msg[6] = 5;
 128:	0f 2e       	mov	r0, r31
 12a:	f5 e0       	ldi	r31, 0x05	; 5
 12c:	9f 2e       	mov	r9, r31
 12e:	f0 2d       	mov	r31, r0

then much later:

				Msg[6] = 5;
 160:	96 82       	std	Z+6, r9	; 0x06

bit the optimiser is completely free to do this kind of thing if it chooses. In fact when you think about it, it's doing you a favour. The actual STD could occur multiple times (each time the input bit is set) and while I suppose the natural reaction for putting 5 into that location would have been:

                        ldi     r24, 5
 160:	96 82       	std	Z+6, r24	; 0x06

then the "repeated" code is actually doing a load each time that does not need to be in the loops.

 

However I still wonder about all the R31 save/restores and whether all that could have been handled better.

 

Of course a bit of the "blame" here actually rests in your court. Once set the contents of Msg[] (in this example anyway) are unchanging so why is it in the loop and conditional on:

		if((PORT_Switch & BMSK_Switch) EQ BMSK_Switch)
		{
			// Switch Off
			if(SwLast EQ 1)
			{

anyway? Surely the more efficient C would have been:

        int SwLast;

	// Set PortB Bit 7 as Output
	DDRB |= BMSK_Led;
	// Set Port B Bit 5 as Input
	DDRB &= ~(BMSK_Switch);

	uart_0_init();

	sei();

	AX_Led_On();

	Msg[0] = 'T';
	Msg[1] = 0;
	Msg[2] = 1;
	Msg[3] = 2;
	Msg[4] = 3;
	Msg[5] = 4;
	Msg[6] = 5;
	Msg[7] = 6;
	Msg[8] = 7;

	Msglen = 5;

	while( 1 )
	{
	// Loop forever

		if((PORT_Switch & BMSK_Switch) EQ BMSK_Switch)
		{
			// Switch Off
			if(SwLast EQ 1)
			{
				if( SPkt_SendMsg() NE SPKT_SUCCESS )
				{
					AX_Led_Off();
				}
			}
			SwLast = 0;
		}
		else
		{
			// Switch On
			SwLast = 1;

		}
	}

The compiler has kind of "half" implemented it that way anyway.

Last Edited: Tue. May 16, 2017 - 04:42 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Re the "odd behaviour" you saw during a single-stepping debug session this is a well known thing: If there, due to optimization,  is no 1-to-1 mapping between C source-code lines and blocks of generated machine code then it can very well be that stepping line-by-line in the C source will create something that seems "odd". We get posts about this every month or so, by folks who are either just confused or scream "Compiler bug!".

 

You are in luck, though. You are an experienced assembler programmer and can inspect the generated code and "see through" the disguise and map chunks of machine code to parts of C source.

 

Think of the beginners that come here and have C or C++ (or perhaps Java or C#), running under a desktop OS (Windoze, GNU/Linux, Mac) as their only experience. In those environments debugging on the source code level rarely displays these "odd" behaviors. They can be totally lost, and we have a hard time talking to them about the generated code. So ... you're actually lucky.

 

I'll leave the detailed analysis to e.g. Cliff. He's much more skilled at it, and a lot faster at it too! (He also seems to enjoy it - another sign of his mild lunacy ;-) , his post count of-course being the first.)

 

Oh, almost forgot.. Welcome to AVRfreaks! :-)

As of January 15, 2018, Site fix-up work has begun! Now do your part and report any bugs or deficiencies here

No guarantees, but if we don't report problems they won't get much of  a chance to be fixed! Details/discussions at link given just above.

 

"Some questions have no answers."[C Baird] "There comes a point where the spoon-feeding has to stop and the independent thinking has to start." [C Lawson] "There are always ways to disagree, without being disagreeable."[E Weddington] "Words represent concepts. Use the wrong words, communicate the wrong concept." [J Morin] "Persistence only goes so far if you set yourself up for failure." [Kartman]

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Are you sure that it's wrong?

 

Can you show the .lss with either -Os or -O2 I will guess that it will make it cleaner!

 

And I guess this "odd" code is because of some front end stuff that is good for ARM!

 

It looks like it build up the msg string in registers up front, and that might confuse you, but if has many registers it don't use so why not.

 

If if you want a better (less ARM) try the winAVR 2010 version. 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

JohanEkdahl wrote:
Cliff... (He also seems to enjoy it - another sign of his mild lunacy)

"I learned long ago, never to wrestle with a pig. You get dirty, and besides, the pig likes it."

George Bernard Shaw

As the project is more than just the simple main() posted, I didn't try to rebuild it.  Indeed, first poking at the .LSS was a bit confusing in the loop code.  As I stated early on, I think if one would force the SwLast to e.g. PORTx or some other volatile value, then the test would be done first in the loop and look more conventional. Perhaps not.

 

sparrow2 wrote:
Can you show the .lss with either -Os or -O2 I will guess that it will make it cleaner!

It might be an interesting comparison, in any case.  Perhaps get rid of some of the register-whomping.  Again as a result of a very quick read, the compiler apparently decided that LDI to Rhigh and then move to Rlow was the best way to do the assignments.

 

lol -- takes me back to one of my past lives where I did some compiler writing myself.  Never to the level of a full-language compiler for a mainstream language; usually more on the line of industry-specific subsets and similar tools.  Anyway, hoisting of "loop-invariant code" is often done piecemeal and each piece might indeed be done on a heuristic basis and not necessarily sequential as us humans might expect.  So the result can indeed be seemingly random fragments in no apparent order.  Perfectly legal and proper as long as the result is the same -- in this case Msg[] contents, probably.  (It is hard to tell as the SendMsg takes no parameters)

 

 

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks for all the input.  At this time I will STOP calling it a "compiler bug" or referring to it as "bad code".  I will say this compiler makes pretty good obfuscated assembler output !

 

The reason for some of the issues you guys pointed out is it is test code and very temporary.  I long ago stopped using debuggers and just wanted to play around with this one.  All I need is a way to compile and burn the chip and all of my code writing is basically Write, Burn, Test and then wash, rinse, and repeat.

 

As long as the code works and is reliable I will be OK.

 

There are so many CPU cycles available to burn, the issues with the redundant fiddling with r31 and r0 will probably have no impact and the amount of extra code generated by the calls and returns and all the pushing and popping during interrupts generates far more code than my actual interrupt routines.  If this begins to be a problem, I will consider writing the interrupt code in assembler and figuring how to set up a few registers which are unique to interrupts (any idea how to do this in c on this platform?).

 

Thank you for all your help, I will monitor this forum and try to pay it back.

 

Peter

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

PeterHouse wrote:
this compiler makes pretty good obfuscated assembler output !

That is in the very nature of optimising compilers!

 

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

PeterHouse wrote:
...he amount of extra code generated by the calls and returns and all the pushing and popping during interrupts generates far more code than my actual interrupt routines. If this begins to be a problem, I will consider writing the interrupt code in assembler ...

 

This is the first I remember seeing in this thread about ISRs.  That is a whole 'nother topic.

 

Indeed, I like CodeVision's "smart ISRs", and GCC usually can't match that.  That said, it will do a good job for you -- if you write your C code to give it half a chance.

 

Indeed, in the AVR8 C toolchains I'm familiar with, all compiler working registers will be saved and restored -- but ONLY if you call a function from the ISR.  Only logical, as when the ISR is invoked the state of registers in the mainline could be anything, and the called function expects to play by the rules of the code generation model.

 

But I very rarely will call a function from an ISR.  Why do it?  Are you really sharing that piece of code with mainline uses?  Generally not.

 

Now, if you want to put that ISR code into a function just for readability or whatever -- use inline or always inline [the gurus will need to give you the best incantation].  [you'll need to consult same gurus about reserving registers in that toolchain]

 

Anyway, if you give the compiler half a chance then GCC ISRs will be pretty tight.  If vanilla C code, I don't think you'll gain much if anything with ASM.

 

[edit] https://www.avrfreaks.net/forum/r... has a pretty good example of the above, in a C vs. ASM battle.  In #18, an ISR consisted of nothing except calling

PinChangeInterrupt();

...and then decrying

__vector_3:
.LM17:				;Pin change interrupt 
	push __zero_reg__
	push __tmp_reg__
	in __tmp_reg__,__SREG__
	push __tmp_reg__
	clr __zero_reg__
	push r18
	push r19
	push r20
	push r21
	push r22
	push r23
	push r24
	push r25
	push r26
	push r27
	push r30
	push r31
.LM18:
	call PinChangeInterrupt
	pop r31
	pop r30
	pop r27
	pop r26
	pop r25
	pop r24
	pop r23
	pop r22
	pop r21
	pop r20
	pop r19
	pop r18
	pop __tmp_reg__
	out __SREG__,__tmp_reg__
	pop __tmp_reg__
	pop __zero_reg__
	reti

But there is no reason to do that!  Where else is that "function" code going to be used?!?  Put the code in the ISR.  Or make it inline.

 

 

 

 

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

Last Edited: Tue. May 16, 2017 - 08:55 PM