Atmega 1284P, unknown behaviour

Go To Last Post
52 posts / 0 new

Pages

Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hello, I will try to explain this as good as I can. I am developing a project consisting a few sensors and also a HM-10 copy. After the code started growing, I encounteried a problem. The code works fine when powering the board, after about 4 minutes , the behaviour changes. I will try to explain what I observed:

 

I start sending data from the sensor continously. The program works without any problem for 3,4 minutes, after this time the MCU does not respond to any command sent through bluetooth and it stops sending data. The MCU does not restart only hangs-up.

 

What is interesting is that no interrupt works anymore  ( ex: an interrupt is generated when the phone is connected to the bluetooth module). The program is made as a state machine, the MCU is able to switch to the other state. For example I have 2 states ACTIVE and SLEEP, if no BLE connection is up for 1 minute it enters in sleep mode. After hanging up the MCU enters sleep mode but no interrupt can wake him up anymore. Usually when it is working as soon as I connect the phone to the module, it exits sleep mode.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

We're going to need to see code, schematics and preferably the smallest cut down code that still exhibits the same fault. In preparing the latter you will likely find the issue anyway!

 

If the code is modular and each module has unit tests then the points of failure should already be identified anyway.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

nikel1992 wrote:
after this time the MCU does not respond to any command sent through bluetooth

Are you certain that the BT is still working?

 

WHat if you take BT out, and just use a wired connection?

 ex: an interrupt is generated when the phone is connected to the bluetooth module

 

Is it?

 

Again, how have you verified that it's not the BT module that's stopped generating interrupts?

 

The program is made as a state machine

So instrument the state machine to allow you to see what state it's in.

 

Usually when it is working as soon as I connect the phone to the module, it exits sleep mode.

But sometimes it doesn't?

 

You need to take a methodical approach to find where the problem occurs - this is the essnce of debugging.

 

http://www.avrfreaks.net/comment...

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

awneil wrote:
ake a methodical approach to find where the problem

 

I am reading the incoming signal from the bluetooth with a serial converter. The bluetooth is connected through UART to the MCU. On the Rx pin of the MCU I have also the RX of the serial converter and I receive the text sent from the phone.

ex: an interrupt is generated when the phone is connected to the bluetooth module

Yes it is , before the MCU hangs up I can check if the interrupt is generated. Also the bluetooth has a led for connection.

 

So instrument the state machine to allow you to see what state it's in. 

I am using a led for this issue and I see when the state changes, and it is able to do this .

Usually when it is working as soon as I connect the phone to the module, it exits sleep mode.

I said usually by mistake, every time I connect the phone , before this error occurs , it works.

 

We're going to need to see code, schematics and preferably the smallest cut down code that still exhibits the same fault. In preparing the latter you will likely find the issue anyway!

 

 

I was thinking that this is the next step, but I am not able to post any code or any schematic here. I am trying to find some clues where to find. I am also thinking at some memory corruption. Is it anyway to check the memory state, or the memory usage time to time. I am thinking at some pointers badly mannered, or some stack overflows, but I am not sure. I lost a lot of time trying to identify the problem. Also if anyone want to see the code, we can use some remote control programs.  

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

two things come to mind, for projects that work for awhile then suddenly stop working.

Stack overflow, corrupting memory,  can you place "guard bytes" below the stack to track how deep it goes before the problem happens?

or improper sleep mode setup, i.e. no wake up source setup before going to sleep mode.

 

For the first one, can you substitute a larger family member part, i.e. tiny85 instead of tiny25, mega 328 instead of mega 48, etc.... so you have more head room to work with? 

..... ok, scratch that, I see your using a large member part already!

Jim

 

 

edit: family part correction...

Last Edited: Wed. Sep 27, 2017 - 03:20 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

It may not be your problem but the 1284P is known for having problems when using an external crystal and running the oscillator not in full swing mode. Especially when using the USART.

'This forum helps those who help themselves.'

 

pragmatic  adjective dealing with things sensibly and realistically in a way that is based on practical rather than theoretical consideration.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

ki0bk wrote:

two things come to mind, for projects that work for awhile then suddenly stop working.

Stack overflow, corrupting memory,  can you place "guard bytes" below the stack to track how deep it goes before the problem happens?

or improper sleep mode setup, i.e. no wake up source setup before going to sleep mode.

 

For the first one, can you substitute a larger family member part, i.e. tiny85 instead of tiny25, mega 328 instead of mega 48, etc.... so you have more head room to work with? 

..... ok, scratch that, I see your using a large member part already!

Jim

 

 

edit: family part correction...

I don`t know how to put such guard bytes, if you can tell me, it would be perfect.

It may not be your problem but the 1284P is known for having problems when using an external crystal and running the oscillator not in full swing mode. Especially when using the USART.

I am using the oscillator in full swing mode . I have a crystal attached to xtal1 and xtal 2.  

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

For Stack checking, One of the GCC freaks will need to guide you on how to do it with that toolchain. 

But for an overview see, https://en.wikipedia.org/wiki/Bu...

 

Jim

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

ki0bk wrote:
Stack overflow, corrupting memory,  can you place "guard bytes" below the stack to track how deep it goes before the problem happens?

Atmel Studio

Stack Overflow Detection Using Data Breakpoint

http://www.atmel.com/webdoc/GUID-ECD8A826-B1DA-44FC-BE0B-5A53418A47BD/index.html?GUID-A4FC8DB5-6B28-4893-93BA-7A4406698E5D

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

gchapman wrote:

ki0bk wrote:
Stack overflow, corrupting memory,  can you place "guard bytes" below the stack to track how deep it goes before the problem happens?

Atmel Studio

Stack Overflow Detection Using Data Breakpoint

http://www.atmel.com/webdoc/GUID-ECD8A826-B1DA-44FC-BE0B-5A53418A47BD/index.html?GUID-A4FC8DB5-6B28-4893-93BA-7A4406698E5D

 

 

I need to mention that I do not have access to debug. I am programming through ISP.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I have used a modification of the technique described here when developing code:

http://www.avrfreaks.net/forum/soft-c-avrgcc-monitoring-stack-usage

David (aka frog_jr)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

nikel1992 wrote:
I need to mention that I do not have access to debug. I am programming through ISP.

 

Then test for a change in your main() loop and light an LED (or provide some indication) when detected.

 

Jim

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I am doing the led thing already , but as I said, I cannot obtain some specific clues.

 

frog_jr wrote:

I have used a modification of the technique described here when developing code:

http://www.avrfreaks.net/forum/soft-c-avrgcc-monitoring-stack-usage

 

I will try this right now.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Is the error mode reproducible?

 

i.e. Will it always crash after 3 - 4 minutes?

 

Will it crash if it just sits still for 4 minutes without the BT connection going active?

 

Does increasing the number of BT connection / disconnection episodes shorten the time to crash?

 

How do you in your programming language set the stack?

(i.e. in Basic I set three stack sizes in the program's header)

 

If the crash is reproducible, can you simply significantly increase your stack sizes and see if that changes the crash behavior?

 

JC

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

DocJC wrote:

Is the error mode reproducible?

 

i.e. Will it always crash after 3 - 4 minutes?

 

Will it crash if it just sits still for 4 minutes without the BT connection going active?

 

Does increasing the number of BT connection / disconnection episodes shorten the time to crash?

 

How do you in your programming language set the stack?

(i.e. in Basic I set three stack sizes in the program's header)

 

If the crash is reproducible, can you simply significantly increase your stack sizes and see if that changes the crash behavior?

 

JC

 

Now while I was testing I see that the device does not respond anymore. The state hangs in sending data state and also does not respond to any command. What I tested today was if the amount of data corrupts the MCU. I tried to send a bigger quantity of data over serial, by decreasing the delays used. My states had delays to decrease the speed of data transmission. I removed all the delays and the MCU hangs at the same amount of time it hanged before.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

frog_jr wrote:

I have used a modification of the technique described here when developing code:

http://www.avrfreaks.net/forum/soft-c-avrgcc-monitoring-stack-usage


 

I downloaded the files included in my project and called only the function StackCount. I receive value 0 no matter what I do , even if I call the function right after port initialization, without calling any function. What am I doing wrong?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Did you add the StackPaint to your program (in the .init1 section)?

And if using gcc, you use the following (depending on what you want to measure):

extern uint8_t _etext;		// End of program (actually points to next not used location)
extern uint8_t __data_start;	// Start of initialized variables (bottom of SRAM)
extern uint8_t __data_end;	// End of initialized variables (== __bss_start)
extern uint8_t __bss_start;	// Start of uninitialized variables
extern uint8_t __bss_end;	// End of uninitialized variables (== __heap_start,  == _end)
extern uint8_t _end;			
extern uint8_t __heap_start;	// Start of dynamically allocated memory (malloc)
extern uint8_t __stack;		// Top of Stack == Top of SRAM

 

 

David (aka frog_jr)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Ok. I added the folders to my project, I included them and I did the next things:

 

#include "string.h"
#include <avr/wdt.h>
#include "stackmon.h"


void StackPaint(void) __attribute__ ((naked)) __attribute__ ((section (".init3")));

first part .

 

memorySpace=StackCount();
	printf("Valoare %d\r\n",memorySpace);

second one

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Now , the MCU is restarting continuously, after the same amount of time. I think I could time the restarts perioad :))). It is like a counter did this stupid thing. I really don`t know how to solve this.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Sounds like the Watchdog ...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I am checking the MCUSR register. No watchdog reset is there.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I am trying to implement the stackCount , but I had no success so far. I don`t know how to do it.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The Mcu restarts around 7 minutes. It happened to restart exactly at 7 minutes.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

What does mcusr say >>is<< the restart cause? It may be time to go get yourself a debugger. In lieu if that, you could insert some code into .init0 which could test mcusr and send captured state to the usart in the event it traps a restart with no cause.

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"Read a lot.  Write a lot."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

joeymorin wrote:
time to go get yourself a debugger

+999999999999999999999999999999999999999999999999

 

Time for that quote again:

 

js wrote:
[not having a debugger is] like a mechanic not having any spanners.

 

See: http://www.avrfreaks.net/comment...

 

And: http://www.avrfreaks.net/comment...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

joeymorin wrote:
d state to the usart in the event it traps a restart
  It does not show any restart source. The problem is that i designed the whole program for not having debugger. On the pins that a debugger is connected i have other things.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The problem is that i designed the whole program for not having debugger. On the pins that a debugger is connected i have other things.

Seems like you've painted yourself into a corner.

 

Looks like you'll have to:

insert some code into .init0 which could test mcusr and send captured state to the usart in the event it traps a restart with no cause.

... or some other interface in lieu of the usart.

 

Ask if you need help with that.

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"Read a lot.  Write a lot."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

nikel1992 wrote:
On the pins that a debugger is connected i have other things.

So not only do you have no spanners yourself, but you've also ensured that spanners cannot be used at all!

 

surprise

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

awneil wrote:
nners yourself, but you've also ensured that spa

 

Kinda! I made some changes today,( more exactly the entire day and I connected the JTAG ( Dragon). Now the problem is that I don`t know how to look for the problem. What memory to check for. On the memory tab on debugger I have :

 

progFlash

prog BOOT_SECTION_1

prog BOOT_SECTION_2

prog BOOT_SECTION_3

prog BOOT_SECTION_4

data registers

data MAPPED_IO

data EEPROM

data IRAM

osccal osccal

 

What should I do to track the problem. Any chance that any of you connect with me on skype to help me a little bit more. I want to find this error quickly :(

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

nikel1992 wrote:
What should I do to track the problem.
If you're fortunate a lint would identify the actualized defect.

Otherwise, sprinkle the source code with asserts.

If you have data or information or intuition that there's a probable/likely stack overflow then the assert will test for SP in the stack guard region.

Other asserts can examine for buffer overflows, divide by zero, range constraints, etc (overflow results in an infinite loop that's terminated by the watchdog)

The problem with AVR libc's assert macro is it calls abort(); when SP is out of bounds then abort is inconsistent so exit() may not be invoked (breakpoint in exit(), operate debugger to read data)

Could create an application-specific assert macro to do what you want like write a byte in .noinit that's specific to that instance of assert or write that byte to a spare port to be captured by a logic analyzer.

 

An alternative is a circular buffer in .noinit that contains "breadcrumbs"; this is akin to the in-RAM instruction trace buffer of AVR32 UC3 and ARM Cortex-M.

Another is, if there's a spare SPI or a spare port, to write a unique byte for a source code line; this is the instrumented trace of MPLAB X with REAL ICE's logic analyzer.

 


The Ganssle Group logo

The Ganssle Group

Automatically Debugging Firmware

By Jack Ganssle

Major rewrite: May, 2014

http://www.ganssle.com/dbc.htm

http://www.nongnu.org/avr-libc/user-manual/group__avr__assert.html

http://www.nongnu.org/avr-libc/user-manual/mem_sections.html

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

While I was trying to make some debug I discovered something. I was in debug mode, on pause. Even If I was pausing at a instruction the MCU restarted itself and the code was reexecuted.

 

 

Last Edited: Sun. Oct 1, 2017 - 03:41 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

nikel1992 wrote:
Even If I was pausing at a instruction the MCU restarted itself and the code was reexecuted.
IIRC, there's a debugger configuration during break to disable ISRs and/or timers.

Otherwise, an oscillating Vdd could trigger BOD or the reset signal is inadvertently active.

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

This is another photo of the stack pointer and it seems to keep its value. How can I see who is the entity which writes that portion.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

For me it does not look so good, I think i know the function that writes that part of memory.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I've not read the whole thread but if you have rogue writes why can't you use a data breakpoint to catch the culprit? 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Long time since I wrote here. I still was not able to find the problem. Due to the fact I cannot post the code here, I will try to rename the functions, so i can post it, and maybe someone sees the leak I am not able to see.

 


case STATE1:{
static int state1Counter=0;
led_color(STATE1);
// state1Counter>a*100 pentru un a=10 avem aprox 1 minut pana intra in lock si sleep

if (atoi(g_data.var1) >5){
state1Counter=0;
}
if (Connexion==1){
state1Counter=0;

}
/
if( function1( function1_var1, function1_var2, function1_var3)){
states=STATE4;
accidentState=1;
led_color(STATE4);
}
if(flagFunction==1){
GP_function1(&g_data,nmea_msg,"RMC",msg2);
GP_function2(&g_data);
clear_vector_dimension((unsigned char*)nmea_msg,300);
flagFunction=0;
}
//sms read
if (messageReceivedUart1 == 1){
function2();
function3( &states,states,g_data,p_numbers.uNumber,index_sms);
messageReceivedUart1=0;
clear_rx_buffer(msg1,&msg1_contor,&messageReceivedUart1);
clear_vector_dimension((unsigned char*)msg1,200);
SMSreceived=0;
}
//citiri de date de la BLE
if (uart_test()>0){
//bt_string_correction(&BTJustConnected);
get_line(msg0,Line);

if (get_cmd_type1(msg0)==2){
if (function4(&states,get_cmd_subtype(msg0))==1) continue;
if (function5(&states,get_cmd_subtype(msg0))==1) continue;
if (function6(&states,get_cmd_subtype(msg0))==1) continue;
if (function7(states,get_cmd_subtype(msg0))==1) continue;
if (function8(&states,get_cmd_subtype(msg0))) continue;
if (function9(get_cmd_subtype(msg0))) continue;
if (function10(get_cmd_subtype(msg0))) continue;
if (function11(get_cmd_subtype(msg0))) continue;
if (function12(get_cmd_subtype(msg0))) continue;

}
//date eeeprom
}
clear_rx_buffer(msg0,&msg0_contor,&messageReceivedUart0);
clear_vector_dimension(msg0,200);
}
if(peripheral_function1(&qw, &qx,&qy,&qz,&vx,&vy,&vz)) {
peripheral_function2(qw, qx, qy, qz, &function1_var1, &function1_var2, &function1_var3);
peripheral_function3(&vxg,&vyg,&vzg,qw,qx,qy,qz);
peripheral_function4(&XTh,&YTh,&ZTh,vxg,vyg,vzg,vx,vy,vz);
function1_var1=function1_var1-function1_var1Offset;
function1_var2=function1_var2-function1_var2Offset;
}
//led[0].r=0;led[0].g=0;led[0].b=100;
//ws2812_setleds(led,2);
//_delay_ms(400);
state1Counter++;
if(state1Counter>20*100){
//PORTA&= ~(1<<7);
//sendSMS_input_text(p_numbers.uNumber,"Your motorcycle will automatically lock","OK",50,msg1,&messageReceivedUart1,&msg1_contor);
//PORTA|=(1<<7);
led_color(STATE2);
state1Counter=0;
states=STATE2;
}
_delay_ms(20);
break;
}





-------------------------------------------------------------


case STATE3:{
led_color(STATE3);
static int i=0;
char s[11];

if (Connexion==0){
states=STATE1;
led[0].r=0;led[0].g=255;led[0].b=0;
ws2812_setleds(led,2);

}
if( function1( function1_var1, function1_var2, function1_var3)){
states=STATE4;
accidentState=1;
led[0].r=255;led[0].g=0;led[0].b=0;
ws2812_setleds(led,2);
}
if(flagFunction==1){
//printf("Nmea mesaj este %s\r\n",nmea_msg);
GP_function1(&g_data,nmea_msg,"RMC",msg2);
////////printf("valoare latitudinii%s\r\n",gps_data.var2);
GP_function2(&g_data);
clear_vector_dimension(nmea_msg,300);
flagFunction=0;
}
if (i>=STATE3_FREQ_500ms){
//memorySpace=StackCount();
//printf("Valoare %d\r\n",memorySpace);
if(peripheral_function1(&qw, &qx,&qy,&qz,&vx,&vy,&vz) == 1) {
peripheral_function3(&vxg,&vyg,&vzg,qw,qx,qy,qz);
peripheral_function4(&XTh,&YTh,&ZTh,vxg,vyg,vzg,vx,vy,vz);
function1_var1=function1_var1-function1_var1Offset;
function1_var2=function1_var2-function1_var2Offset;
}
clear_tlmdata(&datetel);
_delay_ms(1);
strcpy(datetel.var2,g_data.var2);
strcpy(datetel.var3,g_data.var3);
strcpy(datetel.var1,g_data.var1);
sprintf(s,"%f",FxTh);
strcpy(datetel.var4,s);
memcpy(s,0,strlen(s));
sprintf(s,"%f",(function1_var1));
strcpy(datetel.function1_var1,s);
memcpy(s,0,strlen(s));
sprintf(s,"%f",(function1_var2));
strcpy(datetel.function1_var2,s);
memcpy(s,0,strlen(s));

send_command_4(datetel);
i=0;
}

if (uart_test()>0){
get_line(msg0,Line);
_delay_ms(10);
if (get_cmd_type(msg0)==2){
//bt_telemetric_data_off(&states,get_cmd_subtype(msg0));
//function5(&states,get_cmd_subtype(msg0));
//function6(&states,get_cmd_subtype(msg0));
if (function7(states,get_cmd_subtype(msg0))==1) continue;
if (function5(&states,get_cmd_subtype(msg0))==1) continue;
if (function6(&states,get_cmd_subtype(msg0))==1) continue;
}

clear_rx_buffer(msg0,&msg0_contor,&messageReceivedUart0);
clear_vector_dimension(msg0,200);
}
_delay_ms(10);
i++;
break;
}

 

I will try to explain what happens. In the second state the MCU works without anyproblem. It is running without any restart for a log time. When I am keeping the MCU in the second state , with the Connexion flag TRUE in order not to change its state, after 3 and a half minutes it automatically restart. I added code which makes every restart report me what was the cause of restart. Apparently none of the restarts which can be tracked with MCUSR is there. The question is why ?:( Hope the code is understandable. Normally I would have sent the clear code , but i am not able to do this. Thank you very much.

Last Edited: Mon. Nov 20, 2017 - 09:30 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

Your code is basically unreadable. Write simple code and your bugs tend to be simple as well. Also check every bit of data that comes from the outside - garbage in, garbage out. At a guess, what you've described is probably buffer overrun or stack overflow. Buffer overruns can be avoided with defensive programming. Stack overflow can be a little trickier. If you're using function pointers, a simple buffer overrun can cause very bizarre symptoms. This can be a compelling reason to avoid function pointers. In fact, MISRA outlawed them in their early revisions, but they've now relaxed that.

The secret of debugging is to make the invisible, visible. If you can't see the problem, then you've got no way of fixing it. Some of these hints have been given before:

1.scale your code back to the minimum the exhibits the problem. It's the needle in the haystack problem. Minimise the haystack

2. Review your code for points of danger - as I've mentioned, function pointers and buffer overrun.

3. flashing a led only gets you so far. Still, you can extract a lot of info using this technique. Implement different flash sequences.

4. Output diagnostics via the serial port. In order to minimise the impact, send single chars.

5. Put 'sentinels' in memory - these are specific values in specific addresses. If you have a buffer, size it so that it should never get full (!) and put a known value at the end. Write code to frequently check for that value. If it's not there, stop and flash an error code. In an embedded system, you usually know what values the function pointer should have - test for these - especially before calling the function!

 

You start with a question - are my buffers overflowing, then test for this. If you prove your buffers aren't overflowing, then that's one less problem. Rinse and repeat. With some careful though and strategy, you hopefully can narrow down the problem field and eventually nasil the cause. Invariably, it is due to a code defect or something is happening that you didn't expect. "It can never happen!" - how many times have I heard this! Even declared it myself. One in a million happens a few times a second with microcontrollers. With experience, one learns how to avoid the simple bugs, but this leaves more complicated ones. Consider this when you write your code ask yourself "how will I test this"?

 

Also remember the analogy of a train wreck. You know where the wreck ended, but the cause is up the track.

Last Edited: Tue. Nov 21, 2017 - 02:53 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Kartman wrote:
 careful though and strategy

Yes - they are the key requirements to effective debugging!

 

Here are some more tips:

 

http://www.8052.com/faqs/120313

 

http://www.ganssle.com/articles/developingagoodbedsidemanner.htm

 

 

EDIT

 

I note that "careful thought and strategy" has already been mentioned - see #3

Last Edited: Tue. Nov 21, 2017 - 09:14 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Kartman wrote:
The secret of debugging is to make the invisible, visible. 

Indeed.

 

And a debugger is a key tool for doing that - see #25

 

Not having a debugger is like tying at least one of your hands behind your back when trying to debug.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Actually I tried with my dragon to debug. But it always restart and i don`t know how to catch it.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Switch to disassembly view and put a breakpoint on location 0x0000

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I will do this as soon as i get home. And after that how can i track it backwards?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You mean find out how it got to the reset? Not easy in AVR. In more advanced micros you get things like "MTB = Micro Trace Buffer" where the micro keeps a history of execution flow so you can see where it came from to get to a break. In AVRs all you can do is take a punt on what might be causing it (quite possibly an interrupt vector that has not been implemented) and put breaks on points where you think it might be coming from and hope you guessed right. But if it's an atrocious bug like:

void foo(void) {
    uint8_t buff[8];
    for(uint8_t i=0; i < 10; i++) {
        buff[i] = 0;
    }
}

then when CALLd that function has a strong chance of returning back to 0x0000 (can you see why?) and if it's something like this it can be a nightmare to catch how it got to 0. You could break at 0 and look at SP and see if the most recently popped value was 0x0000. If so it might have been a RET to 0.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
In more advanced micros you get things like "MTB = Micro Trace Buffer" where the micro keeps a history of execution flow so you can see where it came from 

If you have some spare RAM, you can simulate that by writing some "trace" values and then examining in the debugger.

 

Probably best to arrange it as a circular buffer.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Is it possible to have those problems from the optimization level? Now I have the O1 . I will try also the others. I hope I will find this error. The main ideea is that I have two states with little differences. From my point of view those differences should work very good. 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

If you catch the jump to 0, you can have a look at the SP and maybe get the return address from the stack.

If not, you can use 'breadcrumbs' to hopefully narrow down the location. These are, again, specific values in specific locations. Basically, when you enter a function, set the breadcrumb variable to a given value. When you enter a different function, store a different value. This should allow you to narrow down the specific part of code where things go wrong. You can then put a breakpoint in that area and step through the code. Hopefully the problem becomes evident. If it's a function that gets called from many places, this complicates things a little. You may need to put in more breadcrumbs to narrow down to the line of code. You might even need to add some extra code to further narrow down the conditions of the failure. Generally, once you've determined the general area where the error occurs, you can evaluate the code and identify where it could possibly go wrong.

 

Optimisation might change the observed problem, but the root cause remains.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

So while digging to find the error, every day, I found something. Yesterday while trying to replace parts of code, I found out that the MCU restarts if I don`t call for a function. I started digging in that function and it seems that if I declare a char array[100]; in the function and I call it , my MCU does not reboot. I presume it is a bad pointer which leaks and this memory allocation stops the leakage. If anyone has other ideas I will listen them carefully.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

nikel1992 wrote:
 if I declare a char array[100]; in the function 

That's quite a lot of data to be putting on the stack!

 

surprise

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

nikel1992 wrote:

...a bad pointer which leaks and this memory allocation stops the leakage...

 

Or moves it somewhere which isn't so important (yet).

'This forum helps those who help themselves.'

 

pragmatic  adjective dealing with things sensibly and realistically in a way that is based on practical rather than theoretical consideration.

Pages