Proper multiple USART usage - ATmega2560

Go To Last Post
48 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Due to lack of space (many additional functions), I'm changing old GPS project C/C++ source for ATmega328p to ATmega2560. The old prototype on 328p worked stable (4800 baud software serial tied with GPS module and 115200 baud hardware from device to PC).

 

Now project with ATmega2560 become unstable reseting irregularly. Possibly because one (or both) of the facts:

1. New code use only hardware USARTs

2. Entire code is now > 64Kb 

 

Arduino software 1.8.5 is used, 16Mhz crystal, stable battery power supply 5V.

 

Main functionality is to use interrupt on receiving NMEA data from GPS module and place it to circular buffer, then resend to PC in main loop. GPS baud is configurable from default 4800 to 115200 and configurable with PC by default 115200.

 

In order to eliminate cause of the reset, I can suppose situation when one USART module start to receive data from GPS module in the same time main loop send data to PC through other. In that case receive interrupt may happens during sending data by other USART module, causing reset.

 

Is this scenario actually possible? 

What would be the safe protocol using multiple hardware USARTs including proper BAUD rates?

 

Regarding program memory > 64K. The whole firmware now have 130Kb. The reset or corruption  actually may be caused by some bug in GCC or used Arduino libs (for instance stored PROGMEM data read without far extension, or not properly calling functions from different 64Kb blocks), however that is another issue irrelevant for this topic...

This topic has a solution.
Last Edited: Sat. Dec 23, 2017 - 02:17 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

Define what you mean by "reset" (the contents of MCUCSR will be particularly helpful for this - Don't forget to reset it after each read).

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

mcp601 wrote:
I can suppose situation when one USART module start to receive data from GPS module in the same time main loop send data to PC through other. In that case receive interrupt may happens during sending data by other USART module, causing reset.

 

Is this scenario actually possible?

Yes, it's entirely possible to have both UARTs active simultaneously - but there is no reason why this, in itself, should cause a reset. 

 

If a reset happens, there is a fault in your code.

 

Obvious things to check would be stack overflow & buffer overrun.

 

Is your watchdog enabled? 

 

may be caused by some bug in GCC

Unlikely

 

http://www.catb.org/esr/faqs/sma...

 

 

 

 

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

As it's GCC I should have also mentioned the ISR "catch all". The default action of _bad_interrupt is a JMP 0. (so you've probably enabled some xxIE bit with no matching ISR).

Last Edited: Fri. Dec 22, 2017 - 11:50 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

It mean it start program execution from beginning. Standard software reset, I presume. If disable sending data (by Serial.write from main loop), seems that get more stable...

 

I do not have experience for now using more than one USART module, particularly with this ATmega2560 and this question is explicitly regarding functionality of the hardware.

Last Edited: Fri. Dec 22, 2017 - 12:07 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

 

Please stop with sarcastic remarks, I simply do not need it.

 

If you are not capable for on topic academic discussion, please do not reply.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Just pointing out that the likelihood of it being due to a GCC bug is very small - which is on-topic, as you mentioned it in your OP.

 

Yes, of course it is still possible - but not very likely.

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

mcp601 wrote:
this question is explicitly regarding functionality of the hardware

As far as the hardware is concerned, there is no inherent reason why using two - or more - UARTs simultaneously should cause any problems.

 

 

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

mcp601 wrote:
If disable sending data (by Serial.write from main loop), seems that get more stable...

Does that mean it does still crash - just less often?

 

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Are you aware that 16MHz and 115k2 does not go well together ?

I read in your first entry "New code use only hardware USARTs". Could it be the old code used software UARTS which gave less deviation in baudrate ?

Patrick

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

but that would just cause baud rate error - not software crashing?

 

OP says 115200 is just from AVR to PC - so shouldn't be any problems with receiving corrupt data at the AVR?

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Paddy wrote:

Are you aware that 16MHz and 115k2 does not go well together ?

I read in your first entry "New code use only hardware USARTs". Could it be the old code used software UARTS which gave less deviation in baudrate ?

 

Yes, I'm aware of small error on 16MHz and 115200 baud. However, that is irrelevant for reset case.

 

As well, NMEA data have checksum, thus before it is send to PC it is checked, processed and sent if correct, while on PC, my own GPS visualizer refuse if data are false. Still, I have not seen in refused NMEA data log any line for old 328p based device.

 

Last Edited: Fri. Dec 22, 2017 - 01:18 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

awneil wrote:

mcp601 wrote:
If disable sending data (by Serial.write from main loop), seems that get more stable...

Does that mean it does still crash - just less often?

No, I cannot be certain without at least 24h running test. For now, several hours is passed running without any reset. Earlier reset was after several minutes. The same firmware and condition, only disabled sending data to PC by device button.

Perhaps simple and clean test case with two UARTs in similar configuration will point in direction what causing the problem. Since old project prototype with the 328p worked flawlessly for almost 3 years, there is quite unlikely that clean C/C++ code I wrote cause the problem.

Last Edited: Fri. Dec 22, 2017 - 01:20 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

mcp601 wrote:
Perhaps simple and clean test case with two UARTs in similar configuration will point in direction what causing the problem

+1

 

Since old project prototype with the 328p worked flawlessly for almost 3 years, there is quite unlikely that clean C/C++ code I wrote cause the problem.

Not sure I follow that?

 

If I understand correctly:

 

  1. You had working code on the 328, using only soft UARTs;
     
  2. You have changed processor to 2560,
    and you are now using hardware UARTs,
    and you have added some new code.

 

That's rather a lot of changes all at once!

 

I would go back to the 328 code, and just change one thing at a time; eg,

 

  1. Just port the code to 2560 - still using soft UARTs.
  2. Change to one hardware UART - keep one soft UART
  3. Change to both hardware UARTs
  4. Add extra code.

 

At each step, test thoroughly for crashing.

 

EDIT

 

be sure to keep copies at each step - a software version control system is ideal here, but simply copying the entire project folder tree will do ...

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
Last Edited: Fri. Dec 22, 2017 - 01:28 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I fear my previous points may have been too subtle. Before you can determine the cause here you need to know what kind of "reset" you have. By "reset" I take it you mean code execution has returned to 0. But you need to find out why PC now contains 0x0000. The MCU Status Register is there to help. If it's a power on reset (maybe Vcc dropped?) there's a flag to tell you. If it was the watchdog there's a flag for that. If no such flags are set then it's a soft reset. Something deliberately / accidentally put 0x0000 in PC. That could be a call through a NULL function pointer or maybe writes to stack frame autos over-ran and a function return address was corrupted to 0x0000 but I'm guessing this is the well-known GCC "catch all" operating. Hence my point about xxIE flags.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

mcp601 wrote:
only disabled sending data to PC by device button.
So perhaps something in that routine interferes with the "normal" operation?

David (aka frog_jr)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

awneil wrote:

Not sure I follow that?

If I understand correctly:

 

Not quite. And it is not that simple.

 

As I wrote in first message:

 

1. On 328p, software 4800 baud from GPS module (115200 baud software UART is not possible), 115200 baud hardware to PC. Arduino libs: Nokia5510, SD card, IR...

 

2. On 2560, the same except:

 - Software changed to hardware 4800-115200 from GPS

- All PROGMEM declarations from Arduino libs are removed and changed code accordingly in order to avoid any explicit pgm code.

- some port and interrupt setting are modified to reflect changes to 2560

- Added code for many other functionality (statistic, graphic) not used in main mode (showing parsed data to LCD  and sending to PC), where reset actually happens (default behavior).

 

That is all.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

OK - thanks for correction & clarification.

 

But even more reason to go back to the original, and only make small changes at a time.

 

But your suggestion of making a simplified test case for the UARTs is probably a good place to start.

 

But first, pay attention to clawson's advice about checking the reset reason ...

 

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I also still wonder (in spite of "...I'm aware of small error on 16MHz and 115200 baud. However, that is irrelevant for reset case.") about the Baud Rate error at 115.2K BPS using 16MHz:

Even with the U2Xn bit the error exceeds what I would accept as allowable.

Backing off to 76.8K or 38.4K or lower would seem appropriate...

 

 

David (aka frog_jr)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

But, again, how could a Tx baud rate error make the microcontroller crash?

 

And the OP says (IIUC) that it is Tx only - so no problems due to Rx baud error.

 

EDIT

 

Ah - OP says that GPS now has the option to do 115200.

 

Not clear whether it actually is at 115200, or still at the original 4800 ... ?

 

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
Last Edited: Fri. Dec 22, 2017 - 02:12 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
I fear my previous points may have been too subtle. Before you can determine the cause here you need to know what kind of "reset" you have. By "reset" I take it you mean code execution has returned to 0. But you need to find out why PC now contains 0x0000. The MCU Status Register is there to help. If it's a power on reset (maybe Vcc dropped?) there's a flag to tell you. If it was the watchdog there's a flag for that. If no such flags are set then it's a soft reset. Something deliberately / accidentally put 0x0000 in PC. That could be a call through a NULL function pointer or maybe writes to stack frame autos over-ran and a function return address was corrupted to 0x0000 but I'm guessing this is the well-known GCC "catch all" operating. Hence my point about xxIE flags.

 

I have tested most of the points here earlier.

 

1. I have stable battery powered 5V (4.8 - 4.95 V) I can monitor by device itself (different mode not used in mentioned default). during explicit test it never dropped under 4.78V

 

2. Watchdog timer was used in 328p based device, but now disabled until prove usage for 2560.

 

3. Interrupt code responsible to read UART data should be additionally checked. It was replaced interrupt port according to 2560 datasheet, however, I do not recall any more is checked RX pin before actual reading... That I will recheck certainly once again.

 

All in all, as currently is, it looks resending NMEA in main loop cause the reset. Of course, buffer is small, 80 characters and it is ensure null-termination (not possible to rewrite it, as mentioned it is circular buffer up to it if garbage from GPS is received)...

 

When code become > 64K all stared to break down more obvious, that is the reason to remove and correct code for all PROGMEM data...

 

All that was actually reason for recent JTAG question...

 

 

Last Edited: Fri. Dec 22, 2017 - 02:41 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

My point still remains: "catch all" ISR. What have you done to check this? Have you tried implementing BADISR_vect? If you change the state of an LED (say) within it do you see that occur?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

awneil wrote:

Ah - OP says that GPS now has the option to do 115200.

Not clear whether it actually is at 115200, or still at the original 4800 ... ?

 

And that is exactly second question in original post - to ensure the same, slightly different, evidently larger baud rate?

 

I have tested so far 115200, 57600 and 4800 baud rate, HW UART from GPS. Even I switched and changed different USART ports  - result is the same (reset).

 

BTW. Again, whoever point data baud rate error, missing the point - it is MCU reset issue, not data error issue! That certainly should not reset MCU. Please read first post carefully. 

 

 

 

Last Edited: Fri. Dec 22, 2017 - 03:24 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

mcp601 wrote:
Standard software reset, I presume.

As mentioned earlier, don't presume on this.  Log each startup cause.  If there is no logged reset cause, then somehow the program ended up at 0.

 

Interesting situation, given the mention of code >64KB -- you ported the app to the different processor, and it doubled in size?

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:

My point still remains: "catch all" ISR. What have you done to check this? Have you tried implementing BADISR_vect? If you change the state of an LED (say) within it do you see that occur?

 

Yes, I will implement it and report result tomorrow. Thank you.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

theusch wrote:

Interesting situation, given the mention of code >64KB -- you ported the app to the different processor, and it doubled in size?

 

Why most of people answer and do not read the source? 

 

"- Added code for many other functionality (statistic, graphic) not used in main mode (showing parsed data to LCD  and sending to PC), where reset actually happens (default behavior)."

 

May I ask moderator to delete such messages including this one? It just make unnecessary pollution. Clawson?

 

In my original post I even do not ask for any analyze or suggestions, just a hardware behavior I have not found in the datasheet.

Last Edited: Fri. Dec 22, 2017 - 03:29 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson keeps pointing you to methods that might establish how the reset is occurring, and you keep ignoring his advice.

  1. Implement the BADISR_vect to see if that is being triggered
  2. Add a routine to check the value of MCUCSR on startup to see what might have triggered the reset

David (aka frog_jr)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

frog_jr wrote:

clawson keeps pointing you to methods that might establish how the reset is occurring, and you keep ignoring his advice.

 

I do not need insinuations as well... 

 

clawson makes some valuable point about reset situation, I certainly will perform, but some other makes quite pollution including your post about data baud rate error (how that is valuable? If you can explain use MP or open another topic)...

 

I have also reply that I will perform test case regarding it. Did you read that?

 

All in all it is frustrating to ask one and get numerous unrelated replies, insinuations and sarcastic comments. It leaves quite bitter impression...

 

Last Edited: Fri. Dec 22, 2017 - 04:49 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

mcp601 wrote:
 it is MCU reset issue, not data error issue! That certainly should not reset MCU.

It is certainly possible that  corrupt data could cause code to crash - ie, reset.

 

 

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

mcp601 wrote:
I even do not ask for any analyze or suggestions, just a hardware behavior

So that was answered in #3:

I wrote:
it's entirely possible to have both UARTs active simultaneously - but there is no reason why this, in itself, should cause a reset

And again in #8:

I wrote:
As far as the hardware is concerned, there is no inherent reason why using two - or more - UARTs simultaneously should cause any problems.

 

If that's all you wanted, then mark one of them as The Solution: https://www.avrfreaks.net/comment...

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

After some further testings:

 

1. There is no MCUCSR register in 2560, there is MCUSR (0x34) and MCUCR (0x35). Just for clarity, the same functionality...

2. After enabling watchdog and deliberately executing code, according datasheet, WDRF flag in MCUSR register should be enabled. It reset, however WDRF is 0. In fact both MCUSR and MCUCR are 0. Value is printed first inside setup() section.

3. BADISR_vector is defined and never triggered.

 

It is possible that:

1. MCUSR and MCUCR are reset automatically somewhere in the code by arduino init routine. However, finding specific code fail. Making minimalistic code avoiding Arduino software, or better clean ASM is once again necessary to eliminate that suspicious. I do not see errata documentation for 2560.

 

2. The 2560 is a fake or partially functional. That would explain all trouble. Except it is bought in reputable electronic shop 10 Euro and never "abused" in any way.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

Suggestions of fake silicon or compiler bugs are usually grasping at straws trying to cover the inevitable: your code simply has bugs.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

mcp601 wrote:
It is possible that:

1. MCUSR and MCUCR are reset automatically somewhere in the code by arduino init routine.

Indeed - see: http://forum.arduino.cc/index.ph...

 

EDIT

 

The Arduino forum doesn't necessarily seem to go to (quite) the right position in the thread with that link.

 

Specifically, it's the post by westfw: #9, Jun 11, 2014, 09:39 pm 

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
Last Edited: Sat. Dec 23, 2017 - 12:51 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
Suggestions of fake silicon or compiler bugs are usually grasping at straws trying to cover the inevitable: your code simply has bugs.

Agreed.

 

But now prepare to be castigated for sarcasm, insinuation, pollution, etc, ...

 

frown

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
This reply has been marked as the solution. 
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Indeed it is uses default arduino  bootloader and I did not aware that that reset MCUSR... Thanks, awnel, for at least one useful comment instead misreading and misunderstanding the topic, insinuation, sarcasm....

 

Clawson, you as well missed the point of the topic, even you are a moderator. Mixing possibilities with suggestion, your incorrect conclusion and your ignoring to delete off topic messages says much about this forum...

 

It is quite low ability here even to understand essence of the topic, despite iy is clearly explain...  :(

 

Now, since this thread is gone to drain and as well one thing is left I actually had to do first(check old PCB, especially for cold joins, corrosion or short, especially in 74HC4050 block):

 

1. As far I tested, reset by code can as well (by side of non-handled PUSH/POP) trigger unknown opcode (including default 0xFFFF) 

2. Explain how bug in data read by RX have circular buffer and ensure null-termination can reset MCU?

3. Even point 2 is possible, how sending corrupted data with ensured null-termination to PC can reset MCU?

4. Can short or cold joint with in 74HC4050 or even unused input pins for both (MCU and 4050) be cause of the reset?

 

All this is rhetorical. And for hardware questions, in any case, much better is Microchip/Atmel free support.

Last Edited: Sat. Dec 23, 2017 - 02:21 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I cannot address your numbered question list to your satisfaction, as you have reported no reset cause reported.  Thus, the AVR is not resetting.

 

But indeed, it is arriving back at zero.  And that tells us something.

 

Side note on 1. -- on AVR8s, "unknown" op code of 0xffff (which unused flash is populated with) is generally benign.  For almost all purposes, think of it as a NOP.  So, if your PC (Program Counter) goes awry and ends up in unused flash the AVR will indeed march along until the PC wraps, and the zero is reached.

 

How does the PC go to 0?  There are probably many causes; respondents above have mentioned some of the most common and thus worthy of immediate consideration to be efficeint with time usage to dig out the cause of your situation.  As mentioned, a common one when doing a port, especially a large app, could be "misspelled interrupt vector" given that the BADISR action is a jump to 0.   It isn't clear to me whether you implemented your own trap and have eliminated that [most?] common situation.

 

Hmmm--a bootloader does a jump to 0, but no bootloader has been mentioned.

 

It could be something with the >64KB and trampolines and such -- I'm not familiar with your toolchain.  As this appears to be a sticky situation it might well be worth your while to cut the app down to below 64KB and see if the symptoms change.

 

The obvious C situation -- from when AVRs were just a twinkle in Alf's eye -- is using a null (value of 0) function pointer.  Does your code use function pointers anywhere?  I'm not a heavy "call back" user, but when I do use them in my production apps I'll have a guard test before invoking, and trap/log code if out of range.

 

 

 

 

 

 

 

 

 

 

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

theusch wrote:
no bootloader has been mentioned

But Arduino has - I guess a bootloader is implicit in that?

 

it might well be worth your while to cut the app down to below 64KB and see if the symptoms change.

This was suggested in #13 (and affirmed in #14) - but no results reported.

 

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

mcp601 wrote:
Explain how bug in data read by RX ... can reset MCU?

Just one obvious example, straight off the top of my head: If the data is used to index an array/table, then a corrupt byte could cause problems...

 

Can short or cold joint with in 74HC4050 or even unused input pins for both (MCU and 4050) be cause of the reset?

Yes,  hardware faults/errors around the microcontroller could cause problems.

 

This would be one reason for checking the Reset Reason: to see if your power supply is actually dropping-out - perhaps due to a faulty/intermittent connection ...

 

 

 

Quote:
for hardware questions, in any case, much better is Microchip/Atmel free support.

But, again, this is unlikely to be a problem with the MCU hardware - far more likely a software bug.

 

 

EDIT

 

typos

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
Last Edited: Sat. Dec 23, 2017 - 03:03 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

mcp601 wrote:
Since old project prototype with the 328p worked flawlessly for almost 3 years, there is quite unlikely that clean C/C++ code I wrote cause the problem.

Beware that it is entirely possible - and not uncommon - for code to have latent ("hidden") bugs that give no (visible) symptoms. Possibly for extended periods of time.

 

And making changes - even apparently unrelated changes - will often bring such bugs to light.

 

EDIT

 

 "long time"

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
Last Edited: Sat. Dec 23, 2017 - 03:05 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

awneil wrote:

But, again, this is unlikely to be a problem with the MCU hardware - far more likely a software bug.

 

Again, my question was strictly hardware one about proper multiple USART usage. I did not found inside datsheet and I only offered basic functionality, not details - since I have expected answer only for that. But no one read or understand that, causing many of you making bad assumptions running for the cause of the reset!

 

This is my long term hobby and beside I'm professional senior desktop programmer, I'm hardly beginner even this is one of my first messages here. Even so, I have some minor to major "gaps" to this hobby, but not C/C++ or numerous higher languages and ASM, so no body can make such claims. However, I'm trying to fill some "gaps" with this hobby, looking deep "under the hood" reading datasheets and asking "on reputable place" (at least I considering is, or considered was).

 

And for these rhetorically marked questions I'm aware of answer, so yes, they are really rhetorical.

 

Therefore, my question was short and explicit, explicitly regarding what I need to know, however fail to find or understand.

 

One of the rules in any higher prog. language support procedural types is to avoid it as much is possible. They are quite handy  with test cases, however they should be avoided, especially in embedded world.

 

The error in data sent by UART cannot cause MCU HW reset, that is simply not possible, AFAIK. if the code cannot handle received garbage, not handle length of the buffer or null-termination that would be fatal error only programmer novice makes and it is reproducable always. Etc...

 

Such "stories" and as well "great manners of participants" caused by elementary hw question and now "washing hands", makes this thread quite polluted and down to drain... And quite bitter taste in mouth...

 

That is the main reason any of you, considered itself able to provide useful on the topic answer, should have priority to focus to the essence, unless OP ask something else related to it, but not quite to the topic. Otherwise, "Help me!", "I'm lost!" and similar, even without reaction of moderator or even his support, are quite adequate topic titles for anything.

 

I hope it is all clear and no reason to continue further off topic. As old Latin said: "Sapienti sat!" :)

 

Last Edited: Sat. Dec 23, 2017 - 04:49 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'm lost.  I tried to give a fleshed-out response, but we are still, apparently, off-topic.

 

So I went to the original post.  The first thing I noticed was that OP had marked a solution.  I must have missed that!  So I went there--but #19 isn't marked as the solution. ???

 

And OP goes on later to say that pointing out that that bad rate/clock speed combination, though not practical for useful and reliable comms, is besides the point of the resets.

 

OP keeps going back to "resets".  I've stated several times that there is no evidence of resets.  IMO if you continue on this quest for multiple-USART-usage-causing-AVR8-resets that you will be on a fruitless search for the Holy Grail.  [Side note:  I have several lines of production controllers using Mega640-class AVR8s, and two to four USARTs active.  There are no timing problems regardless of when e.g. start bits hit on multiple USARTs.  If there were, those apps with all four going at time would have eventually hit such timing windows, and operation would not be smooth.  So I would know about it.]

 

 

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

mcp601 wrote:
I did not found inside datsheet and I only offered basic functionality, not details - since I have expected answer only for that.

So go on, then - what "details" is it that you require?

 

What do you feel is lacking from what's in the datasheet?

 

The error in data sent by UART cannot cause MCU HW reset, that is simply not possible, AFAIK.

Well, it can cause the software to crash; which can appear as a reset - see #38 for one example.

 

 

if the code cannot handle received garbage, not handle length of the buffer or null-termination that would be fatal error only programmer novice makes

Well, an experienced programmer should be less likely to make such mistakes - but of course it is still possible.

 

 

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

mcp601 wrote:
Clawson, you as well missed the point of the topic, even you are a moderator. Mixing possibilities with suggestion, your incorrect conclusion and your ignoring to delete off topic messages says much about this forum...

 

Yes, it says that he has the knowledge and vision to know that 99% of the problems we see are programming errors and not Physical hardware.  EVERYONE here has been trying to get you to see that.  

 

mcp601 wrote:
It is quite low ability here even to understand essence of the topic, despite iy is clearly explain...  :(

NONSENSE!!  It is YOU who cannot explain yourself clearly.

 

mcp601 wrote:
All this is rhetorical. And for hardware questions, in any case, much better is Microchip/Atmel free support.

Then by all means go there.

 

 

 

99.0% of the time the problems brought here are programming/code errors, and not the hardware.  The other .1% are hardware related and usually its because the OP does not bother to read the errata in the datasheets.

 

 

All the best on your quests.

Jim

 

 

If you want a career with a known path - become an undertaker. Dead people don't sue! - Kartman

Why is there a "Highway to Hell" and only a "Stairway to Heaven"? A prediction of the expected traffic load?  - Lee "theusch"

 

Speak sweetly. It makes your words easier to digest when at a later date you have to eat them ;-)  - Source Unknown

Please Read: Code-of-Conduct

Atmel Studio6.2/AS7, DipTrace, Quartus, MPLAB user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Again, my question was strictly hardware one about proper multiple USART usage.

Oh well, here it goes...

 

As others have said, many of us have multiple projects that use multiple USARTs, typically interrupt driven and ring buffered, with NO PROBLEMS.

 

So, as said before, the answer to your highly focused question is this:  No, there are no intrinsic limitations in using multiple hardware modules simultaneously.

(And hence no discussion of any such limitations within the data sheet.)

 

That said, I guess the Thread could be closed.

 

Of course if the OP is frustrated with a failed project migration to a new uC, then the above "off topic" discussions and comments might well lead one to a solution!

 

JC

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

jgmdesign wrote:

Yes, it says that he has the knowledge and vision to know that 99% of the problems we see are programming errors and not Physical hardware.  EVERYONE here has been trying to get you to see that.  

 

 

NONSENSE!!  It is YOU who cannot explain yourself clearly.

 

Then by all means go there.

 

You all presents youselves in great view. Congratulations!

And definitely deserves yourselves.

As long I'm concerned, you can delete whole thread, even any I started.

 

Only one thing missing - to make this forum closed and receive only el. eng.

Similis simili gaudant, at end.

 

I have already gone, thank you very much for your involvement as well.

Last Edited: Sat. Dec 23, 2017 - 09:45 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

If you dropped a bunch of PROGMEM type stuff, you may be using more SRAM for stuff. What's your actual runtime data usage? Stack usage? The hardware can all handle multiple hardware UARTs at once, but if you have too much stuff in main memory, you can start getting *really weird* failure modes that can be very hard to diagnose, and certainly "apparently resetting" would be an obvious one.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

@the_real_seebs - did you not notice:

in #45, mcp601 wrote:
I have already gone

 

the_real_seebs wrote:
if you have too much stuff in main memory, you can start getting *really weird* failure modes that can be very hard to diagnose

 

Indeed - as noted in #3:

 

I wrote:
Obvious things to check would be stack overflow & buffer overrun.

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Oh, I noticed. But the thing about the Internet is, this thread will still be here, and other people might some day have similar issues and want help.

 

I'd rather post a helpful suggestion than not, so future generations might find the thread more rewarding.