Issue with atmega328pb - I/O pins not reading or turning on

Go To Last Post
15 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hello all,

 

We have an application running on an atmega328pb. It has been in development for several months and we have just started in-house testing it on multiple pieces of hardware. In doing so, we have discovered a bizzare issue that I just can't figure out.

 

Every X power-on-resets (where X is very variable, so far seen low as 20 or as high as 100), the device will enter a state where the I/O ports do not respond to writes or reads. Some examples:

  • A read of ADC5, with a 3.3V signal applied is read as 0 (this result is inferred from the application behaviour, not directly).
  • There are two status LEDs that are connected to PORTC from the 5V rail (with appropriate resistors). These are controlled from the application and should be off or flashing, but they are constantly on (the two I/O pins are constantly low).
  • There is a debug LED that should be flashing, but the pin is constantly low.
  • A temperature sensor attached to ADC6. The application is returning ADC values suggesting that there is ~35V on that pin (there isn't!)

 

All these features have been tested and work correctly in normal operation (i.e. almost all power-on-resets result in a successful start and sensible behaviour of the application and the hardware).

 

When this condition occurs, the application code is still running. I can send/receive commands/data to the application, but if that command/data ends up touching an I/O pin, it won't change the state of that pin, or read the pin correctly.

The sections of code that interface with I/O pins are certainly running - or at least, lines of code before/after those lines are executed (for instance, logging commands placed immediately before/after will result in output).

 

Clearly, the I/O ports are still working to some extent, because both USARTS are sending/receiving data.

 

I have only seen this behaviour at POR, not when reset from external reset or the watchdog.

 

Information:

 - ATMEGA328PB, 16MHz crystal

 - Fuses (Low, High, Extended): 0xCE, 0xDE, 0xFF

 - Power supply: input is 48VDC, 5V through linear regulator (<-- if you're wondering, yes this gets hot! It's a D2PAK package. Other design constraints meant we couldn't use a switching supply. The MCU is far enough from the regulator that it doesn't get very warm, and the error can occur when the board has not had time to warm up.)

 - I am running a bootloader, which is optiboot modified to use RS485

 - The application uses 61.6% of program memory, 58.1% of data.

 - The watchdog is enabled with a 500ms timeout.

Some additional background which I don't think is particularly relevant, but included for completeness:

 - the application takes MODBUS commands over an RS485 bus and controls some LEDs using PWM from the 16-bit timers.

 - The second USART is used as a logging output

 - I am using the output compare on TimerB in such a way that I am implementing the workarounds described here: https://www.avrfreaks.net/comment/1717946#comment-1717946

 

I'm truly baffled by this. Are there conditions in which the I/O hardware might enter a state where they become unresponsive, unreliable or otherwise deviate from normal operation?

Since it can take a long time to produce the issue, I'm wary of just testing around it blindly, so I thought I would post early and see if anyone has any advice on the best approach to this.

Right now I'm focusing on producing software builds with additional diagnostics in the hope of getting more information about the MCU state.

 

Any help, information or advice is (as alwasy) very much appreciated. I hope I've been clear enough about what's happening. Apologies for any omissions, errors or spelling/grammar mistakes. This is causing a bit of stress.

 

Thanks in advance,

James 

 

Edit: added clarifying sentence about USARTs working

Edit 2: added extra sentence about reset behaviour

Last Edited: Mon. Aug 21, 2017 - 03:59 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

When you are doing the POR, has the power had a chance to completely go to 0V?

Your fuse settings suggest you are not using the Brown-out Detection... This "might" be a source of the problem...

 

Edit: typo

David (aka frog_jr)

Last Edited: Mon. Aug 21, 2017 - 03:26 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'll have to do some more testing to determine that. Good call on the BOD though, I hadn't considered that (since, as you say, I'm not using it).

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

A read of ADC5, with a 3.3V signal applied is read as 0

Does the input voltage go to zero when the chip is off? Otherwise the chip is likely to be phantom powered by the ADC input with unknown consequences.

John Samperi

Ampertronics Pty. Ltd.

www.ampertronics.com.au

* Electronic Design * Custom Products * Contract Assembly

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Need to see a schematic of the board, including the power supply.

 

As mentioned above:

 

Pseudo-powering the micro, with power applied via an I/O pin when the micro's power is off can latch up / lock up the chip.

 

Likewise, as mentioned, there are spec's on the power supply that are not usually an issue, but include such things as the rate of rise of the Vcc, and that the Vcc must be monotonically increasing.

If, at some partial Vcc voltage on power up, some component ( s ) suddenly kick on and draw more current, and the power supply can't handle it, then the Vcc can have a glitch it its waveform on power up, and that can lock up the chip.

 

The other item to add is to make sure that EVERY Vcc and AVcc are connected to V+, and that EVERY Ground and AGround is connected to Ground, and that there is a By-Pass cap, perhaps 0.1 uF, across EVERY Vcc/Gnd and AVcc/Gnd pairs of pins, mounted as close to the micro's pins as possible.

 

It is very possible to have erratic operation of the micro if it is not properly by-passed, it can work some times, fail sporadically...

 

Recall, also, that AVcc powers the PortA pins, and perhaps a few others, (I'm not looking at a data sheet at the moment).

If you forgot to connect AVcc to V+ then you will have erratic operations on the PortA pins.

 

JC

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

It might not be hardware at all, could be a variable that, for instance, is not setting your DDRx port directions..thus thereafter the ports won't work at all, even if various app commands are arriving.

Check that your initializations are not being interrupted (hence don't allow any interrupts until initializations completed).

Make sure you haven't implemented any reentrant code (at least without extreme care) (Don't call other routines within interrupts which might be called elsewhere). 

Be on the lookout for race conditions or misused variables (such as a slightly wrong name)...a debounce event happens sooner than expected, so the counter variable my_port_val is occasionally different than on other runs.  Except you used also used it where you meant to use my_port_dir (even though you've stared at the code 100 times).

 

Ignoring carries and rollovers can cause whacko issues. Or "temporary" overflows (3+123456789+2-123456789==5??? maybe, maybe not).

 

I have only seen this behaviour at POR, not when reset from external reset or the watchdog...be careful, it might be happening either way but much more rarely (epically if some sort of "race" error)

 

 Strip a few things out , then more until it stops failing. then add back in..that can quickly narrow the problem, or give insight. 

   

Right now I'm focusing on producing software builds with additional diagnostics ...forget about that now, it will just add confusion & uncertainty (you can't get a regular led working let alone one used to tell you diagnostic's--how would you believe anything?)...Remove code, cancel routines, eventually you will be down to 15 lines of code & it will work :)

 

If it is gnd lines, power lines, caps, wiring, xtal,  etc, make a small pattern blinker program...load it & see how often it fails (or doesn't)..will tell you the hardware is junked or ok.

When in the dark remember-the future looks brighter than ever.

Last Edited: Tue. Aug 22, 2017 - 12:15 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hello all,

 

I'm in the same situation. In my case, I'm replacing ATMEGA328P to ATMEGA328PB, and the hardware is the same.

 

Sometimes, after a POWER ON RESET, some I/O pins don't response. The program is running, but these pins don't response.

 

I don't have always the same response. Sometimes, 4 pins of portd are no responding, and sometimes are another ones.

 

The steps to generate the .hex file were:

 

- I created a new project for this ATMEGA328PB

- I imported the files from the ATMEGA328P project

- I changed the register names which have have changed (PRR -> PRR0, for example)

- To finish, I compiled

 

jfowkes, did you resolve your issue?

 

Thanks

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

As was mentioned in the above issue, are you using BOD and is it set for the correct level?

 

Jim

 

Mission: Improving the readiness of hams world wide : flinthillsradioinc.com

Interests: Ham Radio, Solar power, futures & currency trading - whats yours?

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yes, I'm using BOD. I'm using the same configuration of the previous project (ATMEGA328P).

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

depunto wrote:
I'm replacing ATMEGA328P to ATMEGA328PB, and the hardware is the same.
   Well Almost...

Pins 3/6 used to be VCC/GND, now they are i/o pins, has the h/w been modified to account for this?

Are you using the Analog Comparator?  The ACO signal can appear on pin 3?

 

Jim

 

Mission: Improving the readiness of hams world wide : flinthillsradioinc.com

Interests: Ham Radio, Solar power, futures & currency trading - whats yours?

 

Last Edited: Thu. Mar 22, 2018 - 01:19 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The PCB is the same. I understand it is not a problem

 

In reset, PORT E is configurated as INPUT (HZ). Thus, the microcontroller will read 1 or 0 in this port. It should not be a problem, and this port should not be damaged.

 

 

Last Edited: Thu. Mar 22, 2018 - 01:29 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'm not using the Analog comparator

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I did fix this issue in the end, though I'm afraid I can't remember exactly how (it's in the comments/commit log, but I don't have access to those anymore). Should have replied when I fixed it.

 

From memory, I did the following:

 - Turn on brownout detect (can't remember the setting)

 - Set the clock startup as slow as possible

 - Turn on Clock Failure Detection

 

I'm afraid I can't remember more detail than this, but pretty sure one (or a combination) of these resolved the issue.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Regarding the settings you changed:

 

- BOD ON

- Clock startup -> Internal RC 6 CK / 14CK +65ms

- Clock Failure Detection: OFF -> This setting is OFF, because I understand I don't need this setting because I'm using the internal RC clock. If I understand well, when the external clock fails, this mechanism switch automatically to the internal RCclock. Isn't it? 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Any idea??

 

Thanks for your colaboration