AVR programming with !RESET watchdog?

Go To Last Post
25 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I've got a project that has a Mega328PB with an external clock and the AVR watchdog isn't quite enough to insure reliability (during the first few minutes after startup there's an occasional glitch that's enough to lock things up. No, there's really nothing I can do about this. Previous versions of the project had an ATTiny841 that had the ability to programmatically switch clock sources, but the 328PB can't), so I've added an external watchdog circuit. Everything is fine when the part is programmed, but the bootstrap is a problem. Before programming has occurred, there's effectively a 500 Hz (or so) square wave on !RESET. avrdude says that the chip isn't responding when I attempt to program it.

 

Now, I'd expect this to be a non-issue. Programming starts by pulling !RESET low (the watchdog output is an open-drain, so something else pulling !RESET low is no problem). Why should it matter if !RESET is *already* low?

 

I've added a solder jumper to allow the watchdog to be inhibited for programming, but this seems like it oughtn't to be necessary.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

In the M328PB, the RESET pin is used as a control line for ISP programming and it is used as DebugWire I/O for debugging. An external WatchDog is the last thing you want connected to that pin. I really question the need for that external WatchDog.

 

Jim

Jim Wagner Oregon Research Electronics, Consulting Div. Tangent, OR, USA http://www.orelectronics.net

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

during the first few minutes after startup there's an occasional glitch that's enough to lock things up

Why not get rid of this glitch..doesn't sound like a healthy design.   

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

avrcandies wrote:

during the first few minutes after startup there's an occasional glitch that's enough to lock things up

Why not get rid of this glitch..doesn't sound like a healthy design.   

 

Sigh.

 

Because I can't redesign the FEI 5680A rubidium oscillator module, that's why I can't get rid of the glitch.

 

Gadzooks, I hate it when asking a simple question turns into a dissertation defense.

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

ka7ehk wrote:

In the M328PB, the RESET pin is used as a control line for ISP programming and it is used as DebugWire I/O for debugging.

 

That's something I hadn't considered. I could try turning off the debug wire fuse and see if programming works, but it would just be an academic point, as it still has to work with a virgin chip.

 

Quote:

An external WatchDog is the last thing you want connected to that pin. I really question the need for that external WatchDog.

 

Welp, if the internal watchdog would work as advertised, it wouldn't come to that.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

How does the internal watchdog not work? Are you aware that you need to write some carefully timed bytes? From the spec sheet (emphasis added):

To further ensure program security, altera- tions to the Watchdog set-up must follow timed sequences. The sequence for clearing WDE and changing time-out configuration is as follows:

  1. In the same operation, write a logic one to the Watchdog change enable bit (WDCE) and WDE. A logic one must be written to WDE regardless of the previous value of the WDE bit.

  2. Within the next four clock cycles, write the WDE and Watchdog prescaler bits (WDP) as desired, but with the WDCE bit cleared. This must be done in one operation. 

I've had no problems when I follow these rules.

 

Jim 

Jim Wagner Oregon Research Electronics, Consulting Div. Tangent, OR, USA http://www.orelectronics.net

Last Edited: Sun. Feb 18, 2018 - 06:23 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The watchdog is mainly intended for unexpected failures as a form of ultimate backup; expected or known problems, such as glitch sensitivity, should be dealt with in the system design.   Stating it's impossible to get rid of the glitch going to the processor is suspicious at best.  This does not mean you need to redesign your oscillator, it perhaps needs some additional circuitry to prevent any glitches from reaching the processor.   This might include a secondary clock circuit taking over, if the primary has a momentary lapse, failure, or startup issues. The secondary could start things up & switch to the primary after a time delay. If both of these fail, then the watchdog can kick in and take the system to a safe, recovery, or shutdown state. 

 

Techniques to make clock switching glitch free:

https://www.eetimes.com/document.asp?doc_id=1202359

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Maybe the dongle is smart enough and sees that the reset pin is not stable. And says "no" to programming from the start, without even trying...

Just guessing. What dongle is it?

 

More weird is the fact that the internal watchdog is useless. Quite a bad knews in fact, since it works on a separate 128kHz internal oscillator which is supposed to always run.

 

Another option to use on 328PB is the CFD (Clock Failure Detection). Works on external clocks too, switches to 1MHz internal (8MHz/8) and sets a flag. You can periodically poll the flag and generate a hard reset (through the watchdog) when it's set.

 

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I suspect the problem is that the RST line is toggling without your programmer knowing. IIRC the RST pin does not go low and stay low for the whole of the programming cycle but toggles at least a couple of times.

 

So your problem is that your chip is externally clocked by your oscillator module and initial instability in that clock signal causes lock-ups?

 

Have you determined whether these lock-ups are software ones or hardware?

 

I'm guessing you clock the chip externally for time-keeping purposes?

 

How about feeding your oscillator signal to a PLL with a long time-constant and fairly narrow lock-in range. When the signal is stable your chip will still accurately track your reference; when the reference glitches the PLL will smooth things out.

#1 This forum helps those that help themselves

#2 All grounds are not created equal

#3 How have you proved that your chip is running at xxMHz?

#4 "If you think you need floating point to solve the problem then you don't understand the problem. If you really do need floating point then you have a problem you do not understand." - Heater's ex-boss

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

nsayer wrote:
Gadzooks, I hate it when asking a simple question turns into a dissertation defense.

I want to here more about the deficiencies of the AVR watchdog.  That dissertation defense will be useful to all of us with production AVR8 designs that depend on the watchdog. 

 

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

ka7ehk wrote:

How does the internal watchdog not work? Are you aware that you need to write some carefully timed bytes? From the spec sheet (emphasis added):

To further ensure program security, altera- tions to the Watchdog set-up must follow timed sequences. The sequence for clearing WDE and changing time-out configuration is as follows:

  1. In the same operation, write a logic one to the Watchdog change enable bit (WDCE) and WDE. A logic one must be written to WDE regardless of the previous value of the WDE bit.

  2. Within the next four clock cycles, write the WDE and Watchdog prescaler bits (WDP) as desired, but with the WDCE bit cleared. This must be done in one operation. 

I've had no problems when I follow these rules.

 

Jim 

 

I'm really, really, REALLY sure I'm doing it right. I know that because if I stick an endless loop in the code without wdt_reset(), it does, in fact, reset.

 

It does not, however, recover properly when the very occasional clock glitch occurs from the oscillator before it has indicated that it is warmed up. And, unfortunately, I can't programmatically change from an internal clock to the external clock on the M328PB. Back when I was using the Tiny841, that was the solution - switch to the external clock only once the oscillator has told us that it's warm-up is complete. But I ran out of flash in the 841, so I went to the 328PB, where - again - I can't programmatically change clock source on the fly.

 

Again - dissertation defense.

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Brian Fairchild wrote:

I suspect the problem is that the RST line is toggling without your programmer knowing. IIRC the RST pin does not go low and stay low for the whole of the programming cycle but toggles at least a couple of times.

 

Nope. The output of the hardware watchdog is an open-drain. There's a 10k pull-up resistor, but the programmer can easily overcome that.

 

Quote:

 

So your problem is that your chip is externally clocked by your oscillator module and initial instability in that clock signal causes lock-ups?

 

Have you determined whether these lock-ups are software ones or hardware?

 

 

Since the internal watchdog is ineffective, I'd have to conclude they're hardware.

 

Quote:

 

I'm guessing you clock the chip externally for time-keeping purposes?

 

 

Well, frequency-keeping, but yeah.

 

Quote:

 

How about feeding your oscillator signal to a PLL with a long time-constant and fairly narrow lock-in range. When the signal is stable your chip will still accurately track your reference; when the reference glitches the PLL will smooth things out.

 

Because the entire purpose of this board is to be the PLL that controls the frequency of the oscillator. It's a GPS discipline board for the 5680/5660.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

It does not, however, recover properly when the very occasional clock glitch occurs from the oscillator before it has indicated that it is warmed up.

Why are you even running the AVR at all before the clock source is known to be stable?  I've read some of the specs of that rubidium oscillator.  There is a 'warmed up' output.  Use that to keep the AVR in reset until the osc is stable.

 

during the first few minutes after startup there's an occasional glitch that's enough to lock things up

 

Define 'lock things up'.  Surely, if it's a matter of sending the PC off into la-la-land, then the internal watchdog is sufficient to provoke a device reset.  If it means something else, then you should tell us.  The quality and relevance of the answers you receive may improve.

 

Again - dissertation defense.

You are asking for free help from strangers.  If you're unsatisfied with the answers you're getting, you can just ask for your money back.

 

EDIT:  Typos

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

Last Edited: Sun. Feb 18, 2018 - 09:22 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Since the internal watchdog is ineffective, I'd have to conclude they're hardware.

I would conclude that you're not correctly using the watchdog.  'Correct' here goes beyond the basic configuration of the appropriate registers to get the WDT going.  If speaks to properly assessing the health of your app by collecting statistics from the various mission critical modules of which your app is composed, and making a determination that everything is OK.  When it is, and >>only<< when it is, you reset the watchdog.  Simply having a wdr opcode at the bottom of your super loop does next to nothing for you.  If the glitch you're suffering from results in a corruption of some SRAM variable, or a misconfiguration of an I/O register, you will never catch it.  Robust software must rely on more than a single unqualified wdr.

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

nsayer wrote:
It [mega328PB] does not, however, recover properly when the very occasional clock glitch occurs from the oscillator before it has indicated that it is warmed up.
fyi, XMEGA AVR are more tolerant of that than megaAVR.

http://ww1.microchip.com/downloads/en/DeviceDoc/Microchip%20AVR%20microcontroller%20ATmega328PB%20Data%20Sheet%2040001906B.pdf (3.8MB)

(bottom of page 409)

Change in period from one clock cycle to the next

2% max

via http://www.microchip.com/wwwproducts/en/atmega328pb

http://ww1.microchip.com/downloads/en/DeviceDoc/Atmel-8153-8-and-16-bit-AVR-Microcontroller-XMEGA-E-ATxmega8E5-ATxmega16E5-ATxmega32E5_Datasheet.pdf (2MB)

(bottom of page 85)

Change in period from one clock cycle to the next

10% max

via http://www.microchip.com/wwwproducts/en/atxmega32e5

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Instead of a watchdog, why not put a really long one-shot (or better yet, tinyxx processor)  to keep the '328 in reset for a few minutes after powerup, until the osc had settled?  ....Also, many rubidium osc's also have a warmed up/freq locked logic output...couldn't that be used to let  the '328 exit the reset state?--virtually no parts needed.  Once the locked state signal is given release reset; I'd hope no more clock glitches & your '328 is ready to roll..of course '328 would do nothing until  that time.  Right now you are resetting/watchdogging anyhow during that initial time. 

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

nsayer wrote:

Again - dissertation defense.

 

Your attitude sucks. Good luck with your project.

#1 This forum helps those that help themselves

#2 All grounds are not created equal

#3 How have you proved that your chip is running at xxMHz?

#4 "If you think you need floating point to solve the problem then you don't understand the problem. If you really do need floating point then you have a problem you do not understand." - Heater's ex-boss

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

joeymorin wrote:

Since the internal watchdog is ineffective, I'd have to conclude they're hardware.

I would conclude that you're not correctly using the watchdog.  'Correct' here goes beyond the basic configuration of the appropriate registers to get the WDT going.  If speaks to properly assessing the health of your app by collecting statistics from the various mission critical modules of which your app is composed, and making a determination that everything is OK.  When it is, and >>only<< when it is, you reset the watchdog.  Simply having a wdr opcode at the bottom of your super loop does next to nothing for you.  If the glitch you're suffering from results in a corruption of some SRAM variable, or a misconfiguration of an I/O register, you will never catch it.  Robust software must rely on more than a single unqualified wdr.

 

Since you seem to know everything about how to use a watchdog properly, explain why the external watchdog works in the same application in which the internal watchdog doesn't. In your words, the single unqualified wdr doesn't work, but a single unqualified port pin toggle does.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

How can I, since the OP has given zero details about how he has implemented his hardware watchdog, nor any explanation of how he tried the AVR's own watchdog yet failed to get it to work 'as advertised'.  I'm with Brian.  Good luck gents.

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

joeymorin wrote:

How can I, since the OP has given zero details about how he has implemented his hardware watchdog, nor any explanation of how he tried the AVR's own watchdog yet failed to get it to work 'as advertised'.  I'm with Brian.  Good luck gents.

 

The OP attitude is a bit dissapointing, but the issue is more important. That's why I'm here. The OP said it clearly he's interested in only the programming issue. Some others here identified a more important problem, the internal watchdog issue. We are all here to learn from each other, not to give lessons.

 

I, for example, have a theory about his current problem: maybe his dongle is too smart. If it is an atmelice, I would try an usbasp dongle (if practical) or something of that kind.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

My belief is that if there is a known lockup issue, the watchdog should not be considered as a "fix".  The fix should be the fix.   The watchdog should be used to gracefully protect/recover from unexpected issues.   Why bother fixing the steering algorithm & avoiding that tree?--we'll let our seatbelts protect us.

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

joeymorin wrote:

How can I, since the OP has given zero details about how he has implemented his hardware watchdog, nor any explanation of how he tried the AVR's own watchdog yet failed to get it to work 'as advertised'.  I'm with Brian.  Good luck gents.

 

Ok, then.

 

The hardware watchdog is a STWD100NPWY3F. !WDO is connected to !RESET and there is a 10kΩ pull-up resistor to Vcc. WDI is connected directly to PD7, which the firmware will toggle at the same point where it calls wdt_reset(). !EN was connected directly to Vcc, but when that's true, programming (with a USBTiny) fails as I've described.

 

With an unprogrammed chip, if you look at !RESET on a scope, you see a ~500 Hz square wave - as you'd expect. But if you pull !RESET low with the programmer, it stays low - as you'd expect given that the watchdog output is an open drain and there's just a 10kΩ pull-up.

 

The open question that remains is, what is it about programming an ATMega328pb is seemingly sensitive to the fact that !RESET may be pulled low by something else before the programmer begins to assert !RESET at the start of programming.

 

That's the ONE question I came here to ask. And rather than answer it, a lot of responses have gone into questions about other details that aren't particularly relevant to the ONE question that needs answering. If that has given me what you call a "bad attitude," then sorry, not sorry.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

That tells us nothing about how to tried and failed to use the internal watchdog, so I still can't answer rammon's question.

 

    !WDO is connected to !RESET and there is a 10kΩ pull-up resistor to Vcc

You probably don't need the pull-up, as /RESET has an internal pull-up already.

 

 

With an unprogrammed chip, if you look at !RESET on a scope

I'd be far more interested in seeing what MOSI/MISO/SCK are doing.  You may be able to determine whether or not your programmer is failing to send the programming enable command, or if the target is failing to respond.

 

This:

 

 

... >>may<< be the source of your trouble.  If the target fails to respond at first and your programmer tries to pulse /RESET high, your watchdog may be holding it low at that moment and the pulse will never be seen by the target.  Only a full trace of /RESET/MOSI/MISO/SCK will give the the picture you need to put together what's happening.

 

How does your app determine that a 'glitch' has occurred, and that a device reset is required?

 

Have you tried any other programmers, or just your USBTiny?

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

Last Edited: Tue. Feb 20, 2018 - 08:10 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Perhaps the programmer is monitoring the reset pin & looking for a falling edge (verifying a high-to-low transition)...would need a programmer schematic.   Can the programmer drive (or try to drive) it high?  Maybe it's arm-wrestling with the watchdog..  Maybe add 10k  series to the watchdog output & let the programmer win.  Maybe the chip can't "enter"  programming if it is already in reset--would have to review the chip programming protocol state machine.

 

1. Power-up sequence:

Apply power between VCC and GND while RESET and SCK are set to “0”. In some systems, the programmer

can not guarantee that SCK is held low during power-up. In this case, RESET must be given a positive

pulse of at least two CPU clock cycles duration after SCK has been set to “0”.

2. Wait for at least 20ms and enable serial programming by sending the Programming Enable serial instruction

to pin MOSI.

3. The serial programming instructions will not work if the communication is out of synchronization. When in

sync. the second byte (0x53), will echo back when issuing the third byte of the Programming Enable instruction.

Whether the echo is correct or not, all four bytes of the instruction must be transmitted. If the 0x53 did

not echo back, give RESET a positive pulse and issue a new Programming Enable command.

4. The Flash is programmed one page at a time. The memory page is loaded one byte at a time by

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

Last Edited: Tue. Feb 20, 2018 - 08:05 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

nsayer wrote:

The open question that remains is, what is it about programming an ATMega328pb is seemingly sensitive to the fact that !RESET may be pulled low by something else before the programmer begins to assert !RESET at the start of programming.

I'm pretty sure this is not a problem. It doesn't matter who pulls the reset line low first. The problem may be (and probably is) the fact that the reset line may be released programatically by the programmer (taken high) at some points, in some cases.

A good way to take rid of your solder jumper is to address the root of the problem smiley. Instead of PD7, just use SCK from the programming header to retrigger the external watchdog. This way, the programmer will retrigger itself the watchdog... And don't worry about the pin contention, when the reset line is low, the SCK pin is high-Z. I assume you don't use the SCK/MISO/MOSI as SPI in your application here.

 

My theory about an intelligent dongle has fallen. smiley. But I may try the reverse theory also... Try an atmelice if you have one.