SAMD21 (SAMD21E16B) sporadically locks up and does not wake up from STANDBY sleep mode

Go To Last Post
22 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi all!

 

I configure the SAMD21 to wake up from STANDBY sleep mode from an EXTINT or an RTC alarm. We use a test jig to generate an EXTINT signal every 60 seconds. It is actually an old analogue clock with a magnet fixed to the second hand (see attached “clock test jig.jpg”). As the hand revolves through 360 degrees, it triggers the unit’s magnetic sensor which is supposed to wake up the SAMD21. After 8 - 48 hours at least one in four test units stops responding to an EXTINT and does not wake up. an RTC alarm is unable to wake up the SAMD21 either. After a week most / all of the units have locked up.

 

The FW configuration is:

  1. 48 MHz DFLL feeds system clock via GCLK0
  2. 32 kHz external oscillator feeds the DFLL via GCLK1
  3. 32 kHz external oscillator also feeds EXTINT via GCLK1
  4. 32 kHz oscillator feeds RTC via GCLK2
  5. Flash wait states set to 3

 

I enabled the output of GCLK0 which shows when the ARM core is getting an active clock. During STANDBY sleep mode, there is no clock output (as has been confirmed by Atmel). I also toggle a debug pin in all  of the interrupt handlers to show when the core wakes up. When the SAMD21 locks up, there is no GCLK0 clock output and the debug pin is not toggled. A SWD debugger (Segger J-Link) is also not able to wake up the SAMD21 and reports “Error: SAMD (connect): Could not power-up debug port” (see attached “J-Link error.png”).

 

I have literally tried everything that I can think of (HW & FW) and have still not identified the root cause or a work around sad I have been working with Atmel support for a long time, but did not get any joy.

 

I don’t expect anyone to solve it for me, but I just want to know if anyone else has experienced the same issue? Have you managed to fix it?

 

Thanks in advance,

Pieter

http://piconomic.co.za

 

P.S. I hope this is the right forum to post in. If not, please point me in the right direction.

Attachment(s): 

Last Edited: Wed. Jul 13, 2016 - 05:25 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

P.S. I am starting to look at Errata 10416 with new eyes:

 

"If APB clock is stopped and GCLK clock is running, APB read access to read-synchronized registers
will freeze the system. The CPU and the DAP AHB-AP are stalled, as a consequence debug operation is
impossible. Errata reference: 10416"

 

Could a read from a read-synchronized register (e.g. getting the RTC time) just before going into STANDBY sleep mode (that disables the APB clocks) somehow trigger the lock-up?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Moved to SAM community

JIm

I would rather attempt something great and fail, than attempt nothing and succeed - Fortune Cookie

 

"The critical shortage here is not stuff, but time." - Johan Ekdahl

 

"Step N is required before you can do step N+1!" - ka7ehk

 

"If you want a career with a known path - become an undertaker. Dead people don't sue!" - Kartman

"Why is there a "Highway to Hell" and only a "Stairway to Heaven"? A prediction of the expected traffic load?"  - Lee "theusch"

 

Speak sweetly. It makes your words easier to digest when at a later date you have to eat them ;-)  - Source Unknown

Please Read: Code-of-Conduct

Atmel Studio6.2/AS7, DipTrace, Quartus, MPLAB, RSLogix user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi pieterc,

 

did you ever solved your issue? I habe related problems (my SAMD21G18A locks up during sleep) and found something. Please have a look at https://community.atmel.com/forum...

 

regards

 

spachner

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hello pieterc,

 

did you solve your problem in the end?

My electronic projects blog >> www.limpkin.fr

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hello Jim,

Would that be possible to provide a link where you moved that topic?

I am having a similar problem and would like to check on the follow up about this issue.

Regards

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

This topic was moved here, this is the SAM community.

/Lars

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Lajon wrote:

This topic was moved here, this is the SAM community.

/Lars

Oh, that explains a lot :)

Thanks

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hello. Anyone having any solution about this? I have similar problem posted here:

https://community.atmel.com/forum/samd20-not-recovering-deep-sleep

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi @spachner, @limpkin, @0Pal and @ekalyvio,

 

I apologize for not responding earlier. I did not received notifications about posts to this thread :(

 

The only "work around" I could find was implementing the watch dog early warning interrupt. It wakes up the SAMD21 every 8 seconds which then "kicks the dog". If the SAMD21 does not wake up, the watch dog will "bite" and reset it 8 seconds later. It works, but restoring the firmware state to the same condition before reset and keeping accurate RTC time is problematic (it loses up to a second during reset).

 

I used a test farm of 4 units and by trail and error got a version of demonstration firmware (using the watch dog early warning interrupt) that was able to regularly produce one failure within 4 days (usually within 24 hours). It seems to be a hardware related race condition as any changes in the code makes the problem less likely (you have to wait much longer to see a failure). I sent boards and demo firmware to (then) Atmel, but they were unable to discover the root cause as they were approaching the problem only from a pure firmware perspective :(

 

Recently I have been tasked to look at this problem again, because some units have been resetting up to 12 times a day! I found the "ST STM32F40x and STM32F41x Errata sheet", section 2.1.3 and 2.1.5 which has provided me with a new idea to test. I am running a new test on a test farm of 7 units and should know by the end of the weekend if I am on to something.

 

I will keep you posted. If you would like to find out the early details, you may PM me. Please let me know if you are still experiencing the problem and if you have found something.

 

Regards,

Pieter

https://piconomix.com/contact/

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi,

 

 

I've been watching this thread for a while with interest; I  too have been having problems related to SAMD chips not waking from sleep. My problem is a related to waking from I2C slave -- as such there's slightly more observability as to what the micro is doing -- so I can see it starts to wake up and then faults. Post including traces is here:

 

https://community.atmel.com/foru...

 

I'll see if the community have any thoughts then chase ATMEL/Microchip. I too suspect a race condition in the hardware in bringing up the APB/synchonising after sleep.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi everyone,

 

I hope it's not premature, but I think I found a solution. Please test and post feedback on this thread. Here it is:

 

RAMFUNC void main_standby_sleep(void);

 

void main_standby_sleep(void)
{
    // Disable interrupts
    __disable_irq();
    __DMB();
    // Select STANDBY sleep mode
    SCB->SCR |=  SCB_SCR_SLEEPDEEP_Msk;
    // Wait for Interrupt
    __DSB();
    __WFI();
    // Enable interrupts
    __DMB();
    __enable_irq();
}

 

I replaced the ASF version of the code with my own version and placed the function in SRAM (“RAMFUNC” attribute). If you look in the ASF version, you will see:

 

static inline enum status_code system_set_sleepmode(
    const enum system_sleepmode sleep_mode)
{

#if (SAMD20 || SAMD21 || SAMR21)

    /* Get MCU revision */
    uint32_t rev = DSU->DID.reg;

    rev &= DSU_DID_REVISION_Msk;
    rev = rev >> DSU_DID_REVISION_Pos;

#if (SAMD20)
    if (rev < _SYSTEM_MCU_REVISION_E) {
        /* Errata 13140: Make sure that the Flash does not power all the way down
         * when in sleep mode. */
        NVMCTRL->CTRLB.bit.SLEEPPRM = NVMCTRL_CTRLB_SLEEPPRM_DISABLED_Val;
    }
#endif

#if (SAMD21 || SAMR21)
    if (rev < _SYSTEM_MCU_REVISION_D) {
        /* Errata 13140: Make sure that the Flash does not power all the way down
         * when in sleep mode. */
        NVMCTRL->CTRLB.bit.SLEEPPRM = NVMCTRL_CTRLB_SLEEPPRM_DISABLED_Val;
    }
#endif

 

My hypothesis is that Atmel fixed the Flash power down problem (errata 13140), but not perfectly. There are still (rare) timing circumstances where the Flash does not power up fast enough after the ARM core wakes up and feeds it garbage for the next instruction to execute after the "WFI" (Wait for Interrupt) assembly instruction, which causes a hard fault. By placing the code in SRAM this problem is avoided, because the SRAM is always powered and ready to supply the next instruction.

 

Regards,

Pieter

https://piconomix.com/contact

 

Last Edited: Wed. Nov 7, 2018 - 01:10 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

pieterc wrote:

Hi everyone,

 

I hope it's not premature, but I think I found a solution. Please test and post feedback on this thread. Here it is:

RAMFUNC void main_standby_sleep(void);


void main_standby_sleep(void)
{
    // Disable interrupts
    __disable_irq();
    __DMB();
    // Select STANDBY sleep mode
    SCB->SCR |=  SCB_SCR_SLEEPDEEP_Msk;
    // Wait for Interrupt
    __DSB();
    __WFI();
    // Enable interrupts
    __DMB();
    __enable_irq();
}

 

I replaced the ASF version of the code with my own version and placed the function in SRAM (“RAMFUNC” attribute). If you look in the ASF version, you will see:

static inline enum status_code system_set_sleepmode(
    const enum system_sleepmode sleep_mode)
{

#if (SAMD20 || SAMD21 || SAMR21)

    /* Get MCU revision */
    uint32_t rev = DSU->DID.reg;

    rev &= DSU_DID_REVISION_Msk;
    rev = rev >> DSU_DID_REVISION_Pos;

#if (SAMD20)
    if (rev < _SYSTEM_MCU_REVISION_E) {
        /* Errata 13140: Make sure that the Flash does not power all the way down
         * when in sleep mode. */
        NVMCTRL->CTRLB.bit.SLEEPPRM = NVMCTRL_CTRLB_SLEEPPRM_DISABLED_Val;
    }
#endif

#if (SAMD21 || SAMR21)
    if (rev < _SYSTEM_MCU_REVISION_D) {
        /* Errata 13140: Make sure that the Flash does not power all the way down
         * when in sleep mode. */
        NVMCTRL->CTRLB.bit.SLEEPPRM = NVMCTRL_CTRLB_SLEEPPRM_DISABLED_Val;
    }
#endif

My hypothesis is that Atmel fixed the Flash power down problem (errata 13140), but not perfectly. There are still (rare) timing circumstances where the Flash does not power up fast enough after the ARM core wakes up and feeds it garbage for the next instruction to execute after the "WFI" (Wait for Interrupt) assembly instruction, which causes a hard fault. By placing the code in SRAM this problem is avoided, because the SRAM is always powered and ready to supply the next instruction.

 

Regards,

Pieter

https://piconomix.com/contact

 

Well, it's not working for me.

I added the RAMFUNC in my sleep function which waits for an interrupt, but my software still freezes.

 

If I find anything, I'll write in here.

Last Edited: Thu. Nov 29, 2018 - 09:45 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

tglaria wrote:

 

Well, it's not working for me.

I added the RAMFUNC in my sleep function which waits for an interrupt, but my software still freezes.

 

If I find anything, I'll write in here.

 

I found this generally fixed things for me (on the SAMD11). However, I had to make sure that a wakeup didn't occur too closely to going to sleep. I'm still debugging the exact details...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi @tglaria and 2darrenw,

 

My client reported that my "fix" was a great improvement, but the SAMD21 still locks up on occasion. Unfortunately I'm renovating and won't be able to perform any tests, but I want to share my ideas (right or wrong) in the hope that it will spark the right solution.

 

Everyone monitoring this thread, PLEASE respond to this thread indicating if you still have a problem or not. It could help to put pressure on Microchip to investigate this issue properly if they see the full extent. Please contact Microchip support too so that they realize that "Houston, we have a problem". At the moment it is being swept under the rug by the Microchip support team.

 

Ideas:

 

1. Set NVMCTRL->CTRLB.bit.SLEEPPRM = NVMCTRL_CTRLB_SLEEPPRM_WAKEUPINSTANT_Val;

See CTRLB register, SLEEPPRM[1:0] register, page 395. The default value (0x0) wakes up the NVM block upon fist access. 0x1 wakes up the NVM block when exiting sleep. My reasoning is that the sleep code is executing in SRAM and by waking up the NVM block immediately after exiting STANDBY sleep mode, it gives the NVM block longer time to recover, before the codes jumps back to Flash.

 

2. Make sure all input pins are close to 0V or 3.3V by using pull-down or pull-up resistors

I have observed that sometimes my circuit board is drawing more current in STANDBY sleep mode (30 uA versus 20 uA). It could be that the SAMD21's internal regulator can not cope with the increased power consumption and that is why it fails to exit low-power mode cleanly. A floating input that is in the grey area (say 0.99V to 1.82V) will cause increased power consumption. Sometimes this is tricky to track down. For example, I have AT45D DataFlash connected via SPI and I put it into Ultra-Deep Power-Down before going into STANDBY sleep mode. The AT45D DataFlash is probably not driving the MISO pin and hence that pin is floating on the SAMD21. I should enable the pull-down resistor before going into STANDBY sleep mode to prevent it from floating.

 

3. Internal Voltage Regulator Low Power (LP) mode

SAMD21 datasheet "Chapter 16. PM - Power Manager; 16.1 Overview", page 140 states that:

"Before entering the STANDBY sleep mode the user must make sure that a significant amount of clocks and peripherals are disabled, so that the voltage regulator is not overloaded. This is because during STANDBY sleep mode the internal voltage regulator will be in low power mode."

Unfortunately the datasheet does not say what the impact of each clock and peripheral is and I can judge if the voltage regulator is being overloaded or not. It could be operating right on the edge and sometimes go over the edge. This paragraph needs clarification from Microchip. I thought that all of the peripheral clocks were stopped anyway when the core was put into STANDBY sleep mode (?). In my case the RTC is still running using an external crystal. Is this too much or not?

 

These are my ideas. Good luck!

Pieter

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi everyone,

 

It would also be useful to determine if the SAMD21 locks up while it is trying to execute from SRAM or FLASH. Here is a proposed debug trick: Mark debug counter variables as "NO_INIT" so that their values will not be reset to zero when the FW starts up. Increment the value before exiting FLASH, entering SRAM and then again after WFI, exiting SRAM and entering FLASH again. When the FW starts up and detects that a WatchDog reset occurred, it will wait for you to connect a debug probe, so that you can inspect the debug counter values. The values will then give you a better indication were it locked up.

 

Here is the (almost complete) pseudo code:

 

NO_INIT static uint32_t standby_sleep_enter_counter;
NO_INIT static uint32_t standby_sleep_exit_counter;

NO_INIT static uint32_t wfi_enter_counter;
NO_INIT static uint32_t wfi_exit_counter;

 

RAMFUNC void main_standby_sleep(void);

void main_standby_sleep(void)
{
    // Disable interrupts
    __disable_irq();
    __DMB();
    // Select STANDBY sleep mode
    SCB->SCR |=  SCB_SCR_SLEEPDEEP_Msk;
    // Increment WFI enter counter (executing from SRAM)
    wfi_enter_counter++;
    // Wait for Interrupt
    __DSB();
    __WFI();
    // Increment WFI exit counter (executing from SRAM)
    wfi_exit_counter++;

    // Enable interrupts
    __DMB();
    __enable_irq();
}

 

static void wdt_early_warning_handler(void)
{
}

 

static void main_watchdog_enable(void)
{
    struct wdt_conf config;

    wdt_get_config_defaults(&config);
    config.clock_source         = GCLK_GENERATOR_3;
    config.early_warning_period = WDT_PERIOD_8192CLK;
    config.timeout_period       = WDT_PERIOD_16384CLK;
    wdt_set_config(&config);

    wdt_register_callback(wdt_early_warning_handler, WDT_CALLBACK_EARLY_WARNING);
    wdt_enable_callback(WDT_CALLBACK_EARLY_WARNING);

    wdt_reset_count();
}

 

static void main_watchdog_disable(void)
{
    struct wdt_conf config;

    wdt_get_config_defaults(&config);
    config.enable = false;
    wdt_set_config(&config);
}

 

static void main_watchdog_restart(void)
{
    wdt_reset_count();
}

 

int main (void)
{
    // Watchdog reset?
    if(PM->RCAUSE.reg & PM_RCAUSE_WDT)
    {
        main_watchdog_disable();

             // Wait here for debug session to inspect debug counter values

        for(;;)
        {
            led_red_on();
            delay_ms(100);
            led_red_off();
            delay_ms(100);
        }
    }

    // Enable watchdog
    main_watchdog_enable();
    // Reset debug counters
    standby_sleep_enter_counter = 0x80000000;
    standby_sleep_exit_counter  = 0x80000000;
    wfi_enter_counter           = 0x80000000;
    wfi_exit_counter            = 0x80000000;

    while (true)
    {
        // Increment enter standby sleep counter (executing from FLASH)
        standby_sleep_enter_counter++;
        // Enter STANDBY Sleep
        main_standby_sleep();
        // Increment exit standby sleep counter (executing from FLASH)
        standby_sleep_exit_counter++;
        // Restart watchdog
        main_watchdog_restart();
    }
}

Last Edited: Wed. Dec 5, 2018 - 08:27 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'm very keen for this issue to be resolved, it a serious bug in hardware somewhere that prevent wakeup from RTC event especially for low power project we trying to implements. 

I lost a day work struggle to have this fixed and none of the above solved it.

I going put aside the RTC until Atmel team find a ruggedised solution for this problem and try timer instead and see if this also wake up correctly.  

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

Hi everyone!

 

Microchip support finally responded over the December holidays with a concrete hardware related explanation and fix. Here is their response verbatim (I hope that's OK):

 

We have identified the issue with WDT reset. It happens due to SysTick timer.

 

Issue is that WDT initiates the wake up, but then SysTick interrupt starts to get handled first before system is actually ready. Basically, what happens is that SysTick interrupt does not wait for the RAM to properly wake up from sleep.

 

So if you wake up from WDT, the system will wait for the RAM, but the core clock will actually be running, so SysTick interrupt may happen too. SysTick interrupt does not wait on the RAM, so the core attempts to run the SysTick handler and fails, since RAM is not ready. This causes a Hard Fault (in our testing SRAM is so slow to wake up even Hard Fault handler).

 

This mean that device wakes up, getting into the Hard Fault, stay there until WDT fully expires.

 

You can reproduce the issue quicker by running SysTick timer faster, and WDT wake ups also quicker.

 

The solution for the customer is to disable SysTick interrupt before going to sleep and enable it back after the sleep.

 

// Disable systick interrupt

SysTick->CTRL &= ~SysTick_CTRL_TICKINT_Msk;

// Deep sleep
sleepmgr_sleep(SLEEPMGR_STANDBY);

// Enable systick interrupt
SysTick->CTRL |= SysTick_CTRL_TICKINT_Msk;

 

I have been running a test for a few days and so far no lock ups. Please try the fix and respond to this thread to let us know if it works for you (or not).

 

Regards,

Pieter

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi Pieter,

 

I'm not to this so apologies.  I've just started using the processor on the MK1000WiFi board, and have this issue.  I sometimes do not come out of sleep.

 

Being new thought - where do I add this code please ?  Is it in the RTCZero.cpp file, or are you not using a library and putting the processor to sleep yourself ?

 

Cheers

 

Rob

Rob

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

Hi Rob,

 

I'm not familiar with the MK1000WiFi board or the "RTCZero.cpp" file. I use the ASF (Atmel Software Framework) Library that comes standard with ATmel Studio projects. The "sleepmgr_sleep(SLEEPMGR_STANDBY);" function call is to a function provided in the ASF. This function puts the microcontroller into Standby Sleep Mode. The "SysTick->CTRL &= ~SysTick_CTRL_TICKINT_Msk;" lines write directly to the SysTick peripheral's memory mapped registered. You can consult the datasheet to understand how it work.

 

Best of luck!

Pieter

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

pieterc wrote:

Hi everyone!

 

Microchip support finally responded over the December holidays with a concrete hardware related explanation and fix. Here is their response verbatim (I hope that's OK):

 

We have identified the issue with WDT reset. It happens due to SysTick timer.

 

Issue is that WDT initiates the wake up, but then SysTick interrupt starts to get handled first before system is actually ready. Basically, what happens is that SysTick interrupt does not wait for the RAM to properly wake up from sleep.

 

So if you wake up from WDT, the system will wait for the RAM, but the core clock will actually be running, so SysTick interrupt may happen too. SysTick interrupt does not wait on the RAM, so the core attempts to run the SysTick handler and fails, since RAM is not ready. This causes a Hard Fault (in our testing SRAM is so slow to wake up even Hard Fault handler).

 

This mean that device wakes up, getting into the Hard Fault, stay there until WDT fully expires.

 

You can reproduce the issue quicker by running SysTick timer faster, and WDT wake ups also quicker.

 

The solution for the customer is to disable SysTick interrupt before going to sleep and enable it back after the sleep.

 

// Disable systick interrupt

SysTick->CTRL &= ~SysTick_CTRL_TICKINT_Msk;

// Deep sleep
sleepmgr_sleep(SLEEPMGR_STANDBY);

// Enable systick interrupt
SysTick->CTRL |= SysTick_CTRL_TICKINT_Msk;

 

I have been running a test for a few days and so far no lock ups. Please try the fix and respond to this thread to let us know if it works for you (or not).

 

Regards,

Pieter

 

So, apparently, this fixed it for me.

Even though I didn't have the WDT running and wasn't using the watchdog interrupt to wake up the device, on my last test, the device hasn't hung up yet.

I'll keep it running obvously to see if this fixed it, I'll let you know.

 

RobFurlong123 wrote:

Hi Pieter,

 

I'm not to this so apologies.  I've just started using the processor on the MK1000WiFi board, and have this issue.  I sometimes do not come out of sleep.

 

Being new thought - where do I add this code please ?  Is it in the RTCZero.cpp file, or are you not using a library and putting the processor to sleep yourself ?

 

Cheers

 

Rob

I am using arduino on another board with the same microcontroller (SAMD21) and I just put those lines around my sleep function.

Just find where you put your device to sleep, put the line

SysTick->CTRL &= ~SysTick_CTRL_TICKINT_Msk;

right before, and put the line

SysTick->CTRL |= SysTick_CTRL_TICKINT_Msk;

right afterwards.

 

Last Edited: Wed. Jan 23, 2019 - 04:01 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

TLDR: Solved with  Disabling systick interrupt.

 

Previously the MCU hanged ~every 10 min, now the microcontroller has been going on for more than 3 days and has not hanged once.

I call this a succes.

 

Thanks @pieterc