FLASH data shelf life and locked bootloader ideas

Go To Last Post
15 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Below is my assessment and solution to the time limit on FLASH memory data storage. This becomes a problem when used with secure bootloaders, where the bootloader FLASH can never be updated by the customer. I would like to know if anyone else is willing to share any thoughts on a better way to do this.

I am implementing an AES bootloader which will have the bootloader FLASH locked. The On-Chip Debug, JTAG and Serial program download fuse bits will all be set to disabled. In addition, the Memory Lock Bits (LB2:1) and Bootloader Memory Lock Bits (BLB12:11) will be set to disable bootloader programming access. This will prevent further programming and verification in Parallel and Serial Programming mode and prevent the application section from using LPM or SPM on the bootloader FLASH.

I almost forgot the AVR FLASH has a time limit on how long it will retain its data. Most FLASH devices will guarantee 10 years, but I cannot find this specification for the AT90CAN128 FLASH. Since the bootloader has the AES encryption keys, I cannot send a copy of the bootloader out to the customer for reprogramming, even if they have the AVR parallel programmer hardware.

This configuration of bootloader security combined with the FALSH data shelf life appears to make the AVR chip a kind of ticking self destruct time bomb, even if it does take more than 10 years to destruct.

The solution I came up with is a program that lives in the bootloader space and copies the bootloader FLASH back onto itself. It is a maintenance program that would probably only need to be run once every 9 years or so. The maintenance program can be selected and run when the external ISP computer is connected through the bootloader code. The maintenance program does have one problem. If anything serious goes wrong, like a power failure during the FLASH maintenance rewrite, it will corrupt/disable the AVR chip and require repair (factory full chip erase and parallel reloading). So, the maintenance program should only be run as often as needed to help mitigate this risk.

Since the maintenance program is self contained in the bootloader, running it will not expose the AES keys or other bootloader programing to the outside world. Its job is to simply rewrite the bootloader FLASH program (including itself), so that the 10 or whatever year data shelf limit gets a fresh start with another new 10 or whatever years.

In order to ensure the ability to recover from a failed attempt to load new application FLASH firmware, my AVR reset vector fuse is set for the bootloader. The bootloader software decides if the application or bootloader is going to be run. There is a physical emergency recovery jumper to force the bootloader to run in case the application program is toast. When the application program is working, it can also start the bootloader if desired.

I recently asked ATMEL support about the FLASH data shelf life specification. I also asked if I had to erase the page or if I could skip the erase and just rewrite it to begin another 10 or whatever year data shelf life cycle. Obviously, if the page must be erased, then maintenance programs must exist in two different bootloader pages to avoid erasing the maintenance program code while it is running. If anyone is interested, I will post the ATMEL support answer after I get the reply.

Any discussion or better ideas, etc.?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Flash has a minimum of 100,000 write cycles until it can no longer be written to. Most flash memory will last about 1,000,000 but it depends. When the flash 'dies' you will be able to read from it but you just won't be able to write. 100,000 writes is a 100,000 writes.....I don't think there is any way you can change that. Note: 100,000 writes per flash 'cell/block'. This is all off the top of my head so correct me if I'm wrong about something.....


My AVR Site

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The number of FLASH memory write cycles is a different issue.

FLASH memory and EEPROM do not remember forever. It will eventually loose its charge and forget everything that was ever programmed into it. I have seen specifications of 10 years minimum in modern FLASH and EEPROM chips. It can still be programmed for its 10,000 or 100,000 cycles. In fact, when it eventually forgets, it needs to be programmed again.

Remember, the 10 year or whatever the limit time is, starts when the memory is properly programmed. If you reprogrammed a FLASH memory every 10 years for its full 10,000 programming cycles, it would theoretically be 100,000 years before you wore the memory out. As long as this memory is reprogrammed before it forgets after its 10 year limit, it could theoretically make the entire 100,000 years without loosing any data at all.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Ohhhh, I misunderstood your question. Thanks for the clarifacation!

Quote:
FLASH memory and EEPROM do not remember forever. It will eventually loose its charge and forget everything that was ever programmed into it. I have seen specifications of 10 years minimum in modern FLASH and EEPROM chips.

Hmmm...I didn't know this...learned another thing today!


My AVR Site

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Flash retention time is an interesting question. I saw it thrashed to death in another forum where they were discussing how to make a device that would sleep for 100 years then wake up and do something. Physically, Flash is very similar to UV EPROM, so the question is how long will the buried gate retain its trapped charge - and the answer seems to be, nobody knows. The technology has only been around for about 30 years. Ten years seems to be an estimate on the safe side, because there's plenty of 20 to 25 year old equipment around - with EPROMs - that still works.

Your self-refreshing idea seems like a good one, except that (as you say) if you get any failure for any reason - cosmic ray, lightning, etc - during the refresh operation, you'll lose the remaining unrefreshed lifetime. Since you can rely on your devices retaining their code until 2016, and the majority of them probably until 2026, I seriously doubt whether it's worth the trouble to preserve just the boot loader. After 10 or 20 years, nobody's going to reflash the application. If the device fails, they won't be able to tell if it's hardware or software. They'll just buy a factory reconditioned part.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Its not entirely true that "nobody knows" about data retention times.

There are mechanisms that accelerate data loss. One is high temperature. The manufacturers do accelerated life testing which can then be extrapolated to "normal" environments. But, even data loss has a life-distribution. The figure of "10 years" is likely to be the value met by some high percentage of devices. If this is so, then there could be life failures on a few units out of a large population in less than 10 years.

Of course, there is also the "cosmic ray" issue. Its not clear to me whether or not the odds of a failure due to things like this are factored into the 10 years or not. I also don't know whether such a failure renders the memory cell permanently failed or not; if it does, then a "reflash" would do no good and you could not be certain of the integrity of the code used for the reflash, either.

Jim

Jim Wagner Oregon Research Electronics, Consulting Div. Tangent, OR, USA http://www.orelectronics.net

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I copied and pasted most of the reply from the AVR applications group below (I edited the page erase comments and put them in my own words):

The minimum flash retention for all AVR MCUs is 20 years.
They have a minimum of 100 000 write cycles (can be rewritten 100000 times, minimum).

So, within your application lifetime, NOTHING will happen to the flash unless some external mechanism has an impact. Such mechanisms can be:
- storage outside of the spc. temp range (see datasheet).
- exposure to under-voltage due to poor voltage supply.
- other power failures like sudden spikes and drops, when not using the BOD.
- errors in your bootloader application.

It turns out the page erase is one half of the refresh operation. Page erase writes all the one values. Writing a page only writes the zero values. If I skip the page erase, then none of the one values get refreshed. I think I will keep the maintance routine. Maybe I should put a prize in a trust fund for the first customer (if any) that reports actually using it at 20 years :).

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
Page erase writes all the one values. Writing a page only writes the zero values. If I skip the page erase, then none of the one values get refreshed.

But a '1' bit is a gate-discharged condition! It's the natural state of minimum energy, where all your bits are headed during the next century or three. There's nothing to refresh in a '1' bit.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Pert,

You have a very good point there.

I guess I shouldn't have edited it. Here is the original (my bad):

Quote:

Furthermore; changing flash cells is comprised by these two operations:
- writing : cell --> '0'
- erasing: cell --> '1'

Page erase is just like writing all 0xff to the page buffer, then go pagewrite. Writing a page is like only applying a grid of only the 0's in the page buffer....
So you can see how wanting to refresh a flash array not consisting solely of 0's, must involve erasing also. Otherwise, you will not be refreshing the 1's.

Your point seems to contradict this and I tend towards your explanation, given that a FLASH 1 bit is a discharge state. However, maybe there is also some value to discharging all the cells before recharging some of them. I honestly cannot say.

Originally I was afraid the AVR internal FLASH programming hardware might look at the data in the FLASH, notice it was already correct and not do any actual programming at all. I guess its just safer to erase the pages first. It only cost about 21 extra words of code to split the maintenance program into the two pieces necessary to support erasing the pages.

My marketplace has endured many products (that are not mine) that will turn into a useless doorstop if there is any problem at all during the normal programming cycle (this is just the customer data being entered into the device). Problems programming firmware would also make doorstops. All doorstops need attention, usually from the factory for $$$$$. Many of the things that lead to this doorstop behavior was the factory designing traps to keep hackers from enabling features through programming, that they wanted to be paid to enable (which is their right). However, the traps can not tell the difference between legitimate program corruption, like a power loss during programming, or a loose cable connector, or a hacking attempt. The product dies either way. Lets just say my marketplace is very sensitive to programming issues. So, I'm much better off making a solid case for a robust field recovery ability if a customer data or firmware programming cycle goes bad. Even the somewhat ridiculous 20 year maintenance program has its place. Also, as another victim of planned obsolescence (aren't most of us), I feel better letting the customer decide when they want to retire the product, no matter how long its been around.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
However, maybe there is also some value to discharging all the cells before recharging some of them.

Well if you're going to reprogram at all, I agree it's best to follow the whole sequence and erase first. Flash needs precise amounts of charge to program and erase and it's quite finely balanced. In the early days you could make it unprogrammable by over-erasing - giving it too long an erase pulse. These days it's all self-timed so you can't make that particular mistake, but I take it as a warning not to over-program individual bits in case they get stuck. Doing a chip erase first equalizes all the charges and keeps things in balance.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I hope that Mike B and the others concerned about re-freshing flash every 10 years have no electrolytic caps in their units, or any other components that might age or wear out.

Lee

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

A bit off the subject, but I wonder if RoHS will have unintended consequences for equipment lifetime. Tin whiskers could be quite a problem for small packages with pins closer than 0.5mm.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Lee has a very good point. Actually I'm replacing an existing 64 pin microprocessor, external UVPROM and external EEPROM with an AVR mounted on a small plug in circuit board. Most of the parts that are susceptible to aging issues in the original product are all field replaceable/serviceable. Obviously locked boot loader code does not fit into this category.

About RoHS, I guess we are the guinea pigs that get to field test this :(. Now where did I put that planned obsolescence "how to" book.... :wink:

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I would like to thank everyone for their input.

BTW, I seem to remember something about the early space program Gemini phase. When they first started extended missions, they found a ball bearing material that would grow whiskers in the orbital environment that never did this on earth. I cannot remember which alloy is was, but the ball bearings started developing lots of friction when they were up there long enough. I think I forgot to reFLASH my old brain and lost or mangled the original information :lol:. Oh well.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I think I have seen the ultimate folly of trying to refresh the boot loader FLASH. The fuse bits are most likely FLASH themselves. What do you do when the clock source select fuse goes out on a memory protected device? Well, after 20, 30 or whatever + years, I could just make the software public domain if anyone still cares?