FatFs and disk removal

Go To Last Post
15 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I can not recall having seen previous questions about this.

 

I have a data logger with a removable SD card.

I have connected the "disk present" line of the socket to an interrupt pin.

 

At this moment I have to check if the switch is already released the moment the card is pressed, or if it will be released when the card is actually removed.

But I assume it will be already open contact when the card is getting pressed to be removed such that you have more time.

 

ISR( INT7_vect)
{
	tEvent l_Event;

		// close file as fast as possible
	f_sync(&file1);								// first make sure the file is synchronizeed
	f_close(&file1);							// then close the file

Inside the interrupt my intention was to immediately sync and close the file

 

But I now realize that the LCD is also controlled through the same SPI interface and it might be that that is updated at the time of removal or worst the disk is being written.

I wonder what I should do now.

technically whenever I write to the disk I also immediately sync the file, so that it is up to date. so a close of the file is all that would remain but that still gives me a conflict

 

On old projects were I used an SD card I never closed a file, and always let the system crash by removing batteries.

I did always do a sync after writing and have never experienced corrupted files or crashed disks.

But there should be a somewhat nicer way to do things.

 

Now I have a couple of flags that indicate if the disk is present/initialized/mounted and if a file is open. these I also immediately clear in the interrupt.

but that does not solve my problem with ensuring that the file is at least synchronized.

 

Hope you can shed a light on this.

best thing is making it not removable, but that at this point is not an option as there is no access to this disk and we want do a readout.

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I used the internet archive to step back in time to 2014 and take a look at an ffsample.zip from the Chan site at that time. I haven't looked at a recent release (though I know that later some of this stuff was lost) but back then the sample implementation of disk_timerproc() in mmc_avr.c looked like:

void disk_timerproc (void)
{
	BYTE n, s;

	n = Timer1;				/* 100Hz decrement timer */
	if (n) Timer1 = --n;
	n = Timer2;
	if (n) Timer2 = --n;

	s = Stat;

	if (SOCKWP)				/* Write protected */
		s |= STA_PROTECT;
	else					/* Write enabled */
		s &= ~STA_PROTECT;

	if (SOCKINS)			/* Card inserted */
		s &= ~STA_NODISK;
	else					/* Socket empty */
		s |= (STA_NODISK | STA_NOINIT);

	Stat = s;				/* Update MMC status */
}

that is using macros at the top of the file:

#define SOCKINS		(!(PINB & 0x10))	/* Card detected.   yes:true, no:false, default:true */
#define SOCKWP		(PINB & 0x20)		/* Write protected. yes:true, no:false, default:false */

So every 100ms the interrupt code is sensing the state of the Write Protect and Card Present  switches and setting STA_NODISK and STA_NOINIT  accordingly.

 

So the rest of the code is then checking things like:

DRESULT disk_write (
	BYTE pdrv,			/* Physical drive nmuber (0) */
	const BYTE *buff,	/* Pointer to the data to be written */
	DWORD sector,		/* Start sector number (LBA) */
	UINT count			/* Sector count (1..128) */
)
{
	if (pdrv || !count) return RES_PARERR;
	if (Stat & STA_NOINIT) return RES_NOTRDY;
	if (Stat & STA_PROTECT) return RES_WRPRT;

	if (!(CardType & CT_BLOCK)) sector *= 512;	/* Convert to byte address if needed */
...

HOWEVER none of this protection really helps surprise

 

FAT(12/16/32) are inherently dangerous filing systems and can be easily corrupted. If you are old enough just cast your mind back to the days of Lotus-123, Word and so on running on MS-DOS and writing to (initially) floppy disks in FAT12 format.

 

If you remember that you almost certainly remember chkdsk.exe (later scandisk.exe) file file0000.chk and so on??

 

The fact is that if you started to write a file from your spreadsheet or your word processor and power failed part way through or the write(s) were otherwise interrupted you would often get "orphaned chains" on your floppy disk when you later checked it with chkdsk.

 

That's because during a write the system started to add an entry to the FAT chain (as it crossed an allocation unit boundary) and yet it didn't get the chance to complete so the file was left part written with "dangling bits".

 

Similarly the way FAT writing works is that it writes the FAT (if a new allocation unit was necessary), then it writes the data sectors in the current AU ("cluster") then it goes back to the directory and rewrites the entry for the file with an updated "modified time" and "file size". Well suppose it had written the data but power failed (or someone just flipped the catch to release the floppy) before it got the chance to update the filesize. Again the filesystem would be compromised.

 

Lots of secretaries, switching about between lots of floppies, ended up with a lot of disks with corrupt filing systems.

 

That is the nature of FAT!

 

There's no real protection for this (short of a servo that operates a catch and physically locks the SD/MMC in place until the filling system is happy that a complete update has been made and it's safe to release!).

 

So FAT storage (SD, CompactFlash, Floppy, USB, whatever) is going to be damaged from time to time.

 

All you can hope is that your key system files weren't in the middle of an update when the damage occurred.

 

FatFs helps a bit because (unlike MS-DOS etc with loads of RAM) it doesn't have room to cache too much in RAM so it's never more than a sector or two from having written everything.

 

Now you may have spotted that FAT is often set up to have TWO copies of the FAT table? A "good" operating system will write updated info info both. (MS-DOS always did) HOWEVER sadly no one (certainly not Microsoft) ever did anything to make use of this for "crash recovery". The closest anyone ever got was Peter Norton in the famous Norton Utilities. He had utilities there that could try to resurrect very broken disks by examining FAT2 if FAT1 didn't make sense. But the problem here is how do you know which one is the "intact" copy?

 

Later filing systems like NTFS, EXT3 (then EXT4) and so on are "journalling filing systems" and they log operations and try to make a complete update "atomic" so it either worked completely or it failed but you don't get "half done" updates as was possible with FAT. So they are more robust about things like power failure or media withdrawl etc.

 

BTW if you think all this is bad consider:

Image result for sky+ hd

Sky TV had a deployed population of over 10 million of these. Our company made 3+ million of them. I was the guy trying to explain to Sky why 5%+ of our boxes were returned under warranty with about 80..90% of those being because of corruption in the disk filing system (which was based on a specially interleaved "dual FAT" where a small part of the disk used "normal" FAT and 16KB (was it) allocation units while the main recording area used "VFATs" which was FAT but using 1.5MB clusters_.

 

I wrote all kinds of analysis tools and learned more than it's healthy to know about FAT (and Seagate/Western Digital hard drive technologies!).

 

But the annoying thing is that loads of people (millions!) are in the habit of turning off the TV (and the TV recorder) at the wall each night and because a satellite TV recorder is recording ALL the time (it's always buffering the last hour of the current channel) then you are bound to lose power part way through a write. The hope was this was just a corruption in one of the 1.5MB AUs as a bit of video "chopped off" did not matter but if there was a critical write to the 16KB AU area where the main data indices were held then you could completely screw either just the one recording or the user's last 2 years of recordings.

 

They were not happy bunnies!! crying

 

(bottom line: FAT is simple but it's actually a bit crap!)

Last Edited: Thu. Feb 21, 2019 - 03:32 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

In the documentation, take a look at f_sync(). I use this every so often (on a scale of seconds) so that lost data is only a few samples from the logger if the disk is improperly removed.

 

Jim

Jim Wagner Oregon Research Electronics, Consulting Div. Tangent, OR, USA http://www.orelectronics.net

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Things like f_sync()ing and f_close()ing can certainly help as you are less likely to be caught in a "half open" state but what you cannot really cater for is the power dying or the card being removed while a write is actually in progress.

 

Again if we cast our minds back to the 80's and MS-DOS plus floppy disks (or even HDDs later on) then the user was given a visual clue: