FatFs and disk removal

Go To Last Post
15 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I can not recall having seen previous questions about this.

 

I have a data logger with a removable SD card.

I have connected the "disk present" line of the socket to an interrupt pin.

 

At this moment I have to check if the switch is already released the moment the card is pressed, or if it will be released when the card is actually removed.

But I assume it will be already open contact when the card is getting pressed to be removed such that you have more time.

 

ISR( INT7_vect)
{
	tEvent l_Event;

		// close file as fast as possible
	f_sync(&file1);								// first make sure the file is synchronizeed
	f_close(&file1);							// then close the file

Inside the interrupt my intention was to immediately sync and close the file

 

But I now realize that the LCD is also controlled through the same SPI interface and it might be that that is updated at the time of removal or worst the disk is being written.

I wonder what I should do now.

technically whenever I write to the disk I also immediately sync the file, so that it is up to date. so a close of the file is all that would remain but that still gives me a conflict

 

On old projects were I used an SD card I never closed a file, and always let the system crash by removing batteries.

I did always do a sync after writing and have never experienced corrupted files or crashed disks.

But there should be a somewhat nicer way to do things.

 

Now I have a couple of flags that indicate if the disk is present/initialized/mounted and if a file is open. these I also immediately clear in the interrupt.

but that does not solve my problem with ensuring that the file is at least synchronized.

 

Hope you can shed a light on this.

best thing is making it not removable, but that at this point is not an option as there is no access to this disk and we want do a readout.

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I used the internet archive to step back in time to 2014 and take a look at an ffsample.zip from the Chan site at that time. I haven't looked at a recent release (though I know that later some of this stuff was lost) but back then the sample implementation of disk_timerproc() in mmc_avr.c looked like:

void disk_timerproc (void)
{
	BYTE n, s;

	n = Timer1;				/* 100Hz decrement timer */
	if (n) Timer1 = --n;
	n = Timer2;
	if (n) Timer2 = --n;

	s = Stat;

	if (SOCKWP)				/* Write protected */
		s |= STA_PROTECT;
	else					/* Write enabled */
		s &= ~STA_PROTECT;

	if (SOCKINS)			/* Card inserted */
		s &= ~STA_NODISK;
	else					/* Socket empty */
		s |= (STA_NODISK | STA_NOINIT);

	Stat = s;				/* Update MMC status */
}

that is using macros at the top of the file:

#define SOCKINS		(!(PINB & 0x10))	/* Card detected.   yes:true, no:false, default:true */
#define SOCKWP		(PINB & 0x20)		/* Write protected. yes:true, no:false, default:false */

So every 100ms the interrupt code is sensing the state of the Write Protect and Card Present  switches and setting STA_NODISK and STA_NOINIT  accordingly.

 

So the rest of the code is then checking things like:

DRESULT disk_write (
	BYTE pdrv,			/* Physical drive nmuber (0) */
	const BYTE *buff,	/* Pointer to the data to be written */
	DWORD sector,		/* Start sector number (LBA) */
	UINT count			/* Sector count (1..128) */
)
{
	if (pdrv || !count) return RES_PARERR;
	if (Stat & STA_NOINIT) return RES_NOTRDY;
	if (Stat & STA_PROTECT) return RES_WRPRT;

	if (!(CardType & CT_BLOCK)) sector *= 512;	/* Convert to byte address if needed */
...

HOWEVER none of this protection really helps surprise

 

FAT(12/16/32) are inherently dangerous filing systems and can be easily corrupted. If you are old enough just cast your mind back to the days of Lotus-123, Word and so on running on MS-DOS and writing to (initially) floppy disks in FAT12 format.

 

If you remember that you almost certainly remember chkdsk.exe (later scandisk.exe) file file0000.chk and so on??

 

The fact is that if you started to write a file from your spreadsheet or your word processor and power failed part way through or the write(s) were otherwise interrupted you would often get "orphaned chains" on your floppy disk when you later checked it with chkdsk.

 

That's because during a write the system started to add an entry to the FAT chain (as it crossed an allocation unit boundary) and yet it didn't get the chance to complete so the file was left part written with "dangling bits".

 

Similarly the way FAT writing works is that it writes the FAT (if a new allocation unit was necessary), then it writes the data sectors in the current AU ("cluster") then it goes back to the directory and rewrites the entry for the file with an updated "modified time" and "file size". Well suppose it had written the data but power failed (or someone just flipped the catch to release the floppy) before it got the chance to update the filesize. Again the filesystem would be compromised.

 

Lots of secretaries, switching about between lots of floppies, ended up with a lot of disks with corrupt filing systems.

 

That is the nature of FAT!

 

There's no real protection for this (short of a servo that operates a catch and physically locks the SD/MMC in place until the filling system is happy that a complete update has been made and it's safe to release!).

 

So FAT storage (SD, CompactFlash, Floppy, USB, whatever) is going to be damaged from time to time.

 

All you can hope is that your key system files weren't in the middle of an update when the damage occurred.

 

FatFs helps a bit because (unlike MS-DOS etc with loads of RAM) it doesn't have room to cache too much in RAM so it's never more than a sector or two from having written everything.

 

Now you may have spotted that FAT is often set up to have TWO copies of the FAT table? A "good" operating system will write updated info info both. (MS-DOS always did) HOWEVER sadly no one (certainly not Microsoft) ever did anything to make use of this for "crash recovery". The closest anyone ever got was Peter Norton in the famous Norton Utilities. He had utilities there that could try to resurrect very broken disks by examining FAT2 if FAT1 didn't make sense. But the problem here is how do you know which one is the "intact" copy?

 

Later filing systems like NTFS, EXT3 (then EXT4) and so on are "journalling filing systems" and they log operations and try to make a complete update "atomic" so it either worked completely or it failed but you don't get "half done" updates as was possible with FAT. So they are more robust about things like power failure or media withdrawl etc.

 

BTW if you think all this is bad consider:

Image result for sky+ hd

Sky TV had a deployed population of over 10 million of these. Our company made 3+ million of them. I was the guy trying to explain to Sky why 5%+ of our boxes were returned under warranty with about 80..90% of those being because of corruption in the disk filing system (which was based on a specially interleaved "dual FAT" where a small part of the disk used "normal" FAT and 16KB (was it) allocation units while the main recording area used "VFATs" which was FAT but using 1.5MB clusters_.

 

I wrote all kinds of analysis tools and learned more than it's healthy to know about FAT (and Seagate/Western Digital hard drive technologies!).

 

But the annoying thing is that loads of people (millions!) are in the habit of turning off the TV (and the TV recorder) at the wall each night and because a satellite TV recorder is recording ALL the time (it's always buffering the last hour of the current channel) then you are bound to lose power part way through a write. The hope was this was just a corruption in one of the 1.5MB AUs as a bit of video "chopped off" did not matter but if there was a critical write to the 16KB AU area where the main data indices were held then you could completely screw either just the one recording or the user's last 2 years of recordings.

 

They were not happy bunnies!! crying

 

(bottom line: FAT is simple but it's actually a bit crap!)

Last Edited: Thu. Feb 21, 2019 - 03:32 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

In the documentation, take a look at f_sync(). I use this every so often (on a scale of seconds) so that lost data is only a few samples from the logger if the disk is improperly removed.

 

Jim

 

Until Black Lives Matter, we do not have "All Lives Matter"!

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

OK, that clarifies a lot.

 

fortunately it is a 1 off project for myself, so if things go wrong it is both designer and operator error in one go, hahahahaha

 

Fortunately it is not very simple to remove the card, so the chance that it is going to happen by accident is very small.

I do have the possibility to shut down the device, so normal procedure will be turning the device off and then remove the card.

After I have written each record I do use the F_sync command to synchronize the file and indeed in the past when I did not do that , I had very regular disk crashes with as a result that I had to run chkdsk and other tools to revive ( back in the good old days........ memories) so the syncing after the write seems to be a good first protection against corrupting the disk.

 

I will be going to see when I will be getting a warning and how much time I then should have to at least sync the file. then again, might be better to indeed then loose the last running recording and close the file immediately.

 

Have been thinking about using an indicator, but I only have the display... that being SPI too, would mean I first have to update the display, then write the file and then update the display again. Most likely the updating of the display is much more time consuming than the plain file write causing the system to become very busy. And as the file is updated every second, and not much data is writen to the file, it will most likely become an irritating flashing of the display. 

 

thanks for the insights.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I use f_sync() about once a minute. That means that the worst case data loss would be a minute's worth of data. Its your choice. You can use it after every write, but I think that slows things down because it should flush the firmware disk buffer and if that buffer isn't full, then there is a lot of extra "action". For my application, the former implementation did not use f_sync() and had files that were 24 hours long - that meant a loss of up to 24 hours data if it was not closed properly, so this is a huge improvement over that.

 

Jim

 

Until Black Lives Matter, we do not have "All Lives Matter"!

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Still missing the point. Suppose the power dies or the card is pulled out half way through the operation of an f_sync() call.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Still better odds than nothing!

 

Jim

 

Until Black Lives Matter, we do not have "All Lives Matter"!

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

My Pipe organ controllers have a simple SD card socket, resistor divider level shifter and a mega328 on them.

 

In the first iteration I ported my floppy disk FAT12 reader to FAT16/FAT32 using ASM.  Then I found FATFs & converted much of the project to C.  Had some corruption with the first (ASM) version when they left the recorder on & then turned the organ off and went home.  Still Have 100s of little *.chk files in a folder somewhere.  ( I think I wrote a script to re construct most of the recorded playback as they have an internal clock in each frame.)  Since then no one has powered down with the recorder active.

 

I was actually working on some of this today.  Replaced one of my systems back in 2013 with a more professional system.  Another project has no budget, so inherited old boards.  Had to rewrite some of the code to match the current master.  This is the part called a combination action, Which slaves off the master.  There are about 40 buttons below the keyboards called  pistons.  These can activate 260 or so solenoid pairs moving the tabs (called stops) up and down.   Every time a button is pressed, the file is opened, the bits are read and the file is closed.  The buttons are one shots with about a 200 millisecond delay so there is plenty of time to fire or reset the coils.) Edit: the file can also be written with a two button press. The set button enables write.  Then the piston becomes a write button and the state of the stops are written to the SD card. There are LED disk access indicators.

 

Had an issue testing.  Did not trust FATFs at first, so I was buffering the frames, attempting to keep as much in memory as possible to reduce disk reads.  In looking at this all these years later, Looks like I was double or triple buffering the frames. Found out the hard way, when I changed some of the hard coded seeks, to where I was directly manipulating the file pointer changed to use f_tell and f_lseek.  The newer master code uses the standard calls as I may port the code.

 

While it is unlikely someone would attempt to remove the disk during a write,  I do still have disk access LEDs (left over from the floppy days.)  The real risk is a glitch from all the EMF when the coils fire.  There are 2 30 amp supplies for a total of 60 amps.  The input logic is a separate 12 volt supply regulated to 5V and 3.3 for the electronics.   Input gating on these systems is quite clever and has been used widely for about 30 or 40 years.  10K pull ups or pull downs in front of a 74*165 with a 100K resistor on the line.  The internal clamp diode does the work as so little current reaches the pin. (Or so I have been told.)  So far I  have not had a glitch in the system causing a brownout reset of the AVR328.

 

Did have an interesting failure on the RS485 line earlier in the week.  Plugged the board in backwards (not easy to do as the sockets are keyed)  Put -12V on the VCC line.  Magic smoke came out.  Swapped out the AVR.  System seemed to run.  One of the slaves was not seeing data.  Finally lugged out the scope rewriting some of the old FATFs to make more portable as above.  1/2 the RS485 TX was out. RX pair was working perfectly.   Looked at the chip under magnification.  Neatest little hole I ever did see next to the TX leg.  Blew the encapsulation right off next to the index dot.  Thought I might have damaged the SD card.  SD card reads fine. Files seem intact. 

 

So I guess FATFs is now mature enough to be trusted.  As long as one plays by the rules.

 

 

 

 

Last Edited: Sat. Feb 23, 2019 - 05:09 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Not entirely missing the point.

I understand that removing the card while a write is in progress will cause trouble.

 

In the past I wrote lots of small strings to the card and in the end synchornized.

I now first build the entire record to save and then write it to the file and sync, minimizing the time writing to the file and then sync.

 

I have checked and stupid enough the card present switch is only de-activated when the card is almost disconnected, instead of the moment the card is pushed inward to remove it.

In the later case there would have been time enough to actually sync and close the file I guess.

 

Had an interesting thing yesterday evening. Although I shutdown my device by syncing and closing the open file and then removing power. It was no longer recognized in the PC.

I had to format the disk in order for it to work properly again. Note that I did put it back in the device, before reformatting, and there it initialized, mounted and I could open a file without problems ( I did not get an error back).

I thought syncing and closing the open files should be enough to keep the card good, but apparently there might be something else.

I do use an older version of fatfs at the moment, and know I still have to update to the latest version, but previous attempts at doing so did not work out.

 

 

 

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

meslomp wrote:

...

Had an interesting thing yesterday evening. Although I shutdown my device by syncing and closing the open file and then removing power. It was no longer recognized in the PC.

I had to format the disk in order for it to work properly again. Note that I did put it back in the device, before reformatting, and there it initialized, mounted and I could open a file without problems ( I did not get an error back).

 

I use Apple OS computers. Apple has a nasty habit of writing some hidden 'metadata' files to the disk.  There are 3 or so files one can write which disables some of this.  The OS still sometimes writes a copy of the file name with a dot (.) in front of it. I think the hidden attributes are also set.

The names of the files to turn off indexing are.

.Trashes  (Make this a zero byte file entry, if this is a folder files are not deleted and their directory entry is moved inside)

.metadata_never_index  (this is a zero byte file, it turns spotlight search to the drive off,  The drive can not be searched.)

.fseventsd (this is a folder it contains a zero byte file called no_log,  If this is not present many metadata files can be written to this folder.)

 

There is not much one can do about the finder writing the . metadata duplicated filename when the media can be written to.  Apple feels they own the data on media.  In the old floppy systems they had motors which ejected the disk.  Even in the new systems they give one a nasty warning if the device is removed before ejecting.

 

Of course with these files in the drive (or folder) Some of the OS 'features' are disabled.

 

As for formatting Do not use the OS format programs.  These can mess up the wear leveling blocks and cause data corruption. The micro SD association makes a formatting program which correctly allocates the blocks based on the card specifications.  This is a free download from https://www.sdcard.org/downloads/formatter_4/eula_windows/ Using this can sometimes recover bricked SD cards.  There is also a mac version in the left menu.  The docs on the manufactures association website, also in the left menu are well worth a read.

 

Raspberry pi users, have to deal with file corruption, if the device is powered down before being shut down.  This mostly happens when the pi is used headless.  I think this is an inherent weakness in the way these cards are used.

 

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

jporter -

 

Thanks for the notes. I have used the DiskRepair utility to reformat SD cards, always trying (not always successfully) to be careful to use MSFAT disk format (it seems to choose the proper format for the disk size). Will take a look at the formatter utility you described. That could be a more certain way of doing it.

 

Cheers

Jim

 

 

Until Black Lives Matter, we do not have "All Lives Matter"!

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

jporter,

thanks for the tool.

I had to do a full format, tried a quick format at first as that would save the data from being removed and thus a recover program should be able to restore things, but I culd only do a full format in order for windows to recognize the card again.

Next time the card crashes again I will first  have a go with this tool to see if it can solve the problem.

till now it was all bench testing, coming couple of days I am going to use the logger for real as we will be going for a couple of days holiday.

So keeping my fingers crossed it does not crash again.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

meslomp wrote:
I had to do a full format, tried a quick format at first as that would save the data from being removed and thus a recover program should be able to restore things, but I culd only do a full format in order for windows to recognize the card again.
Next time it happens, don't format the card at all.  Mirror the entire card as a single block, and use fs recovery tools.  Under linux, there's photorec and testdisk.  Under Windows, there's EaseUS (used to be GDB or Get Data Back) https://www.easeus.com

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Joey,

thanks for the link.

I just reformatted the card as it held no other data that at that point was interesting. I made a number of logs, but they were all from the back of my house and thus not interesting.

Used the logger last week and it did not crash again during that use.

If it does crash again i will not format, but indeed will try these tools first.