One Developer's Experience With Real Life Bitrot Under HFS+
New submitter jackjeff (955699) writes with an excerpt from developer Aymeric Barthe about data loss suffered under Apple's venerable HFS+ filesystem. HFS+ lost a total of 28 files over the course of 6 years. Most of the corrupted files are completely unreadable. The JPEGs typically decode partially, up to the point of failure. The raw .CR2 files usually turn out to be totally unreadable: either completely black or having a large color overlay on significant portions of the photo. Most of these shots are not so important, but a handful of them are. One of the CR2 files in particular, is a very good picture of my son when he was a baby. I printed and framed that photo, so I am glad that I did not lose the original.
(Barthe acknowledges that data loss and corruption certainly aren't limited to HFS+; "bitrot is actually a problem shared by most popular filesystems. Including NTFS and ext4." I wish I'd lost only 28 files over the years.)
An old partition of some 20000 files, most of them 10 years or older, in where I found 7 or 8 files - coincidentally jpg images as well - that were corrupted. It struck me as nothing other than filesystem corruption as the drive was and still is working just fine.
We know how to build good file systems. We have done it for years with ZFS and now Btrfs. Sticking to legacy file systems which are prone to corruption is simply not acceptable. It is about time that legislative authorities makes it illegal for Apple and other negligent vendors to ship file systems that are essentially faulty by design. A noticeable fine per corrupted file would be appropriate, with possibility of prison time upon recurring incidents.
1. All filesystems and/or storage hardware are imperfect - and what do you mean by "integrity check"?! This thing has ROUNDED CORNERS;
2. I don't back up my data adequately;
3. For some reason, 1 is more important than 2.
shouldn't you have backups?
It's the contents of the files that are corrupted. This has nothing to do with HFS+, and everything to do with a lack of redundancy, and a bad hard drive.
Bitrot isn't the fault of the filesystem unless something is badly buggy. It's the fault of the underlying storage-device itself. Attacking HFS+ for something like that is just silly. Now, with that said there are filesystems out there that can guard against bitrot, most notably Btrfs and ZFS. Both Btrfs and ZFS can be used just like a regular filesystem where no parity-information or duplicate copies are saved and in such a case there is no safety against bitrot, but once you enable parity they can silently heal any affected files without issues. The downside? Saving parity consumes a lot more HDD-space, and that's why it's not done by default by most filesystems.
Good backups aren't enough. If the filesystem isn't flagging corruption as it happens, the backup software will happily back up your corrupted data over and over until the last backup which has the valid file in it has expired or become unrecoverable itself.
This is why Apple should resurrect its ZFS project. Overnight they would be the largest ZFS vendor to match with being the largest UNIX vendor.
Trolling is a art,
Sure, a modern filesystem should be designed to catch and possibly work around bit errors, but in the end, hardware which causes that many bit errors is defective and needs to be fixed or replaced. RAM would be my first suspect if there aren't any error messages in SMART or disk related entries in system logs. If the RAM is defective, can you really blame the filesystem? What if the files got corrupted in RAM while you were working on them?
The solution is to not become too attached to data. It's all ephemeral anyway, in the grand scheme of things.
You are welcome on my lawn.
Due to their commanding smartphone marketshare, along with millions of devices with embedded Linux shipped every year, wouldn't Samsung be the largest UNIX vendor?
Oh? What's that? You weren't counting embedded Linux and I'm a pedantic #$(*#$&@!!!. Can't argue with that!
Bitrot is not usually the issue for most files. Sometimes, but it's rare. What I lost is a mayhem repository of hardware and software and human failure. Thanks for backup, life :)
On Bitrot:
- MP3s and M4As I had that suddenly started to stutter and jump around. You play the music and it starts to skip. Luckily I have backups (read on for why I have multiple backups of everything :) ) so when I find them, I just revert to the backup.
- Images having bad sectors like everyone else. Once or twice here or there.
- A few CDs due to CD degradation. That includes one that I really wish I'd still have, as it was a backup of something I lost. However, the CD takes hours to read, and then eventually either balks up or not for the directory. I won't tell you about actually trying to copy the files, especially with normal timeouts in modern OSes or the hardware pieces or whatnot.
Not Bitrot:
- Two RAID Mirror hard drives, as they were both the same company, and purchased at the same time (same batch), in the same condition, they both balked at approximately the same time, not leaving me time to transfer data back.
- An internal hard drive, as I was making backups to CDs (at that time). For some kind of reason I still cannot explain, the software thought my hard drive was both the source and the destination !!!! Computer froze completely after a minute or two, then I tried rebooting to no avail, and my partition block was now containing a 700mb CD image, quarter full with my stuff. I still don't know how that's possible, but hey, it did. Since I was actualy making my first CD at the time and it was my first backup in a year, I lost countless good files, many I gave up upon (especially my 90's favorite music video sources ripped from the original betacam tapes in 4:2:2 by myself).
- A full bulk of HDs on Mac when I tried putting the journal to another internal SSD drive. I have dozens of HDDs, and I thought it'd go faster to use that nifty "journal on another drive" option. It did work well, although it was hell to initialize, as I had to create a partition for each HDD, then convert them to journaled partitions. Worked awesomely, very quick, very efficient. One day after weeks of usage, I had to hard close the computer and its HDD. When they remounted, they all remounted in the wrong order, somehow using the bad partition order. So imagine you have perfectly healthy HDDs but thinking they have to use another HDDs journal. Mayhem! Most drives thought they were other ones, so my music HDD became my photos HDD RAID, my system HDD thought it was the backup HDD, but just what was in the journal. It took me weeks sporting DiskWarrrior and Data Rescue in order to get 99% of my files back (I'm looking at you, DiskWarrior as a 32 bit app not supporting my 9TB photo drive) with a combinaison of the original drive files and the backup drive files. Took months to rebuild the Aperture database from that.
- All my pictures from when I met my wife to our first travels. I had them in a computer, I made a copy for sure. But I cannot find any of that anywhere. Nowhere to be found, no matter where I look. Since that time, many computers happened, so I don't know where it could've been sent. But I'm really sad to have lost these
- Did a paid photoshoot for an unique event. Took 4 32GB cards worth of priceless pictures. Once done with a card, I was sifting through the pictures with my camera and noticed it had issues reading the card. I removed it immediately. When at home, I put the card in my computer, it had all the troubles in the world reading it (but was able to do so), I was (barely) able to import its contents to Aperture (4-5 pictures didn't make the cut, a few dozens had glitches). It would then (dramatically, as it somehow have its last breath after relinquishing its precious data) not read or mount anywhere, not even being recognized as a card by the readers. Childs, use new cards regularly for your gigs :)
- A RAID array b
Bitrot. It's a thing. It's been a thing since at least the very first tape drive - hell it was a thing with punch cards (when it might well have involved actual rot). While the mechanism changes, every single consumer-level data-storage system in the history of computing has suffered from it. It's a physical phenomena independent from file system, and impossible to defend against in software unless it transparently invokes the one and only defense: redundant data storage. Preferably in the form of multiple redundant backups.
So what is the point of this article?
--- Most topics have many sides worth arguing, allow me to take one opposite you.
It's not a matter of CPU load. Suppose you have one checksum block for every eight data blocks. In order to verify the checksum on read, you have to read the checksum block and all eight data blocks. So you have to read a total of nine blocks instead of one. Reading from the disk is one if the slowest operations in a computer, so ddoing it nine times instead of one slows things down considerably.
The real article would be titled "file systems with no data redundancy and no checksums are vulnerable to bitrot".
That covers about any file system with the lone exception of ZFS when ran on a raid, maybe btrfs? and i guess some mainframe stuff.
I apologize for the lack of a signature.
In a footnote he admits that the corruption was caused by hardware issues, not HFS+ bugs, and of course the summary ignores that completely.
So, for that, let me counter his anecdote with my own anecdote: I have an HFS+ volume with a collection of over 3,000,000 files on it. This collection started in 2004, approximately 50 people access thousands of files on it per day, and occasionally after upgrades or problems it gets a full byte-to-byte comparison to one of three warm standbys. No corruption found, ever.
Now THAT is serious.
Now that IS serious.
Now that is SERIOUS.
NOW that is serious.
People talking about "bit rot" usually have no clue, and this guy is no exception.
It's extremely unlikely that a file would become silently corrupted on disk. Block devices include per-block checksums, and you either have a read error (maybe he has) or the data read is the same as the data previously written. As far as I know, ZFS doesn't help to recover data from read errors. You would need RAID and / or backups.
Main memory is the weakest link. That's why my next computer will have ECC memory. So, when you copy the file (or otherwise defragment or modify the file, etc), you read a good copy, some bit flips in RAM, and you write back corrupted data. Your disk receives the corrupted data, happily computes a checksum, therefore ensuring you can read back your corrupted data faithfully. That's where ZFS helps. Using checksumming scripts is a good idea, and I do it myself. But I don't have auto-defrag on Linux, so I'm safer : when I detect a corrupted copy, I still have the original.
ext2 was introduced in 1993, and so was NTFS. ext4 is just ext2 updated (ext was a different beast). If anything, HFS+ is more modern, not that it makes a difference. All of them are updated. By the way, I noticed recently that Mac OS X resource forks sometimes contain a CRC32. I noticed it in a file coming from Mavericks.
I have discovered a truly marvelous proof of killer sig, which this margin is too narrow to contain.
Anyone who owned a Mac since the 80s remembers having to use Norton Disk Doctor and later DiskWarrior at least once per month to repair the filesystem. Entire folders could go randomly missing each time you booted up your Mac, and if you accidentally lost power to your hard drive, the use of one of those was mandatory.
I've slowly been moving all my systems to Btrfs from least important to most important and have had no problems so far.
ayottesoftware.com
Some people are talking about the fact that bitrot could happen as a result of bad RAM. Are you talking about bad system RAM or the RAM onboard the HDD's controller board?
If it was indeed bad system RAM, wouldn't bad system RAM cause a random BSOD (Windows) or Kernel Panic (Linux)? With how much RAM we use these days it's very likely we're going to be using all of the storage capacity of each of the DIMMs that we have in our systems.
Myself I have 16 GBs of RAM in my Windows machine and at any moment in time I'm using at the very least 40% of the RAM in the system with spikes up to at least 60% depending upon what I'm doing at the time. So with that said, the possibility of kernel memory structures being corrupted at some point while using memory (in even less used DIMMs in your system) I figure is going to happen. I'm not sure how the memory in the DIMMs are being used though. Is it being used sequentially? (DIMM 0, chip 1... 2... 3... 4, DIMM 1, chip 1... 2... 3...4, etc.) Or is the data thrown about randomly on the DIMMs?
Myself, if I had a random BSOD just happen I'd be running MemTest86+ in a hot second to test my system RAM and be asking to Corsair (the company that made my DIMMs) for an RMA.
So if does indeed turn out to be bad system RAM that causes this, I guess that it's a good idea not to be buying cheap RAM to begin with. Myself, I've never had a problem with Corsair Vengeance RAM modules so I will continue to buy that line of Corsair memory.
This sounds like actual disk errors. File systems can't do much about them, you really need something like a RAID.
RAM may have a low error rate much better than HDDs or SDs. That does not mean that you won't have errors even if you have a good brand and treat it well. Bit-level errors can and do happen all the time without us knowing; other times it happens in the wrong place and we notice (but think it is something else) it isn't until it gets really bad that we notice.
Example, say your RAM has a 1% bit loss rate (ignore that is insanely high) well if 90% of your data is not touchy code but data, the odds are that you may not notice 1 bit getting flipped that often. Then you have the fact that RAM could maintain that error rate over decades of smaller faster RAM but now you are storing MORE data and cycling it MORE than was possible on the older computers. So, if you had 1 bit error every gigabyte of throughput on a slow 1Mhz computer with 1MB of RAM it would take a long time for that 1% bit flip to happen (and if you noticed you'd still not likely blame the RAM) -- but today pumping though in seconds what that old machine would take a year; the error would occur quite often. SAME problem with storage but with an additional problem in that they still have the same lifespan requirements - RAM can be refreshed can checked.
Something else to be considered, the error correction schemes being used today are being pushed by the demand for higher density storage. Your HD isn't doing huffman or any of those old simple bit recovery schemes they've moved beyond that long ago to the next gen stuff from what your 56k modem was doing to fight phone line noise. They could make it better... but you would be giving up significant storage space. Perhaps somebody with a good marketing scheme and enough upset consumers could get you to pay MORE for less storage space... I know I would buy into it.
Essentially, we are at a point where HDDs expect you to scrub them for errors every year to avoid the bit rot... which is what I now do... haven't detected an error in years... however, the block level checksums the HDD uses has false positive error rate (just like CRC16 does) and the odds of a false positive may be poor--- again, we are working in the trillions now-- up near it's limitations (I'm assuming whatever they use now scaled... but it may not have which is why more people are talking about these issues. We know it's unlikely industry has adapted to the trends evenly over the decades... it's likely become a minior problem before they are forced to change devices to a newer proprietary checksum and error correction scheme. )
Do serious work? use ECC RAM. I'm still waiting for some low power AM1 motherboard that supports ECC so I can build a ZFS server... the AM1 chip supports ECC but no motherboards do.
Democracy Now! - uncensored, anti-establishment news
There are only two options for reliable data archiving: 1. Spinning disks with redundancy and regular checks 2. Archival grade tape. There used to be MOD as well, but as nobody cared enough to buy it, development stalled and then died. The OP simply was naive and stupid and did not bother to find out how to archive data properly. It is well-known how to do it and has been for a long time. I have not lost a single bit that I care about. Of course, I have a 3-way RAID1 with regular SMART and RAID consistency checks. I have off-site backups that are made with full or at least crypto-hash comparison to the original. I have lost plenty of bits that were not on RAID and I have to replace a disk in that RAID1 about every 1-2 years because of read errors, but none of that is surprising.
In short: The OP is lamenting his own stupidity and he is not even aware of it. Dunning-Kruger effect at work.
And BTW, before I forget: SSDs have worse properties for archiving that spinning disks. As people are generally stupid, I expect the "problem" of bit-rot will get worse. At least as long as people are too lazy to find out how to do things properly or are unwilling to spend the money that doing things right takes.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
The only way to avoid bitrot is with forward error correction such as par2 (pypar2 is a good GUI) and the rsbep.
...and you want to prevent those nosy congressmen pawing through your emails looking for felonies...
Lawrence Person (lawrencepersonh@gmailh.com (remove all "h"s to mail)
http://www.lawrenceperson.com/
What is the best overview doc/book out there for covering backup-archiving options?
I want to be more conversant with the subject before starting work with a FileMaker Pro DB consultant.
I will be doing a mission critical but small database, so data storage size won't be an issue as far as existing 1-4TB HDDs go & RAID arrays. Losing a day's or even an hour's data entry is not an option.
If this guy had had that many printed photographs sitting around in physical form farm more than 28 of them would have been destroyed in 6 years of non-professional storage. By any historical criteria this is an unbelievably successful archival medium!
The problem with bit rot is that backups doesn't help. The corrupted file go into the backup and eventually replace the good copy depending on retention policy. You need a file system which uses checksums on all data block so that it can detect a corrupted block after reading it, flag the file as corrupted so that you can restore it from a good backup.
Sounds like written by someone who is confused what backups are for and who hasn't heard of making archive copies. Backups are for recovering sudden catastrophe or minor mishaps (recover recent file loss), Archive copies are, well, digging archived copies of files looooooooooong after their creation.
You should have multiple separate archive copies to different kind of media, preferably ro-media and store at least one copy offsite.
Archive copy can be a full copy clone saved perpetuity or made with a dedicated archiving software which provides additional features and book keeping.
Losing a day's or even an hour's data entry is not an option.
If you have that kind of requirements (less than an hour lost data), then you are not looking for just backup/archive. You are looking for a fully redundant storage system.
In addition to the backup system, of course.
For reading, check up on backupentral.com, Symantec.com (Backup Exec/Netbackup) emc.com (Avamar, networker).
I once managed a Filemaker database server (v5), and it has a built in featuer to copy the database files for backup. Real simple. Cannot remember if the database had to be taken offline, as we had users only during normal working hours, but these days that should NOT be a requirement.
Simple: It is a "Datasheet" covering an "archival grade medium". If you do not know that, you have absolutely no business working on any kind of "mission critical" storage, as you are simply incompetent with regard to that subject.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
IIUC, the main reason for the problem is that the magnetic value of some bits weakens to the point that it cannot be correctly read. Assuming that the actual magnetic coating is not damaged, couldn't we avoid this problem simply by having a low-level low-priority task running that would read/write each block (without actually updating the various file dates)? By reading and then rewriting, the bit values would be reinforced. The task would simply cycle through all blocks on the disk and then start over.
For SSDs, if bitrot is even a problem therein, you would probably want the task to run somewhat infrequently, say once a month?
Simple: It is a "Datasheet" covering an "archival grade medium". If you do not know that, you have absolutely no business working on any kind of "mission critical" storage, as you are simply incompetent with regard to that subject.
Easy, there, big fella. Posting a link to a datasheet would have sufficed. Ain't right to call a man incompetent for asking a question. Truly, an incompetent is one who don't never ask the question assuming he already knows. Credit is due for seeking to learn something.
Take it easy, Charlie, I've got an Angle...
You don't understand, it's gweihir's JOB here to insult other people and call them incompetent!
He has one job, he's gonna do it...
I googled a little bit and came up with this link -
http://arstechnica.com/information-technology/2014/01/bitrot-and-atomic-cows-inside-next-gen-filesystems/
It talked about bitrot and the next-gen filesystem and according to it, the so-called "next gen" filesystem would detect bitrot
I do not know if it's the case, so I am posting the link here for others to discuss this matter further
You are correct. All computer memory should be ECC. Do not buy non-ECC RAM as this marks you as reckless and shortsighted.
If your ECC RAM is failing you will see warnings in OS logs while your system still operates correctly and stores your data safely. This is your opportunity to replace the RAM.
If you do not have ECC RAM, then failing memory will corrupt an unknown amount of data before you work out that your problems are RAM related.
Manufacturers of non-ECC RAM should be sued. Their products are not "fit for purpose".
GNU stands for GNU's Not UNIX.
Linux is just "Linus with an 'x' at the end to make the name look UNIX-y"
Well IMHO I'd say more important for The Open Group (POSIX) to figure out what role they should play in a world where we don't have a variety of mostly coequal Unixes. A rather a highly fragmented family of Unixes in Linux including Android, a very popular desktop Unix that violates most of the Unix norms in spirit in OSX, and the only remaining big box Unix (AIX) is more aimed at bring over cool features from mainframe. None of them really care about running each other's software. So really the question is what role should the The Open Group play in such a world?
Does any file systems support single disc parity?
Set a parity ratio depending on risk vs. space loss tolerance. Say it is 1000. You can lose any of 1000 bytes in a parity group and recover while only giving up .1% of your disk space to parity.
ALL filesystems and HD's suffer from file corruption over time. It's not even the filesystems fault, so I'm a little confused why this guy is blaming HFS+. Maybe he didn't want to blame anything else? All media, whether is it CD's Magnetic HD's, flash, etc. will suffer this fate in the end. The best thing the end user can do is scan the HD and files for problems regularly.
Long story short, my PM 9600 still boots just fine, and that's from '97 using HFS+. I've also been using the HFS filesystem since the late 80's (I think), and I've found it to be one of the BEST file systems I have ever had to use on any computer.
If someone wants to come up with a sort of ECC file system, than that would be awesome. But until someone does, then you're just going to have to deal with losing a few bits here and there.
Just for clarity, I'm not going to run the backup/archival system as an FMPro consultant will do that.
I need to get some more background, so I have knowledge of where the tradeoffs are. I know this is done all the time, but I'm sure there are still choices to be made.
I dug up the study.
"End-to-end Data Integrity for File Systems: A ZFS Case Study"
Zhang, Rajimwale, Arpaci-Dusseau
Cosmic rays do happen; odds go up as elevation increases. I would guess location also matters.
other looking provided this gem:
Google reports that more than 8% of every DIMM gets error, each year. Google found that the error rates were several magnitudes larger than small scale studies showed.
Democracy Now! - uncensored, anti-establishment news
Just some minor corrections in your text.
ZFS always detects bit rot. ZFS can always repair bit rot if it is configured to do so; if some redundancy is used. For instance, raid or mirror. BUT! You can also configure ZFS to double (triple) every data on a single disk, so ZFS can repair data using only a single disk. BTRFS can also do this, and it does by partitioning the disk into two partitions, building a mirror on a single disk. This is extremely cumbersome, ZFS does not do that. You dont need to repartition or anything with ZFS, just specify "copies=2". Done.
BTW, there is no research if BTRFS is safe, on the the other hand, there are several research projects showing that ZFS is safe, read the research papers on the ZFS wikipedia article.
I read some guy speculating in a storage solution that should repair the corrupted data block from a given checksum, by trying different valid data blocks fulfilling the checksum. He googled this and it turned that someone already tried that. Guess which solution? Yep, ZFS. But that solution was omitted because it took to much time. Pretty cool anyway.
"Incompetent" is a state, not an insult. And no, I cannot post any datasheets as I do not know what kind of equipment will be used.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
You need to find out the details of the backup/archival system being used. There is no way around that. It cannot be modeled as an opaque component, you need to understand the whole stack.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
And it is AC's job to spread lies. So take all he says with a grain of salt.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
was what happened to Microsoft coder's socks.. I have that on good authority from someone at Apple..
That should be OK for differential backups (depending on how often you make a "full"). If a file changes by 1 bit, that'll affect a byte. A differential will be written, but it hardly takes any space and you can still roll back.
Make a good reason to *check* backups every now and then though.