One Developer's Experience With Real Life Bitrot Under HFS+
New submitter jackjeff (955699) writes with an excerpt from developer Aymeric Barthe about data loss suffered under Apple's venerable HFS+ filesystem. HFS+ lost a total of 28 files over the course of 6 years. Most of the corrupted files are completely unreadable. The JPEGs typically decode partially, up to the point of failure. The raw .CR2 files usually turn out to be totally unreadable: either completely black or having a large color overlay on significant portions of the photo. Most of these shots are not so important, but a handful of them are. One of the CR2 files in particular, is a very good picture of my son when he was a baby. I printed and framed that photo, so I am glad that I did not lose the original.
(Barthe acknowledges that data loss and corruption certainly aren't limited to HFS+; "bitrot is actually a problem shared by most popular filesystems. Including NTFS and ext4." I wish I'd lost only 28 files over the years.)
An old partition of some 20000 files, most of them 10 years or older, in where I found 7 or 8 files - coincidentally jpg images as well - that were corrupted. It struck me as nothing other than filesystem corruption as the drive was and still is working just fine.
We know how to build good file systems. We have done it for years with ZFS and now Btrfs. Sticking to legacy file systems which are prone to corruption is simply not acceptable. It is about time that legislative authorities makes it illegal for Apple and other negligent vendors to ship file systems that are essentially faulty by design. A noticeable fine per corrupted file would be appropriate, with possibility of prison time upon recurring incidents.
shouldn't you have backups?
Bitrot isn't the fault of the filesystem unless something is badly buggy. It's the fault of the underlying storage-device itself. Attacking HFS+ for something like that is just silly. Now, with that said there are filesystems out there that can guard against bitrot, most notably Btrfs and ZFS. Both Btrfs and ZFS can be used just like a regular filesystem where no parity-information or duplicate copies are saved and in such a case there is no safety against bitrot, but once you enable parity they can silently heal any affected files without issues. The downside? Saving parity consumes a lot more HDD-space, and that's why it's not done by default by most filesystems.
The point is that there are good file systems that can detect when the storage unit fails, give you an alert and allow you to restore the file from a good backup. Without this feature the corrupted file will just get backed up like any other file and eventually replace the good backup.
Good backups aren't enough. If the filesystem isn't flagging corruption as it happens, the backup software will happily back up your corrupted data over and over until the last backup which has the valid file in it has expired or become unrecoverable itself.
This is why Apple should resurrect its ZFS project. Overnight they would be the largest ZFS vendor to match with being the largest UNIX vendor.
Trolling is a art,
Sure, a modern filesystem should be designed to catch and possibly work around bit errors, but in the end, hardware which causes that many bit errors is defective and needs to be fixed or replaced. RAM would be my first suspect if there aren't any error messages in SMART or disk related entries in system logs. If the RAM is defective, can you really blame the filesystem? What if the files got corrupted in RAM while you were working on them?
The solution is to not become too attached to data. It's all ephemeral anyway, in the grand scheme of things.
You are welcome on my lawn.
Due to their commanding smartphone marketshare, along with millions of devices with embedded Linux shipped every year, wouldn't Samsung be the largest UNIX vendor?
Oh? What's that? You weren't counting embedded Linux and I'm a pedantic #$(*#$&@!!!. Can't argue with that!
Bitrot is not usually the issue for most files. Sometimes, but it's rare. What I lost is a mayhem repository of hardware and software and human failure. Thanks for backup, life :)
On Bitrot:
- MP3s and M4As I had that suddenly started to stutter and jump around. You play the music and it starts to skip. Luckily I have backups (read on for why I have multiple backups of everything :) ) so when I find them, I just revert to the backup.
- Images having bad sectors like everyone else. Once or twice here or there.
- A few CDs due to CD degradation. That includes one that I really wish I'd still have, as it was a backup of something I lost. However, the CD takes hours to read, and then eventually either balks up or not for the directory. I won't tell you about actually trying to copy the files, especially with normal timeouts in modern OSes or the hardware pieces or whatnot.
Not Bitrot:
- Two RAID Mirror hard drives, as they were both the same company, and purchased at the same time (same batch), in the same condition, they both balked at approximately the same time, not leaving me time to transfer data back.
- An internal hard drive, as I was making backups to CDs (at that time). For some kind of reason I still cannot explain, the software thought my hard drive was both the source and the destination !!!! Computer froze completely after a minute or two, then I tried rebooting to no avail, and my partition block was now containing a 700mb CD image, quarter full with my stuff. I still don't know how that's possible, but hey, it did. Since I was actualy making my first CD at the time and it was my first backup in a year, I lost countless good files, many I gave up upon (especially my 90's favorite music video sources ripped from the original betacam tapes in 4:2:2 by myself).
- A full bulk of HDs on Mac when I tried putting the journal to another internal SSD drive. I have dozens of HDDs, and I thought it'd go faster to use that nifty "journal on another drive" option. It did work well, although it was hell to initialize, as I had to create a partition for each HDD, then convert them to journaled partitions. Worked awesomely, very quick, very efficient. One day after weeks of usage, I had to hard close the computer and its HDD. When they remounted, they all remounted in the wrong order, somehow using the bad partition order. So imagine you have perfectly healthy HDDs but thinking they have to use another HDDs journal. Mayhem! Most drives thought they were other ones, so my music HDD became my photos HDD RAID, my system HDD thought it was the backup HDD, but just what was in the journal. It took me weeks sporting DiskWarrrior and Data Rescue in order to get 99% of my files back (I'm looking at you, DiskWarrior as a 32 bit app not supporting my 9TB photo drive) with a combinaison of the original drive files and the backup drive files. Took months to rebuild the Aperture database from that.
- All my pictures from when I met my wife to our first travels. I had them in a computer, I made a copy for sure. But I cannot find any of that anywhere. Nowhere to be found, no matter where I look. Since that time, many computers happened, so I don't know where it could've been sent. But I'm really sad to have lost these
- Did a paid photoshoot for an unique event. Took 4 32GB cards worth of priceless pictures. Once done with a card, I was sifting through the pictures with my camera and noticed it had issues reading the card. I removed it immediately. When at home, I put the card in my computer, it had all the troubles in the world reading it (but was able to do so), I was (barely) able to import its contents to Aperture (4-5 pictures didn't make the cut, a few dozens had glitches). It would then (dramatically, as it somehow have its last breath after relinquishing its precious data) not read or mount anywhere, not even being recognized as a card by the readers. Childs, use new cards regularly for your gigs :)
- A RAID array b
Bitrot. It's a thing. It's been a thing since at least the very first tape drive - hell it was a thing with punch cards (when it might well have involved actual rot). While the mechanism changes, every single consumer-level data-storage system in the history of computing has suffered from it. It's a physical phenomena independent from file system, and impossible to defend against in software unless it transparently invokes the one and only defense: redundant data storage. Preferably in the form of multiple redundant backups.
So what is the point of this article?
--- Most topics have many sides worth arguing, allow me to take one opposite you.
It's not a matter of CPU load. Suppose you have one checksum block for every eight data blocks. In order to verify the checksum on read, you have to read the checksum block and all eight data blocks. So you have to read a total of nine blocks instead of one. Reading from the disk is one if the slowest operations in a computer, so ddoing it nine times instead of one slows things down considerably.
The real article would be titled "file systems with no data redundancy and no checksums are vulnerable to bitrot".
That covers about any file system with the lone exception of ZFS when ran on a raid, maybe btrfs? and i guess some mainframe stuff.
I apologize for the lack of a signature.
In a footnote he admits that the corruption was caused by hardware issues, not HFS+ bugs, and of course the summary ignores that completely.
So, for that, let me counter his anecdote with my own anecdote: I have an HFS+ volume with a collection of over 3,000,000 files on it. This collection started in 2004, approximately 50 people access thousands of files on it per day, and occasionally after upgrades or problems it gets a full byte-to-byte comparison to one of three warm standbys. No corruption found, ever.
People talking about "bit rot" usually have no clue, and this guy is no exception.
It's extremely unlikely that a file would become silently corrupted on disk. Block devices include per-block checksums, and you either have a read error (maybe he has) or the data read is the same as the data previously written. As far as I know, ZFS doesn't help to recover data from read errors. You would need RAID and / or backups.
Main memory is the weakest link. That's why my next computer will have ECC memory. So, when you copy the file (or otherwise defragment or modify the file, etc), you read a good copy, some bit flips in RAM, and you write back corrupted data. Your disk receives the corrupted data, happily computes a checksum, therefore ensuring you can read back your corrupted data faithfully. That's where ZFS helps. Using checksumming scripts is a good idea, and I do it myself. But I don't have auto-defrag on Linux, so I'm safer : when I detect a corrupted copy, I still have the original.
ext2 was introduced in 1993, and so was NTFS. ext4 is just ext2 updated (ext was a different beast). If anything, HFS+ is more modern, not that it makes a difference. All of them are updated. By the way, I noticed recently that Mac OS X resource forks sometimes contain a CRC32. I noticed it in a file coming from Mavericks.
I have discovered a truly marvelous proof of killer sig, which this margin is too narrow to contain.
I've slowly been moving all my systems to Btrfs from least important to most important and have had no problems so far.
ayottesoftware.com
Anyone who owned a Mac since the 80s remembers having to use Norton Disk Doctor and later DiskWarrior at least once per month to repair the filesystem. Entire folders could go randomly missing each time you booted up your Mac, and if you accidentally lost power to your hard drive, the use of one of those was mandatory.
Oh yes. I remember those days well. Journaled HFS+ fixed that, and for about the last decade the only times I have encountered a corrupted file system on a Mac, that discovery was followed shortly by total failure of the hard disk.
So, what was your fucking point?
Anyone who owned a Mac since the 80s remembers having to use Norton Disk Doctor and later DiskWarrior at least once per month to repair the filesystem. Entire folders could go randomly missing each time you booted up your Mac, and if you accidentally lost power to your hard drive, the use of one of those was mandatory.
No, not "anyone who owned a Mac since the 80s...". My first Mac was a Mac Plus bought in 1987 (IIRC), and I have never used those tools nor experienced the problems you mention.
Some people are talking about the fact that bitrot could happen as a result of bad RAM. Are you talking about bad system RAM or the RAM onboard the HDD's controller board?
If it was indeed bad system RAM, wouldn't bad system RAM cause a random BSOD (Windows) or Kernel Panic (Linux)? With how much RAM we use these days it's very likely we're going to be using all of the storage capacity of each of the DIMMs that we have in our systems.
Myself I have 16 GBs of RAM in my Windows machine and at any moment in time I'm using at the very least 40% of the RAM in the system with spikes up to at least 60% depending upon what I'm doing at the time. So with that said, the possibility of kernel memory structures being corrupted at some point while using memory (in even less used DIMMs in your system) I figure is going to happen. I'm not sure how the memory in the DIMMs are being used though. Is it being used sequentially? (DIMM 0, chip 1... 2... 3... 4, DIMM 1, chip 1... 2... 3...4, etc.) Or is the data thrown about randomly on the DIMMs?
Myself, if I had a random BSOD just happen I'd be running MemTest86+ in a hot second to test my system RAM and be asking to Corsair (the company that made my DIMMs) for an RMA.
So if does indeed turn out to be bad system RAM that causes this, I guess that it's a good idea not to be buying cheap RAM to begin with. Myself, I've never had a problem with Corsair Vengeance RAM modules so I will continue to buy that line of Corsair memory.
This sounds like actual disk errors. File systems can't do much about them, you really need something like a RAID.
Sufficiently advanced RAID implementations will carry checksums of those blocks for exactly that purpose.
RAM may have a low error rate much better than HDDs or SDs. That does not mean that you won't have errors even if you have a good brand and treat it well. Bit-level errors can and do happen all the time without us knowing; other times it happens in the wrong place and we notice (but think it is something else) it isn't until it gets really bad that we notice.
Example, say your RAM has a 1% bit loss rate (ignore that is insanely high) well if 90% of your data is not touchy code but data, the odds are that you may not notice 1 bit getting flipped that often. Then you have the fact that RAM could maintain that error rate over decades of smaller faster RAM but now you are storing MORE data and cycling it MORE than was possible on the older computers. So, if you had 1 bit error every gigabyte of throughput on a slow 1Mhz computer with 1MB of RAM it would take a long time for that 1% bit flip to happen (and if you noticed you'd still not likely blame the RAM) -- but today pumping though in seconds what that old machine would take a year; the error would occur quite often. SAME problem with storage but with an additional problem in that they still have the same lifespan requirements - RAM can be refreshed can checked.
Something else to be considered, the error correction schemes being used today are being pushed by the demand for higher density storage. Your HD isn't doing huffman or any of those old simple bit recovery schemes they've moved beyond that long ago to the next gen stuff from what your 56k modem was doing to fight phone line noise. They could make it better... but you would be giving up significant storage space. Perhaps somebody with a good marketing scheme and enough upset consumers could get you to pay MORE for less storage space... I know I would buy into it.
Essentially, we are at a point where HDDs expect you to scrub them for errors every year to avoid the bit rot... which is what I now do... haven't detected an error in years... however, the block level checksums the HDD uses has false positive error rate (just like CRC16 does) and the odds of a false positive may be poor--- again, we are working in the trillions now-- up near it's limitations (I'm assuming whatever they use now scaled... but it may not have which is why more people are talking about these issues. We know it's unlikely industry has adapted to the trends evenly over the decades... it's likely become a minior problem before they are forced to change devices to a newer proprietary checksum and error correction scheme. )
Do serious work? use ECC RAM. I'm still waiting for some low power AM1 motherboard that supports ECC so I can build a ZFS server... the AM1 chip supports ECC but no motherboards do.
Democracy Now! - uncensored, anti-establishment news
ZFS raid does that.
There are only two options for reliable data archiving: 1. Spinning disks with redundancy and regular checks 2. Archival grade tape. There used to be MOD as well, but as nobody cared enough to buy it, development stalled and then died. The OP simply was naive and stupid and did not bother to find out how to archive data properly. It is well-known how to do it and has been for a long time. I have not lost a single bit that I care about. Of course, I have a 3-way RAID1 with regular SMART and RAID consistency checks. I have off-site backups that are made with full or at least crypto-hash comparison to the original. I have lost plenty of bits that were not on RAID and I have to replace a disk in that RAID1 about every 1-2 years because of read errors, but none of that is surprising.
In short: The OP is lamenting his own stupidity and he is not even aware of it. Dunning-Kruger effect at work.
And BTW, before I forget: SSDs have worse properties for archiving that spinning disks. As people are generally stupid, I expect the "problem" of bit-rot will get worse. At least as long as people are too lazy to find out how to do things properly or are unwilling to spend the money that doing things right takes.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Yes, he did it to himself by making assumptions he liked and zero verification whether they hold up in the real world. Now he blames others for his stupidity.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
No, there are not. There are data-archival systems that can do this though. This is not a filesystem-layer problem at all.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
That does not happen unless you fail to verify the data when placing it on that raid. For bit-rot detection on the disks, the disk-internal is more than enough. WHile the manufacturers state "1 uncorrectable sector in 10^15" read, it is more like "1 in 10^30" undetected faulty sector. And of course, any sane RAID setup includes a full disk data consistency check every 14 days or so. If you place defective data on the RAID, the RAID can do nothing for you.
I would also really recommend to read up on RAID and disk technology, you do not seem to understand how things work.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Bullshit. That is not a RAID-layer task. And the disks do that themselves just fine. Historically, there were actually RAID implementations that did what you describe, but they were scrapped due to various problems. Doing this in RAID is the wrong approach.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
...and you want to prevent those nosy congressmen pawing through your emails looking for felonies...
Lawrence Person (lawrencepersonh@gmailh.com (remove all "h"s to mail)
http://www.lawrenceperson.com/
What is the best overview doc/book out there for covering backup-archiving options?
I want to be more conversant with the subject before starting work with a FileMaker Pro DB consultant.
I will be doing a mission critical but small database, so data storage size won't be an issue as far as existing 1-4TB HDDs go & RAID arrays. Losing a day's or even an hour's data entry is not an option.
Yes, there are. There are filesystems that do per-block checksums. If data corruption occurs, it knows about it as soon as it tries to read the block. If it has no redundancy, ZFS will tell you which file is corrupt and suggest restoring it from backup.
Nope, wagnerrp is correct. raidz does exactly what he describes, and your claim that raidz was "scrapped due to various problems" is incorrect.
Anyone who owned a Mac since the 80s remembers having to use Norton Disk Doctor and later DiskWarrior at least once per month to repair the filesystem. Entire folders could go randomly missing each time you booted up your Mac, and if you accidentally lost power to your hard drive, the use of one of those was mandatory.
I think you're confusing generic Disk Repair with rebuilding the Desktop File...
Unless your drives were seriously damaged (floppies thrown in a backpack were always a bad idea no matter where you were), missing icons and whatnot were at the disk catalog level (used by Finder), not the HFS level. Command-Option on disk insert would fix it for me.
In the event of a power outage or something similar, it was always advisable to run Disk First Aid (and later versions System 7.5+ or Mac OS 8.1 maybe?) would run it automatically for you in the event of an unsafe shutdown, but that's just morally equivalent to running an fsck.
Hire a Linux system administrator, systems engineer,
Losing a day's or even an hour's data entry is not an option.
If you have that kind of requirements (less than an hour lost data), then you are not looking for just backup/archive. You are looking for a fully redundant storage system.
In addition to the backup system, of course.
For reading, check up on backupentral.com, Symantec.com (Backup Exec/Netbackup) emc.com (Avamar, networker).
I once managed a Filemaker database server (v5), and it has a built in featuer to copy the database files for backup. Real simple. Cannot remember if the database had to be taken offline, as we had users only during normal working hours, but these days that should NOT be a requirement.
Simple: It is a "Datasheet" covering an "archival grade medium". If you do not know that, you have absolutely no business working on any kind of "mission critical" storage, as you are simply incompetent with regard to that subject.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
You have no clue what you are talking about. Simple per-block checksums on the HDDs are already doing that. This is not a filesystem issue. This is also not a subject topic where clueless idiots like you can contribute anything.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Simple: It is a "Datasheet" covering an "archival grade medium". If you do not know that, you have absolutely no business working on any kind of "mission critical" storage, as you are simply incompetent with regard to that subject.
Easy, there, big fella. Posting a link to a datasheet would have sufficed. Ain't right to call a man incompetent for asking a question. Truly, an incompetent is one who don't never ask the question assuming he already knows. Credit is due for seeking to learn something.
Take it easy, Charlie, I've got an Angle...
GNU stands for GNU's Not UNIX.
Linux is just "Linus with an 'x' at the end to make the name look UNIX-y"
+1 this. The only time I've ever had Mac filesystem problems was when there were unexpected power loss. When you lose power while writing to the drive, bad things can happen. But I've not seen even that since Mac OS 7 or so. :-)
Enable 3D printed prosthetics!
Well IMHO I'd say more important for The Open Group (POSIX) to figure out what role they should play in a world where we don't have a variety of mostly coequal Unixes. A rather a highly fragmented family of Unixes in Linux including Android, a very popular desktop Unix that violates most of the Unix norms in spirit in OSX, and the only remaining big box Unix (AIX) is more aimed at bring over cool features from mainframe. None of them really care about running each other's software. So really the question is what role should the The Open Group play in such a world?
Does any file systems support single disc parity?
Set a parity ratio depending on risk vs. space loss tolerance. Say it is 1000. You can lose any of 1000 bytes in a parity group and recover while only giving up .1% of your disk space to parity.
Just for clarity, I'm not going to run the backup/archival system as an FMPro consultant will do that.
I need to get some more background, so I have knowledge of where the tradeoffs are. I know this is done all the time, but I'm sure there are still choices to be made.
So, what was your fucking point?
He was just thinking back the ole times.
The irony of your calling someone else clueless...
Drives do indeed have checksums on their blocks. That does not prevent them from sometimes feeding you back garbage anyway -- see misdirected and phantom reads and writes. Since ZFS uses a self-validating merkle tree, whereas disk checksums live in the same block as the data, ZFS is largely immune to this problem.
If you've worked with disks any length of time, as in actually trying to write a robust filesystem, you'd know that disks sometimes lie. They usually work but every now and then they do the most ridiculous things, due to mechanical, electrical or firmware problems. That's why filesystems like ZFS were created (what, you thought Sun spent man-decades of expert time on it for giggles?). kthreadd is correct.
Please just stay away from storage. The topic is much more complicated than you make it out to be.
Uh huh. As somebody who uses raidz, and has a decent high-level idea of how it works, I'm going to say you're full of shit.
Still pathetic. And no, you have absolutely no clue. Do you even know what an ECC is and how low the probability of it not detecting an error is for HDDs? And while you are looking that up, look up the Dunning-Kruger effect as well.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
I don't know why all your posts are so focused on the drive's internal checksums not detecting errors. As you say, that's very rare. A far more common occurrence, and one that I've seen many times, is a drive detecting corrupt data and being unable to correct it. At that point, it's up to the filesystem to use whatever redundancy you've provided (be it duplication or parity) to recover the lost data. The error correction on a drive can't do squat if a block is sufficiently corrupt.
You act as if the drive missing corruption is the problem. It's not.
All the Macs I've owned have always been my main personal computer, and the first couple were my only computer at the time. I did everything on them: schoolwork, gaming, stuff for my dad's office and for others, etc. Looking back, I believe I spent way more time with them than I should have.
Did I experience system crashes with the dreaded bomb box? Yes, plenty of them. Did I experience sad Macs? Yes, occasionally. (I believe it was supposed to appear on hardware failure, but after restarting the computers continued to hum along for years). I never owned (nor pirated) a copy of Norton Disk Doctor, although I did see it running on other people's computers.
It's not my fault that my experience differs from yours.
I dug up the study.
"End-to-end Data Integrity for File Systems: A ZFS Case Study"
Zhang, Rajimwale, Arpaci-Dusseau
Cosmic rays do happen; odds go up as elevation increases. I would guess location also matters.
other looking provided this gem:
Google reports that more than 8% of every DIMM gets error, each year. Google found that the error rates were several magnitudes larger than small scale studies showed.
Democracy Now! - uncensored, anti-establishment news
Just some minor corrections in your text.
ZFS always detects bit rot. ZFS can always repair bit rot if it is configured to do so; if some redundancy is used. For instance, raid or mirror. BUT! You can also configure ZFS to double (triple) every data on a single disk, so ZFS can repair data using only a single disk. BTRFS can also do this, and it does by partitioning the disk into two partitions, building a mirror on a single disk. This is extremely cumbersome, ZFS does not do that. You dont need to repartition or anything with ZFS, just specify "copies=2". Done.
BTW, there is no research if BTRFS is safe, on the the other hand, there are several research projects showing that ZFS is safe, read the research papers on the ZFS wikipedia article.
I read some guy speculating in a storage solution that should repair the corrupted data block from a given checksum, by trying different valid data blocks fulfilling the checksum. He googled this and it turned that someone already tried that. Guess which solution? Yep, ZFS. But that solution was omitted because it took to much time. Pretty cool anyway.
My claim is exactly the other way round. The claim by others was that extra error detection on RAID layer was needed. It is not.
"Sufficiently advanced RAID implementations will carry checksums of those blocks for exactly that purpose." is wrong. That is all I am saying. RAID does not carry block checksums because they are not needed. RAID may carry redundancy in several different forms, but redundancy (even ECC) is not "checksums".
What people here seem to completely miss is that filesystem-level data checksums are not there to detect corruption on the disk. The disk does that just fine. They are there to detect data corruption due to corruption in the path from main memory to the disk and back from it.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
"Incompetent" is a state, not an insult. And no, I cannot post any datasheets as I do not know what kind of equipment will be used.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
You need to find out the details of the backup/archival system being used. There is no way around that. It cannot be modeled as an opaque component, you need to understand the whole stack.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
And it is AC's job to spread lies. So take all he says with a grain of salt.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
ZFS definitely can detect bit rot! It's designed to do so.
OK so you're saying that manufacturer's are wrong by f'n 15 orders of magnitude when they say "less than 1 uncorrectable error for every 10^14 bits read". That's such an immense amount of error that you're basically accusing them of being incompetent, possibly even of fraud. Next you're also proposing that consumer hard drives have a bit error rate 13 orders of magnitude less than that claimed by LTO tape manufacturers. You're like the drunk guy running into walls, tripping over himself, shouting and pissing himself, while bitching about everyone else have craptastic balance and smelling like alcohol and urine and talking way too loudly.
was what happened to Microsoft coder's socks.. I have that on good authority from someone at Apple..
Stuff disappears on me all the time ;-)
That should be OK for differential backups (depending on how often you make a "full"). If a file changes by 1 bit, that'll affect a byte. A differential will be written, but it hardly takes any space and you can still roll back.
Make a good reason to *check* backups every now and then though.