Ask Slashdot: Practical Bitrot Detection For Backups?
An anonymous reader writes "There is a lot of advice about backing up data, but it seems to boil down to distributing it to several places (other local or network drives, off-site drives, in the cloud, etc.). We have hundreds of thousands of family pictures and videos we're trying to save using this advice. But in some sparse searching of our archives, we're seeing bitrot destroying our memories. With the quantity of data (~2 TB at present), it's not really practical for us to examine every one of these periodically so we can manually restore them from a different copy. We'd love it if the filesystem could detect this and try correcting first, and if it couldn't correct the problem, it could trigger the restoration. But that only seems to be an option for RAID type systems, where the drives are colocated. Is there a combination of tools that can automatically detect these failures and restore the data from other remote copies without us having to manually examine each image/video and restore them by hand? (It might also be reasonable to ask for the ability to detect a backup drive with enough errors that it needs replacing altogether.)"
http://www.quickpar.org.uk/
http://chuchusoft.com/par2_tbb/
One single cmd will do that,
zpool scrub
Not all cloud storage is expensive. It's only $4 a month for unlimited backups to CrashPlan.
They also do checksums and versioning and can be set to never remove deleted files from the backup.
I have 12.8TB backed up to them and it's been working great.
Other than that, ZFS can't be beat. I use that as well.
I never archive any significant amount of data without first running this script at the top:
find -type f -not -name md5sum.txt -print0|xargs -0 md5sum >> md5sum.txt
It's always good to run md5sum --check right after copying or burning the data. In the past, at least a couple of percent of all the DVDs that I've burned had some kind of immediate data error
(A while back, I rescanned a couple of hundred old DVDs that I burned ranging up to 10 years old, and I didn't find a single additional data error. I think that a lot of cases where people report that DVDs deteriorate over time, they never had good data on them in the first place and only discover it later.)
Bitrot is a myth in modern times. Floppies and cheap-ass tape drives from the 90s had this problem, but anything reasonably modern (GMR) will read what you wrote until mechanical failure.
This isn't just wrong, it's laughably wrong. ZFS has proven that a wide variety of chipset bugs, firmware bugs, actual mechanical failure, etc are still present and actively corrupting our data. It applies to HDDs and flash. Worse, this corruption in most cases appears randomly over time so your proposal to verify the written data immediately is useless.
Prior to the widespread deployment of this new generation of check-summing filesystems, I made the same faulty assumption you made: that data isn't subject to bit rot and will reproduce what was written.
ZFS or BTRFS will disabuse you of these notions very quickly. (Be sure to turn on idle scrubbing).
It also appears that the error rate is roughly constant but storage densities are increasing, so the bit errors per GB stored per month are increasing as well.
Microsoft needs to move ReFS down to consumer euro ducts ASAP. BTRFS needs to become the Linux default FS. Apple needs to get with the program already and adopt a modern filesystem.
Natural != (nontoxic || beneficial)