Ask Slashdot: Best File System For the Ages?
New submitter Kormoran writes: After many, many years of internet, I have accumulated terabyte HDDs full of software, photos, videos, eBooks, articles, PDFs, music, etc. that I'd like to save forever. The problem is, my HDDs are fine, but some files are corrupting. Some videos show missing keyframes and some photos are ill-colored. RAID systems can protect online data (to a degree), but what about offline storage? Is there a software solution, like a file system or a file format, specifically tailored to avoid this kind of bit rot?
It's pretty sad that in this day and age, only one person has highlighted the relevance of ZFS here, and they're an AC. Someone mod parent up. RAID is borderline necessary if you don't have multiple backups, (to recover from in the event of random corruption caused by gamma rays from outer space or a butterfly flapping their wings on another continent or whatever) but so far as I know, only ZFS has built-in checksumming to detect/prevent the data corruption in the first place.
Schrodinger's bit rot. If you never look in the box again after putting the cat in it, you can pretend it lived forever.
Seriously, minimalism is underrated. There is such a thing as too much useless data. It's hard to catalog, it's hard to track, and if you sat down and sorted out what you actually could still use, most of it is probably worthless or you'd never find the time to use ever again. You might ask "well it's still worth storing IN CASE I ever find a use for it", but that's a typical data-hoarder sentiment that is unsustainable. You can't just keep buying media to store everything and never delete, it's a management nightmare results in these very issues.
I guarantee you, if you find you've deleted something and actually want to get it back, it's available somewhere on the Internet. If it's NOT, then it's a candidate for keeping. That's how minimalism works.
You've got terabytes of information you will never access again. How about just getting rid of most of it? Pick some subset you want to keep and then buy 3 HDDs and create triple copies of it Repeat this every year and you'll probably not lose any of the information.
Tell me about a usable linux distribution that has a fully working zfs implementation.
I should have an answer for you shortly. Say, in half a decade or so, give or take.
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
(there's a undetectable fault error rate, something along the lines of 1 in 10^20 bytes read or so will have an undetected error)
I just want to call this out because it's so important. That number, 10^20, sounds big, but considering the size of modern drives it's really not.
Randomly picking the WD 8TB Red NAS drive (WD60EFRX), which is designed for consume RAID as an example:
The spec sheet says the URE (unrecoverable read error) rate is at worst 1 x 10^14 per bits read. However, that drive holds 8 x 10^12 bytes! If you were to read every single byte there is about a 64% chance that at least 1 bit is read incorrectly.
(8 x 8 (bits per byte) x 10^12) / (1 x 10^14) = 64,000,000,000,000 / 100,000,000,000,000 = 0.64
Correct my math if I'm wrong, but this should make anyone think twice about using any kind of RAID as a "backup" solution. If you have a disk fail you have a better than 50/50 chance of introducing corrupt data during the rebuild process!
Frankly, ZFS-style checksumming is the future of files systems. It has to be for any data you care about.
"What do you despise? By this are you truly known." --Princess Irulan, Manual of Muad'Dib
/)
Concur. File corruption due to "age" will not occur without hard read errors. Also, "ill-coloured photos" likely would not be ill-coloured in the case of actual data corruption, but would have whole blocks of hash in them. The user claims to have multiple terabyte sized hard drives - hard drives in this size category userd for archival storage are simply not old enough to be suffering data corruption due to age. The only hard drives suffering so are MFM hard drives that likely the poster wouldn't have a clue how to even interface into a current computer. Hard drives used for archival data storage will likely not age degrade before the interface standard they are based on becomes obsolete. Thus, a perfectly reasonable archival data storage strategy is to simply copy data from one hard drive to a newer (likely much larger and faster) drive when the next generation interface becomes standard, and before the previous generation is totally obsolete. For example, one can still get PATA + SATA USB adapters, SATA + M.2 adapters, etc.
If the user who submitted this question is actually experiencing a problem at all, suggest that PEBCAK. Better explanation is the poster is not actually experiencing current problems at all, but is simply trying to sound important with inflated claims of reams of data and that Slashdot has been had.
Further, no person with Slashdot posting authority should have been ignorant of any of the issues in this question that make its legitimacy questionable at best, and certainly not Slashdot worthy in any circumstance.