Slashdot Mirror


Ask Slashdot: Best File System For the Ages?

New submitter Kormoran writes: After many, many years of internet, I have accumulated terabyte HDDs full of software, photos, videos, eBooks, articles, PDFs, music, etc. that I'd like to save forever. The problem is, my HDDs are fine, but some files are corrupting. Some videos show missing keyframes and some photos are ill-colored. RAID systems can protect online data (to a degree), but what about offline storage? Is there a software solution, like a file system or a file format, specifically tailored to avoid this kind of bit rot?

475 comments

  1. Stone tablet and chisel by Anonymous Coward · · Score: 5, Funny

    I prefer to chisel the 0s and 1s into a stone tablet. Very secure, no bit rot.

    1. Re: Stone tablet and chisel by Anonymous Coward · · Score: 0

      We have a winner.

    2. Re:Stone tablet and chisel by TheFakeTimCook · · Score: 1

      I prefer to chisel the 0s and 1s into a stone tablet. Very secure, no bit rot.

      Reasonably akin to that, and a helluva lot more convenient, we have:

      https://en.wikipedia.org/wiki/...

      Now, finding a READER in a thousand years...

    3. Re: Stone tablet and chisel by Anonymous Coward · · Score: 2, Informative

      Damn beat me to it!
      Stone is secure!

      Dude, if your hard drives were fine, your files wouldn't be corrupted. Keep RAID backups if you want a solution. The file system doesn't make a Fing difference.

    4. Re: Stone tablet and chisel by I'm+just+joshin · · Score: 4, Funny

      I give you the 15
      *drops one*
      10 commandments...

    5. Re:Stone tablet and chisel by Anonymous Coward · · Score: 0

      I prefer to chisel the 0s and 1s into a stone tablet. Very secure, no bit rot.

      Weathering ?

    6. Re:Stone tablet and chisel by DontBeAMoran · · Score: 2

      What about bit fungus?

      --
      #DeleteFacebook
    7. Re:Stone tablet and chisel by magarity · · Score: 4, Funny

      Yes, weathering. That is why casting in bronze is vastly superior to mere chiseling in stone.

    8. Re:Stone tablet and chisel by CaptainDork · · Score: 1

      Why in simple hell is this not modded +1, Funny?

      --
      It little behooves the best of us to comment on the rest of us.
    9. Re:Stone tablet and chisel by Anonymous Coward · · Score: 0

      Why in simple hell is this not modded +1, Funny?

      Because it isn't funny. Every time someone asks about long term storage on Slashdot (and it has been asked many times) someone says "stone tablet". It wasn't even funny the first time.

    10. Re:Stone tablet and chisel by hduff · · Score: 1

      M-disk sounds nice, but I have not had any success using it with Linux, although it "should" work.

      --
      "I believe in Karma. That means I can do bad things to people all day long and I assume they deserve it." : Dogbert
    11. Re:Stone tablet and chisel by rrohbeck · · Score: 1

      But you need a lot of slaves for a decent data rate. And some of them need to be good at math to do the error correction.

    12. Re: Stone tablet and chisel by Anonymous Coward · · Score: 0

      lol it sounds like you know nothing about file systems.

    13. Re:Stone tablet and chisel by Anonymous Coward · · Score: 0

      Because it is false.
      Stone tablets wear out and a lot of information written in stone have been lost despite archeologists doing their best to retrieve the information.
      Loss of history is not funny.

    14. Re:Stone tablet and chisel by Joce640k · · Score: 1

      I prefer to chisel the 0s and 1s into a stone tablet.

      I know the OP said "forever" but what he really meant was "until by grandchildren throw all the old crap in the dumpster".

      (assuming he has offspring).

      A much better question is: WTF is an "ill-colored" image?

      (also: "Why didn't you make two or three copies if disks are so cheap?")

      --
      No sig today...
    15. Re: Stone tablet and chisel by Joce640k · · Score: 1

      You jest but there's actually over 300 commandments in Exodus 20-31.

      The only place where is says "Ten Commandments" in The Bible is Exodus 34:28.

      The commandments in Exodus 34 are more "Do not cook a young goat in its mother’s milk" than "Thou shalt not kill" but don't tell the Christians because it upsets them.

      --
      No sig today...
    16. Re:Stone tablet and chisel by Anonymous Coward · · Score: 0

      Chiseling is too laborious.

      Clay tablets is the way to go.
      Program your dot-matrix printer for cuniform also, too.

    17. Re:Stone tablet and chisel by hawkinspeter · · Score: 1

      It's not generally a good idea using trade secret and patent protected products for long term data archival. Eventually, the patents will expire, but in the short term, it restricts people from implementing and testing it and so you'll probably only discover the issues after it's too late.

      --
      You're a temporary arrangement of matter sliding towards oblivion in a cold, uncaring universe
    18. Re: Stone tablet and chisel by johnsmithperson123 · · Score: 1

      You mean the two commandments, for those who still use Base-10 and not binary.

    19. Re: Stone tablet and chisel by laie_techie · · Score: 2

      You mean the two commandments, for those who still use Base-10 and not binary.

      When the Jews asked Jesus which of the commandments was the most important, he condensed them into Love the Lord thy God with all thy heart and Love thy neighbor as thyself. If you obey these two commandments you won't break the spirit of the traditional 10 given to Moses.

    20. Re: Stone tablet and chisel by Anonymous Coward · · Score: 0

      drops 2 to get 5

    21. Re: Stone tablet and chisel by cant_get_a_good_nick · · Score: 1

      It's good to be the King

    22. Re:Stone tablet and chisel by TheFakeTimCook · · Score: 1

      It's not generally a good idea using trade secret and patent protected products for long term data archival. Eventually, the patents will expire, but in the short term, it restricts people from implementing and testing it and so you'll probably only discover the issues after it's too late.

      Well, therein lies the rub.

      Innovations will almost ALWAYS be Patented for whatever the max term (generally 20 years) allows.

      So, do you deny yourself the POSSIBILITY that something truly IS "the answer" for nearly a quarter-century; or do you just jump in and hope for the best?

    23. Re:Stone tablet and chisel by Shirley+Marquez · · Score: 1

      At least it's only the discs themselves and the writing that are proprietary. A bog-standard optical drive can read them, so we are spared one set of potential problems.

    24. Re: Stone tablet and chisel by saloomy · · Score: 1

      For those who do: http://zfsonlinux.org/

    25. Re: Stone tablet and chisel by WallyL · · Score: 1

      You jest

      No, he's joshing. Subtle difference!

    26. Re:Stone tablet and chisel by rthille · · Score: 1

      I prefer water-jet cutting 1/2" thick plates of titanium.

      --
      Awesome furniture, accessories and cabinetry in Santa Rosa, CA: http://humanity-home.com/
    27. Re:Stone tablet and chisel by martinfb · · Score: 1

      I disagree! My stone backups got eroded when the last civilization allowed climate change and the glacier melted. Damn event eroded my stones, AND the Grand Canyon!

      --


      Self-importance and self-indulgence is the root of ALL evil.
    28. Re: Stone tablet and chisel by Anonymous Coward · · Score: 0

      RAID is not a backup.

    29. Re:Stone tablet and chisel by hawkinspeter · · Score: 1

      The best bet is to go with already tested open standards that anyone can implement, test and evaluate.

      --
      You're a temporary arrangement of matter sliding towards oblivion in a cold, uncaring universe
    30. Re:Stone tablet and chisel by TheFakeTimCook · · Score: 1

      The best bet is to go with already tested open standards that anyone can implement, test and evaluate.

      Maybe that is the "best" bet; but you're treating it like it's the ONLY bet.

      Quit being such an idealogue, you'll live longer.

    31. Re:Stone tablet and chisel by hawkinspeter · · Score: 1

      I was thinking more in terms of practicality - give it 50 years and most proprietary systems are dead in the water. Long term thinking is essential for any kind of long term archiving otherwise you'll just end up transferring the data around every 5-10 years.

      --
      You're a temporary arrangement of matter sliding towards oblivion in a cold, uncaring universe
    32. Re:Stone tablet and chisel by TheFakeTimCook · · Score: 1

      I was thinking more in terms of practicality - give it 50 years and most proprietary systems are dead in the water. Long term thinking is essential for any kind of long term archiving otherwise you'll just end up transferring the data around every 5-10 years.

      Welcome to the world of the Library of Congress. I read an article a few years back about how they have had to do exactly that. And I am SURE they have looked-into some ways of "future-proofing" their data stores...

    33. Re:Stone tablet and chisel by aliquis · · Score: 1

      Google Palmyra ISIS.

    34. Re:Stone tablet and chisel by TWX · · Score: 1

      Bronze is worth more as a material for other uses than it is as a sheet of numbers. Stone is worth something besides as storing numbers, but it is arguably less valuable. That bronze won't survive a century in most cases, the stone might.

      --
      Do not look into laser with remaining eye.
    35. Re: Stone tablet and chisel by Anonymous Coward · · Score: 0

      Btfs ;)

  2. bit rot by Anonymous Coward · · Score: 5, Informative

    zfs

    1. Re:bit rot by Narcocide · · Score: 5, Insightful

      It's pretty sad that in this day and age, only one person has highlighted the relevance of ZFS here, and they're an AC. Someone mod parent up. RAID is borderline necessary if you don't have multiple backups, (to recover from in the event of random corruption caused by gamma rays from outer space or a butterfly flapping their wings on another continent or whatever) but so far as I know, only ZFS has built-in checksumming to detect/prevent the data corruption in the first place.

    2. Re:bit rot by Noryungi · · Score: 1

      zfs

      ZFS is a pretty good solution. Multiple NAS ZFS systems with snapshots and replication are even better.

      I personally like XFS in production (including LVM), but ZFS is hard to beat if bitrot is your #1 concern.

      --
      The right to offend is far more important than the right not to be offended. (Rowan Atkinson)
    3. Re:bit rot by Calibax · · Score: 1

      Like all hardware, disk drives have two states - failed and going to fail. Bitrot will also occur with long term storage, whether you notice or not.

      A self-healing file system with substantial redundancy capabilities like ZFS is the obvious answer.

      However, there are many ways to configure ZFS, and some configurations have better redundancy than others. A misconfigured system would be worse than useless because of the false sense of security. Exactly how many terabytes of data you have also matters for creating the best configuration at lowest cost. Be prepared to spend a little time learning about ZFS before jumping in.

    4. Re:bit rot by tlhIngan · · Score: 4, Informative

      It's pretty sad that in this day and age, only one person has highlighted the relevance of ZFS here, and they're an AC. Someone mod parent up. RAID is borderline necessary if you don't have multiple backups, (to recover from in the event of random corruption caused by gamma rays from outer space or a butterfly flapping their wings on another continent or whatever) but so far as I know, only ZFS has built-in checksumming to detect/prevent the data corruption in the first place.

      No, RAID Is not sufficient to prevent bit-rot. In fact, RAID can accelerate it. You see, using a redundant mode like 1, 5, 6, most controllers (software and hardware) will only read enough disks to get the data, 1 drive in the case of RAID1, N-1 for RAID5 and N-2 for RAID6 (the non-parity ones, to save a parity calculation). But the drives can return bit errors - it's rare, but it does happen (there's a undetectable fault error rate, something along the lines of 1 in 10^20 bytes read or so will have an undetected error). And this the RAID controller will happily return to you since it didn't check the redundant drives to verify correctness. And it's possible it gets written back corrupted, thus causing corruption.

      You really need something like ZFS which puts a checksum on every file and verifies it, so if it does get an error it can resolve it.

    5. Re:bit rot by Anonymous Coward · · Score: 0

      It's pretty sad that in this day and age, only one person has highlighted the relevance of ZFS here, and they're an AC. Someone mod parent up. RAID is borderline necessary if you don't have multiple backups, (to recover from in the event of random corruption caused by gamma rays from outer space or a butterfly flapping their wings on another continent or whatever) but so far as I know, only ZFS has built-in checksumming to detect/prevent the data corruption in the first place.

      Tell me about a usable linux distribution that has a fully working zfs implementation.

    6. Re:bit rot by __aaclcg7560 · · Score: 4, Informative

      You really need something like ZFS which puts a checksum on every file and verifies it, so if it does get an error it can resolve it.

      ZFS also has its own flavors of RAID 1/5/6.

    7. Re:bit rot by MightyMartian · · Score: 4, Informative

      Whose to say zfs will be around in a few decades?

      The real solution here is relatively frequent backups, multiple copies in different filesystem and physical formats (ie. flash, hard drive, optical). Over time you just keep moving your file store to the new mediums. I have files that are over twenty five years old now, some of them coming from DOS and Windows 3.1, others from my old original Slackware 3 installs. Along the way some of those files have been on CD-Rs, DVDs, early USB thumb drives, various hard drives running everything from FAT, FAT32, ReiserFS, HPFS, NTFS, ext2 and ext3. And I'll keep on doing that until I drop dead, and I'll leave it up to my family to decide whether they want to keep any of the documents, pictures, music files, videos and so on that I've been collecting.

      At no point do I ever assume a mere file system sitting on one physical and/or logical volume is ever going to do the job of keeping my files available over the long haul. RAID and file systems in all their glory are not intended for that. Multiple physical copies at multiple locations on multiple types of media, that's the only real way to assure your files remain accessible and safe over time.

      --
      The world's burning. Moped Jesus spotted on I50. Details at 11.
    8. Re:bit rot by davidwr · · Score: 3, Insightful

      Tell me about a usable linux distribution that has a fully working zfs implementation.

      I should have an answer for you shortly. Say, in half a decade or so, give or take.

      --
      Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
    9. Re:bit rot by Anonymous Coward · · Score: 1

      ...but so far as I know, only ZFS has built-in checksumming to detect/prevent the data corruption in the first place.

      BTRFS has it too. Now whether or not BTRFS is production ready or not is another debate.

    10. Re:bit rot by Anonymous Coward · · Score: 0

      only one person has highlighted the relevance of ZFS here, and they're an AC.

      I wish people would settle on he / she / ve / ze / whatever, all but a f*cking plural.

    11. Re:bit rot by Anonymous Coward · · Score: 0

      > Whose to say

      Ours!

    12. Re:bit rot by Anonymous Coward · · Score: 0

      Usable is subjective. I have been using ZFS with Gentoo for years.

    13. Re:bit rot by Solandri · · Score: 1

      That would've been my vote a few years ago. But since Oracle demonstrated they're willing to sue people who use software Sun formerly released as open source, ZFS is dead. Nobody is going to touch it with a 10 ft pole, and Oracle has shown little interest in continuing to develop it.

      We're just gonna have to wait for the great features in ZFS to be re-implemented in some other filesystem, free of Oracle's clutches.

    14. Re: bit rot by X86BSD · · Score: 1

      Stop using inferior solutions. FreeBSD and illumos use ZFS natively And have for many years.

    15. Re:bit rot by shellster_dude · · Score: 3, Informative

      ZFSonLinux works just fine. I've got a huge Debian based NAS running it. Sure, you can't really boot to it, but who cares? I can reinstall the base OS, the irreplaceable part is my personal files which are on a ZFS RAIDZ2 with an encrypted cloud backup.

    16. Re: bit rot by X86BSD · · Score: 0

      This is quite possibly the most ignorant comment in this entire thread. ZFS is dead? What are you smoking? It's in illumos and FreeBSD. Which is the base for many many of the largest storage v noses on earth. Emc, nexenta, spectra logic. Your idea that zfs is dead should be completely ignored as ignorant.

    17. Re: bit rot by Anonymous Coward · · Score: 0

      The singular for an unknown sex would be "it". Calling someone an it is kind of rude though so when in doubt plural is the way to go. This PSA brought to you by English 101.

    18. Re: bit rot by Anonymous Coward · · Score: 1

      That's bollocks because every sector has a checksum at the hardware level, bit rot would be detected at read time.

    19. Re:bit rot by 0100010001010011 · · Score: 1

      As of Ubuntu 16.04 it's in Ubuntu's main repository. You don't even have to install another repository.

      If you're not Ubuntu you should be intelligent enough to add a repository to what ever distro you do use.

    20. Re:bit rot by Blaskowicz · · Score: 1

      To take a lazy example, Ubuntu 16.04 advertised its ZFS support.

    21. Re:bit rot by Anonymous Coward · · Score: 0

      It's pretty sad that zfs was mentioned over stone tablet and chisel. Zfs costs a lot more money, and time was NOT given as a factor. Stone tablet and chisel easily wins.

    22. Re: bit rot by Anonymous Coward · · Score: 1

      FreeBSD *nix ðY

    23. Re:bit rot by Anonymous Coward · · Score: 0

      So angry, aren't I?

    24. Re:bit rot by Anonymous Coward · · Score: 0

      Ubuntu (since 16.04) is currently the only linux distribution that has pre-compiled ZFS modules built-in. This means you don't have to rebuild the ZFS modules after every kernel update, or hope that DKMS doesn't mess it up like I've experienced with Fedora.

      The FreeBSD implementation is pretty nice; I believe it's the only distribution (out of all the BSD, Linux distros) that supports TRIM if you're using SSDs.

    25. Re:bit rot by Anonymous Coward · · Score: 1

      Whose to say zfs will be around in a few decades?

      It's open source, and the data structures are document:

      * http://open-zfs.org/wiki/Main_Page

      It's also in FreeBSD, which has been around for a few decades. Also, it's available on Linux, which won't be going away any time soon either. By the time either of these systems are replaced, I'm sure the writing would be on the wall for a while which would give the OP time to migrate.

    26. Re: bit rot by Anonymous Coward · · Score: 0

      Shakespeare used "they" this way.

    27. Re:bit rot by fluffernutter · · Score: 2

      I tried ZFS once on a backup server, my system crumbled under the load. Did the same config with ext4 and it has been running fine since.

      --
      Laws are rules for the court, but merely a bottom bar to hit for life. Think beyond laws in your actions always.
    28. Re:bit rot by Spazmania · · Score: 4, Informative

      He said a filesystem for the ages. While it has wonderful features, ZFS isn't even a filesystem for this age, let along ages to come. FAT32 and ISOFS are your best bets for being readable 20 years from now.

      Bear in mind that your hard disk checksums each block and returns an error if the block is uncorrectable upon read rather than give you bad data. So, if you're getting bit rot at all then you have a hardware problem.

      With or without a hardware problem you want to be able to recover your data. The answer is par2, such as parchive or QuickPar. Par2 uses a Reed-Solomon code to take a set of source files and produce a set of recovery files such that the original files can be checked for correctness and up to N original files can be corrected where N is the number of recovery files created.

      And that's your answer. A filesystem like FAT32 or ISOFS that's likely to still be implemented in future OSes and a recovery files which let you rebuild anything that suffers from bit rot.

      --
      Moderating "-1, Disagree" is simple censorship. Have the guts to post your opinion.
    29. Re: bit rot by Nutria · · Score: 1

      storage v noses on earth

      Eh?

      --
      "I don't know, therefore Aliens" Wafflebox1
    30. Re: bit rot by Anonymous Coward · · Score: 0

      openZFS is what lives on. its not compatible with Oracles zfs.

    31. Re:bit rot by Vairon · · Score: 1

      Btrfs and ZFS have metadata and data checksum support.
      XFS has only metadata checksum support.

    32. Re:bit rot by higuita · · Score: 1

      ZFS was build to be run on big server, with lot of ram, with battery protected raids. On whatever OS it runs, it tried to use lot more resources than other Filesystems, specially if you enable dedupe and compression. It is a good filesystem, but lot of people think that it is a good general filesystem for all users, where it is not. It can be used, but it is not light, it works best in dedicated fileservers

      --
      Higuita
    33. Re:bit rot by networkBoy · · Score: 1

      OR you can store as PAR files.

      --
      whois gawk date unzip strip find touch finger mount join nice man top fsck grep eject more yes exit umount sleep dump
    34. Re:bit rot by fluffernutter · · Score: 1

      Ok so wouldn't that pretty much exclude it from being used for backups? Who is going to want to buy a big server to load hard drives with data?

      --
      Laws are rules for the court, but merely a bottom bar to hit for life. Think beyond laws in your actions always.
    35. Re:bit rot by pak9rabid · · Score: 1

      This answer is correct answer.

    36. Re:bit rot by Anonymous Coward · · Score: 0

      > Whose to say

      Who's*

    37. Re:bit rot by KingMotley · · Score: 1

      Except most RAID controllers also have things like patrolling reads (bad sectors) and consistency checks (mismatched data/bitrot) to catch these types of issues. Granted, it's not a function of RAID itself, but most decent implementations include them.

    38. Re: bit rot by X86BSD · · Score: 1

      How dare you mention btrfs and ZFS in the same breath. How dare you sir!

    39. Re: bit rot by X86BSD · · Score: 1

      Vendors. That was supposed to be vendors. Auto correct is the bane of forum threads!

    40. Re: bit rot by Anonymous Coward · · Score: 0

      Obviously ntfs is the best.

    41. Re:bit rot by Anonymous Coward · · Score: 0

      PAR2 is another option.

      Just be prepared to dedicate recovery space.

      Another option is to prune and delete stuff. Do you *really* need that funny vid you saw 15 years ago? Do you really need 6 copies of an application? So consider using duplicate file finder programs to root out extras.

      I have been blowing away extra stuff on my 'slush drive'. From 4TB down to ~500MB. Of that I suspect I can blow most of it away.

    42. Re:bit rot by nmb3000 · · Score: 4, Insightful

      (there's a undetectable fault error rate, something along the lines of 1 in 10^20 bytes read or so will have an undetected error)

      I just want to call this out because it's so important. That number, 10^20, sounds big, but considering the size of modern drives it's really not.

      Randomly picking the WD 8TB Red NAS drive (WD60EFRX), which is designed for consume RAID as an example:

      The spec sheet says the URE (unrecoverable read error) rate is at worst 1 x 10^14 per bits read. However, that drive holds 8 x 10^12 bytes! If you were to read every single byte there is about a 64% chance that at least 1 bit is read incorrectly.

      (8 x 8 (bits per byte) x 10^12) / (1 x 10^14) = 64,000,000,000,000 / 100,000,000,000,000 = 0.64

      Correct my math if I'm wrong, but this should make anyone think twice about using any kind of RAID as a "backup" solution. If you have a disk fail you have a better than 50/50 chance of introducing corrupt data during the rebuild process!

      Frankly, ZFS-style checksumming is the future of files systems. It has to be for any data you care about.

      --
      "What do you despise? By this are you truly known." --Princess Irulan, Manual of Muad'Dib
      /)
    43. Re:bit rot by ChrisMaple · · Score: 1

      ZFS is the file system. Stone tablet is the medium. Use ZFS on stone tablets.

      --
      Contribute to civilization: ari.aynrand.org/donate
    44. Re:bit rot by Anonymous Coward · · Score: 0

      Raid controllers are generally discouraged. Lots of RAM is less than most people would put in a new desktop these days, not super relevant if you just want an offline disk protected.

      You can do a single disk ZFS pool with multiple copies enabled as an offline backup for protection against bit rot.

      https://blogs.oracle.com/relling/entry/zfs_copies_and_data_protection

    45. Re:bit rot by Nostalgia4Infinity · · Score: 1

      I'm running ZFS on a atom based motherboard with 8gb ram. It runs amazing. FreeBSD merlin.bbridg01.fl.comcast.net 10.3-RELEASE-p5 FreeBSD 10.3-RELEASE-p5 #0 r301477: Mon Jun 6 05:19:08 EDT 2016 root@merlin.bbridg01.fl.comcast.net:/usr/obj/usr/src/sys/MERLIN amd64 CPU: Intel(R) Atom(TM) CPU N2800 @ 1.86GHz (1866.78-MHz K8-class CPU) Origin="GenuineIntel" Id=0x30661 Family=0x6 Model=0x36 Stepping=1 real memory = 8589934592 (8192 MB) avail memory = 8264859648 (7881 MB) root@merlin:/usr/local/sbin # zpool status pool: tank2 state: ONLINE scan: scrub in progress since Mon Mar 6 20:38:02 2017 3.51G scanned out of 652G at 58.0M/s, 3h11m to go 0 repaired, 0.54% done config: NAME STATE READ WRITE CKSUM tank2 ONLINE 0 0 0 ada1 ONLINE 0 0 0 errors: No known data errors

    46. Re:bit rot by Anonymous Coward · · Score: 1, Informative

      No it isn't. It's completely wrong on a number of things.

      1. "ZFS isn't even a filesystem for this age" - WTF does that even mean? The software exists and is open source; it runs on BSD which is one of the most conservative OSes around and is used by many companies who take data storage seriously. It'll be around for decades.

      2. "if you're getting bit rot at all then you have a hardware problem" - Wrong. Bit-rot is an issue inherent to any storage medium and will, with some low-but-not-low-enough-to-ignore-on-large-datasets probability occur on any hard drive. If you think a hard drive where bit-rot can occur is faulty then there isn't a functioning hard drive on the planet.

      3."With or without a hardware problem you want to be able to recover your data. The answer is par2" - Or just use ZFS (or BTRFS) which has the same feature built into the filesystem so you don't have to manually generate recovery files every time your data changes.

    47. Re:bit rot by ShanghaiBill · · Score: 2

      Whose to say zfs will be around in a few decades?

      Why wouldn't it be? The only things that could wipeout all implementations of a widely used format like ZFS would be nuclear war or an ELE asteroid strike. In either event, reading disk drives would be the least of your problems.

    48. Re:bit rot by fnj · · Score: 1

      Actually, because of its design, notably the COW principles, ZFS shines in its resistance to power interruptions and system crashes. I have a 36 TB ZFS server and a 21 TB ZFS server. Both run 24x7, no UPS, and have ridden out quite a few power interruptions over the years, with never a bit of data loss. Reboot requires no fsck. ZFS doesn't even have an fsck. It is IMPOSSIBLE for it to get corrupted. The most you can lose is some very recent data that was in the actual process of being written out at the moment of power interruption. All that happens is that the volume state gets rolled back to a few seconds previous, with complete consistency.

      I experienced a horrific complete volume data loss with XFS under the same conditions. I will never use that garbage again.

      BTW, ZFS compression does not use significant RAM. Dedupe, on the other hand, DOES. It's virtually never a viable option; CERTAINLY not for a home user who does not have HUNDREDS or THOUSANDS of GB of RAM in his server.

    49. Re:bit rot by fnj · · Score: 1

      You're not limited to a big server to run ZFS. Yes, it gives you a whole lot of advantages if you do, but you still get the massive benefit of data checksumming and incredibly good power loss data protection even with a single disk.

    50. Re:bit rot by grcumb · · Score: 4, Funny

      (there's a undetectable fault error rate, something along the lines of 1 in 10^20 bytes read or so will have an undetected error)

      I just want to call this out because it's so important. That number, 10^20, sounds big, but considering the size of modern drives it's really not.

      Vhrist, you guys. Why so p[aranoid? FAT has been workking just fine since day one, and there's not reason to beliveve it won't keep workingn that way for

      --
      Crumb's Corollary: Never bring a knife to a bun fight.
    51. Re:bit rot by Anonymous Coward · · Score: 0

      As long as you're not using the built-in RAID-5/6 options, it's fine.

    52. Re: bit rot by Anonymous Coward · · Score: 0

      Yea, verily.

    53. Re:bit rot by fluffernutter · · Score: 1

      Except back to my original comment. I had 5-6 disks and it was already bringing my system to a halt. So yes, apparently you do need a big server.

      --
      Laws are rules for the court, but merely a bottom bar to hit for life. Think beyond laws in your actions always.
    54. Re:bit rot by dbIII · · Score: 1

      ZFSonLinux works just fine

      Currently performance sucks compared with ZFS on other platforms. That is changing but for now "works just fine" means doing file operations eventually instead of running smoothly.
      For a single user on a desktop PC it kind of works. For a file server used for staging backups, kind of gets there in the end. For a multi-user fileserver with even low usage, it can't keep up, switch to *bsd and try again after the linux port is more mature.

    55. Re: bit rot by dannys42 · · Score: 2

      Multiple copies may be one solution, but it introduces another problem that doesn't have an elegant solution... you need a tool that can verify the integrity of your data (across the multiple copies). How do you choose which one is "correct" when you migrate and copy to a new system? In addition, how are you sure that any given copy is actually complete? What if you want to permanently delete a file from your archive?

      I mitigated some of these problems for my photo library by using version control software. But they're not really designed for this purpose. Git runs into memory issues when you have repositories that run up to tens of GB. Subversion works, but you end up with a duplicate copy of all your files in your work tree.

      There really isn't a very good archival solution I've found so far that allows you to be sure about the integrity of your data in the long term when talking even at the 100GB level, let alone the multi TB level.

    56. Re:bit rot by dbIII · · Score: 1

      I've got a 32 bit Pentium 4 thing with 16 IDE disks and only 2GB of memory dusted off and used as a test box every now and again (good for teaching people how to replace drives, configuring pools etc). Your hardware is not the problem and neither is ZFS, the problem is more likely to be that ZFS is not trivial to set up so I'd say something went wrong on what seems to be your one and only attempt.
      The tool isn't bad but not learning how to use it can lead to bad results.

      That said, if ext4 and hardware RAID works for you go for it.

    57. Re:bit rot by fluffernutter · · Score: 1

      No, I read all the documents and set it up properly. Unless possibly there were options set by default that tuned it for a server that I should have shut off. I didn't get that far. It seemed to be a lot of commands to do very little.

      --
      Laws are rules for the court, but merely a bottom bar to hit for life. Think beyond laws in your actions always.
    58. Re:bit rot by sjames · · Score: 1

      Worse, even if the RAID controller reads all of the disks all it can say is the data is corrupt. It has no way to know which block is the problem. ZFS and BTRFS both have sufficient checksumming in redundant modes to detect which copy is bad and repair it.

    59. Re: bit rot by adolf · · Score: 1

      Isn't BTRFS's built in RAID still broken?

    60. Re:bit rot by Anonymous Coward · · Score: 0

      There are exactly four very simple and required elements that make up the full answer to this question...
      1) Corruption sucks... Buy CPUs and motherboards that support ECC RAM, then buy and install the fucking ECC RAM, and don't fucking overclock. Really.
      2) Filesystems suck... Use ZFS, it doesn't suck. Really.
      3) You suck... Make backups, physically disconnect them from your computer, store them safely, test them. Really.
      4) Your OS sucks... Use FreeBSD, it doesn't suck. Really.

      Someone else with half a brain and more time will explain for you why these rules exist.
      If you don't do all four of these things, you will continue to corrupt and lose data.
      Now go! You have a lot of documentation to read and old habits to break.

    61. Re:bit rot by Anonymous Coward · · Score: 0

      FreeBSD with ZFS. It boots from ZFS. And it's not nearly as resource intensive as people here would claim. Granted, it requires more memory than ext4, but that's the price you pay for an extremely reliable filesystem.

      BTRFS is perpetual alpha code. I'd no sooner trust it with my data than I would a politician with my money.

    62. Re:bit rot by Anonymous Coward · · Score: 0

      You must use ECC RAM to prevent and detect a major source of instruction and data rot in your entire box.
      You must use ZFS and strong checksumming and raidz options to prevent and detect it on your disks.
      Otherwise all your "oh so buttsaving precious backups you thought you were smart about",
      will inherit all the shit your corrupt system undetectably reads and scribbles back on them.

    63. Re:bit rot by Galactic+Dominator · · Score: 1

      Stone tablets are known to be broken by fits of jealousy.

      --
      brandelf -t FreeBSD /brain
    64. Re:bit rot by Neuronwelder · · Score: 1

      Hey Narcocide, what's your opinion on the Linux file system? ..P.S., ZFS sound pretty good, I looked it up.

    65. Re:bit rot by Anonymous Coward · · Score: 0

      I don't see the big deal? Ram is cheap and ZFS has been doing just fine for me. Most people really don't need dedupe, so just turn it off. Compression works just fine, but even in most cases you don't even need that considering most peoples compression ratio is way below 2. If you're that concerned for duplication, just run other software that you can run periodically over your files to deal with with it.

    66. Re:bit rot by Anonymous Coward · · Score: 0

      Practically anyone with actual valuable data to backup?

      Even SMEs with NAS backup targets generally have a backup server doing the work (NAS merely presents storage as iSCSI or similar), or have a storage device with enough beans to handle things like ZFS. For eg, an entry level rackmount ReadyNAS.

    67. Re:bit rot by grcumb · · Score: 1

      (Whoever modded this offtopic either has no sense of humour, or actually likes FAT. Either way....)

      --
      Crumb's Corollary: Never bring a knife to a bun fight.
    68. Re:bit rot by Neuronwelder · · Score: 1

      Thanks 0100010001010011! Wish I didn't run out of points so soon. Thanks for the info on Ubuntu! I didn't know ZFS was Linux compatible!

    69. Re:bit rot by EmeraldBot · · Score: 1

      Actually, because of its design, notably the COW principles, ZFS shines in its resistance to power interruptions and system crashes. I have a 36 TB ZFS server and a 21 TB ZFS server. Both run 24x7, no UPS, and have ridden out quite a few power interruptions over the years, with never a bit of data loss. Reboot requires no fsck. ZFS doesn't even have an fsck. It is IMPOSSIBLE for it to get corrupted. The most you can lose is some very recent data that was in the actual process of being written out at the moment of power interruption. All that happens is that the volume state gets rolled back to a few seconds previous, with complete consistency.

      I experienced a horrific complete volume data loss with XFS under the same conditions. I will never use that garbage again.

      BTW, ZFS compression does not use significant RAM. Dedupe, on the other hand, DOES. It's virtually never a viable option; CERTAINLY not for a home user who does not have HUNDREDS or THOUSANDS of GB of RAM in his server.

      ZFS compression is actually faster for some types of data on a modern computer than plain data is. That's because the time it takes for the processor to go through lz4 is less than it would take your hard drive to read that data uncompressed. It was a pleasant surprise I discovered, and in tandem with the ARC and prefetching, ZFS can be amazingly quick for often accessed files. Yet another reason why it's incredible...

      --
      "Set a man a fire, he'll be warm for the rest of the night. Set a man afire, he'll be warm for the rest of his life."
    70. Re:bit rot by Neuronwelder · · Score: 1

      Wow fluffernutter ! I didn't know that ZFS liked ext4 so much. Yay! Linux!

    71. Re: bit rot by EmeraldBot · · Score: 1

      This is quite possibly the most ignorant comment in this entire thread. ZFS is dead? What are you smoking? It's in illumos and FreeBSD. Which is the base for many many of the largest storage v noses on earth. Emc, nexenta, spectra logic. Your idea that zfs is dead should be completely ignored as ignorant.

      Not to mention that FreeBSD's fork is completely patent free and open source, available to everyone. FreeBSD's version of ZFS has as much chance of becoming Oracle fodder as ext4 does. Which is to say, fairly likely since Oracle considers all successful software to be theirs, but they won't be able to build a case with it.

      --
      "Set a man a fire, he'll be warm for the rest of the night. Set a man afire, he'll be warm for the rest of his life."
    72. Re:bit rot by spongman · · Score: 1

      i have a machine running debian squeeze with a raidz4+1 ssd cache pool that has hosted several VMs with heavy load for 3 years now. i have seen 1 error that was found and fixed during a routine scrub.
      http://zfsonlinux.org/ has more.

    73. Re: bit rot by Neuronwelder · · Score: 1

      Please excuse me, but ZFS is not quite dead. It is currently comes available with Ubuntu 16, fully functioning.. EXT4 likes it.

    74. Re:bit rot by spongman · · Score: 1

      you can definitely boot to zfsonlinux on jessie.

      i have used the following to set up a jessie machine with a single root raidz8 pool:

      https://github.com/zfsonlinux/...

    75. Re:bit rot by Spazmania · · Score: 4, Informative

      "ZFS isn't even a filesystem for this age" - WTF does that even mean?

      It means that even back when FAT was a johnny come lately it already had greater market penetration than ZFS. With decades behind it and broad market penetration today, there's good reason to believe it won't vanish with the advent of the next development in filesystem architecture. ZFS is likely to be a blip on the radar, a pause before the next innovation. Not what you want for an archival format.

      Bit-rot is an issue inherent to any storage medium

      Bit rot, aka corrupted data, is not inherent to correctly operating hardware. As implemented, you'll see tens of thousands of unreadable blocks on a hard disk before you see a single one in which data has been undetectably corrupted. Every single sector gets a checksum in hardware and if the checksum does not pass you get the famous Abort Retry Ignore. For most storage you get Forward Error Correction coding so that some number of bit errors can be corrected on read before having to throw an error.

      When you see bit rot, the storage media is usually not at fault. More often the data passes through faulty non-parity ram, a noisy memory bus or an overheated controller and gets corrupted on its way to storage rather than getting corrupted at rest on the storage. It died when you used an overclocked piece of garbage to copy it from an old hard disk to a newer, bigger one.

      --
      Moderating "-1, Disagree" is simple censorship. Have the guts to post your opinion.
    76. Re:bit rot by Spazmania · · Score: 3, Informative

      Bit-rot is an issue inherent to any storage medium

      Here's a quick article which explains how hard disks use error correcting codes so that the user-level experience is no bit rot but rather many many read failures before even a single block of undetectably corrupted data. Next time you can know what you're talking about.

      http://www.pcguide.com/ref/hdd...

      --
      Moderating "-1, Disagree" is simple censorship. Have the guts to post your opinion.
    77. Re:bit rot by Anonymous Coward · · Score: 0

      FreeBSD will boot off ZFS on root no problem at all, and with all things ZFS, as the primary development OS and first native adopter, it has done so for YEARS.

      https://planet.freebsd.org/lulf/2008/12/16/setting-up-a-zfs-only-system/
      https://wiki.freebsd.org/ZFSOnRoot
      https://wiki.freebsd.org/RootOnZFS
      https://wiki.freebsd.org/ZFS
      https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/zfs.html

      ZFS on FreeBSD is just plain baller.

    78. Re:bit rot by FredrikKarlsson · · Score: 1

      "Whose to say zfs will be around in a few decades?" If you are using it on you computer then it will be around for decades. Linux loves backwards-compatibility you can find file system support for FAT and old (almost unused) terminal support. By the way you don't need to upgrade to the latest thing. and last Windows XP is still around.... 20 years and going strong. Software-wise for a few decades should not be a problem.

    79. Re: bit rot by Anonymous Coward · · Score: 0

      I thought it came from Solaris?

    80. Re:bit rot by AaronW · · Score: 2

      I have been using XFS for many years and have found it to be quite reliable and have been able to recover data when the underlying data store got corrupted. It's also quite mature in Linux and relatively fast. My last experience with BTRFS was a failure (several years ago) due to it being incredibly slow when there were thousands of small files in directories. Once ZFS is stable in the Linux kernel I'll give it a try.

      --
      This post is encrypted twice with ROT-13. Documenting or attempting to crack this encryption is illegal.
    81. Re: bit rot by Anne+Thwacks · · Score: 1
      Vendors?

      Personally, I think V Noses described them better!

      --
      Sent from my ASR33 using ASCII
    82. Re:bit rot by Zontar+The+Mindless · · Score: 1

      This grammar fascist has no problem using "he" or "he or she" for formal settings, and "they" for informal settings. Slashdot qualifies as the latter. Get a grip already.

      --
      Il n'y a pas de Planet B.
    83. Re: bit rot by Zontar+The+Mindless · · Score: 1

      Auto-correct: That annoying thing that I disable whenever it rears its stupid and ugly head.

      --
      Il n'y a pas de Planet B.
    84. Re:bit rot by Anonymous Coward · · Score: 0

      Every storage media has built-in protection like zfs. Yes, even hard disks. They are ultimately analog devices with added error correction code, just like zfs.

    85. Re:bit rot by Anonymous Coward · · Score: 0

      The real question on ZFS is:
      Does it run windows?

      I am only partially sarcastic!
      For a system to be adapted across the board, it has to be usable by major systems. i.e. Windows users should be able to use ZFS system; the open source (including ZFS) community may need to write the drivers for that to work. Do not rely on M$ to do something for the Open Source community.

    86. Re: bit rot by ssam · · Score: 2

      Only the parity modes (raid 5,6). BTRFS raid 0,1,10 work great and give better protection than traditional raid thanks to the checksumming.

    87. Re:bit rot by Anonymous Coward · · Score: 0

      You are soooo absolutely right. Start with using a plural for 'he' or 'she' and you enter a slippery slope that will end up with using a plural for 'thou'.

    88. Re:bit rot by Anonymous Coward · · Score: 0

      "They" has been used as a gender-neutral singular pronoun since the 14th century. https://en.wikipedia.org/wiki/...

    89. Re:bit rot by dbIII · · Score: 1

      You THINK you set it up properly. Sometimes you don't get to find out if you did or didn't on your first attempt with things.
      For example, deduplication is a trap for new players - don't do it unless you have a LOT of memory. There are a few others, such as incorrect block sizes on recent drives that can really kill performance on both ZFS and RAID. Also ZFS on linux still kind of sucks in terms of performance and was even worse a year ago - it has a long way to go to catch up with the *bsd versions or on even Solaris10.

    90. Re:bit rot by leptechie · · Score: 1

      This doesn't address silent bit rot. Note OP refers to silent corruption of stored data, not notifications from the disk or FS that the data is corrupt. You'd need to manually scan data with your PAR tool to confirm integrity before accessing every single one.

    91. Re: bit rot by Anonymous Coward · · Score: 0

      Your maths is wrong.

      If you had 1 billion bits to read, and a 1 in a billion chance of an error, would that make exactly one error a certainty?

      Perhaps, when reading 2 billion bits of data, the probably of an error would be 2?

      Always better to make double sure.

    92. Re:bit rot by thegarbz · · Score: 1

      Just jump on Wikipedia for a list of Linux distributions and start reading sequentially down the list. As long as you don't give a shit about the supposed licenses incompatibilities then ZFS rubs on any Linux distribution. Done even so with native ZFS and support it like the LTS releases of Ubuntu.

    93. Re:bit rot by thegarbz · · Score: 1

      You have failed to understand the issue and your solution will happily replicate backup and restore damaged data. Backups do not resolve but rot issues.

    94. Re:bit rot by Bongo · · Score: 1

      You really need something like ZFS which puts a checksum on every file and verifies it, so if it does get an error it can resolve it.

      In a small enterprise, I've been down this road of what to do, and the worst problem was, it seems to me, the not knowing whether the bits are there and safe.

      How many backups should I keep? How can I trust this one rather than that one? When does a backup turn bad? What if weird hardware does a weird thing? And so on.

      With ZFS I can scrub, I can get notifications, I can send filesystems to remote pools, and because stuff is just a file, not in some archive package, I can idle my hours browsing through files to see they are openable and readable, as one last sporadic manual check by human.

      Snapshots and volume management are icing on the cake as far as I'm concerned, very useful, but without checksums, would be elaborate theatre. Just yesterday I was dismayed to see a drive in a desktop with a desktop filesystem, showing signs that it had been suffering silent data corruption for a long time. But the backups are in ZFS pools which go back years, and so all is not lost.

      I love ZFS, and as for the other issues people raise, data does need to be watched, whatever tech one is using.

      Otherwise it is just Schrödinger's bits.

    95. Re:bit rot by Anonymous Coward · · Score: 0

      This answer is partially correct.
      It would be correct if your disks are automagically connect to the RAM which they are not.
      Even if your disk is doing all it can to provide correct data it cannot guarantee that nothing goes wrong (transient or permanent errors) along the I/O path to the memory (say connectors, cables, controllers, bus lines and so on ...).

      ZFS does provide that the data which came from all this path is indeed genuine if you allows it to have its own redundancy setup (mirror, RAIDZ1-2).
      If ZFS detects any bad data it will try its best to get the data from any other available copies of the data AND will also try to correct the corrupted data source.

    96. Re:bit rot by Anonymous Coward · · Score: 0

      ZFS is very much alive and well. It is very actively being developed under BSD, and the linux version is rapidly catching up to the BSD feature set. There are commercial products out there that are completely based on the face that it's BSD, and they are doing fine and growing.

    97. Re:bit rot by Lord+Crc · · Score: 1

      What was your hardware specs like, and did you enable de-duplication?

      If you enabled de-dupe, then that is the problem. De-dupe sounds fine on paper but have some very serious downsides, including crippling performance in some cases.

    98. Re:bit rot by Anonymous Coward · · Score: 0

      Who's to say zfs will be around in a few decades?

      FTFY.

    99. Re:bit rot by Anonymous Coward · · Score: 0

      ZFS was build to be run on big server, with lot of ram, with battery protected raids.

      Stop spreading lies. ZFS was designed to run on all servers, and specifically without hardware RAID. ZFS was designed specifically because Sun was having so many problems of data loss caused by shitty and expensive RAID controllers scrambling data and returning bad sectors.

      It is a good filesystem, but lot of people think that it is a good general filesystem for all users, where it is not.

      More lies. I run zfs every day on my laptop. I run it on our NAS, with over 256 FC attached disks, and I run it on almost all our other servers. It runs great everywhere.

    100. Re: bit rot by fluffernutter · · Score: 1

      But the features that are left over can be done with ext4 and lvm2 anyway.

      --
      Laws are rules for the court, but merely a bottom bar to hit for life. Think beyond laws in your actions always.
    101. Re: bit rot by Anonymous Coward · · Score: 0

      This is quite possibly the most ignorant comment in this entire thread. ZFS is dead? What are you smoking? It's in illumos and FreeBSD. Which is the base for many many of the largest storage v noses on earth. Emc, nexenta, spectra logic. Your idea that zfs is dead should be completely ignored as ignorant.

      You forgot NetApp. They also run FreeBSD, but they run their own proprietary (and IMO superior) filesystem instead of zfs.

    102. Re:bit rot by s13g3 · · Score: 1

      Ah, THAT's what I was looking for, thanks! I was thinking there was something relevant I remembered from my USENet days, and figured maybe, just maybe, if I looked far enough down in the comments, I might find a hint of the old /., and lo and behold, there it was.

      "Parchive" on Wikipedia: https://en.wikipedia.org/wiki/...

      --
      "Inveniemus Viam Aut Faciemus" 'We will find a way... Or we will make one!' --Hannibal of Carthage
    103. Re:bit rot by thegarbz · · Score: 1

      FAT32 and ISOFS are your best bets for being readable 20 years from now.

      For the ages in this case implies that you are certain of the data you're reading. FAT32 and ISOFS may allow you to read data, but you won't be sure if it is still the right data.

      Don't read headlines, read content.

    104. Re:bit rot by rl117 · · Score: 1

      Ubuntu, works out of the box since 16.04. Additionally, while it's not supported by the installer, you can set it up to boot directly from the pool with GRUB. Other implementations such as in FreeBSD remain better integrated and more featureful and usable, but you can use it today on Linux without any special effort.

    105. Re:bit rot by rl117 · · Score: 1

      LZ4 is performant, since it was intended to trade off compression efficiency for less CPU and memory usage. And is cleverly run multiple times over the same data to improve the compression, then bail out once it's no longer worth repeating. But some of the other options have terrible performance; gzip is dire, and gzip-9 barely usable. When I tried benchmarking these a few months back, they nearly tanked the server by making the filesystem nonresponsive and pushing the CPU utilisation through the roof copying a few hundred gigabytes of data (FreeBSD 11). LZ4 in comparison is barely noticeable--it's definitely the best choice.

    106. Re: bit rot by dbIII · · Score: 1

      Not quite but close with lvm snapshotting. On linux you get better performance now, and probably for the next year, on ext4 plus lvm2 but ZFS on *bsd and solaris is a bit of a step up from that again. That stupidly old IDE box I mentioned can still saturate gigabit, which I'm certain is better performing than whatever you tried to set up that disappointed you.
      Quick questions - was your failed ZFS setup on linux? Was it a couple of years ago?

    107. Re:bit rot by GameboyRMH · · Score: 1

      My first thought was ZFS, if it makes you feel better :-P although I don't use it myself, it's just overkill for what I need.

      --
      "When information is power, privacy is freedom" - Jah-Wren Ryel
    108. Re: bit rot by AmiMoJo · · Score: 1

      PAR3 works well enough for TB sized data sets. There are open source implementations in C, so it's unlikely to become unreadable in our lifetimes. You can choose how much recover data you want to generate. Basically if you have 1% parity data you can lose 1% of your actual data (10GB/TB) and recover it, so it's a little bit like RAID5 but more flexible.

      As for filesystems, UDF is a good choice. It's used on CDs, DVDs and BluRay discs. All likely to be readable for decades to come, and with solid open-source implementations. On HDDs EXT2 might be a good bet, again it is unlikely to become unreadable for many decades.

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    109. Re:bit rot by rl117 · · Score: 1

      ZFS can be more intelligent though, since it has more information about what the data means, rather than simply looking at raw blocks. If you have a copy count greater than one, it can use the redundant copies in addition to the mirrors or raid parity to self-heal. It can also ignore errors when it's in an unused block; that would otherwise be an error since there's no way to know the block is unimportant.

    110. Re:bit rot by Anonymous Coward · · Score: 0

      First of all, I don't think you understand what bit rot means. Please read carefully what the OP needs.

      Don't make me laugh about FAT32. That filesystem already lost support from Windows several years ago. I run a Windows 7 machine that is currently unable to format a disk in FAT32 through the graphical user interface; it will only allow exFAT. Some days ago I used the command prompt (terminal) to format a 500 GB external drive in FAT32 with command

      format e: /Q /FS:FAT32

      and Windows' reply was that "the volume is too big for FAT32." Strangely enough, I was later able to format it in FAT32 just fine on OS X.

    111. Re:bit rot by rl117 · · Score: 1

      Is it? After repeated dataloss using it with raid 0, 1 or none at all, plus other problems like becoming unusable from getting completely unbalanced, I'm a bit shy about trusting it quite so readily. It's hard to say it's "fine" when in all cases it was "fine" right up until the point it was "not fine" and lost all its data or went read only. "Working" isn't a useful observation for a filesystem; validating that it copes with every conceivable failure case by having the test cases to do that would, but it's been clear from all the failure scenarios I've encountered that this is missing. While the worst dataloss bugs have been fixed, it doesn't provide confidence that others remain lurking in there, or if regressions occur and reintroduce bugs. Btrfs continues to be a gamble which is not where you want to be when you care about data integrity. One of the reasons I switched to ZFS two years back. Once bitten, twice shy. But I've been badly bitten by ZFS on several occasions...

    112. Re:bit rot by Anonymous Coward · · Score: 0

      With that, you have only moved the error checking to yourself so that you have to do it manually. If there is a file system or set of programs that do this automatically, they would be preferable to having a very large collection of files in which you don't really know if they are corrupt or not.

    113. Re: bit rot by rl117 · · Score: 1

      It did, but it was taken from OpenSolaris and ported to a number of platforms, notably FreeBSD and IllumOS (OpenIndiana, SmartOS, Nexenta), and also Linux. The featureset, integration and and usability on Linux is poorer than the rest. Though it works well, it's better on the others.

    114. Re:bit rot by Anonymous Coward · · Score: 0

      Correct my math if I'm wrong

      The maths is wrong. The error rate is 1 in 100000000000000. So if we read 100000000000000 bytes the probability of us hitting an error is ~63% (link).

      However, we are only reading 8000000000000 bytes, which is 8% of 100000000000000. So the probability of hitting an error somewhere in 8000000000000 bytes is ~5% (0.63*0.08).

    115. Re:bit rot by Anonymous Coward · · Score: 0

      but so far as I know, only ZFS has built-in checksumming to detect/prevent the data corruption in the first place.

      Btrfs has data checksumming on by default.
      https://btrfs.wiki.kernel.org/index.php/FAQ#What_checksum_function_does_Btrfs_use.3F

    116. Re: bit rot by fuzznutz · · Score: 1

      Multiple copies may be one solution, but it introduces another problem that doesn't have an elegant solution... you need a tool that can verify the integrity of your data (across the multiple copies). How do you choose which one is "correct" when you migrate and copy to a new system? In addition, how are you sure that any given copy is actually complete? What if you want to permanently delete a file from your archive?

      This is how I mitigate bit rot my home photos. My personal policy is that all changes must be made against a master folder on one designated computer. I have a script that recursively runs an md5, sha1, and sha512 checksum/hash on all the files in that master folder. I run a verify with all three hashes prior to making any changes to the master folder. Once changes are made, I recreate new hashes and rsync to multiple locations on my home network including on one portable drive I keep offsite. Each copy is independently tested with the hashes to verify it is correct. If at any time, any of the hashes fail (they have not yet) on the master folder, I (would) recover the failed file from one of the other copies after verifying it is correct with the hashes. I have considered adding some type of versioning scheme, but I only add, never delete. It is possible that a bitflip could occur between deleting the old hashes and create the new, but since I have never had an error trapped despite making multiple copies and potential drive wear, I think this is a remote possibility.

      Back in the old days before it got so big, I would tar and gpg the entire archive and put a copy to Dropbox too.

    117. Re:bit rot by Anonymous Coward · · Score: 0

      BTRFS also has checksumming to prevent bit rot.

    118. Re:bit rot by squiggleslash · · Score: 1

      Frankly, ZFS-style checksumming is the future of files systems. It has to be for any data you care about.

      Really, that file system level checksumming should be redundant, and if it's redundant then the question is why you're bothering.

      What you want is more that error detection, you want error correction. That means using regular archives with PAR files providing a matrix based correction system. Store that on a RAID and there's no reason for obscure file systems that have a high probability of being unsupported 50 years from now.

      Not that I'm happy ZFS counts as an obscure file system, but it is. Sun fucked up the licensing, and so it never got the adoption or mindshare it deserved. You absolutely should never, ever, use it for anything you plan to read a long time from now. You'd be better of 'tar'ing files directly to /dev/sda.

      --
      You are not alone. This is not normal. None of this is normal.
    119. Re:bit rot by Anonymous Coward · · Score: 0

      Freenas & Nas4free comes to mind...but 8GB of ram is required.

    120. Re:bit rot by squiggleslash · · Score: 1

      The only things that could wipeout all implementations of a widely used format like ZFS would be...

      Or for it not to be "widely used". Which it really isn't.

      That said: whether this matters depends on whether you're going to be backing up your backups - or rather, regularly copying it to newer media.

      If someone genuinely is looking towards using current media to back things up to be read 50 years from now, then I suspect they're never going to be happy. I can't find an economical SCSI adapter for my collection of SCSI (50 pin) drives any more, I doubt many here could find an RLL/MFM controller either, and that's just common storage technologies from 25-30 years ago (SCSI was still widely used 20 years ago.) I guess it's plausible USB might survive in a well supported form, but would you want to count on it?

      So, from that point of view, a better approach might be to forget trying to pick a file system and media technology, and instead focus ensuring you can regularly copy from an old system to a new system. That almost certainly means thinking in terms of an archive format, like tar or zip, and, preferably, an error correction system like PAR.

      File system? Red herring. It's not going to help. And the more obscure and advanced the file system you pick, the infinitely less probability you'll be able to get tools to read it in future. Don't even bother, just copy your archive to a current technology on a regular basis.

      --
      You are not alone. This is not normal. None of this is normal.
    121. Re: bit rot by sjames · · Score: 1

      In my tests on a VM, BTRFS RAID 1 worked flawlessly but RAID 5 crashed and burned. The word from the development team suggests that's par for the course.

    122. Re:bit rot by Carewolf · · Score: 1

      It's pretty sad that in this day and age, only one person has highlighted the relevance of ZFS here, and they're an AC. Someone mod parent up. RAID is borderline necessary if you don't have multiple backups, (to recover from in the event of random corruption caused by gamma rays from outer space or a butterfly flapping their wings on another continent or whatever) but so far as I know, only ZFS has built-in checksumming to detect/prevent the data corruption in the first place.

      No, RAID Is not sufficient to prevent bit-rot. In fact, RAID can accelerate it. You see, using a redundant mode like 1, 5, 6, most controllers (software and hardware) will only read enough disks to get the data, 1 drive in the case of RAID1, N-1 for RAID5 and N-2 for RAID6 (the non-parity ones, to save a parity calculation). But the drives can return bit errors - it's rare, but it does happen (there's a undetectable fault error rate, something along the lines of 1 in 10^20 bytes read or so will have an undetected error). And this the RAID controller will happily return to you since it didn't check the redundant drives to verify correctness. And it's possible it gets written back corrupted, thus causing corruption.

      You really need something like ZFS which puts a checksum on every file and verifies it, so if it does get an error it can resolve it.

      At least for spinning disk, they will not return bit-errors. The medium is rather analog and has a powerful error-detection and -correction built in by necessity. This means a classic disk will either give you correct data, or tell you outright that the data cannot be read. This is what RAID is meant to work on top of, being told the data is corrupted by the layer below.

    123. Re: bit rot by Carewolf · · Score: 1

      That's bollocks because every sector has a checksum at the hardware level, bit rot would be detected at read time.

      Is that also true for SSDs?

    124. Re:bit rot by Anonymous Coward · · Score: 0

      Whose to say zfs will be around in a few decades?

      Why wouldn't it be? The only things that could wipeout all implementations of a widely used format like ZFS would be nuclear war or an ELE asteroid strike. In either event, reading disk drives would be the least of your problems.

      Politics could also wipeout all implementations of a widely used format like ZFS.

    125. Re:bit rot by Anonymous Coward · · Score: 0

      You must use ECC RAM to prevent and detect a major source of instruction and data rot in your entire box.

      Doing that. Check.

      You must use ZFS and strong checksumming and raidz options to prevent and detect it on your disks.

      Doing that. Check.

      Otherwise all your "oh so buttsaving precious backups you thought you were smart about",
      will inherit all the shit your corrupt system undetectably reads and scribbles back on them.

      Correct.

      My main storage box at home has 6 * 3TB drives, with 4 of them in a stripe of two ZFS mirrors, with checksums and automatic weekly scrubbing enabled.

      As mentioned above, it uses ECC memory with a supported CPU and motherboard.

      One drive is hot-spare. The final drive is cold-spare, and is not running at all until needed.

      The box also uses ZFS snapshots, again on an automatic schedule, for those "oh, shit, I messed with the wrong part of the filesystem" moments.

      It uses rsnapshot to efficiently rsync everything over to an external drive with a staggered retention scheme covering three months back in time.

      Finally, the most critical pieces of the stored data (the ones I cannot otherwise recreate from other sources) are rsynced elsewhere (not to the cloud, though) in case of the whole building going up in flames, however unlikely.

      Works well for me.

    126. Re:bit rot by Anonymous Coward · · Score: 0

      ZFS also has its own flavors of RAID 1/5/6.

      using a redundant mode like 1, 5, 6, most controllers (software and hardware) will only read enough disks to get the data, 1 drive in the case of RAID1, N-1 for RAID5 and N-2 for RAID6

      A normal RAID5/6 has a "stride" of 64 - 256kByte. This is about the size where reading anything smaller doesn't make a spinning disk much faster just because of the time it takes to settle the head vs. the time to let this much data go past, so it's an old choice based on spinning disks.

      If you're reading a sequence of blocks that doesn't cross a stride boundary, RAID5/6 will only seek a single disk, and if you cross a boundary, it will seek two. Only if it gets an error from the drive will it read the full stripe from all the disks and reconstruct.

      You'll notice an option to inform the filesystem at creation time of the stride size in mkfs. I guess it aligns things to avoid crossing disk boundaries as often. I'm not really sure.

      In any case, a database will often make tiny reads, like 4kB, which definitely don't cross a stride boundary.

      Writing 4kB is worse: I think you have to read 4kB from n disks then write 3 disks? but for reads, a RAID6 of n disks will deliver nearly the io/s of n - 2 disks.

      ZFS always reads and writes the whole stripe, I think. Anyway, a raidz2 of n disks delivers the io/s of 1 disk in practice. This is pretty well-established metric on zfs-discuss, and it is definitely worse than raid6 and definitely people notice.

      I still like ZFS because it solves a lot of problems and seems to be fairly high-quality work with a few annoyances. but this performance hit can be a difficult thing to swallow.

    127. Re:bit rot by pnutjam · · Score: 2

      but rot.. Sounds serious...

    128. Re:bit rot by pnutjam · · Score: 1

      SuSE is using btrfs as the default filesystem.

    129. Re:bit rot by RatherBeAnonymous · · Score: 0

      "ZFS isn't even a filesystem for this age" - WTF does that even mean?

      It means that even back when FAT was a johnny come lately it already had greater market penetration than ZFS. With decades behind it and broad market penetration today, there's good reason to believe it won't vanish with the advent of the next development in filesystem architecture. ZFS is likely to be a blip on the radar, a pause before the next innovation. Not what you want for an archival format.

      Market penetration is nice, but is Fat even sufficient? FAT32 files are limited to a maximum of 4GB. That is a real issue for video files. There is ExFAT, but people barely know it exists. I have little faith it will be supported in 20 years. NTFS is better in all respects, other than the fact that only MS products officially support writing to it. ZFS is supported natively by BSD and Linux supports it with a driver. Lots of big storage vendors use it for their SAN file systems. Few file systems have such broad-based support.

      Bit-rot is an issue inherent to any storage medium

      Bit rot, aka corrupted data, is not inherent to correctly operating hardware. As implemented, you'll see tens of thousands of unreadable blocks on a hard disk before you see a single one in which data has been undetectably corrupted. Every single sector gets a checksum in hardware and if the checksum does not pass you get the famous Abort Retry Ignore. For most storage you get Forward Error Correction coding so that some number of bit errors can be corrected on read before having to throw an error.

      When you see bit rot, the storage media is usually not at fault. More often the data passes through faulty non-parity ram, a noisy memory bus or an overheated controller and gets corrupted on its way to storage rather than getting corrupted at rest on the storage. It died when you used an overclocked piece of garbage to copy it from an old hard disk to a newer, bigger one.

      How do you know your hardware is good? Diagnostics can not detect every problem, not to mention power failures, dodgy power supplies, and cosmic rays triggering random bit flips. In addition, as storage densities get higher, magnetic domains are compressed, magnetic boundary migration is accelerated, and data life expectancy is reduced. With software calculated checksums and write verification you can get protection against data transfer errors and verify that your data at rest is still correct. When I archive data I take these precautions. You can choose a filesystem that supports these functions or you can use a combination of software products to do the same job. Pick your poison.

    130. Re: bit rot by the_B0fh · · Score: 0

      Ah. Anonymous Idiot. CRC is a *VERY* simple checksum that is prone to failure all the time. You might want to read it up on it.

    131. Re:bit rot by the_B0fh · · Score: 1

      The medium is rather analog and has a powerful error-detection and -correction built in by necessity. This means a classic disk will either give you correct data, or tell you outright that the data cannot be read.

      CRC is now considered "powerful error-detection" ?!?!?! Seriously? Are people that ignorant nowadays?

    132. Re: bit rot by fluffernutter · · Score: 1

      Ok yes I was trying to use dedup. There are two features of zfs I found particularly useful; dedup and snapshots especially across network. Snapshots ended up being overkill in that I would have had to set up zfs everywhere, and the fact that dedup needs so many resources makes it overkill as well. Getting back to the original post, I guess I'm not sure what is left that zfs does that makes it better for archiving that can't be done with ext4 and lvm

      --
      Laws are rules for the court, but merely a bottom bar to hit for life. Think beyond laws in your actions always.
    133. Re:bit rot by Wolfrider · · Score: 1

      --YEP. ZFS on Linux is mature enough for the past several years to trust my irreplaceable/important data to. ZFS+Samba is the killer app for file sharing on my LAN.

      --Cron Snapshots every day (auto-deleted after a month) for an extra layer of protection, monthly scheduled scrubs and SMART tests. Most of the time the server runs with the drives spun down to save power, and boot/roots off a USB3 thumbdrive.

      --Apart from tape, I just don't trust "offline" data storage. DVD backups get bitrot (not enough data on Bluray life expectancy yet, although I would tend to trust M-DISC), disconnected hard drives can fail to spin up again after years of storage.

      --Keep it up and running on a UPS and checked semi-frequently, replace disks that go bad, and grow your disk capacity (number of drives) and sizes when you need to, all in-situ. ZFS rocks. :-)

      --
      .
      == WolfriderV6 == I'm willing to admit that *I just might* be wrong... Are you??
    134. Re:bit rot by Wolfrider · · Score: 1

      > ZFS doesn't even have an fsck. It is IMPOSSIBLE for it to get corrupted

      --As much as I love ZFS, I wouldn't use the word that you use. Take a look here:

      https://github.com/zfsonlinux/...

      https://github.com/zfsonlinux/...

      --Complex software always has bugs somewhere. Can't say for certain on FreeBSD or Solaris implementations, but I do track the Linux bug reports.

      --
      .
      == WolfriderV6 == I'm willing to admit that *I just might* be wrong... Are you??
    135. Re:bit rot by Anonymous Coward · · Score: 0

      That's not how probabilities work. You have a 1 in 10^14 chance of a bit error, which means the probability of reading a bit correctly is 1-1/10^14. To get the probability of reading n bits correctly, multiply that probability for each bit, so the probability of reading 64*10^12 bits correctly is (1-1/10^14)^(64*10^12), which is 53%. In conclusion, the probability that you read one or more bits incorrectly is 47%.

    136. Re: bit rot by Anonymous Coward · · Score: 0

      Disagree. ZFS has already replaced EXT4 and UFS in a lot of systems. When something gets ported to BSD, you know it's important, because they don't have the resources to dick around the superfluous. ZFS is here to stay.

    137. Re:bit rot by Shirley+Marquez · · Score: 1

      ZFS is part of FreeNAS, a storage solution that is becoming popular. It's also now available for Ubuntu Linux. We're still in the "not widely used" phase but that is changing.

    138. Re:bit rot by Shirley+Marquez · · Score: 1

      Try FreeNAS. It makes setting up a storage server with ZFS painless. So long as you don't turn on deduplication (which as others have pointed out takes a LOT of memory) you don't need a killer machine. I'm running it on a system with an Athlon 5350 - and to be fair, 16GB RAM that I bought when RAM was really cheap last summer. Kabini isn't the world's most powerful platform but it's good enough for my needs, and a 25W TDP processor is appealing for a system that is on 24/7.

    139. Re: bit rot by Lord+Crc · · Score: 1

      I'm not very familiar with ext4 nor lvm, but for me it's the fact that zfs validates the checksums on _reads_, and repairs corrupted data if needed, and that you can configure it to store N copies of each block. These two features work on a single disk (non-raid).

      So if you have a single disk, you can set it to store say 3 copies of each block, and when reading the data and it finds that one block is corrupted, it will try to use one of the other two.

    140. Re:bit rot by Ocrad · · Score: 1

      And that's your answer. A filesystem like FAT32 or ISOFS that's likely to still be implemented in future OSes and a recovery files which let you rebuild anything that suffers from bit rot.

      Lzip and lziprecover can help you keep your data safe in the long term. (For the kind of data where it makes sense to use a lossless compressor, that is).

    141. Re:bit rot by wallsg · · Score: 1

      The real solution here is relatively frequent backups, multiple copies in different filesystem and physical formats (ie. flash, hard drive, optical). Over time you just keep moving your file store to the new mediums. I have files that are over twenty five years old now, some of them coming from DOS and Windows 3.1, others from my old original Slackware 3 installs. Along the way some of those files have been on CD-Rs, DVDs, early USB thumb drives, various hard drives running everything from FAT, FAT32, ReiserFS, HPFS, NTFS, ext2 and ext3. And I'll keep on doing that until I drop dead, and I'll leave it up to my family to decide whether they want to keep any of the documents, pictures, music files, videos and so on that I've been collecting.

      You don't think my Jumbo 120 backups are sufficient?

    142. Re:bit rot by thegarbz · · Score: 1

      I hate my phone.

    143. Re: bit rot by dannys42 · · Score: 1

      I'll have to look into PAR3. Thanks!

    144. Re:bit rot by Anonymous Coward · · Score: 0

      It's pretty sad that in this day and age, only one person has highlighted the relevance of ZFS here, and they're an AC. Someone mod parent up. RAID is borderline necessary if you don't have multiple backups, (to recover from in the event of random corruption caused by gamma rays from outer space or a butterfly flapping their wings on another continent or whatever) but so far as I know, only ZFS has built-in checksumming to detect/prevent the data corruption in the first place.

      Keep in mind, please, that if ZFS detects a correctable in-place corruption event, it corrects it _in memory_ and returns the correct result to the reader. It _does not_ immediately re-write the bad block with the corrected block from memory. You need to schedule a `zpool scrub ` periodically to re-write bad blocks, otherwise even the Glorious ZFS Masterrace will rot on disk. You can get the status of an in-fligth scrub with `zpool status -v `. (In both of the above, replace the with your zpool name...)

      Re-writing the bad block immediately seems like a no-brainer. I can only assume ZFS works this way due to performance concerns with immediate re-write.

    145. Re:bit rot by rl117 · · Score: 1

      That should have read "But I've been badly bitten by *Btrfs* on several occasions..."; wish slashdot allowed editing of mistakes.

    146. Re:bit rot by Anonymous Coward · · Score: 0

      You're thinking too big. For a non-obsessed home user, a "big server" is likely a headless mini tower with two or three hard drives in your closet. It could easily be your old game rig from 8 years ago, underclocked and sans gpu.

    147. Re:bit rot by Carewolf · · Score: 1

      The medium is rather analog and has a powerful error-detection and -correction built in by necessity. This means a classic disk will either give you correct data, or tell you outright that the data cannot be read.

      CRC is now considered "powerful error-detection" ?!?!?! Seriously? Are people that ignorant nowadays?

      It is not just CRC. As the density has grown the bits allocated to error detection has grown even faster, and this point spinning disk are so dense it is closer to being stocastic analysis of analog signals than digital processing with CRC.

    148. Re:bit rot by __aaclcg7560 · · Score: 1

      Writing 4kB is worse: I think you have to read 4kB from n disks then write 3 disks? but for reads, a RAID6 of n disks will deliver nearly the io/s of n - 2 disks.

      Newer hard drives have 4kB sectors. It would make sense to read/write data in 4kB chunks.

    149. Re:bit rot by HuguesT · · Score: 1

      I recommend using a RAIDZ6 on heaps of tablets then. Should be good up to two broken tablets by VDEV of 8-12 tablets. Do a vigorous scrub now and then, perhaps with a metal brush. If in doubt, convert to read-only by burying the tablets.

      Hope this helps.

    150. Re:bit rot by HuguesT · · Score: 1

      No, not at all. ZFS is designed to work on bare disks with as little hardware and software between the devices and itself. No battery protected stuff; hardware RAID is a big no no. It was also started on ancient Sun hardware that is so very slow compared with modern hardware that it is not even funny.

      I have a ZFS system that has been running for years on a 2010-era Pentium. It does require a lot of memory. Don't bother with deduplication (off by default) but do turn on compression.

    151. Re:bit rot by HuguesT · · Score: 1

      ZFS is harder to setup and manage than EXT4 but easier than LVM + md. If you wanted to do any kind of mirroring / software RAID it is pretty much the recommended way under Linux right now. The only thing that could have been a problem is deduplication. Stay away from that stuff.

    152. Re:bit rot by Blaskowicz · · Score: 1

      Maybe ExFAT will be part of that short list eventually?, once it is not "protected" by patents anymore. By which time we will all be old...

    153. Re:bit rot by Anonymous Coward · · Score: 0

      Data scrubbing helps.

      I have one in my crontab.

      https://en.wikipedia.org/wiki/Data_scrubbing

    154. Re:bit rot by Anonymous Coward · · Score: 0

      The answer is par2, such as parchive or QuickPar. Par2 uses a Reed-Solomon code to take a set of source files and produce a set of recovery files such that the original files can be checked for correctness and up to N original files can be corrected where N is the number of recovery files created.

      And it's slow as hell too. Or at least was, back when I tried to fill the empty space of my DVDs with par2...

      Invaluable for alt.binaries.* though.

    155. Re:bit rot by Anonymous Coward · · Score: 0

      Whose to say zfs will be around in a few decades?

      The real solution here is relatively frequent backups, multiple copies in different filesystem and physical formats (ie. flash, hard drive, optical). Over time you just keep moving your file store to the new mediums. I have files that are over twenty five years old now, some of them coming from DOS and Windows 3.1, others from my old original Slackware 3 installs. Along the way some of those files have been on CD-Rs, DVDs, early USB thumb drives, various hard drives running everything from FAT, FAT32, ReiserFS, HPFS, NTFS, ext2 and ext3. And I'll keep on doing that until I drop dead, and I'll leave it up to my family to decide whether they want to keep any of the documents, pictures, music files, videos and so on that I've been collecting.

      At no point do I ever assume a mere file system sitting on one physical and/or logical volume is ever going to do the job of keeping my files available over the long haul. RAID and file systems in all their glory are not intended for that. Multiple physical copies at multiple locations on multiple types of media, that's the only real way to assure your files remain accessible and safe over time.

      Hi, for a semi-tech aware person, the proposed solutions you have are not consumer friendly. What is your recommendation for a family to save photos, documents, etc. to avoid bit rot (e.g. WD Red/Purple, etc.?)

    156. Re:bit rot by supremebob · · Score: 1

      Ah, ReiserFS. Now THAT was a killer file system.

      Whatever happened to that one?

    157. Re:bit rot by higuita · · Score: 1

      Notice that RAID cards are recommended to store cached writes, not to build RAID arrays. You can replace that with UPS WITH CONTROLLED SHUTDOWN, so you have time to flush caches.

      Just because something works, that doesn't mean that it is safe, that you do not hit internal bugs due not being developed to work that way and that will perform well

      Lets listen to a guy that really understand ZFS and built a 71TB NAS with it:

      http://louwrentius.com/please-...

      How many desktop hardware you have with ECC? mostly only AMD and even that, not all motherboards and CPUs

      http://louwrentius.com/things-...

      From the page above:
      "If your NAS is attached to a UPS, this is not much of a risk, you can perform a controlled shutdown before the batteries run out of power."

      So read this and understand, if you have a ZFS system without raid backed battery or UPS with controlled shutdown, you risk data

      "The absolute minimum RAM for a viable ZFS setup is 4 GB but there is not a lot of headroom for ZFS here. ZFS is quite memory hungry because it uses RAM as a buffer so it can perform operations like checksums and reorder all I/O to be sequential."

      So this excludes most low end systems and most desktop for zfs

      "ZFS was never designed for consumer hardware, it was destined to be used on server hardware using ECC memory. "

      Another url, just for fun:
      http://louwrentius.com/should-...

      --
      Higuita
    158. Re:bit rot by Anonymous Coward · · Score: 0

      FreeNAS 10/Corral just came out with lots of ZFS love and GUI usability improvements make home NAS storage easy. FreeNAS 9 isn't bad either and the ZFS stuff is the same underneath.

      NexentaStor used to be the ZFS gold standard (OpenSolaris/illumos kernel, with the solaris iSCSI daemon COMSTAR) despite the limited driver support (which basically meant Intel CPU/ethernet and LSI SAS HBA's, so a SuperMicro server), but the community version getting kneecapped at 18TB, and v5 locking out shell access and requiring a second server to just admin it was the last straw.

      Now, if there was a stable object storage server that paired well with ZFS, that would be awesome...

  3. Terabytes over decades on NTFS by DogDude · · Score: 2, Interesting

    I've got somewhere between 20-30 TB that has been accumulating for more than 20 years on NTFS, and I've never seen any examples of "bit rot". My files today are identical to what they were 20+ years ago. I have to wonder what kind of filesystem that the poster is using.

    --
    I don't respond to AC's.
    1. Re:Terabytes over decades on NTFS by Bizzeh · · Score: 1

      A broken one?

    2. Re:Terabytes over decades on NTFS by Narcocide · · Score: 2

      Not a single example in 30TB over 20 years? I think you should check again.

    3. Re:Terabytes over decades on NTFS by Anonymous Coward · · Score: 0

      How do you know your files are identical?

    4. Re:Terabytes over decades on NTFS by xanthines-R-yummy · · Score: 1

      I tried downloading an old attachment (6-7 years ago now) from my gmail account but the attachment is corrupted. No matter how many times I download it or to what computer, it's corrupted. I wonder what Google is using?

    5. Re:Terabytes over decades on NTFS by Narcocide · · Score: 5, Insightful

      Schrodinger's bit rot. If you never look in the box again after putting the cat in it, you can pretend it lived forever.

    6. Re:Terabytes over decades on NTFS by Blymie · · Score: 1

      No kidding.

      I have two raids, clones of each other. On the weekends, during off-hours, I run md5sums on them. Scripts automatically compare to prior versions.

      So far the newer the raid card, the less I've seen. A lot of older (eg, 10, 15 year old) raid cards didn't patrol read automatically, or do consistency checks automatically. My current cards do, and I've scheduled them for weekly.

      With *that*, I see files 'caught' via bitrot once in a while... and corrected. Maybe, 2 or 3 files per year, out of billions... on a 60TB raid.

      But, on top of all that? There have been severe potential data loss bugs with EVERY filesystem over the last 20 years, from FAT to ext2/3/4, xfs, you name it.

    7. Re:Terabytes over decades on NTFS by Anonymous Coward · · Score: 0

      He's probably using Fat32 or anything without a bit level delta journal like NTFS, which automatically votes out corruption. Siphoning off these bit level differences as they occur is critical to the continuous recovery embodied in backup products like Macrium Reflect. More to the point however, is who knew that the essential NTFS delta technology was stolen by Microsoft back in 1988 when it was known as Wavelink.

    8. Re:Terabytes over decades on NTFS by marko123 · · Score: 1

      I tried downloading an old attachment (6-7 years ago now) from my gmail account but the attachment is corrupted. No matter how many times I download it or to what computer, it's corrupted. I wonder what Google is using?

      What type of file is it? It might be a media format the player software no longer recognises (find an older player). Or if it is an exe it might be a 16 or 32 bit exe that won't run in a 64 bit environment. (find an older operating system). If it's not confidential, could you post a link so we can try it?

      --
      http://pcblues.com - Digits and Wood
    9. Re:Terabytes over decades on NTFS by QuietLagoon · · Score: 1

      ...'ve got somewhere between 20-30 TB that has been accumulating for more than 20 years on NTFS...

      Given what appears to be Microsoft's strategy slowing morphing away from [consumer] OS's, I'd be reluctant to need to rely on Microsoft for anything long-term.

    10. Re:Terabytes over decades on NTFS by Anonymous Coward · · Score: 0

      I've got somewhere between 20-30 TB that has been accumulating for more than 20 years on NTFS, and I've never seen any examples of "bit rot". My files today are identical to what they were 20+ years ago. I have to wonder what kind of filesystem that the poster is using.

      How would you know if your file system doesn't have checksums? Have you created a SHA-1/256 hash of every file to tell if it's been changed? Perhaps a 'parchive' file to recover any bit flips?

    11. Re:Terabytes over decades on NTFS by Anonymous Coward · · Score: 0

      You most certainly do have bit-rot, you just don't know it.

      A bit flipped here and there in a large movie or jpeg file is probably not the end of the world - but if that bit gets
      flipped in file system meta data, or some other critical file you are in a world of hurt.

    12. Re: Terabytes over decades on NTFS by Anonymous Coward · · Score: 0

      Google scrubs your data.

    13. Re:Terabytes over decades on NTFS by fnj · · Score: 1

      Why would you doubt it? ZFS with its checksumming was designed specifically to detect and correct bit rot. If you have redundancy (MIRRORING or RAID-Z), it will automatically repair bit rot as soon as it detects it.

    14. Re:Terabytes over decades on NTFS by Anonymous Coward · · Score: 0

      Does that mean, in a many-worlds interpretation, that your uncorrupted data is safely backed up - just in another universe?

    15. Re:Terabytes over decades on NTFS by guruevi · · Score: 1

      Just because you haven't noticed bit rot doesn't mean it doesn't exist. With 20TB, you're experiencing AT LEAST 1 missing data sector per year (based on my experience with 200TB of data).

      --
      Custom electronics and digital signage for your business: www.evcircuits.com
    16. Re:Terabytes over decades on NTFS by hoggoth · · Score: 1

      How would you know?
      Even if you had the time to check 30 TB of data for a single bit error, which would take WEEKS running 24x7... what would you check it against?

      --
      - For the complete works of Shakespeare: cat /dev/random (may take some time)
    17. Re:Terabytes over decades on NTFS by fishfrys · · Score: 1

      And it might have.

    18. Re:Terabytes over decades on NTFS by Anonymous Coward · · Score: 0

      I have a very old MP3 collection that has moved from system to system over time and fairly often I find songs that don't work anymore.

  4. Is this possible by Bizzeh · · Score: 1

    Is this even possible long term? What would have happened if you stored all of your information on PATA drives 10 years ago, its rare to find a motherboard with PATA on it now, yes there are converters and 3rd party PCI cards, but those are eventually going to dry up too.

    Now, say you choose SATA, what happens when M2 becomes the defacto standard? So, why dont you choose M2? What happens when M2 is phased out?

    It is not just the file system and the data you need to think about, its the physical hardware too. With the rate things change in hardware, and connecting that hardware to other hardware, its unrealistic that you could expect to be able to use your current storage media in 10 years, let alone 20, 30 or 40 years.

    1. Re:Is this possible by pushing-robot · · Score: 1

      That's easy: use an external USB controller. You can still buy cheap PATA-USB interfaces, and of course SATA and M.2.

      USB has been around 20 years, and it could be another 20 before we lose USB 2.0 / 3.0 compatibility.

      --
      How can I believe you when you tell me what I don't want to hear?
    2. Re:Is this possible by marko123 · · Score: 1

      Is this even possible long term? What would have happened if you stored all of your information on PATA drives 10 years ago, its rare to find a motherboard with PATA on it now, yes there are converters and 3rd party PCI cards, but those are eventually going to dry up too.

      Now, say you choose SATA, what happens when M2 becomes the defacto standard? So, why dont you choose M2? What happens when M2 is phased out?

      It is not just the file system and the data you need to think about, its the physical hardware too. With the rate things change in hardware, and connecting that hardware to other hardware, its unrealistic that you could expect to be able to use your current storage media in 10 years, let alone 20, 30 or 40 years.

      This is the problem with maintaining your own hardware, and a really useful use case for cloud storage, so long as you can trust the provider to keep the hardware up to date while your files stay clean, private and available.

      --
      http://pcblues.com - Digits and Wood
    3. Re:Is this possible by __aaclcg7560 · · Score: 2

      USB has been around 20 years, and it could be another 20 before we lose USB 2.0 / 3.0 compatibility.

      Before that we had FireWire 400/800 and SCSI I/II/III. Won't be long before Apple obsoletes USB 1/2/3 for something with a much smaller connector.

    4. Re:Is this possible by 0100010001010011 · · Score: 2

      I've had a theseus' ZFS pool that I started years ago on a set of PATA drives. RAID-Z2 on OpenSolaris. It's since moved to SATA drives, been expanded a few times, moved from Debian to FreeBSD to now FreeNAS.

      Setup a pool with the level of redundancy you need and as technology changes use a system compatible with the old and new tech and just replace drives as needed.

    5. Re:Is this possible by __aaclcg7560 · · Score: 2

      This is the problem with maintaining your own hardware, and a really useful use case for cloud storage, so long as you can trust the provider to keep the hardware up to date while your files stay clean, private and available.

      If you want to keep your data private, get it off the Internet. No cloud provider can guarantee your data will stay private, much less clean and available.

    6. Re:Is this possible by Anonymous Coward · · Score: 0

      If you're trusting your files to cloud storage they're as good as gone already.

      Only an idiot would trust someone else to keep their important data safe.

    7. Re:Is this possible by Anonymous Coward · · Score: 0

      Unless you get a Mac Pro to read your disks. When will Apple pull the plug on USB for some new "courageous" system.

      Digital files are fragile they are capable of being lost very easily. On my home systems I have used ST-506 et. al., ATA and now SATA. For removable media, cassette tapes, stringy floppies, floppy drives, Bernoulli Box, Zip Drives, CD, DVD and Blu-Ray, While working in the industry add it various sorts of SCSI and SAS. Each generation is incompatible with the last, moving from one generation to the next is expensive and it take a long time (talking wall clock time). Since time is money, at some point it is best to just give up and cut your losses. Keeping anything for more than five years in digital format is a pipe dream.

    8. Re:Is this possible by pushing-robot · · Score: 1

      Apple has been using USB for 19 years (they were the first major adopter, in fact). They're in the process switching to Type-C ports, but at most you'll need a new cable, adaptor or hub to connect your older devices.

      Firewire was never particularly successful, but Apple kept it around for well over a decade, and again you can still buy an adaptor for modern Thunderbolt PCs.

      SCSI? Really? We put up the terrible hardware for the speed and protocol improvements over early IDE, but SCSI products were always niche in the consumer world and the writing was on the wall by the mid-90s. But on the off chance you saved all your data on SCSI drives and then hid under a rock for 25 years, you can still pick up a PCI-e SCSI controller (and a PCI-e to Thunderbolt box if you really have to hook it up to your new Mac).

      While Apple is relatively aggressive at removing "non-essential" ports and features, USB seems the least likely of all ports to be removed, and even then Apple would sell an adapter to get you through the next decade or more. Past that, you could probably find an aftermarket USB-to-new-whizbang interface for another decade or more. Hell, your Mac can still party like it's 1989 with an ADB adapter and Token Ring bridge; USB will be with us for a long, long time. Even beyond that, you could borrow an older machine to access the data; it's not hard to find a ten- or twenty-year-old PC; most Slashdotters have one in their house.

      But this is academic; by then you'd surely have shifted your data onto a newer drive anyway. Drives fail, even when not in use, so if you care about your data you'll be maintaining a few copies on various media and spending a few hours per decade moving that data onto fresh disks. Compatibility should never be a hurdle.

      --
      How can I believe you when you tell me what I don't want to hear?
    9. Re:Is this possible by PhunkySchtuff · · Score: 1

      What you mean like they already have with USB-C only on all new laptops?

    10. Re:Is this possible by __aaclcg7560 · · Score: 1

      But this is academic; by then you'd surely have shifted your data onto a newer drive anyway.

      From RLL (20MB 5/25" full height) to EIDE (520MB 3.5" half height) to SCSI-2 (40GB 3.5") to Firewire (250GB 3.5" PATA internal) to SATA (1TB 3.5" or 2.5") drives. Been there, done that.

    11. Re:Is this possible by __aaclcg7560 · · Score: 1

      What you mean like they already have with USB-C only on all new laptops?

      You mean the replacement for Thunderbolt that Apple previous promoted as the next big thing?

    12. Re:Is this possible by mrchaotica · · Score: 1

      Or encrypt it before uploading it.

      --

      "[Regarding the 'cloud,'] ownership was what made America different than Russia." -- Woz

    13. Re:Is this possible by Cajun+Hell · · Score: 1

      Perhaps the "filesystem of the ages" includes hardware migration every 5-10 years. Didn't make the committment to periodically review and possibly renew the hardware? Then according to the filesystem specs, you did not format the volume correctly.

      --
      "Believe me!" -- Donald Trump
    14. Re:Is this possible by jon3k · · Score: 1

      Only an idiot would trust someone else to keep their important data safe.

      The vast majority of people don't have the technical skill to keep their data safer than cloud providers do.

  5. Not iPhone by Anonymous Coward · · Score: 0

    Anything but that

  6. Clay Pots in the desert. by Anonymous Coward · · Score: 0

    The only historically tried and proven method of storing data for the very, very long term is hiding clay pots in the desert.

    I recommend that.

    1. Re:Clay Pots in the desert. by Anonymous Coward · · Score: 0

      in a cave, sealed by salt.

  7. Photos are ill colored? by Anonymous Coward · · Score: 0

    Are you sure this is from an aging HDD? Maybe it's your eyes.

  8. Tape drive? by Anonymous Coward · · Score: 0

    I mean, they were literally made for offline storage.
    Otherwise, a bit more affordable and probably somewhat future proof, there's M-Disc, the disc that are supposedly made to last a thousand years.

    1. Re:Tape drive? by Noryungi · · Score: 1

      Tape suffers from bit rot.

      And tape standards themselves also suffer form obsolescence. QIC-80 format, anyone?

      --
      The right to offend is far more important than the right not to be offended. (Rowan Atkinson)
    2. Re:Tape drive? by Cramer · · Score: 1

      EXACTLY. Hard drives are the worst solution ever for ARCHIVAL . They need to be on and spinning to do the things necessary to keep the data remotely readable. And even then, they only last single digit years. If you're lucky. Tapes -- enterprise grade tapes -- will last decades sitting quietly in a draw away from strong magnets and the like. (I have 8MM DAT tapes -- exabyte 8200 and 8500 series -- that are 3 decades old and still perfectly readable -- if you can find a working drive. And that's sitting in a kitchen drawer next to the sink.)

      If the data is important to you, you make multiple copies, on various different types of media. (tape, cd, dvd, blu-ray, thumb drives, raid arrays, "cloud" backup systems, etc.)

    3. Re:Tape drive? by Miamicanes · · Score: 1

      Or, a hundred times worse than QIC... QIC-EXtra, which only really worked with Verbatim's own backup software.

      Verbatim's scam:

      1. Bundle a free "SE" version of their backup software that could "sort of" make backups under Windows 9x, but crashed too easily and often to reliably restore files from the backup later.

      2. Release an expensive "Pro" edition of the backup software that fixed the worst of those bugs.

      3. Excuse the SE version's near-unusability by saying, "well, maybe the Windows 9x SE version IS kind of buggy... but the free DOS version should still work". Except the DOS version had no concept of long filenames, so it couldn't be used to restore a full system backup to a blank hard drive. Using it to restore a backup was more like picking through the wreckage that used to be your house after a hurricane reduces it to a debris pile.

      4. ???

      5. Profit.

    4. Re:Tape drive? by Anonymous Coward · · Score: 0

      You just gave me a flashback of the sound of a QIC-80 tape helplessly winding and re-winding in a fruitless effort to read what I had written to it.

    5. Re:Tape drive? by RatherBeAnonymous · · Score: 1

      The trouble with archival tapes, IMO, is that they last TOO long. By the time anyone realizes it's time to migrate the data, the equipment to read and process the tapes is nigh impossible to find.

  9. Clay pots in the desert by DalM · · Score: 2

    The only historically tried and proven method of storing information for the long term.

    1. Re:Clay pots in the desert by Narcocide · · Score: 1

      It doesn't have to be crockery. Flat clay tablets work fine too, if you don't bomb them.

    2. Re:Clay pots in the desert by LeftCoastThinker · · Score: 1

      LOL this... Had someone a while back want data stability for a millenia, including the system to read the data. The conclusion we came to was carve it in marble or in fired ceramic, including the instructions for building the data reader in plain text.

      --
      If you disagree, please post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like
    3. Re:Clay pots in the desert by DalM · · Score: 1

      What? Who was the client? The British Monarchy?

    4. Re:Clay pots in the desert by Anonymous Coward · · Score: 0

      Bad idea. There is a lot of survivors bias to that.
      Most data written in marble or in ceramic have been lost.

    5. Re:Clay pots in the desert by DalM · · Score: 1

      Yes, but you have never seen a 1000 year old hard drive, have you?

    6. Re:Clay pots in the desert by LeftCoastThinker · · Score: 1

      Cant say because they wanted their involvement in the project kept confidential, but they were so serious about it that they had a dry disused mine to store their data in.

      --
      If you disagree, please post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like
  10. HDDs are NOT fine by archer,+the · · Score: 2

    If the bits on your drive are changing while the drive is offline, that isn't a filesystem issue. A filesystem issue would be if your OS wrote the wrong information to the drive, but that can't happen with an offline drive.

    1. Re:HDDs are NOT fine by Anonymous Coward · · Score: 0

      It could be an issue if the FS doesn't or can't detect the bit-rot.

    2. Re:HDDs are NOT fine by thegarbz · · Score: 1

      No. The filesystem issue is that it leaves the problems undetected.

  11. Tapes by Anonymous Coward · · Score: 0

    Tape drives will store your stuff for upwards of 10 years, up to 30 if you store them really well. They're also available in large sizes and is pretty cheap (about a cent per GB).

    1. Re:Tapes by Noryungi · · Score: 1

      Tape drives will store your stuff for upwards of 10 years, up to 30 if you store them really well. They're also available in large sizes and is pretty cheap (about a cent per GB).

      And if you believe any of that, I have a very interesting investment offer for you...

      --
      The right to offend is far more important than the right not to be offended. (Rowan Atkinson)
    2. Re:Tapes by Narcocide · · Score: 1

      Lemme guess, it's a used tape drive.

    3. Re:Tapes by Anonymous Coward · · Score: 0

      A fully consistent 50TB flash drive with a decade of power-off behind it?

    4. Re:Tapes by Nutria · · Score: 1

      You obviously know nothing of enterprise tape drives.

      --
      "I don't know, therefore Aliens" Wafflebox1
  12. RAID by buchner.johannes · · Score: 3, Informative

    RAID systems can protect online data (to a degree), but what about offline storage?

    Still RAID is a good choice for your redundancy of choice.

    Or paper: http://ollydbg.de/Paperbak/#1

    --
    NB: The message above might reflect my opinion right now, but not necessarily tomorrow or next year.
    1. Re:RAID by scdeimos · · Score: 1

      RAID is not a backup. RAID checksums are only evaluated when you read data and are only calculated when writing. If the data is just sitting there for years without any kind of access you can guarantee that it's going to die from bitrot.

    2. Re:RAID by Anonymous Coward · · Score: 0

      You can forget about paper. One sheet only stores on the order of a megabyte. A gigabyte will be a thick book. For terabytes, you're looking at a small library.

    3. Re:RAID by Blymie · · Score: 1

      Not entirely true. Most modern hardware raid cards will automatically perform consistency checks, and additional block checks often. Mine do so weekly, and were defaulted to monthly.

      It's true that if the raid card + drives never power up -- no go. But, if the computer is just on? The raid card and drives powered? Modern hardware raid will automatically scan the entire drive and fix issues.

    4. Re:RAID by Anonymous Coward · · Score: 0

      RAID can cope with a small amount of bitrot in a backup.

    5. Re:RAID by Anonymous Coward · · Score: 0

      Having a RAID system does not cure hitting the delete key.
      RAID is there for HA and hardware fault tolerance not backup. Backups store information on devices that are not live as RAID systems tend to be. I have had multiple simultaneous RAID spindles fail resulting in total data loss (4 spindle RAID 5 losing two drives on 6+ year old SCSI drives). At my last job we had a set of bad SATA drives in a new data center attached to a faulty RAID enclosure. The RAID enclosure had a bad power supply which too down the entire enclosure (even though the vendor said the power supply's were "redundant") Once the enclosure was replaced the drives started to experience head crashes at random times due to the power spike from the "redundant" power supplies. It took months to "fix" the issue with the one server (56TB attached storage) due to the vendors not having available disk to swap out the "bad" drives. This happened to a set of 14 servers running the disaster recovery backup system.
      No RAID is not a backup - not by any stretch of the imagination.

    6. Re:RAID by Billly+Gates · · Score: 1

      RAID is not backup!

      It's been said a million times from any competent system administrator.

    7. Re:RAID by Anonymous Coward · · Score: 0

      I have a Redundant Array of Independent Duplicates, you insensitive clod!

    8. Re:RAID by Anonymous Coward · · Score: 0

      These days hardware RAID does one thing and one thing only... adds MORE points of failure to the system.
      You want ZFS on raw disks.
      That's it.

      The only time you need more hardware is if you run out of SATA channels.
      Then you add port multipliers.

      And no matter what chain of disk hardware you're using, if you're not using ZFS with skein/sha checksumming on it, and making backups, you're going to fail.

    9. Re:RAID by sjames · · Score: 1

      RAID only helps when one copy is unambiguously lost, such as when a drive fails entirely. If one copy gets a bit flipped, it can't tell which version is correct. You need an FS with it's own error detection to be able to tell which copy is good and re-write the bad copy.

  13. How about DNA? by PseudoThink · · Score: 1

    Joking...but not really. From today's Reddit Science AMA with Yaniv Erlich: https://www.reddit.com/r/scien...

    1. Re:How about DNA? by PseudoThink · · Score: 1

      Also from the AMA: https://www.reddit.com/r/scien...
      "Our colleagues from ETH Zurich did a test and found that the half life of DNA after a chemical treatment can be 4000 years in room temperature, much better than my CDs!"

    2. Re:How about DNA? by DalM · · Score: 1

      So... you could record all human knowledge into a chromosome, inject them into human embryos and preserve our knowledge for all humankind. dude.....

    3. Re:How about DNA? by suutar · · Score: 1

      yeah, but now you can't update wikipedia til you have a kid. And if mom and dad have conflicting edits, watch out...

    4. Re:How about DNA? by emacsdarwin · · Score: 1

      Look for "Craig Venter 1.0" encoded in alt contigs in the upcoming hg39 assembly.

  14. Papyrus by Anonymous Coward · · Score: 0

    There's some very old papyrus around.

    1. Re:Papyrus by Blaskowicz · · Score: 1

      In sealed Egyptian tombs and other dry environments. Pretty much all European papyrus writings rotted away.

  15. Error correction codes. PAR2, btrfs, partitions,VM by raymorris · · Score: 5, Informative

    The magic phrase to Google is "error correction codes" (ECC).

    PAR2 uses Reed-Solomon error correction. parchive is the ECC file format specification, for Linux you will want PyPar or par2tbb, and on Windows you use a GUI called QuickPar.

    Btrfs can be set to use ECC on a single disk.

    You can slice a single disk into partitions and then use RAID1 or LVM mirroring, or RAID5 or RAID6. LVM can alao be useful to divide (and combine) any number of drives into any number of volumes, then you can RAID across the volumes.

    If you Google "ecc disk", "ecc backup", or "ecc archive" you'll find other options, with details about each option.

  16. ext4 by hcs_$reboot · · Score: 1

    ext4 is journaled and prevents loss in case of some file-corruption-prone events (like a sudden shutdown).

    --
    Slashdot, fix the reply notifications... You won't get away with it...
    1. Re:ext4 by Anonymous Coward · · Score: 0

      Yes but does nothing to help against bit-rot.

  17. ZFS by Anonymous Coward · · Score: 0

    ZFS lets you use the multiple copies in a RAID array to correct such bit rot and seems to generally be popular with people storing multiple terabytes. You might also want to ask this question on reddit's /r/datahoarder for some experience. For offline storage you should then probably activate it and run a scrub once in a while.

    The other suggestion would be to look at solutions employed on larger scales (libraries, archives), e.g. tape, distributed storage. For long-term storage you should also consider the possibility of soft- and hardware changes and thus maybe a "dumb" filesystem and easy access 20 years later might be more beneficial than a complicated filesystem and no access.

  18. ZFS and lots of redundancy by Chewbacon · · Score: 4, Informative

    ZFS will guard against bit rot. That's not enough. RAID isn't enough. You need redundancy outside your home or office. Cloud maybe expensive for the amount of data you have, but Amazon S3 maybe the most affordable in that range. You could get S3 for maybe $15-20 a month if you have a terabyte of data. If that's cost prohibitive, rotate external drives regularly and keep one at work. You'll lose very little data since you're archiving things.

    --
    Chewbacon
    The Bible is like Wikipedia: written by a bunch of people and verifiable by questionable sources.
    1. Re:ZFS and lots of redundancy by hawguy · · Score: 1

      ZFS will guard against bit rot. That's not enough. RAID isn't enough. You need redundancy outside your home or office. Cloud maybe expensive for the amount of data you have, but Amazon S3 maybe the most affordable in that range. You could get S3 for maybe $15-20 a month if you have a terabyte of data. If that's cost prohibitive, rotate external drives regularly and keep one at work. You'll lose very little data since you're archiving things.

      AWS S3 pricing is $0.023/GB or $23/TB/month.

      But for infrequently accessed data, AWS Glacier offers the same durability of S3 for only $0.004/GB or $4/TB/month. There's an infrequent access tier in between those two for $12.50/TB/month.

      Volume discounts kick in above 50TB.

    2. Re:ZFS and lots of redundancy by Anonymous Coward · · Score: 1

      If it's only disaster recovery, Amazon Glacier is probably cheaper. It's cheaper to use - by far - so long as you hardly ever read from it. That's what makes it ideal - if you have a local-NAS/SAN that stores your actual copy of the data (to handle your read requests) and Glacier storing your long-term DR copy, you should be fine. Glacier is also there for the 'audit logging' or 'long term record archival' where you need to store large quantities but only have the occasional read request; and won't ever look at most of it.

      My Synology NAS box at home is configured to automatically back-up to Glacier.

    3. Re:ZFS and lots of redundancy by heypete · · Score: 1

      But for infrequently accessed data, AWS Glacier offers the same durability of S3 for only $0.004/GB or $4/TB/month. There's an infrequent access tier in between those two for $12.50/TB/month.

      Volume discounts kick in above 50TB.

      Online.net's C14 service is even cheaper, at EUR 0.002/GB/month plus EUR 0.01/GB for "operations" (such as creating an archive from the temporary staging area, manually verifying archives on demand, or recovering an archive), and offers the same 99.999999999% durability as Glacier. No bandwidth costs and no complicated retrieval speed costs like Glacier, and you can use rsync to upload to the staging area. Naturally, they perform behind-the-scenes error checking and repair, but the manually-selected verification process is nice to explicitly verify that things are intact.

      They offer an "Enterprise" level with even more durability for increased costs (EUR 0.004/GB/month + EUR 0.025/GB for operations), as well as a new "Intensive" level that costs EUR 0.005/GB/month with no operations fees (it's intended for more frequent accesses to backed-up data).

      Online.net is owned by Iliad, who in turn is owned by Free, a major French telecom, so the risk of suddenly going out of business is low.

      Disclosure: I'm a happy C14 user, but otherwise have no connection to the company.

    4. Re:ZFS and lots of redundancy by sithlord2 · · Score: 1

      ZFS also has the "set copies=n" option which stores a file multiple (n) times. If you really want maximum protection, you can try something like this:

      - use ZFS mirroring
      - use "set copies" to store files multiple times (You can even use this is in a single-disk non-mirroring setup as well).
      - use "zfs send" and "zfs receive" over SSH to make offsite backups to a remote ZFS machine (or multiple machines).

      --
      ...You are over-qualified and under-paid. If we give you a raise, we will break the cosmic balance of the universe.
    5. Re:ZFS and lots of redundancy by nine-times · · Score: 1

      ZFS will guard against bit rot. That's not enough. RAID isn't enough.

      Yeah, honestly, if you want your data to be really safe, it's not going to be enough to determine a storage medium and filesystem. You need a process.

      Conceptually, the process needs to include having multiple copies in multiple locations. Each location needs to have a complete list of files along with a checksum for each file, and each file needs to be checked against that checksum on a regular basis. If a file is found to not match the checksum, you then need to have a mechanism whereby the same file is checked at other locations until a copy of the file is found that matches its checksum, and then *that* copy of the file needs to be redistributed to any locations where the file appeared to be corrupt.

      I'm not sure if there's some kind of rsync/bittorrent-like file that will handle that for you automatically. I've thought about it before, that it would be nice if I could buy a series of NAS devices, and scatter them around different locations with some kind of archival redundant file system that does that. I suppose using Amazon/Azure/Google cloud storage might serve as well as anything, though, but I personally don't know exactly what kind of redundancy/integrity safeguards each of those services have in place.

    6. Re:ZFS and lots of redundancy by hoggoth · · Score: 1

      Two ZFS systems. One in your house and one at your office or friend or brother's house. One does a zfs send to the other periodically. Checksums keep you safe from bit rot, hardware errors and cosmic rays. Snapshots keep you safe from malware and your nephew asking 'what does rm -rf * do?'

      --
      - For the complete works of Shakespeare: cat /dev/random (may take some time)
  19. Any Linux FS by MouseR · · Score: 2, Interesting

    I'd go for any Linux file system because Linux is the platform that evolves the least. It's still in the 90s so in 2037 it will still be current.

    (Watch out of the hater storm! Here they come!)

    But it's kinda true if you omit the snideness of the first statement. Because it's maintained by the user base, it's less likely to "devolve" into something incompatible due to market pressure. I, myself, would go for an Apple file system but Apple isn't so keep in keeping the Mac current and it doesn't bode well for the future. There might be a great change in the horizon.

    1. Re:Any Linux FS by Anonymous Coward · · Score: 0

      How do you explain systemd?

    2. Re:Any Linux FS by Anonymous Coward · · Score: 0

      I'd go with FreeBSD because that's the best OS for ZFS.

    3. Re:Any Linux FS by Anonymous Coward · · Score: 0

      abomination

    4. Re:Any Linux FS by vtcodger · · Score: 1

      "How do you explain systemd?"

      Systemd is the inevitable result of too much sex, drugs, and rock and roll music corrupting the minds of our youth.

      --
      You can't see ANYTHING from a car, You've got to get out of the goddamned contraption and walk...Edward Abbey
    5. Re:Any Linux FS by Anonymous Coward · · Score: 0

      More like the LACK or sex, drugs, and rock and roll music

    6. Re:Any Linux FS by Zontar+The+Mindless · · Score: 1
      --
      Il n'y a pas de Planet B.
    7. Re:Any Linux FS by Anonymous Coward · · Score: 0

      Using snideness for humor works better if you know what you are talking about. It just reads as Apple fanboyism. And for that matter, there's nothing magical or so great about about HFS+ that competing file systems don't have on Linux. And to say Linux filesystems are stuck in the 90s is just silly. Even just ext4's first stable release was only in about 2008 or so. What you wrote just reads arrogant but suggests you don't know much about the topic.

  20. "some photos are ill-colored" by hcs_$reboot · · Score: 3, Informative

    That's a well known problem to photographers, photos colors are affected over time. Keep the photo negatives in a safe place!

    --
    Slashdot, fix the reply notifications... You won't get away with it...
    1. Re:"some photos are ill-colored" by Anonymous Coward · · Score: 0

      Photo negatives? Is that when you XOR your JPGs?

    2. Re:"some photos are ill-colored" by marko123 · · Score: 1

      That's a well known problem to photographers, photos colors are affected over time. Keep the photo negatives in a safe place!

      That struck me as odd too. If the colours in digital photos or movies don't look right, I would try to display them with different software. It's more likely that the software that displays is reading and interpreting the format of the file differently than bit-rot would only affect the colour pallette and not make the whole file unreadable.

      --
      http://pcblues.com - Digits and Wood
    3. Re: "some photos are ill-colored" by Anonymous Coward · · Score: 0

      If critical bits describing the color are corrupted the image will goofy digital or not. Poster also commented on missing video keyframes.

    4. Re:"some photos are ill-colored" by erice · · Score: 1

      That struck me as odd too. If the colours in digital photos or movies don't look right, I would try to display them with different software. It's more likely that the software that displays is reading and interpreting the format of the file differently than bit-rot would only affect the colour pallette and not make the whole file unreadable.

      Or the OP is using a different monitor. It doesn't matter if the new monitor better or worse than the old one. If it is different and the photos are adjusted for the old monitor, it will look "off".

    5. Re:"some photos are ill-colored" by Anonymous Coward · · Score: 0

      Problem with software developing photos is that the process may change or the software simply not working in say 10 years.

    6. Re:"some photos are ill-colored" by Anonymous Coward · · Score: 0

      I've had corruption in JPEGs that create wacky color effects, like dropping an entire channel halfway down the image. If that's the sort of 'ill-coloring' OP is talking about, that's corruption.

    7. Re:"some photos are ill-colored" by Anonymous Coward · · Score: 0

      Or the OP has become a better photographer over time and learned how to light a shot.

    8. Re:"some photos are ill-colored" by Anonymous Coward · · Score: 0

      I have had jpg pictures turn a bluish / yellowish tint before.

      There may be 100 photos in the series, with one having that tint.

      It's obvious that some bit got changed somewhere.

    9. Re:"some photos are ill-colored" by Anonymous Coward · · Score: 0

      So you've never seen a photo in your library suddenly half or ¾ tinted green or purple? How lucky for you. Bit rot is real and JPEG has more than 2 failure modes due to it.

    10. Re:"some photos are ill-colored" by thegarbz · · Score: 1

      God this thread makes people who understand colour management hang their heads.

    11. Re:"some photos are ill-colored" by Anonymous Coward · · Score: 0

      The problem that's being overlooked is that digital photos have a color look-up table that maps pixel values to RGB colors (in GIF and similar), or the parameters for the wavelets that generate RGB colors (in JPG and similar). If that information takes a hit, the picture might be ok, but the colors are wrong.

      The best example is the SHA-1 collision proof PDF, found here: https://shattered.io/
      When a few bits of the file were corrupted, the background color of the PDF changed, as the corruption was inside the embedded JPG file.

  21. ZFS with regular scrubs by Anonymous Coward · · Score: 0

    In a word, no: I don't think there's any filesystem which is designed to combat bitrot while offline. Logically that would just mean duplication of data anyway and hoping that the duplicates don't both get corrupted over whatever period they're offline.

    Instead what you really want is a RAID array using ZFS with regular scrubs. A 'scrub' being where ZFS scans the entire contents of the disks, confirms checksums all still match and, if they don't, rewrites the data using the redundant disks in the array. Obviously it needs to be online to perform the scrubs but you could just boot it once a month for a few hours to do that.

    If you only need a mirrored RAID rather than RAID5/6/7 then BTRFS can offer the same functionality with some additional flexibility and is also native to Linux (rather than BSD for ZFS).

    1. Re: ZFS with regular scrubs by Anonymous Coward · · Score: 0

      Storing archive data on a hard drive or flash drive faces a few issues Write Once Read Many (WORM) storage such as CDs and DVDs do not face.

      The first is the typo issue (a.k.a. bumping your head). One wrong command and you can lose TB of data in a blink of an eye.

      The other issue is ransom ware. Should you get infected you may have to pay through the nose.

      I am not saying CD and DVD storage is perfect. They can be damaged by light, and they can "fade" after a few years. With streaming media and flash drives becoming more popular in the last few years computer manufactures have been cutting back on CD drives.

      Hard drives can seize up if not used periodically but they wear out if you use it. One wrong slip and you have a paperweight.

      As was pointed out in other posting you need to look at your interfaceservice also. PATA is dead, long live SATA, well until MSATA or some other interface replaces it. Sticking with USB-A as it will never go away...except for USB-C just might replace it...

      Maybe the best solution is to spread the risk by storing multiple copies on multiple types of media with multiple interface formats.

      To detect and correct bitrot you will have to use some hash and parity programs. But what happens when your selected algorithm is no longer available? For example SHA1 and MD5 was quite popular but now is being shunned due to being cryptographically unsecure. Should your favorite OS drop these algorithms error recovery will become more complex. To mitigate this you might think about storing a copy of those utilities' source code on the storage media. The more basic the software source code the more likely it will be useful in the future. Running an 8 bit application from 1983 on today's 64 bit OS might not be an option.

  22. Not all RAIDs are equal by kiviQr · · Score: 1

    Not all RAIDs are equal. If you want your data be safe use RAID 1 with second volume in a remote location (aka. offline backup).

    1. Re:Not all RAIDs are equal by helsinki92 · · Score: 1

      How do you get the second volume, or set of volumes in a RAID 1 offline in a remote location? If you are reading and writing to a RAID 1, I believe it has to be online unless I missed some neat technology in the past 10 years

    2. Re:Not all RAIDs are equal by Jamu · · Score: 1

      Mail the HDDs?

      --
      Who ordered that?
    3. Re:Not all RAIDs are equal by davidwr · · Score: 1

      If you are reading and writing to a RAID 1, I believe it has to be online unless I missed some neat technology in the past 10 years

      Use software RAID with Network block device for the underlying remote disk. Performance will suffer greatly but it should function as long as nothing times out.

      --
      Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
    4. Re:Not all RAIDs are equal by Billly+Gates · · Score: 1

      Raid is not backup.

      It doesn't guarantee your data will be there if data on the disk over time gets reset by Earth's magnetic field. Tapes too but last longer as they are designed to not detect but recover from errors.

      What if your hardware raid card won't fit in a computer 10 years from now or have drivers that work?

    5. Re:Not all RAIDs are equal by Vairon · · Score: 1

      You can also use DRBD (Distributed Remote Block Device) https://www.drbd.org/en/ to replicate block devices between servers.

    6. Re:Not all RAIDs are equal by kiviQr · · Score: 1

      Concur in case mailing is not an option, Amazon has a truck available!

  23. Backblaze: SMART metrics of imminent failure by archer,+the · · Score: 1

    Backblaze made a report of what SMART drives they see indicating imminent drive failure: https://www.backblaze.com/blog...

    1. Re:Backblaze: SMART metrics of imminent failure by GameboyRMH · · Score: 1

      My home server has a script that monitors each drive for these and shows a warning when I log in if any values are out of line. You'll rarely get a SMART error so bad that it triggers a general health warning before a drive fails, you have to watch the stats.

      Another good one to watch on some drives is the Raw Read Error Rate - on some drives this normally stays at zero and if it climbs above zero, it means a failure is coming. On other drives the value harmlessly racks up over time and it means nothing. The same script writes a report that notes this for each drive, so if I see one that I know normally stays at zero climbing, I know to watch out.

      --
      "When information is power, privacy is freedom" - Jah-Wren Ryel
  24. Lots of parity by ilsaloving · · Score: 1

    No media is perfect. There's just varying likelyhood of error rates over time, depending on the quality of the media. Without knowing ahead of time whether a specific piece of media is going to fail, the question needs to change from "How do I keep it from getting corrupted" to "How do I mitigate eventual corruption?"

    And the question basically boils down to one answer: redundency.

    Off the top of my head, I can think of three things you can do, and these are not mutually exclusive.
    1. Multiple copies of data, stored in different locations. If something happens to a specific location, then at least the media is still safe elsewhere. Even if nothing happens to the location, media failure can still occur. The more copies you have, the more likely you will still have at least one good copy when the times comes that you want to access it.

    2. Parity. There are plenty of tools available that allow you to add parity information to your files. For example, the RAR compression utility will allow you to add a 'recovery record' to your file. You choose how much RR you want to add, up to 10% of the file. Obviously this takes up additional space, but you can have a sizable portion of your .rar file become corrupted, and you can still retrieve it. Another thing you can use, is a tool that was popular in the old days of newsgroups: PAR. Unlike RAR which encapsulates your file into an archive, PAR files sit beside your data files. But the function is basically the same. PAR files provide parity data, which you can use to reconstruct files that have been damaged. I'm sure there are other tools available as well.

    3. Migrate your data over time. The unfortunate fact is that media changes. If you want to keep your data for the long haul, you have two choices: Make sure that you keep backup hardware to read the media you want to read (which brings it's own longevity problems), because it may not exist in the future (eg: It's pretty darn hard to find 8" floppy drives anymore), or you periodically migrate your data to a new standard format. Just in the past 30 years, we've gone from Floppies->CDs->DVDs->Bluray->Flash(thumbdrives,SD,etc).

  25. ZFS + Amazon Glacier storage, or equivalent by Anonymous Coward · · Score: 0

    For the average Joe, you can't do much better than a simple array of disks with ZFS (it offers good quality integrity checks out of the box),
    combined with an off-site backup which you would likely be unable to do anywhere else cheaper than Amazon Glacier's service.

    Last time I checked, fitting 6x8 of disks striped in raidz2 configurations gave an optimal balance of reliability, capacity and speed, all feasible to have in a single box.
    Offline is not a problem: you just switch off its power when not in use.

  26. ZFS by johnslater · · Score: 2

    "Is there a software solution, like a file system or a file format, specifically tailored to avoid this kind of bit rot?"

    Yes, ZFS is specifically tailored for this. Configure a zpool running RAID-Z2 with a hot spare or RAID-Z3. Half a dozen 6TB or 8TB disks should suffice.

    Set it to auto-scrub regularly. Send logs and warnings to your email, and pay attention to them. (This is the hard part). Especially pay attention if they stop arriving. (This is even harder).

    I have used Nexenta for some time, but the free product has a limit of 18TB of raw storage. If I was starting today I would use FreeNAS which has no such restriction.

    The other comments about the futility of trying to do this long term are worth heeding, but that doesn't mean you shouldn't try. They key is to make this an active project rather than a passive archive, and to re-evaluate the best approach every few years.

  27. FS not related at all by Anonymous Coward · · Score: 0

    Whatever media you choose, must be tested from time to time. Even a tape can suffer data loss if not used eventually. And hdd may be are susceptible to errors because of cosmic rays or magnetism if not powered for a long time, just guessing...

  28. Online by lobiusmoop · · Score: 1

    'Forever' is a long time.

    'Offline' is difficult to deal with long-term (i am thinking decades to centuries) such is the nature of technology and the lack of any real history we have of digital data management,
      Personally I would say the best bet is keeping your data 'live' online to some extent, it is the only real way to monitor and control the inevitable decay.
      Basically your data's lifespan is related to how long you can convince someone to care for it for you.

    --
    "I bless every day that I continue to live, for every day is pure profit."
  29. Different objectives mean different solutions by Noryungi · · Score: 1

    Pick your poison:

    - Tape: inexpensive and slow, require frequent testing (backup we do, it's restoration the problem!), usually unreadable after 6 to 12 months or less (that's in production people).

    - WORM: more expensive than tape and just as slow, work well in the medium term (meaning 10 years top).

    - XFS NAS: faster than the above, require good hardware and a bit more work than either tape or worm. Don't forget to setup replication to multiple systems. May suffer from bitrot in the long term (checksumming/hashing files might be a good idea). Very stable, large capacity file system. Tape backup is always a good idea.

    - ZFS NAS: slightly slower than XFS (at least, that's my experience, YMMV). Ultra-large capacity. Snapshotting is just a breeze. Again, replication to multiple, distant systems is mandatory. Very stable file system. Tape backup is always a good idea.

    - DNA, 3D crystal lattice, holographic memory: what we are all going to use in the future. Still in beta testing, though.

    - DVD: don't make me laugh.

    --
    The right to offend is far more important than the right not to be offended. (Rowan Atkinson)
    1. Re:Different objectives mean different solutions by Nutria · · Score: 1

      usually unreadable after 6 to 12 months or less

      What kind of crappy tapes do you you use? We've restored DLT tapes after 7 years in Iron Mountain.

      --
      "I don't know, therefore Aliens" Wafflebox1
    2. Re:Different objectives mean different solutions by Cramer · · Score: 1

      LTO? Every LTO tape I've ever sent to IM has come back trash. (at least a DLT/SDLT tape can be erased and reused)

      It's too bad Quantum discontinued the DLT technology. LTO is a very poor substitute.

    3. Re:Different objectives mean different solutions by Nutria · · Score: 1

      at least a DLT/SDLT tape can be erased and reused

      LTO can't????

      --
      "I don't know, therefore Aliens" Wafflebox1
    4. Re:Different objectives mean different solutions by HBI · · Score: 1

      I have DC2000 tapes from 1992 that I can still read.

      --
      HBI's Law: Frequency of calling others Nazis is directly correlated with the likelihood of the accuser being Communist.
    5. Re:Different objectives mean different solutions by Cramer · · Score: 1

      Take a magnet to one (degauss) and get back to me. Like a hard drive, tracking/alignment information is magnetically stored on the tape. Lose even one bit of that, and the tape is forever ruined. DLT/SDLT have physical alignment marks on the back of the tape, read by laser; you cannot magnetically destroy a DLT tape -- erase, sure. ruin, no.

    6. Re:Different objectives mean different solutions by Nutria · · Score: 1

      Interesting. I've only ever used DLT & SDLT (on DEC/Compaq/HP systems).

      --
      "I don't know, therefore Aliens" Wafflebox1
  30. hard drives have a finite lifespan by Anonymous Coward · · Score: 0

    As has been stated, your hard drives are not fine if you are starting to see data corruption. They are starting to die. The filesystem used is irrelevant. You have a hardware deterioration problem. Hard drives only last so long. When in constant use they'll eventually wear out. When sporadically used they're susceptible to other kinds of hardware failure issues. This includes magnetic issues, heat cycle issues, etc.

    Hard drives are not permanent storage. If you choose to keep your data on hard drives, and hard drives alone, then you will have to accept the fact that you will forever be buying new drives and copying the old data to the new drives. Whether that's done manually or is simply an automatic procedure is up to you and how you choose to set up your system. If you don't do this, you will eventually lose all your data because chances are your drives will stop working at some point. They all do.

    As has also been mentioned, your best best is to set up some hardware redundancy and some filesystem redundancy on top of that. In addition one or more extra copies in a different physical location. I'm confused as to how this basic stuff is on slashdot, frankly.

  31. Re:Filesystems with CRCs... by archer,+the · · Score: 1

    It looks like there are (at least) two with CRC: zfs and btrfs. Here's info for btrfs CRCs: https://en.wikipedia.org/wiki/...

    You'd still need a backup or RAID solution to replace a bad black.

  32. this doesn't make any sense by trybywrench · · Score: 1

    if bits were randomly changing you'd have corruption issues not faded images and videos missing keyframes. This is ridiculous.

    --
    I came to the datacenter drunk with a fake ID, don't you want to be just like me?
  33. Use permanent storage. by Gravis+Zero · · Score: 1

    HDDs will die. If you want something that will last for many decades or even centuries without getting corrupted then you need to stop using a volatile filesystem. The best option is to go with write once media. The best option I know is M-DISC.

    M-DISC's design is intended to provide greater archival media longevity.[3][4] Millenniata claims that properly stored M-DISC DVD recordings will last 1000 years.[5] While the exact properties of M-DISC are a trade secret,[6] the patents protecting the M-DISC technology assert that the data layer is a "glassy carbon" and that the material is substantially inert to oxidation and has a melting point between 200 and 1000 C.[7][8] -- Wikipedia

    --
    Anons need not reply. Questions end with a question mark.
    1. Re:Use permanent storage. by Dorianny · · Score: 2

      HDDs will die. If you want something that will last for many decades or even centuries without getting corrupted then you need to stop using a volatile filesystem. The best option is to go with write once media. The best option I know is M-DISC.

      M-DISC's design is intended to provide greater archival media longevity.[3][4] Millenniata claims that properly stored M-DISC DVD recordings will last 1000 years.[5] While the exact properties of M-DISC are a trade secret,[6] the patents protecting the M-DISC technology assert that the data layer is a "glassy carbon" and that the material is substantially inert to oxidation and has a melting point between 200 and 1000 C.[7][8] -- Wikipedia

      Did you even bother reading the wiki you linked to or did you just copy and paste the first paragraph ?

      "However, according to the French National Laboratory of Metrology and Testing at 90 C and 85% humidity the DVD+R with inorganic recording layer such as M-DISC show no longer lifetimes than conventional DVD±R.[11]"

  34. Two distinctly different problems by Tablizer · · Score: 1

    It may have nothing to do with bits. It's possible the problem is a media player and/or driver compatibility issue or bug. I've seen where one media player/displayer can display an image or video fine, but another gags on it or distorts it. Probably a bug in the encoder and/or decoder.

    As far as backups, make at least 2 copies. Bit-error-recovery schemes will usually require more storage space such that it's probably less hassle and more "insurance" to keep 2 regular copies rather than one copy with some fancy bit-correcting on it. Plus, in the future you may not be able to find a decoder for the fancy file encoder scheme.

  35. ReFS if you aren't booting from it by Anonymous Coward · · Score: 1

    Essentially MS's new ReFs does everything plus self healing except no alternate data streams and no booting from it. Your files sound like Archiving which is exactly what this can be for.

    1. Re: ReFS if you aren't booting from it by Anonymous Coward · · Score: 0

      Confused why more people aren't talking about ReFS for archival, it's current version is available to fuss with.

  36. Snapraid by silas_moeckel · · Score: 2

    ZFS is nice I use it it makes assumptions about sane gear that are not safe on desktop grade hardware. BTRFS I also use works great. But for your specific use case snapraid is the thing to use. By that use case things that never change a big pile of files you keep adding to. Mind you your going to have to replace drives over time.

    --
    No sir I dont like it.
  37. What you might need by OrangeTide · · Score: 1

    A archival optical format. M-DISC DVDs and Blu-ray are theoretically able to retain data for 1000 years. And DVD uses some error correcting codes already, Reed-Solomon I believe.
    An SSD is a bad choice for archival, in some cases MLC Flash can decay and accumulate errors in 3 months while unpowered.
    For a file system that is likely to be understood in the distance future, ISO 9660 with no file larger than 2 GiB should do the trick.
    Packing your data into a custom archive file format that has more sophisticated forward error correction, like Turbo Codes, could be useful although perhaps inconvenient if you need special software to decode the files.
    Keeping file of hashes (MD5, SHA1, crc32, cksum, cfv, whatever) for file integrity verification is very helpful for verifying if you have bit rot. As I've found most proprietary file formats cause programs to crash when they are corrupt.

    Making N copies of your data and sending the discs to N destinations would allow you to recover most instances of partial data loss among all the discs, and total data loss of N-1 discs. I think N=2 or N=3 is plenty of paranoia without much overhead for an individual.

    For short term, just throw it into the cloud. If your local backup

    NOTE: in 100-1000 years, people interested in old data won't need an off-the-shelf DVD drive to read the data off a DVD, any researcher should be able to construct a purpose built drive. I mention this because USB, SATA and PATA won't be around as standards and the old electronics won't likely work reliable anyways. Even today, I think building a device to read a CD or DVD is within reach of a clever teenager.

    --
    “Common sense is not so common.” — Voltaire
  38. Here's how I'd do it. by SuricouRaven · · Score: 1

    1. Add lots of redundancy in the form of PAR2 files.
    2. Store the whole lot as a tar format, dumped to the drive as a block device. This format is so simple that a future programmer will have no trouble reverse-engineering it, even if all documentation has somehow been lost, and there are no key structures which will render the whole thing impossible to read if lost. Just to be sure, the first thing going on there is a copy of the tar format specification.
    3. Include also a copy of the par2 software for several operating systems, source code, mathematical explanation and format specification.
    4. dd copy the drive to as many other drives as your budget allows.
    5. Distribute the drives.

    This approach should do for the next forty years or so. After that point it might get difficult for people to source a SATA controller, so you will have to migrate to new media.

  39. Delete shit by Anonymous Coward · · Score: 2, Insightful

    Seriously, minimalism is underrated. There is such a thing as too much useless data. It's hard to catalog, it's hard to track, and if you sat down and sorted out what you actually could still use, most of it is probably worthless or you'd never find the time to use ever again. You might ask "well it's still worth storing IN CASE I ever find a use for it", but that's a typical data-hoarder sentiment that is unsustainable. You can't just keep buying media to store everything and never delete, it's a management nightmare results in these very issues.

    I guarantee you, if you find you've deleted something and actually want to get it back, it's available somewhere on the Internet. If it's NOT, then it's a candidate for keeping. That's how minimalism works.

    1. Re:Delete shit by Anonymous Coward · · Score: 1

      I knew someone in college with over 9,000 1.2M 5.25" floppies worth of porn. This was in the days before hard drives were affordable. I wonder what 10,000 floppy disks cost.

    2. Re:Delete shit by Anonymous Coward · · Score: 0

      I've got about 2TB of stuff, mainly movies and TV shows, and I'm thinking these days that I could probably just delete most of it. Some of it is hard to get because it's out of print and quite large (like every episode of Doctor Who or MST3K), but most of this junk I could stream or go on Amazon and buy the whole series at better quality if I ever want to watch it again.

    3. Re:Delete shit by davidwr · · Score: 2

      I knew someone in college with over 9,000 1.2M 5.25" floppies worth of

      If he was smart, a least half of them were used for backup.

      --
      Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
    4. Re:Delete shit by Anonymous Coward · · Score: 0

      I can't agree more emphatically with this.

      When you do file cleanup, move all your files to a directory named "delete-after-${6-months-from-today}".

      If you find you need any of the data, recover the specific files you need into a clear, sane folder structure that will clearly identify the file contents. If you don't find you need any of that data, delete the fucking folder and don't look back.

      If you're REALLY anal retentive about it, you can stage it - move it off your primary disk to a secondary storage location after 6 months, then delete from secondary storage a year after moving it.

      Seriously - if you don't know exactly what's there and why you're keeping it, it's useless garbage.

  40. RAID if you must, but cloud is better by hawguy · · Score: 1

    Just RAID it (preferably mirroring)store multiple redundant copies, physically separated. Either use a checksumming filesystem (i.e. zfs) or make your own checksums so you can recognize bitrot.

    But you'll never know when things have degraded beyond recovery, .

    Unless you're prepared to regularly validate that the data is still readable, you'd be better off storing the data at any major cloud vendor and let *them* verify integrity over time. Or better, mirror the data across multiple cloud providers.

    My most important data is family photos (some scanned images date back to the early 1900's). I keep the image files on a RAID-6 hard disk array, which is backed up to a separate hard drive in another part of the house once a week (for quick local restores), everything is also backed up to a Crashplan cloud backup account, and all of the files are also backed up to AWS Glacier in a different country from me.

  41. ANSI-labelled mag tape ... by Anonymous Coward · · Score: 0

    ... obviously.

  42. Filesystems and hardware that mitigate bit rot by davidwr · · Score: 1

    If I understand you correctly, you are asking what filesystems can error-correct in the face of physical bit rot.

    I don't know of any commonly-used "disk-type" (local, not specifically designed for archival/offline media) file systems that have checksumming or RAID-style redundant data within the filesystem itself. Some distributed/clustered file systems have features like this, but they aren't well suited for offline storage in the way that you are thinking about (or, when used for offline storage, the redundancy is likely "optimized away."). I'm not familiar enough with the filesystems used by optical media and tape drives or their underlying hardware to know how much redundancy exists or at what layer the redundancy exists at, but I suspect it is "below" the filesystem level.

    If you aren't interested in inventing an "optimal" solution in terms of storage space or time-to-create-or-read the backup, a "not much thought required/you can think it through in far less than an hour" solution is to create checksums for every backup you make (either per-file, per-"block," or some other way) then make a second copy of both the backup and the checksums. Store the two copies in different places. If it's very important, make a third copy but use a different format for this backup (for some documents, like a business letter, a printout is an acceptable backup).

    --

    If it's really really important, encrypt it and upload it to PasteBin and tell the world that it's political dirt on [insert politician's name here] and that you will release the encryption key if the politician doesn't resign. This will ensure that there will always be many copies in existence. *joke*

    --
    Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
  43. Digital hoarding by Anonymous Coward · · Score: 0

    Back in the day i had a 4.3gb hard drive in my computer; that's right a hdd the same size as a DVD. I had to uninstall one game to get enough space to install another. I used to run it as lean as i possible could but these days of multi terabyte drives i have relaxed and succumb to little bit of hoarding; maybe a few hundred gb worth.
    But people who feel the need to keep multiple terabytes really need to look at themselves and think is it really necessary or is it simply hoarding. Digital hoarding can become very expensive very quickly; having to keep multiple drives containing the same data to ensure failure won't wipe it out. Having to routinely spin up the drives to ensure no damaged sectors have corrupted the data; doing a bit comparison to ensure both copies are identical.

    You seriously need to look at the data and consider if it's worth keeping and if it is can it benefit from compression, even lossy. Do you really need a 50gb bluray rip when a 15gb x265 encode will be all but indistinguishable? Do you really need a 20mb png of an image that can be converted to a 1mb jpg with minimal loss of fidelity?

    Just because we live in the age of single 10tb drives doesn't mean we should simply stop being selective or logical about our storage needs.

    I have 2 backup routines. The first is a (nearly) whole system backup onto a 2tb external hard drive (excluding steam folder and other things that can be re-downloaded) - this is something i can just restore if my drives go bang.
    The second routine is to backup the things i simply cannot live without. The things that i have written over the past decade or so which i simply could never replicate again. That is encrypted and mirrored in nearly a dozen separate places - on a usb stick, on my primary phones memory and sd card, on my secondary phones memory and sd card, on my mp3 player, in 3 separate 'clouds', on my tablet and on another external hard drive. That single encrypted file is less than 500mb. That is the difference between keeping the things you cannot live without and digital hoarding.

    1. Re:Digital hoarding by davidwr · · Score: 1

      Back in the day i had a 4.3gb hard drive in my computer

      Ah, kids these days. Back in my day 4.3mb was huge.

      "Nine megs for the secretaries fair,
      Seven megs for the hackers scarce,
      Five megs for the grads in smoky lairs,
      Three megs for system source;

      One disk to rule them all,
      One disk to bind them,
      One disk to hold the files
      And in the darkness grind 'em."

      --
      Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
    2. Re:Digital hoarding by Anonymous Coward · · Score: 0

      A whole lot of poorly-written waffle, but quite correct nevertheless.

      You could certainly have left out your personal digital hoarding strategies, but your point stands. However, the real answer is to use simply two separate cloud services, and be done with it.

    3. Re:Digital hoarding by Cmdln+Daco · · Score: 1

      Ahem.

      Back in the day I had a 5 megabyte hard drive in my system. Since my floppy drives were 360K, it held the equivalent of 15 diskettes worth of data.

      A 20mb png image??? Seriously? How may floppy diskettes will it take to back that up? Isn't a picture only worth a thousand words?

    4. Re:Digital hoarding by Cmdln+Daco · · Score: 1

      A whole lot of poorly-written waffle, ...

      I ran a Waffle BBS for a time. It was an odball waffle system, because I was running it on an MS-DOS system, and it was mostly hosted on UNIX systems, I was told.

      There was this one punk kid who kept trying to crack it. He'd dial up and just keep throwing esc sequences and random characters out of his modem at my modem. I think he thought my Waffle BBS was running on a UNIX system. It would have served him right to crack his way to an MS-DOS prompt. But when I figured out what he was doing I linekilled him and banned his account.

  44. Stone tablet by Anonymous Coward · · Score: 0

    Get out your chisel and mallet. Carved into stone tablets your data can last for millennia. You will want to throw in a little error correction. Better get to chiseling...

    Or use an optical format made for archiving.

  45. Get therapy by Anonymous Coward · · Score: 0

    for your OCD, hoarding, and anxiety disorder.

  46. Are you sure it's bit-rot? by davidwr · · Score: 1

    It may be that the codec you are using now isn't bug-for-bug compatible with the codec that was used to store the file.

    It's also possible that the file was saved in a "not quite industry standard format" but that it would look fine on vintage hardware running a vintage OS with vintage device drivers and vintage software, but today's hardware and software interprets these "not quite industry standard-format" files in a way that exposes their flaws.

    Got a Pentium II computer and a copy of Windows 98 in the basement?

    --
    Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
  47. How about getting rid of it? by swb · · Score: 4, Insightful

    You've got terabytes of information you will never access again. How about just getting rid of most of it? Pick some subset you want to keep and then buy 3 HDDs and create triple copies of it Repeat this every year and you'll probably not lose any of the information.

    1. Re:How about getting rid of it? by Anonymous Coward · · Score: 2, Insightful

      I completely agree. Someday you will die. Maybe .01% of what you have stored will be valuable to your posterity (including photos and videos). After two generations - nobody will care except for a picture or two.

      Too bad you have now dumped Terabytes of uncurrated junk on them that they will now have to spend days of time checking for anything useful. Do your posterity a favor and save the useful stuff and junk the rest.

      By the way, if you don't, this will repeat with your children, and childrens children until each generation is handing down exabytes to the next. What a terrible legacy pretending we are so important.

    2. Re:How about getting rid of it? by Anonymous Coward · · Score: 2, Insightful

      Never in human history have we had the ability to save such an amount of useless, pointless "information". Unless you are super famous (and even then only 1-3 generations will give a shit) or end up becoming some very important figure in history, nobody and I mean nobody will look at your shit after you die.

      If my dad dumps TBs of data on me, I'm shredding it. I won't even plug the fucking things in.

      The time, money and energy you are putting into this is a complete and utter waste. Read a book, clean some windows, do any other mundane chore and you'll have achieved more. Go cut grass with scissors, you'll be more productive and it's a story your great grand children will actually hear about. They won't ever see a single photo or anything else you've "saved".

    3. Re:How about getting rid of it? by Anonymous Coward · · Score: 0

      It's not for future generations, it's for me. I'll judge what I feel is important and store based on the worth I assign to it.
      But thanks for your less than useful input.

    4. Re:How about getting rid of it? by Waccoon · · Score: 1

      You may have terabytes of information, but you never know which megabyte you'll need.

      I once needed to dig up a logo I created in college almost 20 years ago. Due to my good organizational skills, recovering it was as simple as going into my folder of college stuff, and... voilà.

      Ask somebody who archives company invoices how important data can be 10-20 years later.

    5. Re:How about getting rid of it? by Anonymous Coward · · Score: 0

      Suuuuure. But even if you mean that, there's a discrepancy between what you feel is important and what will actually be important to you, because you are failing to realise that you'll die soon.
      There's simply no way that such a huge amount of data is going to be useful to you. We aren't talking about your pictures here, we're talking about video and games. These are items which take considerable time to consume, especially when you've got so much of it. If you keep accumulating crap, at some point you'd have to be doing nothing but watching video all day just to finish it before you die.
      But you won't, because I know your type. You'll keep watching new stuff and store it on more and more drives and you'll watch maybe one or two series again, if ever. And you'll probably download those again, because finding them in your trash pile takes too much time and the internet has the remastered re-release or whatever. And you'll ‘save’ that too. And you'll complain to your friends your accumulating backlog of series to watch.
      And then you'll die and nobody will ever look at any of it again.

    6. Re:How about getting rid of it? by swb · · Score: 2

      It's worse than that. A friend of mine is in the estate sale business. He and his partner have been doing it, along with sidelines in collectible art and furniture, for close to 30 years, and cater to a who's-who list of local old money families.

      Unless you are an extremely serious collector of high value objects, about half your stuff will sell for 10 cents on the dollar and the rest will go to the landfill. Vintage silver service? Valued at the melt value of the silver.

      I helped him move stuff to the dumpster a couple of years ago and stumbled across (in the dumpster) 5 photo albums with family pictures. Nobody wanted them, not even the family, although I think in this case "family" was 4 cousins in their 50s who lived out of state.

    7. Re:How about getting rid of it? by Anonymous Coward · · Score: 0

      Photos of landscapes and buildings get valuable after only one decade if those areas have been demolished and built over. Your photos become valuable to grandchildren and great-grandchildren, and valuable to fashion archivists if you were wearing the fashions of the time. There were photographers who just went around capturing pictures of everyday scenes of the time. Now those become iconic photos of the past.

    8. Re:How about getting rid of it? by Anonymous Coward · · Score: 1

      Photos of landscapes and buildings get valuable after only one decade if those areas have been demolished and built over. Your photos become valuable to grandchildren and great-grandchildren, and valuable to fashion archivists if you were wearing the fashions of the time.

      No, YOUR photos don't. YOUR photos are blurry, out of focus, poorly-lit abortions which give the people looking at them headaches if they look too long.

      The VALUABLE photos are the ones which are clear, well-lit, well-composed, and properly focused. Those tend to be taken by people who are professional photographers, or at least GOOD amateurs. And those people will take dozens of digital photos to capture that one GOOD photo that will be valuable some day.

      Look, I've got thousands of candid photos through the years of my now-17 year old daughter and my wife, whom I love dearly. I would be very disappointed if I lost them, because they capture little moments and memories that are meaningful to me. Some of them MAY be meaningful to my daughter someday, but pretending that "all of these photos" are worth keeping and valuable is just fucking stupid. If I want to leave my daughter something of "value", I'll cull through all those low-quality photos to find the 20 or 30 really GOOD photos, get them printed professionally, and present them to her as a memento. I won't hand her a 10-terabyte hard drive full of 40,000 photos, and say "There's some good memories in here, you should find them."

      Looking through a nice presentation of 30 good photos is a lot more rewarding than paging through 40,000 terrible photos. At some point, the scale of the data is simply too great for any single person to manage.

      How many photos do you think were taken in Times Square on August 14, 1945? "V-J Day In Times Square" is an iconic photo. Out of the probable hundreds of thousands of photos taken that day in Times Square, one photo emerged as iconic - and it was published nationally in Life in the week after it was taken, which suggests that other people recognized its importance. If you think your random cameraphone photos are *likely* to become "iconic," then I'm guessing you also have a "sure-fire" way to buy lottery tickets that just hasn't quite paid off for you yet.

    9. Re:How about getting rid of it? by Anonymous Coward · · Score: 0

      I tend to agree: Reduce. That said, I have kept downloads of some niche content from web sites which no longer exist (think GeoCities sites with content about really old computer games) that is meaningful to me, but won't be to my descendants

      At some point, how do you even know what you have? Even with indexing and searching, if you have forgotten about something completely, you wouldn't even think to look for it unless you spend a bunch of time combing through your archives....

  48. BTRFS another option by Anonymous Coward · · Score: 2, Informative

    in addition to ZFS, BTRFS also handles bitrot. I'm running a 4 disk BTRFS RAID 10 in my closet, mounting to a development machine on my desk via NFS, it's been working fine for about a year, and I scheduled a scrub a couple times a month whose purpose is exactly this, to catch and correct bitrot. It does so by using a CRC32 check, and if it detects a problem on one slice it overwrites that slice from the data on the good slice.

    Also I have offline and offsite backups of very important items.

    When using BTRFS read the wiki and settle on a kernel version and btrfs tools version that is sufficiently up to date, it's stabilized sufficiently for these kinds of things, but only if you are careful to run an up to date version that isn't marked as buggy on the wiki

  49. ZFS by Anonymous Coward · · Score: 0

    If your data is on-line (stored in disks that are plugged in) then you want ZFS. Preferably with either mirroring, RAID or multi-copy turned on so you have more than one copy of each file. This allows the file system to repair files that fall victim to bitrot.

    If your file system is off-line (not plugged in) then you should make multiple copies of each file/disk and store your backups in a different location. Chances are the same file will not get corrupted in both places in the same time frame.

  50. HAHA! by s.petry · · Score: 1

    Thanks for that one!

    --

    -The wise argue that there are few absolutes, the fool argues that there are no probabilities.

  51. Usenet by Anonymous Coward · · Score: 0

    Somehow all those postings I made back in the 90's are still available.

  52. Re:Filesystems with CRCs... by davidwr · · Score: 2

    It looks like there are (at least) two with CRC: zfs and btrfs. Here's info for btrfs CRCs: https://en.wikipedia.org/wiki/... [wikipedia.org]

    You'd still need a backup or RAID solution to replace a bad black.

    If only Slashdot posts had CRC or something like that, the posts wauld say what the poster intended.

    --
    Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
  53. OFFLINE Storage, with FS Access by williamyf · · Score: 5, Informative

    That a job for Linear Tape FileSystem

    https://en.wikipedia.org/wiki/...

    Tape is (still) the best medium for Long Term Storge. Over the years tape (or more likely, the engineers) has agresively incorporated in the standards things like FEC codes (from reed-solomon to more exotic ones nowadays).

    And since 2010, with LTFS, you can aceess the files with the convenience of a normal filesystem (but bear in mind, access is slow as hell).

    Back up your data to tape (more than one set), and send it to specialized offline storage facilities (cimate controlled: ie. temperature/humidity/dust/light control) from different providers, in diferente geographical areas.

    Since now there is only one true-tape standard (LTO-7 released in 2015, the tape business has been shrinking, so the proliferation os standards seems to be over now), so, if you use that today, chances are you will still find equipment to read it 50 years from now. Nonetheless, keep a few (as in two or more) SYSTEMS (Computer+Drive+SW) set up so that you can re-read. A cheapo micro formfactor mobo with an Atom Pocessor (but NOT the Atom C2000series PLEASE), linux, a 1Gbps nic and a tape drive should be more than enough. ....

    Now, for Online, as other posters have said, ZFS WITH ECC memory (and therefore, a very expensive Xeon, or AMD server type mobo) and JBOD will do the trick.

    --
    *** Suerte a todos y Feliz dia!
    1. Re:OFFLINE Storage, with FS Access by laing · · Score: 1

      LTO drives cost between $2k and $5k and the tapes are over $100 each. With only a 6TB maximum capacity, you may as well buy a bunch of 6TB hard drives for $160 each. LTO only becomes economical when the size of your tape archive exceeds 300TB.

    2. Re:OFFLINE Storage, with FS Access by Anonymous Coward · · Score: 0

      Tape is (still) the best medium for Long Term Storge.

      Define best. Tape backups are reputed to be good for about 30 years. M-Disk is (supposed to be) good for 1000 years (estimated). HOWEVER, the biggest M-Disk you can currently get is 100gb.

    3. Re:OFFLINE Storage, with FS Access by fnj · · Score: 1

      ZFS WITH ECC memory (and therefore, a very expensive Xeon...)

      Bullshit detector triggered. The socket 1151 Xeon e3-1220v5 is $206.49. It's a 4 core 3 GHz and supports up to 64 GB of DDR 3 or 4 ECC or non-ECC RAM.

    4. Re:OFFLINE Storage, with FS Access by Miamicanes · · Score: 1

      If you can live with 25gb capacity, the most cost-effective current medium you can get is non-LTH BD-R.

      M-Disk is just a non-LTH BD-R disc with the spiral geometry and pit dimensions of a DVD. It's a nifty idea, but unless you LITERALLY have to be able to stick the disc into a consumer DVD player and watch it directly, you'll get the same data-durability and save a substantial amount of money by just ripping your DVDs to .iso files & writing several of them to a single BD-R disc. Likewise, unless you need to be able to read your files on a PC that can't read Blu-Ray discs, there's NO GOOD REASON to buy M-Disks instead of non-LTH BD-R discs.

      Well, OK, I can think of one... even if you know what you're looking for, trying to buy non-LTH BD-R discs is kind of like trying to buy a multi-TT USB hub... it's rarely advertised as an explicit feature, so you can spend literally HOURS combing through product listings at Amazon & Newegg looking for the non-LTH needle in a growing haystack of LTH discs (searching rarely works, because the fucking search algorithms at Amazon and Newegg treat "LTH" as a match for "non-LTH". Whereas if the disc is branded "M-Disk" and has a capacity of 25gb, you know beyond doubt that it's non-LTH BD-R.

    5. Re:OFFLINE Storage, with FS Access by dbIII · · Score: 1

      Not bullshit I'd say, just using "very expensive" in comparison with the bottom end desktop CPUs around the $50 mark.

    6. Re:OFFLINE Storage, with FS Access by Anonymous Coward · · Score: 0

      Expensive?

      Well if all the fucking stupid windows gamers growing up to be stupid corporate admins would LISTEN to SAGE ADVICE and run ECC RAM on whatever cpu/mobo combo supports it, from their first computer onwards...

      THEN THERE WOULD BE NO EXPENSE BECAUSE THERE'D BE NO MARKET for shit ram and shit systems based upon their stupid KIDDIE theories.

    7. Re:OFFLINE Storage, with FS Access by Anonymous Coward · · Score: 0

      HAHAHAHA !!!!

      For all consumer end user level quantities of data, and quite frankly for anything BOTH under an exabyte AND that you expect to have actual data demand for,
      the COST of LTO and semantics of tape completely prohibits itself.

      You're better off buying and building multiple entire redundant offsite chassis full of 10TB+ HDDs
      for less than cost of a single LTO-7 drive with 1 tape and 1 cleaning cartridge.

      You need to be a bigdata corp / gov / edu before you start thinking tape.

    8. Re:OFFLINE Storage, with FS Access by dunkelfalke · · Score: 1

      How about a LGA2011 Xeon E5-2660? 8 cores, 16 threads, 20 megs of cache. $50 to $90 on ebay. Very fast. The motherboards are somewhat expensive, though, but then again you'll be able to buy used server RAM with ECC and register chips that is cheaper than normal DDR3 RAM. You can buy that xeon with a cooler, motherboard and 32 gigs of RAM for about $300 altogether.

      --
      "It's such a fine line between stupid and clever" -- David St. Hubbins, Spinal Tap
    9. Re:OFFLINE Storage, with FS Access by dbIII · · Score: 1

      How about a free computer out of dumpster? All I'm referring to is a poster above getting insulting over the obvious that a new Xeon is not the cheapest CPU available. For some reason it annoyed me so I posted out of boredom or something instead of letting it go.

    10. Re:OFFLINE Storage, with FS Access by Anonymous Coward · · Score: 0

      You don't need expensive Xeon. Some pentium processors for cca $50 support ecc. Or you buy used supermicro server or hp workstation. Without disks, you are looking below $400 for system - either new desktop grade or used server grade.

      I would suggest FreeNAS (uses ZFS), just read docs and buy good storage controller, if your board does not provide you with one.

    11. Re:OFFLINE Storage, with FS Access by fnj · · Score: 1

      $200 is not "very expensive" for a CPU. Sheesh.

    12. Re:OFFLINE Storage, with FS Access by Anonymous Coward · · Score: 0

      Keeping a working fleet of LTO drives with lots of tape moving past their heads is a major headache so my gut reaction was "fuck no to this guy."

      However the reason is few vendors, small population of drives in service, high in-use wear. None of those things apply in 100 years. What will be easier for a futuristic society to fix or replicate, a BD-ROM or a tape drive?

      I just wish there were some form of tape with better endurance. https://en.wikipedia.org/wiki/Creo is one idea, and it seems easy to read.

    13. Re:OFFLINE Storage, with FS Access by Anonymous Coward · · Score: 0

      Doesn't have to be expensive, a Celeron based server with enough (8GB+) ECC RAM will do just fine for FreeNAS with ZFS. For example a HP Microserver Gen8 can be bought for around $200.

    14. Re:OFFLINE Storage, with FS Access by fnj · · Score: 1

      I just discovered something that surprised me greatly, and should please you as it does me. The lowly Celeron G3900 dual-core Skylake supports ECC - 64GB of ECC. It doesn't have hyperthreading, and only 2MB cache, but it DOES have full VT-x with EPT, and VT-d.

      All for only $44.75.

  54. You have a hardware problem. FS choice won't help. by Anonymous Coward · · Score: 1

    Hard drives already use ECC on the physical platters (and SSDs do this too) to ensure bit rot is correctable (or at least detectable in the case of 2-bit failure). Unless you have a particular type of 3-bit failure, monitoring SMART stats and catching any read errors thrown by the drive is sufficient.

    There are WAY too many ZFS/btrfs zealots. "Oh, it has CRCs!" Yeah, so what? CRCs will tell you that data is already damaged just like a hard drive will. "Oh, it has ECC!" Yeah, so does the physical medium, so how is that going to make a difference beyond the hard drive? It might (MIGHT) detect hardware failures between the medium and the CPU/RAM, but if that's going out then you've got a much bigger problem than a filesystem with ECC is going to be able to help. Plus your CRCs only help if you either read the data or scrub frequently; untouched data CRCs won't get checked otherwise which is no change from no CRCs at all.

    The solution is what it always was: keep a backup copy of everything and restore from backup when something is damaged. If you want to detect data loss due to surface failure more quickly, you'll need to do a comparison of the backup against the data and see if a file has actually changed (rsync -rcvn source/path/ dest/path/ will display all changed files detected by full file checksums instead of time+size difference).

    ZFS and btrfs have lots of spiffy features that may be helpful in avoiding or mitigating data loss (snapshots come to mind), but I really wish people would stop acting like they're a magic bullet that kills your bit rot problems. They're not. They never will be.

    RAID-5, XFS, and an array on a different machine periodically mirroring with rsync snapshots has NEVER failed me and likely never will. If you're losing so much data that multiple files are getting corrupted, ZFS and btrfs ARE NOT GOING TO SAVE YOU.* Fix your computer.

  55. RAID isn't nearly enough by Anonymous Coward · · Score: 0

    RAID assumes statistically decoupled failure modes. One place I worked, they shut down a rack of servers for maintenance, and had over a dozen simultaneous lost drives when powering them up. Stiction. A design flaw in the bearings or heads or something caused the disks to get scraped to hell, completely unreadable. One brand of recent drives can't shut down properly because the capacitor that powers the emergency-parking the heads isn't strong enough to overcome unexpectedly high friction in the bearings, if the grease settles for too long.

    So yeah, just assume you're going to lose that data some day, and learn to live with that fact.

  56. parchment by Anonymous Coward · · Score: 0

    Data recorded on parchment has survived for hundreds of years, through numerous world wars, environmental catastrophes, human stupidity, etc.

    Make sure to back it all up on clay tablets.

  57. Forever? by Anonymous Coward · · Score: 0

    > I'd like to save forever

    Why? Nobody will want your crap after you die. What you are doing is called 'hoarding'. If you have traces of sanity, you should destroy this crap and start living you life.

    A related thought: whoever worries too much about 'bit rot' should keep reminding themselves that this is exactly what happens to us - both our bodies and minds - in the course of each and every minute. Both our genetic material and our brain cells continually deteriorate.

    Don't waste your time.

  58. SnapRaid by galvanash · · Score: 2

    Its not the only solution of its type, but it is imo the best:

    http://www.snapraid.it/

    It is perfect for your kind of situation - long term, reliable, efficient storage of lots of data that seldom changes. Think of it as offline RAID backup, it works like RAID, but it computes parity during your backup operations "offline"..

    The beauty of it, imo, is that is is not file system dependent. It works with NTFS, EXT2, HFS, whatever. It works on Linux, Windows, Macs, whatever. You don't need special controllers, and your hard drivers do not have to be matched to each other. You can even include drives on different buses (some on USB, some on SATA, whatever).

    It doesn't mess with your data at all - your files are stored normally and can be accessed normally, there is no difference between using it and not using it under normal operation - there is no performance impact at all (it only does anything during backup operations - and even then it is very lightweight if your data doesn't change drastically day to day). You just schedule it to run on a regular basis and it does it thing. It detects and recovers from bit rot in much the same way as ZFS (although you need double parity or more to really ensure full protection from multiple drive failures). You can be as paranoid as you want, it just takes more storage to be more paranoid :)

    It isn't good for frequently changing data, and it isn't so great for huge amounts of small files either. It takes a long time to generate parity setup if you have lots of data. You have to be comfortable with command line usage and you have to have some way to schedule jobs. Those issues aside, for things like media libraries and archival storage, it is easily the least painful, most effective solution I have ever used. And its free to boot (and opensource).

    Highly Recommended.

    --
    - sigs are stupid
  59. Paper Tape by arfonrg · · Score: 1

    Paper Tape - As long as you don't damage it, it will never suffer data loss.

    --
    Your thin skin doesn't make me a troll
    1. Re:Paper Tape by Nutria · · Score: 1

      Someone with a 5-digit ID should have seen paper turning brown & brittle.

      --
      "I don't know, therefore Aliens" Wafflebox1
    2. Re:Paper Tape by caseih · · Score: 1

      Yeah sadly paper from the early 1900s to the 70s was rubbish. Paper from the 1600s, now that will not be turning brown and brittle and is just as readable today as it was back then.

    3. Re:Paper Tape by Nutria · · Score: 1

      But there's no paper tape from the 1600s... :)

      Also, I've got paperback books from the 1980s that are brown & brittle.

      --
      "I don't know, therefore Aliens" Wafflebox1
    4. Re:Paper Tape by Anonymous Coward · · Score: 0

      Someone with a 5-digit ID should have seen paper turning brown & brittle.

      I call BS, I still have programs and data on pre-1973 paper tape. They are neither brown nor brittle. But I no longer have any hardware capable of reading them.

    5. Re:Paper Tape by Cmdln+Daco · · Score: 1

      There is a mylar variety of punched paper tape. I've seen it used to store Printed Circuit Board drilling data. It should also be well-suited to storing ASCII data. I know some of the first programs I ever wrote (in BASIC) were stored that way. It's the only storage we had on the MERITS dialup timesharing system when I was in High School.

    6. Re:Paper Tape by Anonymous Coward · · Score: 0

      In our research lab, someone left their reference papers on a shelf exposed to direct sunlight. After a year, those papers had gained a sun-tan.

  60. Re:Error correction codes. PAR2, btrfs, partitions by heypete · · Score: 4, Informative

    QuickPar on Windows is long-obsolete. MultiPar is the more modern variant.

  61. Whenever I write data "for keeps" ... by aix+tom · · Score: 1

    ... be it on an optical disk or another storage medium, I first add ~25% error correction data with http://www.dvdisaster.com/en/i....

    So far I have only needed it once, when I wheeled over a DVD with my office chair, when it was enough to recover the data.

  62. Use winrar to by weedjams · · Score: 1

    archive stuff and upload everything to usenet.

  63. Check your RAM by GuB-42 · · Score: 2

    A lot of bit rot is actually caused by faulty RAM.
    When data is moved around, it has to go through RAM, and even smart filesystems like ZFS may not help you there. Servers usually have ECC memory for that reason and ZFS explicitly recommends it.

    1. Re:Check your RAM by Anonymous Coward · · Score: 0

      First, advice people to USE ECC RAM !!!
      After you've done that thus substantially reducing their odds of corruption in the first place,
      then if you like those much better odds you won't have to tell them to check it,
      because the system log will be reporting it.
      Othewise sure memtest86 once a year is fine if you don't like it.
      But guess what most people NEVER do???

      The never fully test the modules because they never swap their locations and run the test in each location.
      Their OS takes up base ram, morons.

  64. Scrubbing by xlsior · · Score: 1

    For live data, some Nas devices like synology have a 'scrubbing' option where it can rewrite your dataset once a month to prevent magnetic levels from degrading too much, and prevent bit rot by doing so.

  65. open zfs / zfs on linux by itr2401 · · Score: 1

    ZFS on Linux (http://zfsonlinux.org/) is a great option via the great work done by Lawrence Livermore National Laboratory Also, have a look at: http://open-zfs.org/wiki/Distr... and http://open-zfs.org/wiki/Compa... for solutions where ZFS is integrated into various solutions.

  66. Re:Filesystems with CRCs... by Anonymous Coward · · Score: 0

    You'd still need a backup or RAID solution to replace a bad black.

    I hope you mean a Western Digital Black.

  67. Bit-rot happens in transit by Anonymous Coward · · Score: 0

    Bit-rot doesn't happen in storage. (Because it has an ECC on the hard disk; you either get your data back, or you get a read error!)

    If you have numerous examples of file corruption, they happened as you were reading, writing, moving/copying the data. Think about how you've done that in the past. Especially think about whether you used any external docks, USB adapters, or similar.

  68. As has already been stated... by brantondaveperson · · Score: 1

    ..forever is a very long time. Your stated aim is simply impossible. Delete your data, because the reality here is that no-one, not even you, if you really examined your own feelings on the matter, honestly cares about your terabytes of digital driftwood.

    Of course, if you are really intent on storing this information forever, then you're going to have to consider what happens when you die. For this, you're going to have to become rich, because no-one is going to look after this stuff for free. You'll also need a library of hashes of the files, to ensure integrity, and naturally at least two copies of each. You'll have to write software to continually re-calculate the hashes, and check against your library, but that's OK, because you could probably sell this kind of archival service to other OCD-stricken humans, which takes care of your money problems.

    In fact, it occurs to me, that the real and only answer to your problem, is to invest all your time in building a company that provides this service, use the proceeds to look after your data too, and write provisions into your will that your data is preserved forever.

    And thermodynamics, be damned.

  69. The storage medium is important too by pestilence669 · · Score: 1

    Over the years, I've had failures in my CD/DVD archive, hard disks, and solid state storage (USB, CF, MMC, SD). Consumer grade hardware isn't designed for longevity. The only rock solid archival medium that's never failed me is obscure and dead. That said, ISO filesystems on CD-ROM will likely be readable for a decade or two longer... as long as your media doesn't rot. FAT is the next most ubiquitous.

  70. Re:Error correction codes. PAR2, btrfs, partitions by Kjella · · Score: 2

    I agree on PAR2, simply because it's a file you can easily copy around, take backup off and so on. From a 1GB file I have ~3000 source blocks and ~30 recovery blocks, so I can recover from a lot of bit flips or failed 4kb sectors for a 1% size gain. If it's a photo set I usually make sure I can recover at least one completely missing photo. The nice thing is that it's sufficiently overkill you can probably go through several hardware generations without checking/repairing before you accumulate an unrecoverable number of errors. Which is good, because it's fairly CPU intensive so I wouldn't really want to go through an 8TB drive often. But I've found that an on-demand check when I actually need it is fine for content that is "in storage". It's not like it happens very often or applications and other more bit-flip sensitive formats would be screwed up quite often.

    --
    Live today, because you never know what tomorrow brings
  71. Re:Error correction codes. PAR2, btrfs, partitions by Anonymous Coward · · Score: 0

    The magic phrase to Google is "error correction codes" (ECC).

    PAR2 uses Reed-Solomon error correction. parchive is the ECC file format specification, for Linux you will want PyPar or par2tbb, and on Windows you use a GUI called QuickPar.

    Btrfs can be set to use ECC on a single disk.

    You can slice a single disk into partitions and then use RAID1 or LVM mirroring, or RAID5 or RAID6. LVM can alao be useful to divide (and combine) any number of drives into any number of volumes, then you can RAID across the volumes.

    If you Google "ecc disk", "ecc backup", or "ecc archive" you'll find other options, with details about each option.

    ECC is probably not going to fully cut it. That just increases the number of errors that can be corrected, usually not by a large amount. Long term entire disks are going to die, or if your lucky only part of an entire disk.

    I'm thinking of something like torrent. For instance, suppose your set of files decomposes into 100 torrent chunks. Each chunk is a MB or whatever. Each chunk can be copied onto multiple destinations. All you need to do is obtain a full set. You can extend this a bit, by encoding the chunks similarly to how raid-6 does, such that you effectively have to recover two out of four sub chunks to recover that larger chunk. (You would still have to recover all 100 chunks to recover the original file.) (Maybe you can do better than Raid-6...)

    At any rate, each disk should contain a copy of all the file names and the checksums of the major chunks. This is a fairly small amount of info, so its being copied everywhere is harmless. The distributed ECC allows you to recover files with some parts missing or corrupt. Combine that with as mentioned just insuring you have copies of all the pieces reasonably distributed and your in pretty good shape. The use of something like torrent chunks basically adds the ability to verify that chunk is intact. Also breaking things up into pieces, increases the odds of finding enough intact pieces to recover the whole.

    I think back blaze may do some of this. With some actual effort, you could probably figure out how to optimize this and see what failure patterns you really protect against.

    On a side note, the "torrent chunks" should typically try not to smear small files across more than one chunk. We want to be able to do a partial recovery, if the data is badly corrupted. The main key is to decompose the problem into a set of smaller problems such that you can determine with some degree of certainty just how reliable the resulting system is and what your failure profiles are. Depending on how things are structured you may be able to control the possible failures to ensure that things really do fail somewhat gracefully, as opposed to simply either having enough to reconstruct all, or nothing.

  72. DEATH IS NORMAL & NATURAL by Anonymous Coward · · Score: 0

    Your parents, your spouse, your children, your pets,
    everyone you know, everyone you've ever heard of,
    the earth, the solar system, the milky-way galaxy,
    you and your pitiful little hard drive will all die in time.
    I, on the other hand, purchased the Extended Warranty Plan.

  73. UDF Maybe by higuita · · Score: 1

    UDF is the RW format for dvd-rw and can be used on HDs in all modern OS (it requires format version 2.01)

    The format is resilient, as DVD-R(W) may have scratches and have CRC in metadata... sadly it do not have CRC in data, as the DVD reader/physical format also have some recovery info, so UDF didn't add it directly.

    It is still a good format, being a ISO, it should have a long life and be read for a long time. Of course, for HDs, i would bet that mechanical problems will probably be a problem sooner.

    other than UDF, ZFS and BTRFS both have CRC and should be resilient and the format is set and should not change. but there are other formats with CRC, check the wikipedia for more options

    Finally, probably the format that you store the files is also important, a solid RAR or TAR may cause problems in the future than compressing each file with gzip. Probably the best option is store the files using par, as it was created to permit access to the files even if several blocks can't be read. some backup tools support this, directly , as DAR or but, or indirectly, as backuppc (search ArchivePar) on the archive step

    Whatever you do, a followup of this in one year (or more) is a good idea, as the theory and real life may be different things :)

    --
    Higuita
    1. Re:UDF Maybe by Miamicanes · · Score: 1

      For anyone who's interested, here's how to forcibly format a drive with UDF under Windows:

      format [driveletter]: /fs:UDF

      It MUST be done from a command prompt. It MIGHT require an admin command prompt (don't remember offhand). You can't do it via Windows Explorer.

  74. Pendaflex Hanging Folders by Cmdln+Daco · · Score: 1

    Those green hanging folders seem to last forever. When was the last time you saw one wear out to the point where it was thrown away? Typically they seem to last a lifetime.

  75. ZFS on Linux has software RAID. by Futurepower(R) · · Score: 4, Informative

    An Introduction to the Z File System (ZFS) for Linux.

    Quote: "ZFS is capable of many different RAID levels, all while delivering performance thatâ(TM)s comparable to that of hardware RAID controllers."

    That sounds good to me. I want to avoid hardware RAID because, when hardware RAID controllers fail, they are often difficult to replace.

    1. Re:ZFS on Linux has software RAID. by thegarbz · · Score: 1

      This right here. The only data loss incident I've ever had has been a failed RAID controller.
      Naturally my backups now ensure that dataloss no longer happens and I've been using software RAID ever since.

    2. Re:ZFS on Linux has software RAID. by whitlocktj · · Score: 1

      I'm going to correct that quote. ZFS is a resource hog. Mind you, I use it everywhere, but you need 1GB of RAM for every TB of data. Even more if you're doing deduplication (not that common). That being said, I'd rather buy a couple extra gigs of memory than buying a RAID card and dealing with proprietary crap when it dies.

    3. Re:ZFS on Linux has software RAID. by Gr8Apes · · Score: 1

      Failed RAID controllers, sata controllers, MFM controllers, various disks (clicking to just... not working) You name it, I think I've experienced it including a Seagate disk long long ago that failed a bearing or something and actually came to a literal screeching halt. Stuff happens, disks fail. Bits rot. I rotate in a new set of cheap disks about every 3 years for a new cloned set. Usually I'll double the size of the disks on each purchase, so I get a 2 for 1 reduction in backups. I put the old disks in a closet. I actually still have a few scsi disks from back in 2001 that I recently grabbed some files from, just to be sure I had a full copy, namely because I was getting rid of those. Interesting what you find on disks from early 2000s. I still have a couple of RLL drives that I'm going to wipe and toss, but I'd like to see what's on them first.

      --
      The cesspool just got a check and balance.
    4. Re:ZFS on Linux has software RAID. by Gr8Apes · · Score: 1

      ZFS ... need 1GB of RAM for every TB of data.

      I finally have an excuse to double my RAM.

      --
      The cesspool just got a check and balance.
    5. Re: ZFS on Linux has software RAID. by Anonymous Coward · · Score: 0

      If you use a raid card in IT mode it still offloads a lot of the functions and vastly increases performance over an on-board Sara controller. The zfs pool is still making the raid but the controller helps a lot. 256mb of ran per tb is our current model for our internal nas servers we built using an lsi in it mode. If the controller fails replace and import pool, actually tested this when we had a controller fail. The best thing about zfs not mentioned here is snapshotting and mirroring. We have 2 twin nas that sync to each other at 40gb, the front data lines from the network are at 10gbe. Make sure to stagger the load for a busy network. You need atleast 2x the data bandwidth to ensure proper mirror when we tested.

    6. Re:ZFS on Linux has software RAID. by jon3k · · Score: 1

      This is relevant to my interests. I'm considering switching to RAID next time I replace my disks (currently 2x8TB that I rsync occasionally). Is this accurate or an exaggeration? Do you really need 1GB per TB?

  76. Librarians use LOCKSS by Anonymous Coward · · Score: 0

    Lots Of Copies Keeps Stuff Safe (this is what is recommended).

    Google Drive, One Drive, DropBox, Box, Corbonite, Crashplan...sign up for several and spread copies around.

    At least one is likely to survive the fires, floods, tornadoes that will inevitably doom any single approach.

  77. It's fine... by Tough+Love · · Score: 1

    my HDDs are fine, but some files are corrupting

    Your HDDs are not fine.

    --
    When all you have is a hammer, every problem starts to look like a thumb.
  78. Re:Error correction codes. PAR2, btrfs, partitions by Anonymous Coward · · Score: 0

    CAREFUL. If you're using ECC on your file system you also need ECC memory. If not, a bad bit in memory could trigger an ECC validation failure on good data and then the file system's cascading 'data corrections' may wipe out your entire partition!

    Someone mentioned this on Slashdot awhile ago with a far better explanation than I can give.

  79. UDF by Lehk228 · · Score: 1

    UDF on archive quality optical WORM media, such as BDR M-Disc

    --
    Snowden and Manning are heroes.
  80. Be careful of cascading error correction by raymorris · · Score: 1

    More generally, be careful of cascading error correction. Some types by nature will not cascade (these can generally be thought of as 1-dimensional), other types should check for a cascading effect before doing a correction.

    1. Re:Be careful of cascading error correction by Gunstick · · Score: 1

      can you give examples? I have no idea if reed-solomon is cascading error correction.

      --
      Atari rules... ermm... ruled.
  81. Clay table or chipped stone. by Anonymous Coward · · Score: 0

    That's why we know more about ancient cultures of the middle east than we do about the cultures that destroyed them.

  82. Use ZFS with ECC ram.....smile and rest. by brainchill · · Score: 1

    Simple ...... Use ZFS with ECC ram. It checksums all of your data and using ECC ram will prevent corruption during off disk filesystem operations

  83. Re:Error correction codes. PAR2, btrfs, partitions by Anonymous Coward · · Score: 0

    >"The magic phrase to Google is "error correction codes" (ECC)."

    Ok, maybe o/t, but: I always knew ECC as "Error Checking and Correcting"
    Has the meaning changed over time, or is this a case of a collision in TLA-space?

    Yeah, yeah, I'm too lazy to google/wikipedia it.

  84. Re:Error correction codes. PAR2, btrfs, partitions by Anonymous Coward · · Score: 0

    Do NOT use btrfs. I have real world experience with it and it is horribly unstable, but everybody claims it is stable. It's the edge cases that cause the entire filesystem to become totally unusable, not to mention the features that were never really tested properly and people were using thinking they were stable for years (RAID5).

    It's really alarming that it's a default option on synology disk stations!

  85. Re:Error correction codes. PAR2, btrfs, partitions by Anonymous Coward · · Score: 0

    Or, you know, you can just use ZFS and have all this and more.

    Besides, BTRFS is buggy, beta, and dead.

  86. simple solution by ooloorie · · Score: 1

    If you have multiple backups, you can fill in "bad blocks" in one backup with another backup; that's probably the simplest and most easy-to-use solution. You can calculate the probability of an unrecoverable error easily.

    If you want something more efficient, you can use various forward error correction tools or file systems. Tahoe-LAFS is one such system, though perhaps more complex than you might want.

  87. Slash rot by Excelcia · · Score: 5, Insightful

    Concur. File corruption due to "age" will not occur without hard read errors. Also, "ill-coloured photos" likely would not be ill-coloured in the case of actual data corruption, but would have whole blocks of hash in them. The user claims to have multiple terabyte sized hard drives - hard drives in this size category userd for archival storage are simply not old enough to be suffering data corruption due to age. The only hard drives suffering so are MFM hard drives that likely the poster wouldn't have a clue how to even interface into a current computer. Hard drives used for archival data storage will likely not age degrade before the interface standard they are based on becomes obsolete. Thus, a perfectly reasonable archival data storage strategy is to simply copy data from one hard drive to a newer (likely much larger and faster) drive when the next generation interface becomes standard, and before the previous generation is totally obsolete. For example, one can still get PATA + SATA USB adapters, SATA + M.2 adapters, etc.

    If the user who submitted this question is actually experiencing a problem at all, suggest that PEBCAK. Better explanation is the poster is not actually experiencing current problems at all, but is simply trying to sound important with inflated claims of reams of data and that Slashdot has been had.

    Further, no person with Slashdot posting authority should have been ignorant of any of the issues in this question that make its legitimacy questionable at best, and certainly not Slashdot worthy in any circumstance.

    1. Re:Slash rot by Anonymous Coward · · Score: 0

      Lol. Pretty damn full of yourself aren't you :) Hope you grow up and get over your issues.

    2. Re:Slash rot by cvdwl · · Score: 4, Insightful

      Ahh, there's the Slashdot of old that I miss so much.

      --
      ... grumble, grumble, grumble, mutter, mutter, Millenium... Hand... Shrimp, I tol' 'em, I tol' 'em.
    3. Re: Slash rot by Anonymous Coward · · Score: 0

      Your newsletter sir - I wish to subscribe to it.

    4. Re:Slash rot by MooseMiester · · Score: 1

      Kudos to you Sir and Thank you for pointing this out.

      Two whole days of Slashdot WITHOUT the left wing Trump hating clickbait and I am SO HAPPY to see actual technical topics being discussed. If msmash is on vacation I hope he stays there permanently...

      --
      Murphy was an optimist
    5. Re:Slash rot by lordmage · · Score: 1

      This sounds more like a Copying issue. File corruption gets involved during Raid/Copying from older disks to newer ones and so data gets corrupted.

      Also, using large USB archive drives can corrupt the drive due to bad USB implementation, Drivers, WORK blocks, etc. I have one drive that works everywhere but put it in this one Laptop and half of it works.. half does not and writing at that stage would corrupt the drive.

      --
      I can program myself out of a Hello World Contest!!
  88. Your files are not corrupting. by Anonymous Coward · · Score: 0

    Digital photos do not become "ill colored." The degradation would have to affect the entire photo in a consistent fashion and, depending on compression used, the degradation would need to first decode, change, and then re-encode the image.

    Bit rot does not cause this.

    Depending on the compression used, with real bit rot, you will get either an unusable file, a usable file with a few corrupt blocks, or a regular file with a few pixels messed up here or there. You will not see whole-picture color alterations. These are the only failure modes that bit rot creates in a digital image.

    It is more likely that the lighting in your room, or your monitor, or the color balance settings in the OS, or even the memory in your head has changed, than that bit rot or an unreliable file system will have created changes like you describe, because it's physically impossible unless you have a virus actively altering them.

    1. Re:Your files are not corrupting. by brantondaveperson · · Score: 1

      I think it's fairly obvious that the guy is a bit mad, since he's intent on storing his random collection of software, ebooks, PDFs, images and videos, forever.

  89. DragonFly BSD Hammer file system. by Anonymous Coward · · Score: 0

    Hammer has significant check sums that assure the health of the data. However nothing can prevent actual bit rot. In addition to check sums to verify data, you need some sort of extended error correction data stored with the file. There are user space applications that can apply this on a block for block. One example might be the dvdisaster utility available in Debian which computes the ECC blocks to help preserve data on DVD.

  90. PAR2 by cerberusss · · Score: 1

    I've created PAR2 files for all my photos. I've got a kid and although I make multiple backups, I neither trust the filesystem (HFS+) nor the backup (Time Machine and CrashPlan). Especially with photos, it's really easy because it makes sense to put them in directories per time period (for example every quarter or month), for instance, "/Pictures/2017 Q1". When you create a new folder, just create par2 files in the old folder, like so:

    $ par2create par2file *

    To verify them:

    $ par2verify par2file.par2

    Big advantages of par2 versus other methods:
    - It works independent of file system
    - It can not only verify but also repair

    --
    8 of 13 people found this answer helpful. Did you?
  91. md5deep by Anonymous Coward · · Score: 0

    If you don't want to go with the great integrated solutions above, such as ZFS or PAR, you can use a hash program like md5sum. Its multithreaded equivalent md5deep also has a nice "recursive" option. Create a text file with all the md5sums of all the files. Then repeat a year later, and compare (via sort, diff). Advantages of an explicit checksum is that it's more compatible, and you only ever need one drive connected to read / write data, and the data are available instantly. Disadvantages: more work, need twice the storage (where ZFS and PAR can use more efficient encoding).

    I will also note that I have never had bit rot attributable to hard drives on my ZFS pool, over about 5 years. This is using "green" 2TB drives. Check your hardware, especially RAM (credit to GuB-42 who mentioned this above), cables and power supply.

  92. Disks are safe, it's the ECC-less copy to new disk by ext42fs · · Score: 1

    Disks are pretty much safe, it's the ECC-less transfer to new disks which is risky. It is very very unlikely for bitrot on disk to go unnoticed thanks to ECC but when you migrate your data to bigger disks, new filesystems then an occasional bitflip due to critical timings (SDRAM, busses, chips, clocks), old PATA cables (no data checksum) or caused by EMI, radiation etc. increases with the size of the dataset. The solution is checksumming before- and after the copy. Even deprecated algorithms (MD5, SHA1) will do.

  93. Comment removed by account_deleted · · Score: 1

    Comment removed based on user account deletion

  94. For *OFFLINE* storage... by Anonymous Coward · · Score: 1

    ...I'd recommend RAR with as big a recovery record as you can get away with, or PAR - These both provide really good error correction for small bits (haha) of corruption - and then store the lot on whatever you have to hand (magnetic tape, magnetic disk, optical disk, 3d printed stone tablet etc.) and transfer to newer media every now and then to avoid the data being stuck on obsolete tech. (Like the data I have trapped on my EZ135 cartridges ;_;)

    Ignore the idiots suggesting ZFS and BTRFS - They only do error checking, not correction, so they will only tell you the data is corrupt, but not repair it.
    Actually that's not 100% true - They can repair if they are in a RAID configuration (i.e. 1, 5 or 6) but then you'd have to keep those disk together and, esp. with 5 and 6, the potential to lose all the data on the disks is higher e.g. if some of the disks die (Then you can't even use a data recovery company to get stuff back!)

  95. Why so much love for ZFS, none for BTRFS? by leptechie · · Score: 2
    I'm seeing plenty of enthusiasm for a filesystem that has inline checksums that are verified on each file access, particularly ZFS. This doesn't quite address the OP's point: a filesystem for the ages. Given ZFS' license, it is not only outside the mainline kernel now, it likely will be forever.

    btrfs is in mainline now, and has a number of years to have settled down. Even if you don't like the more advanced features, it has some that tick all the boxes:
    - good on-disk checksums to detect errors (incl bit rot) for metadata and data
    - RAID modes to protect from whole disk failures
    - realtime and online scrubbing to detect and recover from checksum failures from another copy (RAID1) or rebuilding from parity (RAID5/6) - no action required from user (contrast with PAR solutions proposed)
    - subvolumes for segregation of data if needed, especially if there is a desire to consolidate multiple older drives and especially useful to pool capacity from these disparate sources to implement RAID modes.
    - online reshaping for the above So, even if you're not accessing the data frequently, if OP cares about data I'm sure it's no hassle to plug them in once a year and let a scrub run (for a couple of days if needed, I know this part of the code is still terribly slow). Even if btrfs is deprecated today, it will be a long time before support is removed from the kernel, and even longer before the last distro stops supporting it, and longer yet before that last distro release refuses to boot on whatever incarnation of hardware is available to plug the drives into. All the while the data is free to be migrated onto new spare/surplus drives and a new filesystem if needed.

    1. Re:Why so much love for ZFS, none for BTRFS? by rl117 · · Score: 1

      When it comes to archival, ZFS is a production quality filesystem and volume manager intended for serious use. Btrfs is perpetually pre-alpha. Using it for archival would be foolish. It's also tied to a single implementation on a single operating system. I can (and have) run "zpool export" on a Linux server, removed the disks and slotted them into a FreeBSD server, then run "zpool import": data immediately on-line and mounted. It would also have worked for any other OS implementing ZFS; for data transportability it's the most feature cross-platform filesystem right now, given that the alternatives are crude filesystems like FAT. Archival implies the ability to read the data in a few decades, and I would bet that ZFS outlasts Btrfs by a significant margin. The single implementation of Btrfs might have been removed or changed incompatibly before you need to reread your data, and that presupposing that Btrfs wouldn't trash your data unrecoverably in the interim; after several total dataloss incidents with Btrfs due to implementation bugs in Btrfs, let's just say I'm a bit more grounded and objective as to its true merits.

  96. add error correcting files for redundancy by Gunstick · · Score: 1

    There are several projects/tools out there.
    Search for reed-solomon

    https://www.thanassis.space/rs...
    http://unix.stackexchange.com/...

    I used par2 to put my videos on CD-R, but those are now 10 years old and I did not check if it's still readable :-)

    --
    Atari rules... ermm... ruled.
  97. Parity by Anonymous Coward · · Score: 0

    "for the ages" means a tradeoff against bitrot resistance and readability. esoteric fses are unlikely to be easily readable in a few decades.

    I've settled on NTFS because of its support in $any_os and likely support in the decades to come. I supplement it with parity files to not only be aware of bitrot but also to have some limited resistance to it. I wrote a tool that allows you to easily analyse file hierarchies and check, make or monitor their par2 status. https://github.com/brenthuisman/par2deep/

  98. par2 by Anonymous Coward · · Score: 0

    "par2 c -R -r15 your_desired_par2_filename_here ./*" without the quotes

    this is fine with ext4 or ntfs or hfs for that matter

    15% should be ok but you can -r20 or -r25 or whatever your like

    zfs with copies=2 on a single disk or a two-disk mirror or better is probably OK without the par2

  99. M-discs? by Anonymous Coward · · Score: 0

    Archive to M-discs. They last for 1000 years, which should be long enough for most people!

  100. Bullshit by allo · · Score: 1

    That's not how "bit rot" works and your file system doesn't have anything to do with changing files either. If your FS is the problem, you won't be able to access the file. If bitflips would be a problem, your file would be corrupt (which doesn't shift your colors, but causes errors in the image. single bitflips probably won't even be visisble at all).
    Then the "bit rot" isn't a problem on hard drives manufactured in this millenium. Either your drive fails or has bad sectors or your files are probably okay. The probability of a single bitflip is very very low, the bit rot you think you're observing is obviously something else. I guess you got a new monitor and a less tolerant video player or something similiar.

  101. Reed-Solomon typically 512-byte blocks by raymorris · · Score: 1

    Reed-Solomon is a block-based ECC rather than stream-based (convolutional) memory errors would effect only that one block.

    Probably the most convenient way to use Reed-Solomon, where the math works out nicely, is to apply it to 512-byte blocks. That also happens to be the native size of hard drive sectors, so that's how it's most often used. Each sector has it's own ECC. The ECC of one block doesn't effect any other block. There are several decoding algorithms for Reed-Solomon which may have different characteristics as far as how many bits in that block might be affected by a memory error of one bit.

    Extended binary Golay code uses represents 12 bits of data with 24 encoded bits and corrects up to three errors in those 12 bits. It can detect up to seven errors. A memory bit-flip wouldn't be a problem, but eight flipped bits could result in all 12 bits being read incorrectly.

    The other class is convolutional (stream-based) codes. As a class, convolutional codes aren't limited to a fixed-size block, so some set of errors could propagate. Of course smart people design these codes, so I can't think of any off hand that are designed such that they actually propagate errors in an unbounded way. The general type would most likely be one that looks a bit like a cross word puzzle of many dimensions - a single bit gives information about many otherwise unrelated bits.

    Convolutional codes are typically used "close to the metal", with analog values rather than digital.
      Consider you're applying ECC to the electrical signal in a cable, or a wireless signal. The protocol may specify that 1 volt positive (or higher) is logical true, 1 volt negative (or lower) is logical false. Suppose you're using triple modular redundancy, which simply means you send each signal three times. You might read the following values:
    +1.8
    -0.6
    +0.3

    Even though two of the three values are invalid, we can see that they are clearly biased toward the positive and therefore treat it as logical true. Space probes sending pictures from millions of miles away require convolutional coding with high redundancy.

  102. NTFS by RecycledElectrons · · Score: 1

    I've been archiving file since the 1980's, and have a ~20TB collection at the moment.

    Your potential sources of data loss and solutions are:

    1. Problem: Not being able to find the file in all the disks you have.
    Solution: Organize stuff, even if that means using twice the number of disks to leave room to add future files in the right place. Use the Library of Congress Cataloging system when possible; They really did think of everything.

    2. Problem: Not being able to find a reader for the file. (Thus mis-colored images as the image formats evolve.)
    Solution: Use only a few common formats: JPEG, PDF, DOCX, TXT, GIF. Seriously, even avoid TIFF files.

    3. Problem: Disk failure.
    Solution: Keep multiple copies of everything in different physical locations.

    4. Problem: Not being able to read the obsolete disk format (e.g., MFM, RLL, proprietary 1.76MB formatted 3.5" floppies.)
    This is what you asked about.
    Solution: Keep everything in the MOST COMMON formats available. ZFS is evolving so quickly that you will not be able to find a reader for today's version in 5 years. Stick to NTFS only because FAT32 refuses to deal with large files. Stick to USB enclosures, and make sure you can easily remove the SATA drives to access them directly if necessary.

    1. Re:NTFS by RecycledElectrons · · Score: 1

      One more thing, DO NOT USE TAR, ZIP, OR PAR files! They add a point of failure. You will not find a un-rar program in 20 years, because you used the coolest, best thing out right now that turns out not to be supported past the 2029 update.

    2. Re:NTFS by duke_cheetah2003 · · Score: 1

      One more thing, DO NOT USE TAR, ZIP, OR PAR files! They add a point of failure. You will not find a un-rar program in 20 years, because you used the coolest, best thing out right now that turns out not to be supported past the 2029 update.

      Can't agree with this. tar hasn't been changed in what, 30 years? 40 years? Zip format also hasn't seen any significant changes in a few decades either. These formats are probably pretty safe, especially tar. Just be careful of which compression algorithms you use, use old common ones if you're worried about future accessability.

      And additional thing I do with large repositories of data is make sure to put any relevant 'readers' on the drive with all the data: A copy of a popular archiver, source code to tar, etc.

      I use TrueCrypt too, and to ensure I never am without a copy, I am sure to copy it's installers and source code to everything. Hell, even my phone and dashcam sdcards have a copy of TrueCrypt, just in case.

  103. M-DISC by Anonymous Coward · · Score: 1

    It's tough to beat M-DISC's purported shelf-life of over 1,000 years. The discs are a couple dollars apiece and the drives are around $50. If you're concerned about not having the hardware required to access the filesystem on the disc in the future, simply migrate your files from one M-DISC (or equivalent) to another every so often to mitigate this risk. Each time you burn the files, perform a bit-by-bit analysis to ensure that all the hashes line-up, and you should be in good shape.

  104. Data corruption is inevitable. by Anonymous Coward · · Score: 0

    From our own bodies we can learn this (tumours).
    On a long enough timeline even our own DNA becomes corrupt through replication errors and within a relatively small amount of generations little to none of it remains as to be recognisable.
    One cannot reasonably expect to retain data indefinitely- whether one looks at it from a hardware or software perspective.
    This needs to be put in perspective when discussing matters such as AI - eventually it shall almost certainly suffer a computational amnesia, dementia or insanity as a result of the above.

    But for the purpose you seem to have asked, a combination of well chosen hardware and software solutions should give you some good mileage. The rest is really semantics. You can't buy immortality off the shelf.

  105. Letting go is the answer by 14erCleaner · · Score: 2

    Make copies of things you care about occasionally on new media. If you don't care about something, let it rot. It's very liberating, kind of like burning your desk.

    --
    Have you read my blog lately?
  106. Extended Ask Slashdot: Best portable FS? by duke_cheetah2003 · · Score: 1

    I like this slashdot question. But I'd like to expand it. Because like the OP, I have a couple terabytes of crap I'd prefer not to lose. I currently employ a manual mirroring of the drive to an offline drive of equal size and store that drive away from my computers.

    My problem is portability though. Currently, both my drives (the online and offline copy) use NTFS for the filesystem. I choose NTFS because I want the drive to be accessable from both Linux and Windows machines. Like for example, I like to take the drive with me when I travel so I can watch my videos whereever I go should I get bored.

    So my expanded question is: Which filesystem is the best for data retention and portability across Windows and Linux?

    As an additional, if anyone wants to bite, how come there are no decent third party file system drivers for Windows? It seems like long past due for some good third-party filesystem drivers to be out there and usable.

    ps. Never experienced any form of bit rot on standard spinner HDDs. Only time I've ever had issues with data loss on media is with recordable CDs and DVDs which I've long since stopped using for any purpose due to their proven unreliability. USB flash drives are also similarly unreliable as long term storage. I've multiple times gone to use a USB flash drive and discovered it's blank or scrambled and unusable.

  107. Re:Error correction codes. PAR2, btrfs, partitions by hawk · · Score: 1

    >PAR2 uses Reed-Solomon error correction.

    I'm no expert, but it seems to me that when the correct value of the data is disputed, cutting the data in half as a solution is a Bad Idea(TM) . . .

    hawk

  108. Just make sure you only use ... by Anonymous Coward · · Score: 0

    ... pure, virgin 0's and 1's.

  109. Re:Error correction codes. PAR2, btrfs, partitions by Wolfrider · · Score: 1

    --I wonder if anyone has ever thought of porting par2 to a FUSE filesystem? ;-)

    --Here's the closest thing I found with a quick search:

    http://askubuntu.com/questions...

    --
    .
    == WolfriderV6 == I'm willing to admit that *I just might* be wrong... Are you??
  110. I think this as well.. but then, I think again... by gosand · · Score: 2

    First off, people in this thread seem to think that all the information people are saving is about them.
    I have lots of data as well, but most if it is about my kids and family. My kids are starting to hit their teens, and we still go back and watch old family videos. Most are short, under 2 min. They capture points of times in our lives that we won't get back. Our old house, in a different state... friends we had then, neighbors. Not like full documentaries on them, but fading memories. It's not important to anyone else. And if I leave that to them, they may keep it or destroy it. But that is their choice to make. With the ability to capture all of this with much more ease than the previous generation, why shouldn't we? I only have a few photos of my grandparents, and I would like to have more. But they don't exist. I don't have any videos of them at all.

    It's not about being famous, or because you are important in the world's eye. There are entire professions dedicated to preserving that. It's about whatever YOU want to preserve. I once found a website that had information about my family name, I had never seen it anywhere before. Pictures and scans of documents, etc. I saved that information off, and about a year after that, the site went away. Since then, ancestry.com came to be, and I was able to use that information I saved to help stitch together our family tree.

    We are in the information age, I don't understand why someone would be so opposed to preserving that.
    You sound very angry about something, you should probably figure out what that is before it's too late.

    --

    My beliefs do not require that you agree with them.

  111. Re:You have a hardware problem. FS choice won't he by rl117 · · Score: 1

    The device error correction is probabilistic. It won't necessarily know the data is "bad". And there can be firmware bugs which make it return or store bad data. What about phantom reads and writes. https://www.youtube.com/watch?... is a very interesting presentation from Bryan Cantrill about all sorts of bitrot and storage stuff.

  112. ZFS forever (and a day) by epine · · Score: 1

    You absolutely should never, ever, use it for anything you plan to read a long time from now. You'd be better of 'tar'ing files directly to /dev/sda.

    The ZFS on-disk format is extremely well documented and not that terribly hard to understand.

    ZFS On-Disk Specification

    It's conceptually well thought out and doesn't require a lot of corner cases. It's stable and common to every current ZFS implementation. I haven't looked at all the feature bits subsequently added, but I don't think many of them complicate recovery of basic files whatsoever.

    0:06 / 43:27 Examining ZFS On-Disk Format Using mdb and zdb: Max Bruning

    This is not ZFS Internals for Dummies, but do note how the available tools are first rate. Plus, the on-disk structure is integrity checked all the way down. Most likely, any misconception about bit-patterns will be brutally put to rest by the next disk block you fetch.

    Finally, there's very little fundamental churn here, because ZFS was designed for the future on day one.

    ZFS has one Achilles heel: the absence of block-pointer rewrite. Basically, the integrity layer is overly rigid about block placement, and thus certain kinds of desired flexibility are off the table, now and forever, until ZFS 2.0 comes with a different on-disk placement record (which might never happen, as the principals all refer to BPR in hushed voices as some large, daunting project—probably because maintaining the historical testing standard requires industrial-strength support).

    Contrary to your ludicrously uninformed tar-pit scenario, ZFS is a paragon of long-term, binary-format stability.

    You must somehow think the future can barely start a fire by rubbing two sticks together.

    Have you ever turned the pages of Office "Open" XML?

    This is potentially a hundred times more baroque, Byzantine, and baffling to some rub-stick deprived future generation, that wakes up severely hung-over and back-to-the-buggies as the dust settles on the AI apocalypse.

    (How do we finally beat the AI uprising? Probably though nefarious virus bearing an OOXML payload, which even the ascendant AI-powered globally-distributed firewall fatally misclassifies.)

    For true geeks, ZFS origin story from the horse's mouth.

    The Birth of ZFS by Jeff Bonwick

    BP-rewrite mentioned at 18:00. Something technical about DVA (data virtual address). Then a horrible "bolt on" is mentioned. Then a mike drop.

    Circa 7m: it's not going to be a team of 80 people, it's just going to be me and Matt at the white board all day, every day. (And y'all knows how that goes. Turns out "tank" is a character from the Matrix. )

    Circa 11m20: ztest and zloop

    So we realized very quickly we could just create files in /tmp, pretend that they're disk drives, and then issue all the read and write commands and transactions from our test program and exercise all the code from userland.

    The really big advantage of that was that it gave us the ability not just to test the datapath, which is pretty easy to do with normal stress tests, but also to test the administrative path.

    So you could have things like: What happens if one thread is trying to attach a mirror, while another thread is trying to detach it, while another is trying to write, while somebody else is trying to resilver?"

    In practice, those things are very hard to test with actual hardware, because it just takes a long time to do those things, but because we had very small devices that were really just /tmp files which were super fast, and because the cost of killing ztest and restarting it was measured in milliseconds, it meant we could crank through the

  113. Re:Error correction codes. PAR2, btrfs, partitions by timq · · Score: 1

    QuickPar on Windows is long-obsolete. MultiPar is the more modern variant.

    Filesystem for the ages, eh?

  114. Archiving solution by Anonymous Coward · · Score: 0

    For this purpose I would deploy a private bittorrent tracker.
    You'd have to create .torrent files and add them in your torrent client, run data integrity check every once in a while. Have at least of two of these client nodes and you should be good to go.
    Ensures data recovery in case of corruption (BitTorrent will download just the corrupted blocks from the other peer).

  115. Only 1 solution: black holes by martinfb · · Score: 1

    No info is ever lost there. It lasts (almost) forever right at the event horizon.

    Retrieving it is a bit challenging, though.

    --


    Self-importance and self-indulgence is the root of ALL evil.
  116. Re:Error correction codes. PAR2, btrfs, partitions by heypete · · Score: 1

    PAR2 not a filesystem, but rather a means of generating error-correction codes to detect and repair damage to files.

    The actual PAR2 algorithm hasn't changed, though development for PAR3 is ongoing. It's simply that one particular implementation, QuickPar, is obsolete, while MultiPar, a similar program that is completely compatible is more modern.

  117. SeqBox by Anonymous Coward · · Score: 0

    I know I'm terribly late, but since I read this Ask Slashdot while I was finishing up / testing the tool in subject, I can't avoid a shameless plug. So here it is.

    https://github.com/MarcoPon/SeqBox - Sequenced Box container - A single file container/archive that can be reconstructed even after total loss of file system structures

    You can encode a file in a SBX container, and gain better recoverability if disaster happens, and also bit-rot detection, since each block is CRC tagged.
    In addition, if you make multiple copies of the SBX file (on the same volume, in different media, whatever), and every one of them is damaged in some way, the SeqBox recovery tools can scan for valid SBX blocks on multiple files/block devices, collect all the good parts from all available source, and (hopefully) gather enough data to reassemble the original container.