Slashdot Mirror


Ask Slashdot: Practical Bitrot Detection For Backups?

An anonymous reader writes "There is a lot of advice about backing up data, but it seems to boil down to distributing it to several places (other local or network drives, off-site drives, in the cloud, etc.). We have hundreds of thousands of family pictures and videos we're trying to save using this advice. But in some sparse searching of our archives, we're seeing bitrot destroying our memories. With the quantity of data (~2 TB at present), it's not really practical for us to examine every one of these periodically so we can manually restore them from a different copy. We'd love it if the filesystem could detect this and try correcting first, and if it couldn't correct the problem, it could trigger the restoration. But that only seems to be an option for RAID type systems, where the drives are colocated. Is there a combination of tools that can automatically detect these failures and restore the data from other remote copies without us having to manually examine each image/video and restore them by hand? (It might also be reasonable to ask for the ability to detect a backup drive with enough errors that it needs replacing altogether.)"

42 of 321 comments (clear)

  1. PAR2 by Anonymous Coward · · Score: 5, Informative
    1. Re: PAR2 by Miamicanes · · Score: 4, Informative

      Use non-LTH BD-R media. It's seriously the best media we've ever had for long-term archival storage, hands-down, no contest. Unlike DVD+/-R, it's phase-change magneto-optical WORM... the laser liquefies the plastic, the magnet orients little shiny planar mirrors, the plastic solidifies, and the bits are about as close to 'carved in stone' as you're likely to ever get. As a technology, it's not cheap... but it definitely minimizes the number of things that can go wrong over a ~25-year timeframe:

      * decouples media from its player... the achilles heel of hard drive-based backup schemes. A broken hard drive means a spectacularly expensive data-recovery job. A broken BD drive means buying a new one.

      * phase-change MO media doesn't bleach or darken with age... and if it's going to delaminate or anything (like early optical discs often do), it's overwhelmingly likely to happen sooner rather than later (while you still have the originals available to re-archive if necessary).

      * I think we can safely accept that future evolution to optical discs will remain downwards-compatible with reading older media. Seriously, CDs are THIRTY YEARS OLD, and any Blu-Ray player from China can still play them just fine (plus everything that's ever been commonly burned/stamped into them). A 2037 Apple Eve might have the masses drooling over its legacy-free minimalist purity, but the rest of us will have a 600 petabyte optical drive manufactured by a sweatshop in Uganda or Haiti that can read old BD-R discs just fine (at least, after opening it up and soldering a wire across two pads on the circuit board to make it think it's supposed to be their $6,000 enterprise version instead).

    2. Re: PAR2 by Miamicanes · · Score: 3, Informative

      EEPROM also happens to be the ancestor of SLC flash, not MLC, TLC or worse.

      Flash is like a leaky bucket that starts out full of water, and gets drained to some level when a cell's value is set:

      SLC == "The bucket is either totally empty (0), or has some water in it (1)"

      MLC == "The bucket can be totally empty (00), non-empty to ~33% full (01), 33%-~66% full (10), or 66-100% full (10). After 1/3 the water leaks out, the cell's value is corrupt.

      TLC == same idea as MLC, but the bucket has EIGHT levels instead of four. Do the math to figure out how much metaphorical water can leak out before the cell's value becomes corrupted.

      BIOS eeproms are also a larger process than high-density flash, so the buckets themselves are larger while the leaks remain relatively constant in size. In other words, you're comparing a metaphorical 55 gallon drum with a slow drip that has to be completely empty to change from 1 to 0 to a thimble with 8 tick marks on the side and a leak of the same size.

  2. ZFS filesystem by Anonymous Coward · · Score: 5, Informative

    One single cmd will do that,

    zpool scrub

    1. Re:ZFS filesystem by vecctor · · Score: 5, Informative

      Agreed, ZFS does exactly this, though without the remote file retrieval portion.

      To elaborate:

      http://en.wikipedia.org/wiki/ZFS#ZFS_data_integrity

      End-to-end file system checksumming is built in, but by itself this will only tell you the files are corrupt. To get the automatic correction, you also need to use one of the RAID-Z modes (multiple drives in a software raid). OP said they wanted to avoid that, but for this kind of data I think it should be done. Having both RAID and an offsite copy is the best course.

      You could combine it with some scripts inside a storage appliance (or old PC) using something like Nas4Free (http://www.nas4free.org/), but I'm not sure what it has "out of the box" for doing something like the remote file retrieval. What it would give is the drive health checks that OP was talking about; this can be done with both S.M.A.R.T. info and emailing error reports every time the system does a scrub of the data (which can be scheduled).

      Building something like this may cost a bit more than for just an external drive, but for this kind of irreplaceable data it is worth it. A small atom server board with 3-4 drives attached would be plenty, would take minimal power, and would allow access to the data from anywhere (for automated offsite backup pushes, viewing files from other devices in the house, etc).

      I run a nas4free box at home with RAID-Z3 and have been very happy with the capabilities. In this configuration you can lose 3 drives completely and not lose any data.

      --
      Why, yes I have been touched by His noodly appendage. And I plan to sue.
    2. Re:ZFS filesystem by Guspaz · · Score: 5, Informative

      You don't need raidz or multiple drives to get protection against corrupt blocks with ZFS. It supports ditto blocks, which basically just means mirrored copies of blocks. It tries to keep ditto blocks as far apart from eachother on the disk as possible.

      By default, ZFS only uses ditto blocks for important filesystem metadata (the more important the data, the more copies). But you can tell it that you want to use ditto blocks on user data too. All you do is set the "copies" property:

      # zfs set copies=2 tank

    3. Re:ZFS filesystem by Mike+Kirk · · Score: 2, Informative

      I'm another fan of backups to disks stitched together with ZFS. In the last year I've had two cases where "zfs scrub" started to report and correct errors in files one to two months in advance of a physical hard drive failure (I have it scheduled to run weekly). Eventually the drives faulted and were replaced, but I had plenty of warning, and RAIDZ2 kept everything humming along perfectly while I sourced replacements.

      For offsite backups I currently rotate offline HDD's, but I should move to Cloud storage. Give a bit of my surplus space and bandwidth to someone like Symform, and in turn they give me a free little slice of the Cloud to have TrueCrypt archives mirrored into. Win-win!

    4. Re:ZFS filesystem by cas2000 · · Score: 2

      true, but you do need multiple disks (mirrored or raidz) to protect against drive failure.

      two or more copies of your data on the one disk won't help at all if that disk dies.

      fortunately, zfs can give you both raid-like multiple disk storage (mirroring and/or raidz) as well as errror detection and correction.

      That ZFS_data_integrity link in the post you were replying to gives a pretty good summary of how it works.

      The paragraphs immediately above that (titled 'Data integrity', 'Error rates in hard disks', and 'Silent data corruption') also give a good summary of why error-correcting filesystems like ZFS (and btrfs) are necessary, especially with the huge sizes of modern drives.

      In fact, anyone interested should read the entire wikipedia article.

      ps: neither raid nor ZFS is a substitute for backups. you still need backups of your data (preferably with off-site copies) to protect against accidental deletion or overwrite (snapshots can help with this if used intelligently prior to the event) or burglary or catastrophic damage like fire or flood.

  3. ZFS by Electricity+Likes+Me · · Score: 4, Interesting

    ZFS without RAID will still detect corrupt files, and more importantly tell you exactly which files are corrupt. So a distributed group of ZFS drives could be used to rebuild a complete backup by only copying uncorrupt files from each.

    You still need redundancy, but you can get away without the RAID in each case.

  4. Re:Excellent question by sandytaru · · Score: 2

    There are, but you'll be paying a lot of $$$ for that kind of storage in the cloud. I get 4GB for free from DropBox. SkyDrive from Microsoft will set you back $1000/month for 2TB - DropBox is about twice that much. It's not really practical for media files.

    A much better solution would be archival quality Blue-Rays. They can hold 25 GB apiece and they're supposed to last 100 years, but they really just need to last long enough until a new, even denser storage media comes along.

    --
    Occasionally living proof of the Ballmer peak.
  5. Re:Excellent question by SirMasterboy · · Score: 5, Informative

    Not all cloud storage is expensive. It's only $4 a month for unlimited backups to CrashPlan.

    They also do checksums and versioning and can be set to never remove deleted files from the backup.

    I have 12.8TB backed up to them and it's been working great.

    Other than that, ZFS can't be beat. I use that as well.

  6. Re:Checksums? by QuietLagoon · · Score: 2
    I use checksums to check for bitrot.

    .
    Once a week, I use openssl to calculate a checksum for each file; and I write that checksum, along with the path/filename, to a file. The next week, I do the same thing, and I compare (diff) the prior checksum file with the current checksum file.

    With about a terabyte of data, I've not seen any bitrot yet.

    Long term, I plan to move to ZFS, as the server's disk capacity will be rising significantly.

  7. Re:uhuh by Anonymous Coward · · Score: 2, Informative

    Warning for all UNIX newbies: that command will reset the file to 0 bytes. Just that you know.

    (I've seen some cases when a rookie is setting up a Linux system and people jokingly throw him these "rm -rf /" commands and the poor guy actually ends up wrecking his system.)

  8. Re:Checksums? by Anonymous Coward · · Score: 2, Interesting

    Periodically checking them is the important part that no one seems to want to do.

    A few years back we had a massive system failure and once we recovered the underlying problems and began recovery we found that most of the server image backup tapes for 6 months+ could not be loaded. The ops guys took a severe beating for it.

    You think this stuff will never happen but it always does. We had triple redundancy with our own power backups but even that wasn't on a regular test cycle. Some maintenance guy left the switch open between floors for some reno job over a year prior and while the generators were running the power didn't make it to infrastructure.... it was as if hundreds of UPSs screamed at once and were silenced when failover didn't happen.

    You really can't beat Murphy's Law, but with regular testing you can soften the effects.

  9. Re:Excellent question by lgw · · Score: 3, Insightful

    Bitrot is a myth in modern times. Floppies and cheap-ass tape drives from the 90s had this problem, but anything reasonably modern (GMR) will read what you wrote until mechanical failure.

    The key therefore is to verify as you write. Usually, verifying a sample of a few GB will let you know if everything went OK. DO your backups with checksums of some sort. A modern tape drive and backup software will do that automatically, and let you schedule a verify automatically as part of backups (2 TB? That's 1 tape - might want to consider that), though ideally you should verify a tape on a different drive than the one you wrote it on.

    For disk-based backups, local or cloud, I strongly recommend archiving to a format with checksums (RAR etc) over some sort of raw file copy. Especially for anything going over the network: RAR a volume/file set locally first, then upload, then test the archive.

    If you have a superstitious fear of bitrot, you can always do some random sampling of archive integrity, and keep multiple historical copies of files just in case (e.g., don't just delete backup N-1 when you do backup N, do a rotation scheme).

    --
    Socialism: a lie told by totalitarians and believed by fools.
  10. A paranoid setup by brokenin2 · · Score: 4, Interesting

    If you really want hassle free and safe, it would be expensive, but this is what I would do:

    ZFS for the main storage - Either using double parity via ZFS or on a raid 6 via hardware raid.

    Second location - Same setup, but maybe with a little more space

    Use rsync between them using the --backup switch so that any changes get put into a different folder.

    What you get:

    Pretty disaster tolerant
    Easy to maintain/manage
    A clear list of any files that may have been changed for *any* reason (Cryptolocker anyone?)
    Upgradable - just change drives
    Expense - You can build it for about $1800 per machine or $3600 total if you go full-on hardware raid. That would give you about 4TB storage after parity (4 2TB drives - $800, Raid Card - $500, basic server with room in the case - $500)

    What you don't get: Lost baby pictures/videos. I've been there, and I'd pay a lot more than this to get them back at this point, and my wife would pay a lot more than I would..

    Your current setup is going to be time consuming, and you're going to lose things here and there anyway.. If you just try to do the same thing but make it a little better, you're still going to have the same situation, just not as bad. In this setup you have to have like 5 catastrophic failures to lose anything, sometimes even more..

    1. Re:A paranoid setup by cas2000 · · Score: 2

      > Just don't let ZFS know that there's more than 1 drive.

      That is *precisely* the wrong thing to do. As in, the exact opposite of how you should do it.

      Instead, configure the RAID card to be JBOD and let ZFS handle the multiple-drive redundancy (raidz and/or mirroring), as well as the error detection and correction.

      Otherwise, there is little or no benefit in using ZFS. ZFS can't correct many problems if it doesn't have direct control over the individual disks, and RAID simply can't do the things that ZFS can do.

      Of course, this means that you're actually better off with a cheap dumb non-raid HBA card (or even just the SATA ports on your motherboard if there's enough of them) than an expensive HW RAID card. This is another advantage of ZFS.

      (a good option is to use an LSI SAS2008 card or similar, and make sure it's re-flashed to "IT" mode firmware if you're using consumer-grade SATA drives with it to avoid TLER issues. readily available brand new for under $100 for 8 SAS/SATA ports)

      > You can't have them both trying to manage the redundant storage.

      yes. and it's ZFS that should be managing it, not the raid card.

      > ZFS certainly isn't necessary though, if you've got hardware raid.

      wrong. RAID does not provide error detection or correction. RAID protects against drive failures only, not silent corruption.

    2. Re:A paranoid setup by cas2000 · · Score: 3, Informative

      good post, except for three details:

      1. if you're using ZFS on both systems, you're *much* better off using 'zfs send' and 'zfs recv' than rsync.

      do the initial full copy, and from then you can just send the incremental snapshot differences from then on.

      one advantage of zfs send over rsync is that rsync has to check each file for changes (either file timestamp or block checksum or both) every time you rsync a filesystem or directory tree. With and incremental 'zfs send', it only sends the incremental difference between the last snapshot sent and the current snapshot.

      you've also got the full zfs snapshot history on the remote copy as well as on the local copy.

      (and, like rsync, you can still run the copy over ssh so that the transfer is encrypted over the network)

      2. your price estimates seem very expensive. with just a little smart shopping, it wouldn't be hard to do what you're suggesting for less than half your estimate.

      3. if you've got a choice between hardware raid and ZFS then choose ZFS. Even if you've already spent the money on an expensive hardware raid controller, just use it as JBOD and let ZFS handle the raid function.

  11. Re:Excellent question by clickclickdrone · · Score: 2

    Are there cloud storage providers that can do this for the above example of an approx. 2 TB data set, and provide complete security?

    Cloud and complete security together is an oxymoron.

    --
    I want a list of atrocities done in your name - Recoil
  12. Re:Checksums? by Waffle+Iron · · Score: 5, Informative

    I never archive any significant amount of data without first running this script at the top:

    find -type f -not -name md5sum.txt -print0|xargs -0 md5sum >> md5sum.txt

    It's always good to run md5sum --check right after copying or burning the data. In the past, at least a couple of percent of all the DVDs that I've burned had some kind of immediate data error

    (A while back, I rescanned a couple of hundred old DVDs that I burned ranging up to 10 years old, and I didn't find a single additional data error. I think that a lot of cases where people report that DVDs deteriorate over time, they never had good data on them in the first place and only discover it later.)

  13. Re:uhuh by Sarten-X · · Score: 2

    And yet, one of FLOSS's selling points is our great community support...

    --
    You do not have a moral or legal right to do absolutely anything you want.
  14. Re:Excellent question by mlts · · Score: 3, Interesting

    In reality, Dropbox, Skydrive, and other cloud services should be treated as a type of media, just like BD-ROMs, tape, SDD, HDD, and even hard copy.

    The trick is to use different media to protect against different things. My Blu-Ray disks protect an archive against tampering or CryptoLocker (barring a hack that flashes the BD burner's ROM to allow the laser to overwrite written sectors.) However, they have to be maintained in a good environment with a good indexing system. My files stashed on Dropbox bring me accessibility virtually anywhere... but malware that erases files could wipe that volume out in no time.

    Similar with external HDDs. Those are great for dealing with a complete bare metal restore, but provide little to no protection against malware. Tape, OTOH, is expensive for the drive and requires a fast computer, but once the read-only tab is flipped or the WORM session is closed, the data is there until the tape is physically destroyed.

    Of course, there is not just media... there are backup programs. This is why I use the KISS principle when it comes to backups. I use an archiving utility to break up a large backup into segments (with recovery segments to allow the archive to be repaired should media go bad), then burn the segments onto optical media.

    I've found that using a backup utility can work well... until one has to restore, the company is out of business, and one can't find the CD key or serial number so the software will install. One major program I used for years worked excellently... then just refused to support new optical drives (as in ignoring them completely.) So, unless I can find a DVD drive on its antiquated hardware list on eBay, all my backups are inaccessible. I was lucky enough to find that and copy the data to a HDD, but using the lowest common denominator is a good thing.

    Backups are the often neglected underbelly of the IT world. While storage, security, availability and other technologies have advanced significantly, backups on the non-enterprise level are still languishing behind in almost every way possible. It was only a few years ago that encryption became standard with backup utilities [1].

    [1]: With encryption comes key management, and some backup programs make that easy, some make it incredibly hard.

  15. Re:Look to the past by Venotar · · Score: 2
    The tapes may be stable (I'm suspicious of that claim: their temperature tolerances aren't as high as modern hard drives, they actually care about dust, and I would expect them to be more susceptible to magnetic interference); but the tape drives are not. Over time drive heads become misaligned. They continue to write fine and can read what they write; but sufficient misalignment prevents other drives of the same type from reading the tape. That tape then becomes only as useful as the drive that wrote it. Lose the drive, you lose the use of the data on the tape. Unless you test reading the tape in a different drive than it was written from (while the writing drive is still available for pulling the data out), this condition's effectively undetectable until you actually need the data.

    There's a reason so many shops have moved to disk based backups. Tape simply isn't reliable. Tape is cheap; but definitely NOT reliable.

  16. Re:Excellent question by rabtech · · Score: 5, Interesting

    Bitrot is a myth in modern times. Floppies and cheap-ass tape drives from the 90s had this problem, but anything reasonably modern (GMR) will read what you wrote until mechanical failure.

    This isn't just wrong, it's laughably wrong. ZFS has proven that a wide variety of chipset bugs, firmware bugs, actual mechanical failure, etc are still present and actively corrupting our data. It applies to HDDs and flash. Worse, this corruption in most cases appears randomly over time so your proposal to verify the written data immediately is useless.

    Prior to the widespread deployment of this new generation of check-summing filesystems, I made the same faulty assumption you made: that data isn't subject to bit rot and will reproduce what was written.

    ZFS or BTRFS will disabuse you of these notions very quickly. (Be sure to turn on idle scrubbing).

    It also appears that the error rate is roughly constant but storage densities are increasing, so the bit errors per GB stored per month are increasing as well.

    Microsoft needs to move ReFS down to consumer euro ducts ASAP. BTRFS needs to become the Linux default FS. Apple needs to get with the program already and adopt a modern filesystem.

    --
    Natural != (nontoxic || beneficial)
  17. Re:Excellent question by Anonymous Coward · · Score: 2, Insightful

    it doesn't seem that way... http://forums.freenas.org/threads/ecc-vs-non-ecc-ram-and-zfs.15449/

  18. Re:BTRFS filesystem by mlts · · Score: 4, Informative

    I'll be the heretic here, but on Windows 8.1 and Windows Server 2012 R2, there is a feature called Storage Spaces. It works similar to ZFS where you toss drives into a pool, then create a volume that is either simple, mirror, or with parity, and Windows does the rest. If a volume needs more space, toss some more drives in the pool.

    To boot, it even offers autotiering so data can be stored on a SSD that is frequently used, or remain on the HDDs if it isn't. Deduplication is handled on the filesystem level [1].

    No, this isn't a replacement for a SAN with RAID 6 and real-time deduplication, but it does get Windows at least in the same ballgame as Oracle with ZFS.

    [1]: Not active deduplication. The data is initially stored duplicated, but a background task finds identical blocks and adds pointers. Of course, the made from scratch filesystem, ReFS (which has the ability to check for bit rot on reads like ZFS), doesn't have this, so one is still stuck with NTFS for this feature.

  19. Have mercy! by c0d3g33k · · Score: 4, Funny

    We have hundreds of thousands of family pictures and videos we're trying to save using this advice. But in some sparse searching of our archives, we're seeing bitrot destroying our memories. With the quantity of data (~2 TB at present),

    As the proud owner of dozens of family photo albums, a stack of PhotoCDs etc which rarely see the light of day, the bigger challenge is whether anyone will ever voluntarily look at those terabytes of photos. Having been the victim of excruciating vacation slide shows that only consisted of 40-50 images on a number of occasions (not to mention the more modern version involving a phone/tablet waving in my face), I can only imagine the pain you could inflict on someone with the arsenal you are amassing.

  20. The old-fashioned method by TheloniousToady · · Score: 4, Interesting

    Don't forget the old-fashioned method: make archival prints of your photos and spread copies among your relatives. Although that isn't practical for "hundreds of thousands", it is practical for the hundreds of photos you or your descendants might really care about. The advantage of this method is that it is a simple technology that will make your photos accessible into the far future. And it has a proven track record.

    Every other solution I've seen described here better addresses your specific question, but doesn't really address your basic problem. In fact, the more specific and exotic the technology (file systems, services, RAID, etc.) the less likely your data is to be accessible in the far future. At best, those sorts of solutions provide you a migration path to the next storage technology. One can imagine that such a large amount of data would need to be transported across systems and technologies multiple times to last even a few decades. But will someone care enough to do that when you're gone? Compare that to the humble black-and-white paper print, which if created and stored properly can last for well over a hundred years with no maintenance whatsoever.

    Culling down to a few hundred photos may seem like a sacrifice, but those who receive your pictures in the future will thank you for it. In my experience, just a few photos of an ancestor, each taken at a different age or at a different stage of life, is all I really want anyway. It's also important to carefully label them on the back, where the information can't get lost, because a photo without context information is nearly meaningless. Names are especially important: a photo of an unknown person is of virtually no interest.

    Sorry I don't have a low-tech answer for video, but video (or "home movies", as we used to call it) will be far less important to your descendants anyway.

    1. Re:The old-fashioned method by Grizzley9 · · Score: 3, Interesting

      Agreed. Looking through a family picture album from the late 1800's I realized my hundreds of GB's of current family pics will likely die with me. There are a ton of family images and a select few family pics may be copied by progeny but unlike their printed counterparts, there are no names or locations on many (and sometimes dates if the exif gets corrupted or overwritten).

      So what good is a bunch of pics or videos of long past events except to the person involved? Digital images today, unless meticulously managed and edited do little good for historical purposes like the photo album of yesterday. Especially if those are locked away in some online archive that may or may not be easily accessed if the owner can keep up with format and company changes over the decades they will have them and descendants know where they are.

  21. Prepare for maintainer-rot, too by Rob+the+Bold · · Score: 3, Interesting

    A family archive maintained by the "tech guy/gal" in the family is also subject to failure from death or disability or the aforementioned maintainer. Any storage/backup solution should therefore be sufficiently documented (probably on paper, too) that the grieving loved ones can get things back after a year or two of zero maintenance and care of the system. That would also imply eschewing home-brew type systems in favor of using standard tools so a knowledgeable tech person not familiar with the creator's original design can salvage things in this tragic but possible scenario. Document the system so even if the family can't do it themselves, and an IT guy has to be contracted to resurrect the data, he'll have the information needed to do so.

    Any system sufficiently dependent on regular maintenance by just one particular person is indistinguishable from a dead-man time-bomb.

    --
    I am not a crackpot.
  22. You need an editing plan more than a backup plan by neo-mkrey · · Score: 4, Interesting

    100,000s -- like 300,000? More? How many of them will you actually ever look at again? Less 1% I'm guessing. Here's my advice (and it's what I do), step 1) when transferring pics to your computer, delete the ones that are out of focus, bad lighting, framed poorly, etc. This is about 15%. Step 2) once a month, go through the photos you have taken the previous month and delete those that just don't mean as much anymore (if they have decreased in emotional value in 30 days, just think how utterly worthless they would be in 5 years?). This takes care of another 30%. Step 3) once every 3 months, I and my wife pick the cream of the crop for physical prints. This is about 10%. These are stuck into photo albums, labeled and kept in a fire proof safe in our basement. So 200 photos a month, gets reduced to ~100, and then 10 per month are printed. YMMV

  23. Re:Excellent question by entrigant · · Score: 3, Interesting

    I've been surprised by the lack of reference of proper error checked data paths so far in these comments. I'm continually saddened by ever increasing aggressiveness in clocks and density of RAM in consumer level systems while stubbornly refusing to implement ECC. Many people are even hostile to the idea as if ECC RAM is somehow tainted.

    This article points out something else I'd not even considered. A scenario where lack of ECC on a self healing file system can amplify a RAM failure to a catastrophic degree making such filesystems even riskier to run on consumer grade systems.

    Thank you for sharing.

  24. Photos = Lightroom plus DNG on a Drobo by carlcmc · · Score: 2

    Convert photos to DNG in Adobe Lightroom and use the ability for it to check for file changes. Store on a Drobo with dual disk redundancy.

  25. ZFS, of course by rainer_d · · Score: 2

    but there is a catch: to reliably detect bit-rot and other problems, you also need server-grade hardware with ECC.
    ZFS (especially when your dataset-size increases and you add more RAM) is picky about that, too.
    Bit-rot does not only occur in hard-disks or flash.
    You should really, really take a hard look at every set of photos and select one or two from each "set", then have these printed (black and white, for extra longevity).
    If this results in still too many images, only print a selection of the selection and let the rest die.

    --
    Windows 2000 - from the guys who brought us edlin
  26. Do not defrag ? Definitely do not over clock. by perpenso · · Score: 2

    ZFS has proven that a wide variety of chipset bugs, firmware bugs, actual mechanical failure, etc are still present and actively corrupting our data.

    And I expect that defragging aggravates this. Read a perfectly good block of data from disk into flaky RAM, have a bit flip, and write out that corrupted data to its new location. Even if the software is verifying its likely to verify against RAM and it did successfully write what is in RAM.

    And then there is over clocking. If a computer is just used for gaming, no problem. But if its used for more serious things or archiving things of value to you then you may want to pass on over clocking. Folks who say you can verify an over clocked CPU are mistaken. Its not a crash or no crash thing, at a certain unpredictable point in over clocking an unpredictable CPU instruction may simply give an incorrect result. This incorrect result could end up in your data or image. I've seen over clocked CPUs mess up a text string that is supplied by the CPU itself, CPUID's vendor string.

  27. Re:Just get a carbonite account by gravis777 · · Score: 2

    how can you be sure that your cloud provider is not suffering from bitrot on your stored files?

    http://en.wikipedia.org/wiki/Carbonite_(online_backup)#Product_details

    Works for me - better than what I have going on at home, and cheaper than I could set up something like this. And anyways, I still have my External HDD backups as well. Its just another level of backup to keep me from data loss.

  28. Re:Excellent question by lgw · · Score: 2

    Well, I did backup software and hardware for nearly 20 years. But I can't substantiate that with a link.

    --
    Socialism: a lie told by totalitarians and believed by fools.
  29. Re:Excellent question by bluefoxlucid · · Score: 3, Interesting

    I used to fancy a girl who worked as a data recovery engineer. You wouldn't believe how many people hear the RAID controller alarming and get up to close the case instead of hot swapping a spare drive.. then a week later the second drive goes. She had a fanciful story about how spinning disks used to occasionally fail in such a way that a random sector would go bad, report incorrect data, and a RAID-1 mirror would "fix" it by destroying data on the other drive. She also used to tell me software RAID options had a tendency to actually beat hardware RAID options for data integrity outside of other inline failures--that is, when the system is operating under optimal circumstances, most hardware RAID systems more often self-corrupt than software RAID systems. Just an odd statistic, and I never got overall risk performance stats out of her.

  30. MD5 and a few scripts by MooseTick · · Score: 2

    Here's a cheap easy solution (assuming you can write some basic scripts)

    1. Start by taking an MD5 of all your pics.Save the results.
    2. Backup everything to a 2nd drive. Take MD5s and be sure they match using basic scripts.
    3. Perioducally scan drive 1 and 2 and compare against their expected MD5 value. If one has changed, copy it from the other (assuming it is still correct)

    You could expand this with more drives if you are extra paranoid. You could do this cheap, check regularly, and know when bitrot is happening.

  31. Re:BTRFS filesystem by RR · · Score: 3, Informative

    The only way to truly prevent bitrot is by maintaining at least three complete copies of the data, and regularly compare between them.

    There you go again. Acting like you know what you're talking about, but you don't.

    ZFS and BTRFS have a much more efficient way to ensure correctness: CRC of everything written. That is what is checked when you do a zpool scrub or a btrfs scrub. Random errors are very unlikely to produce the same checksum, so then you only need a second copy that doesn't produce CRC errors.

    Hard drives are nowhere near as reliable as their manufacturers claim. Modern drives don't store the bits that you feed them exactly as you give them. Instead, they use CRC and error correcting codes, so they only need most of the data to be correct. Usually, if the data doesn't match the CRC, and it cannot be corrected by ECC, then you get a read error instead of corrupted data. Which, I guess, is better than getting a corrupted picture. Ideally, a RAID would be able to recreate the missing block, but I can't find any reference to a RAID doing that.

    But I've seen enough errors that I suspect something else is going on. It surely doesn't help that modern computers have many gigabytes of memory, but almost none have ECC on that memory. Your computer can be corrupting your data, and you have no warning that it's happening. In addition, hard drives lie. I'm not optimistic about the long-term storage of electronic data.

    --
    Have a nice time.
  32. Re:BTRFS filesystem by girlintraining · · Score: 2

    There you go again. Acting like you know what you're talking about, but you don't. ZFS and BTRFS have ...

    Exactly dick to do with what I said. The filesystem doesn't matter. The operating system doesn't even matter.

    Modern drives don't store the bits that you feed them exactly as you give them. Instead, they use CRC and error correcting codes, so they

    ... Which again counts for exactly dick. I'm talking about infrastructure and architecture, while you're blubbering on about the hardware.

    Which, I guess, is better than getting a corrupted picture. Ideally, a RAID would be able to recreate the missing block, but I can't find any reference to a RAID doing that.

    That's because you have no experience as a network administrator in a professional environment. Because then you'd know that's the very thing RAID was designed to do: Recover from hardware failure, which includes sectors becoming unreadable. You are clearly confused both which what level of abstraction is being discussed (architecture versus hardware), as well as the different types of failure modes each of these solutions presents. Bit rot is a physical process that occurs in all magnetic media, and at sufficiently small-scale, can also affect non-persistent storage such as RAM.

    It surely doesn't help that modern computers have many gigabytes of memory, but almost none have ECC on that memory.

    That's because ECC adds an extra layer of complexity to solve a problem that doesn't occur very often in computers, and when it does, the most severe consequence is usually that the computer crashes or behaves abnormally. For residential, and even most commercial uses, ECC memory just isn't needed. But for a select few use scenarios where data integrity is absolutely critical -- such as, say, nuclear power plants, air traffic control systems, certain types of hospital equipment, or financial processing systems, the added cost is justified because they need high availability/high reliability of those systems. It's also used in certain aerospace applications because the physical mechanism that causes bitrot -- high energy radiation, increases quite a bit at higher altitudes, and in space increases several orders of magnitude -- and if you're going to put something in geostationary orbit, it then takes the full brunt of solar radiation with no mitigation. Correcting for memory problems in these situations is better done at the hardware level; hence ECC memory.

    Your consumer-grade computer's memory is a piece of shit. It's made with commodity capacitors and ICs that are stamped out in bulk for super cheap. And, big surprise -- super cheap doesn't mean super reliable. But we don't need super reliability -- when our system shows obvious signs of a failing memory stick, we just drive to the store, plunk down a $20 and abscond with a new one. Problem solved.

    I'm not optimistic about the long-term storage of electronic data.

    That's because, as previously pointed out, your experience comes from consumer-grade hardware that you don't fully understand the design considerations made. NASA has had great success in the long-term storage of magnetic media -- in fact there was an article not long ago about how they had to reverse-engineer equipment designed during the 1960s for the Apollo program to recover data on tape reels, when they lacked the original equipment it was recorded from. They discussed how the tapes themselves had become brittle and the ferrous oxide would actually peel off in chunks while reading, much like how paint peels off a house, but they were able to recover this data anyway. The technology we have today is far more sophisticated and unlike old tape-technology doesn't require physical contact with the source media to read it. There are companies like OnTrack that specialize in data recovery from harddrives and boast a rema

    --
    #fuckbeta #iamslashdot #dicemustdie
  33. Re:BTRFS filesystem by MarkTina · · Score: 2, Informative

    RAID10 and similar systems are two RAID5 systems which are independent and regularly compare data; These can detect which system is inconsistent, so you will always have at least one copy of your data in a consistent state.

    You were doing quite well up until you said that sentance .....