Slashdot Mirror


Large IDE Drives as Long-Term Archival Media?

PlatterMan asks: "The question of how to cope with backing up disk drives which are rapidly increasing in size, onto tape and other backup devices which aren't scaling in size as quickly isn't new to Slashdot. Neither is the use of single, raided, and removal disks as backup devices, this has been covered numerous times on Slashdot in e.g. here and here. One thing I haven't really seen discussed however is the feasibility of disk drives as medium to long-term archival media, say 5 to 10 years. Like many people I'm in the position of now having multiple machines with a combined data pool of about 220 Gig, and backing up these onto DDS or DLT tapes is slow and manual to do, and expensive in tape costs. So I'm looking to add a removal drive bay to my primary backup machine and pick up a bunch of large IDE drives, so that I can do regular disk to disk backups over 100 Meg Ethernet (and for my machines which are in cages, over the Net) pulling out and alternating the backup drives on a 3-way backup cycle."

"Backups are of no use without offsite archival copies so I plan to take one set of disks out of the pool, and archive them offsite on a quarterly basis.

However, I've heard horror stories about the data retention and usability off older disks which have been shelved for archival, for example disk stiction - where people try to restore data off of a 4 to 5 year old drive only to find that the disk won't spin up due to solidification of lubricants, or that they've experienced data degradation.

I'd be interested in the Slashdot crowd's opinion on using large IDE drives as an archival media. Clearly one possible problem is being able to get hold of a machine in the future with a suitable IDE interface to plug them into for restoration, but I can't see IDE disappearing within 5 years (maybe 10 though). I'm more interested in experiences and opinions on the suitability of the disks themselves for long-term archival.


  • Is stiction still likely occur on newer makes of IDE drives or have manufacturers beaten the problems which caused this in the past?
  • Likewise how likely is bit drop-out and general data degradation over say a 5 year and 10 year period, and what do people think would be the likely maximum feasible time that a shelved drive would be usable for?
  • Any suggestions as to how would I need to store drives in order to minimize these types of problem and maximise their feasible life as archival media.
Thanks!"

28 of 710 comments (clear)

  1. t's the next AYB^H^H^H Soviet Russia by Dental+Plan · · Score: 5, Funny

    Backing up to IDE hard drives.... That's a paddling

    Not using SCSI like you should... That's a paddling

    The right tool for the job is a tape drive, if you don't use it.... That's definitly a paddling.

  2. Steve Gibson by Jucius+Maximus · · Score: 5, Informative
    Please don't flame me for quoting Steve Gibson, but I think he's right on this account: "There are only two kinds of hard drives -- Those that have failed and those that will fail."

    Hard drives are not non-volatile storage.

  3. warranty period by Clover_Kicker · · Score: 5, Insightful

    Since IDE HD manufacturers recently decreased their warranty period, I'd be *really* reluctant to trust 'em 10 years from now.

    1. Re:warranty period by fishbowl · · Score: 5, Informative

      "Who the fuck has 220GB of personal data? "

      I'm getting there, in audio data.

      My own music, that I write and record, so, going down to the store to replace it isn't exactly an option.
      It's also on DAT, and on CD audio, so you could say
      I have a backup, but that's not really true -- the DAT is the source material, and a CD would represents one view of some of the data.

      Am I going to buy a $65,000 SAN tape library machine, just because I'm getting into volume? (No.) Would I like an inexpensive solution that is less cumbersome than CDR? (Yes.)

      --
      -fb Everything not expressly forbidden is now mandatory.
    2. Re:warranty period by Kaa · · Score: 5, Interesting

      Who the fuck has 220GB of personal data?

      And what's so weird about it?

      A scan of a single frame of a 35mm film, on a high-end consumer film scanner will create a file... let's see:

      The scanner is 4000dpi, so the resulting image is about 4000x6000 pixels. We are working in 16-bit-per-color-channel mode, so that's 6 bytes per single pixel. A bit of multiplication get you 144Mb. As a practical matter, the film frame is slightly smaller so your output TIFF file is about 120Mb in size. That is for a single 35mm film frame.

      So raw scans of slightly under 2000 film frames will already hit the 220Gb figure.

      Still think it's a ridiculous number?

      --

      Kaa
      Kaa's Law: In any sufficiently large group of people most are idiots.
  4. rock and chisel by Lxy · · Score: 5, Funny

    with all the stories I've seen about being unable to retrieve data from just 15 yrs ago (because the format is unreadable, not because the media deteriorated) I'm convinced that archiving data using a chisel and a rock is the best way to go.

    --

    There is no reasonable defense against an idiot with an agenda
    :wq
    1. Re:rock and chisel by nsample · · Score: 5, Interesting


      I know this parent was modded up as +Funny, but it's actually +Informative. "Rock and chisel" are the best thing we have, and there's a real trend toward using it more. Take a look at Norsam's HD-Rosetta. It's an etched nickel plate designed to last for thousands of years. Vive la Rock & Chisel!

  5. Why Tape Is Good by Jucius+Maximus · · Score: 5, Informative
    Tape may be inconvenient but it is still a true backup medium. With hard drives, the reading and writing hardware are enclosed with the platters. So when the read head of the HDD fails, your data may be 100% intact on the platters but you can't get at it without professional help. How many other parts in the HDD could fail without harming the platters? A lot!

    With tape, the failure of a tape drive doesn't separate your from your data (unless it catches on fire with the tape in it or something.) You can just get a new tape drive and you are good to go again.

    Thus, tapes are very good because the storage medium and the read/write hardware are separated and not interdependent.

    1. Re:Why Tape Is Good by BlankTim · · Score: 5, Insightful

      Obviously, you've never had a tape physically fail.

      Maybe it's just me, but after the experiences I've had the last year with crappy tapes, I'm surprised the "tape as a backup medium" idea hasn't been seen for the farce that it is.

      Backing up to IDE or SCSI? Good short term solution, but I don't think I'd trust my backup drives for more than 1 year, tops.

      Burn to CD? Good long term solution, just not practical due to the file sizes involved. Burn to DVD isn't much better.

      It's time for something new. Hell, maybe it will turn into the next "killer thing" and revitalize the economy.

      I vote for soft bubble memory

      --
      Just once, I'd like it if someone called me "Sir".
      Without adding, "You're creating a scene."
  6. Long Term Storage by caseydk · · Score: 5, Informative
    The Library of Congress is attempting to answer this question as they have huge amounts of media that is on highly degrading (nitrate-based films) materials.


    Their answer? A huge RAID array starting at 180TB and growing steadily over time.


    Your answer? Probably figure out which of the data is fixed and which of it changes and attempt to back up accordingly. Does all 220gb change on a weekly basis? That seems unlikely...

  7. Tapes *is* the right medium for long term backup by MooRogue · · Score: 5, Insightful

    I'm sorry, but 220GB easily handled by backup tape. With SDLT and AIT tape capacities exceeding 100GB per tape, two tapes can easily handle your load.

    If you have the budget, get an autoloader so you can perform a full backup in one session, or two tape drives for that matter.

    Personally, i am backing up 600+GB onto tape and it works well. I've had numerous IDE hard disk failures, yet not a single data tape failure so far.

  8. Slashdot - the "Jackass" of tech support by HotNeedleOfInquiry · · Score: 5, Funny
    Here's some more questions:

    Can I use my laser printer to print on Gummy Bears?

    Can I dry my cat in the microwave?

    Can I put rice in my car radiator?

    Can I unplug all the fans in my computer so it will run quieter?

    Can I run 120 VAC on the spare CAT5 pairs?

    --
    "Eve of Destruction", it's not just for old hippies anymore...
  9. Re:GraniteDigital is what I use by coyote-san · · Score: 5, Interesting

    At the least, toss the media into freezer-weight ziplock bags. Better yet is double-bagging it - put the media in a smaller bag, and then in a larger bag with smaller bag's opening on the 'far' side.

    Paper-rated "fire safes" work by putting a media that undergoes a phase change at high temperatures, releasing steam in the process. (Think of the latent heat involved in freezing and melting ice, same theory is used to keep the interior of the safe at a reasonable temperature.)

    The only problem is that paper tolerates steam fairly well. Ditto the smoke that can make its way into the safe. The paper may be damaged, but it is still readable. Computer media will be destroyed. Fortunately freezer-weight plastic is more than adequate to block the steam, leaving only small openings in the seal. Even this is modest, and the second bag is mostly to allow you to avoid smearing soot onto the media as you remove it from the bag.

    --
    For every complex problem there is an answer that is clear, simple, and wrong. -- H L Mencken
  10. Crappy backups better than nothing by jolshefsky · · Score: 5, Interesting
    I don't know how "pro" you want to go with this, but I ran into a similar situation and resigned myeslf to the same solution. My DDS2 SCSI tape drive is getting to be too small at 4/8GB. I would like to have a tape solution, but it's too expensive for my purpose. I get drives as pulls and last-years-models so I only spent US$150, but with tapes at US$10, even 8GB is absurdly small. If I were to go with new equipment and step up to DDS-4, I'd be out about US$1000 for the drive and another US$20 for each 20-40GB tape. Total cost for a basic 3-tape rotating backup: US$1060.

    On the other hand, I could spend (as I have) US$40 on a basic (a.k.a. el-cheapo) FireWire-IDE case, US$30 for 3 removeable IDE enclosures, and (eventually) about US$70 each for 3 60GB IDE drives. Total cost: US$280.

    What do I sacrifice? Not much ... one of the drives might fail. At that point I'd just replace it with another US$70 capacity drive (which would probably be larger.) If I needed to restore something from backup, I'm already looking at up-to 24-hour old data, and if that drive happened to die, possibly 48-hour ... it's unlikely that all the drives would fail at once.

    The advantages? I can use the US$780 I save for something else and I don't have to worry about shelling out another US$1000 every four years just to scale to "current" requirements. I don't know what the upper limit of an IDE drive is these days (i.e. what can the ATAPI bus handle) but even 200GB is pretty big for me right now.

    Anyway, just a few thoughts. The basic thing is lower cost for nearly the same risk ... tapes fail too, you know. Remember, too, that this story would be very different if I had to handle 50 machines instead of 2.

    --
    --- Jason Olshefsky

    Karma: Poser (mostly affected by adding this line long after everyone else did)

  11. Tapes are NOT a long term archival medium. by silentbozo · · Score: 5, Insightful

    Tapes are fine for backups, but I never expect to pull complete and usable data off of them after 6 months. Why? Tape degrades - it's nothing more than rust on platic. As humidity and temperature change, you can end up with a solid roll which will stick to your tape drive heads and result in whole patches of magnetic coating coming off. I worked on a project restoring data from 10+ year old reel-to-reel tape, and it was a nightmare. 1 out of 4 tapes was completely unusable.

    Even worse, tape drive formats keep changing - and since tape drives are guaranteed to wear out, where are you going to get a working tape drive to restore data 5, 10, 15 years from now? I've gone through 3 tape drives in the last 8 years - thank god I got a CD burner early, that data I can still read (although it's about time to start recopying stuff from 1996.)

    Basically, if you entrust your data to tape long term, you have to continuously copy that data to new tapes, and or new tape formats. Where tape has traditionally shined is as a short-term backup format, although with the drop in DVD-burner drives/media, and the high-cost of high-capacity tape drives/media, this may no longer be the case (assuming you get some peon to do the big backup on DVDs, and you get to do daily diffs - otherwise, having a bank of tape drives is cheaper on staff time.)

  12. Re:Ask who's actually doing it. by DJPenguin · · Score: 5, Informative

    Well, don't know about LucasFilm, but Pixar use massive tape libraries (we are talking robots with 100+ drives and tens of thousands of slots.)

    Incremental backups every HOUR, tape drives spinning all the time. They are a customer of the company I work for. (Veritas)

  13. Re:Tapes *is* the right medium for long term backu by Drakantus · · Score: 5, Insightful


    "I have $500 to spend on a backup solution for my 220GB data pool, and I was thinking of buying 4 120GB IDE drives along with an IDE RAID1 card and useing the array for backups, anyone have other ideas?"

    "No way, you are insane. IDE is horribly unreliable and you will surely lose your data. You need a $6000 tape drive, if you can't afford it you are better off with no backups at all"

    --
    I love going down to the elementary school, watching all the kids jump and shout, but they dont know I'm using blanks.
  14. Re:Tapes *is* the right medium for long term backu by glesga_kiss · · Score: 5, Informative
    I've had numerous IDE hard disk failures, yet not a single data tape failure so far.

    You speak of not having tape failures, but you omit one important fact; how many times have you successfully retrieved data from tape?

    IDE disks will fail from continual use, and that failure will generally be obvious, but what way do you have of knowing that you genuinely don't have any tape failures, if all you are doing is rewriting over the same tapes?

  15. Perfect Storage Medium by techsoldaten · · Score: 5, Funny

    For my clients, I always suggest the use of stone and / or clay tablets for all mission critical data archive projects, regardless of size or scope. Bablyonian and Greek models of data retention from as far back as 4,500 years ago are (in many cases) superior to the models we commonly use today, with much of the physical meadia having survived electrical storms, tornadoes, floods, fires, and wars on every scale imaginable with a data corruption rate of zero and without the benefit of a climate controlled room, dedicated security staff, or even a closet for media storage. Imagine the elegance of a 84'3/4 STROM (Stone Tablet Read Only Memory) machine hooked up to your Slackware Archive server for performing restorations, and the ST Binary Writer you have networked to your backup systems and kept physically over by the quarry... nice! The TCO for slab is far less than that of tape archives, considering you can store the media in a pile of mud and hose it down when you are ready for a restoration.

    M

  16. Five Points About Archiving by maggard · · Score: 5, Insightful
    1. Accept that you can't just stick magnetic media on a shelf (in a vault, even climate-controlled) and expect it to last forever.

      Bits rot. Under the most perfectly controlled environment the damn stuff still goes bad. Be realistic, anticipate this, do everything you can to slow it down, but plan for it and make provisions when you first put your archiving strategy in place. Tapes are likely more robust the platters as there's fewer critical parts to go wrong but nothing is perfect.

    2. Accept that CD & DVD don't have 100-year lifespans, mebbe not 10 year, and possibly far less.

      Yes they're cheap but we've far less experience with these media then we do with tape and studies are showing that they dyes may not be as stable as first thought. Heck, there's even a bug out there that eats some of these. There's also the question of long-term standards in some cases like DVDs.

    3. Checksums and multiple-backups (that reinforce eachother) are a necessity.

      Nothings worse then losing one part of an archive at one site, another part at a different site, and being unable to easily reconcile the two to get a good whole set. Make sure that however you archive things, same media or different media, that partial archives can be reconciled.

    4. Everything evolves - Keep updating backups.

      Years ago there was a big scramble to recover the US Govt's 1950 Census. It had been stored on steel tape and the required Unisys readers were no longer. (Much of the data was available but the entire raw set wasn't.) Eventually a working one was built from cannibalized parts in museum and private collections but the lesson was clear: Don't depend on the readers. The same goes for the recent BBC Domesday Book debacle - nobody could read the optical disks. Any good archive scheme will call for the material to be re-read and re-transcribed regularly in order to ensure the entire recovery-chain still works: Hardware, software, OS's, etc. If recovery becomes difficult migrate the material.

    5. Be pragmatic about what you archive.

      All too often folks archive everything 'cause they're too lazy to determine what is actually necessary and what isn't. Combine this with the difficulty of later having someone unfamiliar try to winnow down the material and this becomes a real problem. Even worse is later trying to find the useful material among all of the dross. Establish clear policies of what can be archived and make folks justify their material. Just as importantly make sure the costs are clear up front, even to the point of charging them a rate covering several years of storage initially. Suddenly some pack-rat deciding EVERYTHING they've ever typed is potentially a goldmine isn't so funny. Lastly, run everything past Legal: Some of this they don't want hanging around any longer then necessary.

    --
    I don't read ACs: If a post isn't worth so much as a nom de plume to its author then I wont bother either.
  17. Re:Non-volatile: no such thing by Alien+Being · · Score: 5, Funny

    "accidentally drop it and have it shatter"

    Moses: I bring to you these fifteen [crash], ten, ten ommandments.

  18. You could always ... by Greedo · · Score: 5, Funny

    Stegnographize your data and hide it in an amateur pr0n video.

    To restore from backup, search with Kazaa.

    --
    Tuus crepidae innexilis sunt.
  19. Re:Print! by alexburke · · Score: 5, Funny

    Have you ever tried to grep three boxes of greenstripe?

    Not a pretty sight, let me tell you...

  20. Re:the absolute surefire way to back something up. by BeBoxer · · Score: 5, Insightful

    But is printing a whole character per bit, or even byte, efficient? I'm curious how much data a laser printer could store on a piece of paper. Is it realistic to expect individual bits printed at 300dpi to actually be retrievable? Perhaps on a good 600dpi or 1200dpi printer.

    300dpi gives us almost 11KBytes per square inch. Figure 70 square inches on a letter page with 1/2" margins. That's 770KB. Print full duplex and you're looking at 1.5MB per page, or roughly a floppy disk (coincidence?) You wouldn't want to back up your MP3 collection, but for an archival method that is likely to last 100 years it's not too bad. Factor in compression and you are probably getting a 100x increase in storage density over plain text. Kind of a neat thought.

  21. The cheapest, and most long lasting backup. by teamhasnoi · · Score: 5, Funny
    The oral tradition! Have many children, give them each 10 pages to memorize. To make things easier, you can name them Sector 237, Cylinder 13004 and such.

    As disk space grows, so does your family/backup.

    To see examples of how this works see: Mad Max - Thunderdome, The Bible, American Indians, The Fellowship of the Ring, Aesops Fables, and the Legend of How the Great Nog Vomited the Earth and Heavens in Ancient Times, Before the Oceans Drank Atlantis.

    I have heard rumors that this is how Google archives.

  22. Re:the absolute surefire way to back something up. by 2nd+Post! · · Score: 5, Interesting
    But each of your 20k per page can easily encode a unicode value, which means you can cram 2 bytes per spot, or only 50 tons per terabyte.

    But how about a 600dpi laser printer, 8"x10"?

    For good readability, we can use:
    ***
    **
    *
    *
    **
    ***
    For (1,0) which gives us 3 dots per bit, or 200 bits per inch. A square inch would then give us 40,000 bits, or 5,000 bytes. A sheet of 8x10 then gives us 400,000 bytes. Or if you tweak the margins, 400k per page. So that's already 20 times your density. Increase the resolution to 1200dpi, and you can increase the data density to 1600k per page.

    We can also use different encodings: Right now we use 9 bits to encode 1 bit of information (really, really, redundant). We can probably safely use the following encoding to double our data density:
    ***

    ***

    *
    *
    *
    *
    *
    *
    So this further gives us 2 bits of information in the same 3x3 square, which increases our data density another 2fold: 800k or 3200k per page. At 1200dpi, that's 3mb per page, so that 1gb == 333 pages, and 1tb == 333k pages. 67 boxes, or 134 pounds per terabyte.

    There are more variations of course. We can increase density to 4 bits per 3x3 square. With a bit of thought, we can also increase the density up to the theoretical limit of 2^9 values in a 3x3 square, but we want to include some leeway for data redundancy...

    So by doubling to 4 bits per square, we require only 70 pounds per terabyte. By doubling again to 8 bits per square, That's down to 35 pounds.

    That much (little) paper... is actually lighter than a terrabyte of digital storage!
  23. Re:the absolute surefire way to back something up. by schmink182 · · Score: 5, Interesting
    To take this a little farther, a helpful reference tells us some useful information.

    2000 sheets of 8-1/2 x 11, 20# laserwriter paper weighs 20 lbs.
    First of all, this changes your estimate of weight from 100 tons to 250 tons.

    Typical yield of paper: 125 lbs per tree
    250 tons (500000 lbs) divided by 125 lbs per tree gives us 4000 trees.

    440 trees per acre
    This, after division, gives us 9 acres of trees destroyed for backing up 1 TB of data. Seem worth it? :)

  24. Ten year old data by Eric+Green · · Score: 5, Insightful
    I actually have a lot of data that is now 16 years old, including the source code (6502 assembly language) for a BBS program that I wrote as a kid. The secret: Regular migration of data to newer/larger media. From 1541 floppy to Amiga via serial port and xmodem, from Amiga to Linux via serial port and uucp, and on Linux, periodic moving of the data to newer hard drives as I upgrade my systems. I also now maintain a copy of my data in CVS, so that if something gets accidentally erased or changed, I can retrieve a copy. My CVS archive, too, periodically gets moved to newer/larger/faster hard drives.

    And to top it all off, I back it all up to a DDS-4 DAT autochanger. Yes, those six tapes will only hold 120gb, but the amount of important data on my disk drive is far less than 120gb (it is actually less than 20gb, including the original 44.1khz .wav recordings of all my original songs, and fits onto one tape easily).

    Do you *REALLY* need a backup of your .mp3 collection?! Probably not. Do you *REALLY* need a backup of all those ISO CDROM images that you downloaded for fifty versions of Linux and a half dozen versions of FreeBSD? Probably not. But that's the sorts of things that are taking up 80gb plus on my hard drives -- i.e., utterly disposable cruft. Which is true for most personal computers.

    --
    Send mail here if you want to reach me.