Slashdot Mirror


Large IDE Drives as Long-Term Archival Media?

PlatterMan asks: "The question of how to cope with backing up disk drives which are rapidly increasing in size, onto tape and other backup devices which aren't scaling in size as quickly isn't new to Slashdot. Neither is the use of single, raided, and removal disks as backup devices, this has been covered numerous times on Slashdot in e.g. here and here. One thing I haven't really seen discussed however is the feasibility of disk drives as medium to long-term archival media, say 5 to 10 years. Like many people I'm in the position of now having multiple machines with a combined data pool of about 220 Gig, and backing up these onto DDS or DLT tapes is slow and manual to do, and expensive in tape costs. So I'm looking to add a removal drive bay to my primary backup machine and pick up a bunch of large IDE drives, so that I can do regular disk to disk backups over 100 Meg Ethernet (and for my machines which are in cages, over the Net) pulling out and alternating the backup drives on a 3-way backup cycle."

"Backups are of no use without offsite archival copies so I plan to take one set of disks out of the pool, and archive them offsite on a quarterly basis.

However, I've heard horror stories about the data retention and usability off older disks which have been shelved for archival, for example disk stiction - where people try to restore data off of a 4 to 5 year old drive only to find that the disk won't spin up due to solidification of lubricants, or that they've experienced data degradation.

I'd be interested in the Slashdot crowd's opinion on using large IDE drives as an archival media. Clearly one possible problem is being able to get hold of a machine in the future with a suitable IDE interface to plug them into for restoration, but I can't see IDE disappearing within 5 years (maybe 10 though). I'm more interested in experiences and opinions on the suitability of the disks themselves for long-term archival.


  • Is stiction still likely occur on newer makes of IDE drives or have manufacturers beaten the problems which caused this in the past?
  • Likewise how likely is bit drop-out and general data degradation over say a 5 year and 10 year period, and what do people think would be the likely maximum feasible time that a shelved drive would be usable for?
  • Any suggestions as to how would I need to store drives in order to minimize these types of problem and maximise their feasible life as archival media.
Thanks!"

6 of 710 comments (clear)

  1. Re:rock and chisel by nsample · · Score: 5, Interesting


    I know this parent was modded up as +Funny, but it's actually +Informative. "Rock and chisel" are the best thing we have, and there's a real trend toward using it more. Take a look at Norsam's HD-Rosetta. It's an etched nickel plate designed to last for thousands of years. Vive la Rock & Chisel!

  2. Re:GraniteDigital is what I use by coyote-san · · Score: 5, Interesting

    At the least, toss the media into freezer-weight ziplock bags. Better yet is double-bagging it - put the media in a smaller bag, and then in a larger bag with smaller bag's opening on the 'far' side.

    Paper-rated "fire safes" work by putting a media that undergoes a phase change at high temperatures, releasing steam in the process. (Think of the latent heat involved in freezing and melting ice, same theory is used to keep the interior of the safe at a reasonable temperature.)

    The only problem is that paper tolerates steam fairly well. Ditto the smoke that can make its way into the safe. The paper may be damaged, but it is still readable. Computer media will be destroyed. Fortunately freezer-weight plastic is more than adequate to block the steam, leaving only small openings in the seal. Even this is modest, and the second bag is mostly to allow you to avoid smearing soot onto the media as you remove it from the bag.

    --
    For every complex problem there is an answer that is clear, simple, and wrong. -- H L Mencken
  3. Crappy backups better than nothing by jolshefsky · · Score: 5, Interesting
    I don't know how "pro" you want to go with this, but I ran into a similar situation and resigned myeslf to the same solution. My DDS2 SCSI tape drive is getting to be too small at 4/8GB. I would like to have a tape solution, but it's too expensive for my purpose. I get drives as pulls and last-years-models so I only spent US$150, but with tapes at US$10, even 8GB is absurdly small. If I were to go with new equipment and step up to DDS-4, I'd be out about US$1000 for the drive and another US$20 for each 20-40GB tape. Total cost for a basic 3-tape rotating backup: US$1060.

    On the other hand, I could spend (as I have) US$40 on a basic (a.k.a. el-cheapo) FireWire-IDE case, US$30 for 3 removeable IDE enclosures, and (eventually) about US$70 each for 3 60GB IDE drives. Total cost: US$280.

    What do I sacrifice? Not much ... one of the drives might fail. At that point I'd just replace it with another US$70 capacity drive (which would probably be larger.) If I needed to restore something from backup, I'm already looking at up-to 24-hour old data, and if that drive happened to die, possibly 48-hour ... it's unlikely that all the drives would fail at once.

    The advantages? I can use the US$780 I save for something else and I don't have to worry about shelling out another US$1000 every four years just to scale to "current" requirements. I don't know what the upper limit of an IDE drive is these days (i.e. what can the ATAPI bus handle) but even 200GB is pretty big for me right now.

    Anyway, just a few thoughts. The basic thing is lower cost for nearly the same risk ... tapes fail too, you know. Remember, too, that this story would be very different if I had to handle 50 machines instead of 2.

    --
    --- Jason Olshefsky

    Karma: Poser (mostly affected by adding this line long after everyone else did)

  4. Re:the absolute surefire way to back something up. by 2nd+Post! · · Score: 5, Interesting
    But each of your 20k per page can easily encode a unicode value, which means you can cram 2 bytes per spot, or only 50 tons per terabyte.

    But how about a 600dpi laser printer, 8"x10"?

    For good readability, we can use:
    ***
    **
    *
    *
    **
    ***
    For (1,0) which gives us 3 dots per bit, or 200 bits per inch. A square inch would then give us 40,000 bits, or 5,000 bytes. A sheet of 8x10 then gives us 400,000 bytes. Or if you tweak the margins, 400k per page. So that's already 20 times your density. Increase the resolution to 1200dpi, and you can increase the data density to 1600k per page.

    We can also use different encodings: Right now we use 9 bits to encode 1 bit of information (really, really, redundant). We can probably safely use the following encoding to double our data density:
    ***

    ***

    *
    *
    *
    *
    *
    *
    So this further gives us 2 bits of information in the same 3x3 square, which increases our data density another 2fold: 800k or 3200k per page. At 1200dpi, that's 3mb per page, so that 1gb == 333 pages, and 1tb == 333k pages. 67 boxes, or 134 pounds per terabyte.

    There are more variations of course. We can increase density to 4 bits per 3x3 square. With a bit of thought, we can also increase the density up to the theoretical limit of 2^9 values in a 3x3 square, but we want to include some leeway for data redundancy...

    So by doubling to 4 bits per square, we require only 70 pounds per terabyte. By doubling again to 8 bits per square, That's down to 35 pounds.

    That much (little) paper... is actually lighter than a terrabyte of digital storage!
  5. Re:the absolute surefire way to back something up. by schmink182 · · Score: 5, Interesting
    To take this a little farther, a helpful reference tells us some useful information.

    2000 sheets of 8-1/2 x 11, 20# laserwriter paper weighs 20 lbs.
    First of all, this changes your estimate of weight from 100 tons to 250 tons.

    Typical yield of paper: 125 lbs per tree
    250 tons (500000 lbs) divided by 125 lbs per tree gives us 4000 trees.

    440 trees per acre
    This, after division, gives us 9 acres of trees destroyed for backing up 1 TB of data. Seem worth it? :)

  6. Re:warranty period by Kaa · · Score: 5, Interesting

    Who the fuck has 220GB of personal data?

    And what's so weird about it?

    A scan of a single frame of a 35mm film, on a high-end consumer film scanner will create a file... let's see:

    The scanner is 4000dpi, so the resulting image is about 4000x6000 pixels. We are working in 16-bit-per-color-channel mode, so that's 6 bytes per single pixel. A bit of multiplication get you 144Mb. As a practical matter, the film frame is slightly smaller so your output TIFF file is about 120Mb in size. That is for a single 35mm film frame.

    So raw scans of slightly under 2000 film frames will already hit the 220Gb figure.

    Still think it's a ridiculous number?

    --

    Kaa
    Kaa's Law: In any sufficiently large group of people most are idiots.