Slashdot Mirror


Long-Term Storage of Moderately Large Datasets?

hawkeyeMI writes "I have a small scientific services company, and we end up generating fairly large datasets (2-3 TB) for each customer. We don't have to ship all of that, but we do need to keep some compressed archives. The best I can come up with right now is to buy some large hard drives, use software RAID in linux to make a RAID5 set out of them, and store them in a safe deposit box. I feel like there must be a better way for a small business, but despite some research into Blu-ray, I've not been able to find a good, cost-effective alternative. A tape library would be impractical at the present time. What do you recommend?"

7 of 411 comments (clear)

  1. bzip2 by Colin+Smith · · Score: 5, Funny

    And optar:

    http://ronja.twibright.com/optar/

    You know it makes sense.

    --
    Deleted
  2. Tape is your friend by chill · · Score: 5, Informative

    LTO tape, properly stored, will outlast burned optical media and hard drives. Great stuff and designed specifically for what you're talking about.

    http://en.wikipedia.org/wiki/Linear_Tape-Open

    --
    Learning HOW to think is more important than learning WHAT to think.
    1. Re:Tape is your friend by Saint+Aardvark · · Score: 5, Informative

      Couldn't agree more. A tape library (as in autochanger) might be out of your budget, but a simple tape drive wouldn't be too much -- say $5000 for an LTO4. Media is $50-$100 or so depending on where you shop. Seriously, you're not going to find a reasonable way of storing that much data anywhere else.

      BTW, if you're not a member of LOPSA, you may want to seriously consider it. Even if you're not a sysadmin, this is definitely a sysadmin-type question, and their mailing lists are second to none. It's an excellent resource.

  3. Re:Exactly what you're doing by forgottenusername · · Score: 5, Interesting

    I don't think it's a great solution. You're storing relatively fragile hard drives in a raid5 configuration in a lock box? It's not like you can tell if one of the drives goes bad and needs to be replaced when it's sitting in a box. You'd have to regularly pull the data sets out, fire them up and make sure everything is still functional.

    I'd at least want to do 2 complete sets of mirrored drives.

    Tape storage does store better.

    Depending on how important the data is, I might do something like a local mirrored drive set in storage and an online copy at something like rsync.net - stay away from s3, it's not designed to protect data, despite what AWS fans may say.

  4. Re:Exactly. by TooMuchToDo · · Score: 5, Informative

    Because Amazon can be *expensive* compared to doing it yourself ($$$ for data in, $$$ for data out, $$$ for monthly storage). But heh, what do I know. I just manage the storage for one of the LHC detectors (5PB spinning disk, 17PB tape). Amazon is good when you've got VC money or have no IT folks.

  5. I'd encrypt the data and... by Rivalz · · Score: 5, Funny

    Label it something like complete american idol blueray collection and upload it on p2p to piratebay. every couple years rename it to some other horrible popular tv series. It will be self sustaining form of storage with infinite number of redundant hosts.

  6. Re:Exactly. by Anonymous Coward · · Score: 5, Insightful

    Ok, yes, we see you know a lot about this.

    So what's your recommendation?