Slashdot Mirror


Bulk Data Storage For The Common Man?

Vigyaan writes "Lately, I have been looking into different bulk data storage options available to a common man. My work depends on generating, storing and analyzing a large amount of data -- averaging about 1 TB per month. I would like to have a storage system which is automated, fast, reliable and most importantly does not cost the price of an eye. Right now, I have a 4 node Linux cluster with 10 large hard disks (total capacity 1.6 TB); data storage roughly costs about $0.60/GB (excluding the cost of PC hardware). But long term storage is painful -- DVDs cost about $0.10-$0.15/GB but takes too much human time and leaving data on hard disks makes me nervous because of possible failures. RAID is a possibility, but it increases the cost significantly. I was wondering, if Slashdot readers have any recommendations for a cheap automated way to store and retrieve data."

112 of 483 comments (clear)

  1. Finally a use for my 1GB Gmail invites... by anakin357 · · Score: 4, Funny

    I'll send you a couple.

    --
    http://www.fsckin.com/
    1. Re:Finally a use for my 1GB Gmail invites... by Anonymous Coward · · Score: 2, Funny

      That's auhthesis@yahoo.com ? Just wanted to make sure I got it right. auhthesis@yahoo.com . I'm guessing you'd want to have auhthesis@gmail.com as your e-mail address? Invitation is on its way!

    2. Re:Finally a use for my 1GB Gmail invites... by EvilTwinSkippy · · Score: 4, Funny

      Nah, just tarball your backup into 1 or 2 GB file sizes, name it "PR0N XXX TEEN SEX DONKEY LOVE - MILITANT ISLAMIC BUKAKKE KITTEN.MPG.AVI.WMV" and share is on Gnutella.

      --
      "Learning is not compulsory... neither is survival."
      --Dr.W.Edwards Deming
    3. Re:Finally a use for my 1GB Gmail invites... by anakin357 · · Score: 3, Informative

      An invitation has been emailed to your friend.

      Yee. Sent someone else who replied a invite too.

      Mod this up and I'll send you one too. :P

      --
      http://www.fsckin.com/
    4. Re:Finally a use for my 1GB Gmail invites... by e133tc1pher · · Score: 2, Interesting

      Actualy you may have something here... Lets say you want to back up your precious config files in your /etc/ directory, just take a real porn picture and use a stenography program to hide your config file in it. Hell, if its a real good picture, you can probly get away with it bieng a couple megs. Just share them on your favorite file sharing service, if they support chat hype them up. I wouldn't use this as even a backup to your real backup, but if all hell breaks loose, you know where to go : )

    5. Re:Finally a use for my 1GB Gmail invites... by superpulpsicle · · Score: 2, Informative

      People are so hyped up about Gmail. Did you know that www.hotmail.com is NOT effectively backed up as of 2002 (cough, inside source, hmm).

      Knowing how long hotmail and M$ has been around, and still failed to backup hotmail with their infinite windows license. What makes you think your 1 Gig will be backed up by Google.

    6. Re:Finally a use for my 1GB Gmail invites... by __aafkqj3628 · · Score: 2, Informative

      (cough, inside source, hmm)

      Inside source? Just call them up and ask! It's not hidden knowledge.

    7. Re:Finally a use for my 1GB Gmail invites... by EvilTwinSkippy · · Score: 2, Interesting
      Yes you do have to backup your backup. This is why I abhor backing up to disk. (Or at least a single disk.) The advantage of tapes, for all their warts, is that you have several copies of them going back through time. When a user shows up at my desk looking for a file that MAY have been on the array a week ago, something that get's mirror (and only mirrored) once a night isn't going to cover it.

      You always need at least 3 generations of backup. The Current backup, the "father", and the "Grandfather." These are complete backups, not incremental. And you need them in case you run into a media error. In our case we keep the last week of tapes, a weekly backup from the last 30 days, a monthly backup from the last year, and a yearly backup starting at the dawn of time.

      If the data isn't worth backing up properly, you might as well not bother backing it up at all.

      --
      "Learning is not compulsory... neither is survival."
      --Dr.W.Edwards Deming
  2. Hard disks by ConsumedByTV · · Score: 5, Informative

    You're always going to get a better rate with Hard drives but you're going to be prone to failure.

    If you buy them in bulk you can save.

    Burning DVDs is going to take you forever and drive you nuts.

    Find a hotswappable set of drives and use that for your offline backups. Use a raid for your current backups.

    --


    "Not my manner of thinking but the manner of thinking of others has been the source of my unhappiness." - M
    1. Re:Hard disks by littlerubberfeet · · Score: 4, Informative

      hard disks are good.

      If you want one of those nifty things with robotic arms and whatnot, plan on spending upwards of $3500. The AIT Automated Tape Library goes for that much and holds only 15 tapes. Plan on spending tens of thousands for something like Ampex's DIS 914 for 30 Terabytes.

      Your friend is right: tapes or cheap. The equipment needed to support them is expensive, slow and error prone. It gets cost effective once you have enough money for a new Porsche though...

      --
      Sig (appended to the end of comments you post, 120 chars)
    2. Re:Hard disks by ConsumedByTV · · Score: 2, Interesting

      I personally use a firewire enclosure, it's fast, it's hotpluggable and it's easy to swap the internal disk.

      --


      "Not my manner of thinking but the manner of thinking of others has been the source of my unhappiness." - M
    3. Re:Hard disks by DetrimentalFiend · · Score: 5, Interesting

      We're dealing with storage issues right now at work, and what we're doing is buying a server with 8x250 GB SATA drives. We then run the drives in raid 5, so we have 1.75TB of storage space (unformatted). Including computer costs, it's running us about $2.50 per GB, but it's a very beefy 3u server. For backup, we're currently backing up to tape. That costs us under $0.50 per GB with ultrium tapes. For some of our data, we've been backing up to DVD's, but we've pretty much given up on that. In the long run, it's not worth it.

    4. Re:Hard disks by silas_moeckel · · Score: 2, Informative

      If your going to just plug in backup and swap try the USB 2.0 to IDE backup boxes pretty much its a power brick and an US to IDE chipset in a plastic case with a 40 pin IDE connector on it. You plug in the drive and your good to go. No cases or hot swap caddies to deal with. And 5400 RPM drives dont get hot to the touch sitting on the desk. It's not pretty but if your just running backup keep on buying $100 IDE disks (generaly best cost per GB)

      --
      No sir I dont like it.
    5. Re:Hard disks by eric76 · · Score: 3, Informative

      Tapes can be pretty dependable, but you need a better quality tape system than that typically sold for PC backups. The 20 GB tapes are just not that dependable.

      If I had the money, at a minimum I'd get a tape drive that could handle the 200 GB (uncompressed) tapes. Something like IBM's LTO Gen-2 Tape Library. That should run a bit less than $6,000.

      For that matter, if I won the lottery, my first purchase would probably be a top of the line tape backup system instead of a the usual new car.

      Since I can't afford it, I use DVDs and CDs for backups. They are a pain in the neck and are not that dependable, but I keep backups up to a year on DVD+RW so if one fails, hopefully the others will have the data.

      Instead of writing directly to the DVD writer, I write the backups to disk and then copy the backup sets to the DVDs.

      I also keep a complete current backup of nearly everything important on a seperate computer.

    6. Re:Hard disks by Naffer · · Score: 2, Funny

      I think you're absolutly right.
      Tapes or cheap.

    7. Re:Hard disks by eric76 · · Score: 2, Informative

      Use RAID to increase your on-line availability.

      RAID does not a backup system make. You still need backups.

      For increased on-line availability, how about a good distribued file system with several servers? And, of course, back everything up anyway.

  3. Waiting for ... by Entropy · · Score: 3, Interesting

    Blu ray based dvd burners.

    Those will be sweet =)

    --
    The sea changes color, but the sea does not change.
  4. good luck by Madcapjack · · Score: 4, Funny

    PRINTSCREEN should do the trick.

    1. Re:good luck by mlk · · Score: 2, Funny

      please dont joke about that, not too long ago I received an email, please help me open this word document. Complete with a BMP, natrally I assumed this was the error message. Nope, it was a "print screen" of its Icon. And yes, while I had said picture up, the user did try double clicking.

      Poor luser.

      --
      Wow, I should not post when knackered.
    2. Re:good luck by schtum · · Score: 3, Funny

      Translation:

      Please don't joke about that. Not too long ago, I received an email asking for help opening a Word document. Attached was a bitmap image which I naturally assumed was an error message. Instead, it was a screen-capture of the document's icon! The user was double-clicking on the image!

      So I shot her.

  5. Wirewire drives? by NanoGator · · Score: 5, Interesting

    For long term storage, how do you feel about firewire drives? Maybe not as cheap as you'd like, but you can get them in >160 gig flavors, plus you can hook them up to just about anything. Once you do the backup, which'd be a simple copy and paste, you can just unplug the drive and store it in a safe or something.

    Again, I'm not sure if that's as cheap as you'd want, but that's a solution I came up with for a similar problem. My company's going to be 3D rendering some stuff that could end up eating 50 megabytes a frame. (Extra data is stored for future refinement... I can go into detail if I've piqued anybody's curiosity.) We can't afford to lose this data, so the Firewire drive approach is what we're considering right now.

    --
    "Derp de derp."
    1. Re:Wirewire drives? by Rik+van+Riel · · Score: 3, Informative
      For long term storage, how do you feel about firewire drives? Maybe not as cheap as you'd like,

      Oh, but they are cheap. Just buy a large IDE disk and a $30 firewire/fast-usb enclosure.

      I'm just not sure about the "long term", though. I have no idea what the shelf life of a hard disk is.

    2. Re:Wirewire drives? by littlerubberfeet · · Score: 5, Informative

      Lemme address the firewire thing: I work in a sound studio, and we generate about 5-8 gigs of data a month, mostly music for TV. This isn't a huge amount, but we rely on multiple sets of Firewire drives for backup and then internal hard drives for current projects. This means we have all 400 or so projects at our fingertips. Given how fast we do things, this is important.

      Lacie makes their 1 terabyte firewire (943 gigabyte formatted) drive. I we get them for $1,080 a drive (Macmall matched Provantage's price). This is more then the article author spends now per gig, but these drives have done quite well in the studio. You can find cheaper firewire though.

      We are at the point where hard drives give the best bang for the buck. The only fault of firewire is that my bosses have burned several bridges. ground yourself before unplugging the drives. The bridges were cheap though. In any case, hard drives are probably the most failsafe and cost effective solution, with firewire being the easiest interface to use those drives with.

      --
      Sig (appended to the end of comments you post, 120 chars)
    3. Re:Wirewire drives? by Anonymous Coward · · Score: 2, Insightful

      USB2 is *not* faster then Firewire 400
      USB2 is *bustable* up to 480mb/s transfer
      Firewire can *sustain* 400mb/s transfer
      In almost all cases, you'll find Firewire much faster.

    4. Re:Wirewire drives? by SlamMan · · Score: 4, Informative

      USB makes the computer actually do work, while firewire ports handle it themselves. For a normal user, not much of an issue, but over a couple drives, you'd notice.

      --
      Mod point free since 2001
    5. Re:Wirewire drives? by insert+3+letters · · Score: 2, Informative

      Exactly I've run the exact same drive (WD 120 SE) in a usb2 enclosure and a firewire. On benchmarks, the firewire was generally about 30% faster. More reliable connection too.

    6. Re:Wirewire drives? by NanoGator · · Score: 2, Funny

      "here are no external firewire drives. There are (to my knowledge) no Firewire drives at all.
      People take standard ATA/IDE drives and use an ATA/Firewire bridge to connect them up externally and bypass the extremely limited cable length of ATA."


      Well that totally blows my point out of the water!

      --
      "Derp de derp."
  6. Personally I prefer something in a blonde by kfg · · Score: 4, Insightful

    I was wondering, if Slashdot readers have any recommendations for a cheap automated way to store and retrieve data."

    Although the good ones don't come cheap. I guess this another case of "pick any two."

    KFG

    1. Re:Personally I prefer something in a blonde by bheerssen · · Score: 2, Informative

      mods, this is not off-topic.

      KFG meant to say "You can have fast, good, or cheap. Pick two."

      It's an old software design maxim that applies suprisingly well to this subject.

      --
      (Score: -1, Stupid)
    2. Re:Personally I prefer something in a blonde by mcrbids · · Score: 2, Interesting

      KFG meant to say "You can have fast, good, or cheap. Pick two."

      It's an old software design maxim that applies suprisingly well to this subject.


      ...and to many things, particularly if you replace "fast" with "convenient". Just for kicks, think about it.

      Food? Check. Clothing? Check. Beer? Check. Housing construction? Check.

      Pretty much anything that involves the exchange of money for goods and services follows this maxim.

      --
      I have no problem with your religion until you decide it's reason to deprive others of the truth.
  7. 1TB a month?!? by stinkydog · · Score: 5, Funny

    Short of launching his own space probe, the only way for this guy to consume a TB a month of storage is a serious porn habit. Just post your 'content' on Edonkey and it will be available when you 'need' it. You likely only watch them once anyway.

    SD

    --
    âoeWho knew something as harmless as willful ignorance could end up having real consequences?â
    1. Re:1TB a month?!? by grasshoppa · · Score: 5, Funny

      Using this method, I have achieved my life long dream of tapeless ( well, everything-less ) backups.

      I simply make a tar.bz2 file with all my important files, filter it through gpg, then post it on edonkey, usually titled, "Olsen twins getting it on", and then usually the date.

      Viola, instant backup that is available to me whereever I may go.

      --
      Mod me down with all of your hatred and your journey towards the dark side will be complete!
    2. Re:1TB a month?!? by gl4ss · · Score: 5, Funny

      ** You have obviously never heard of fMRI studies, have you?**

      oh shit! I totally missed the part of the history where FMRI scanners came commonplace for men.

      oh wait the whole ask slashdot blurb is twisted, the headline implies asking for datastorage possibilities for the common man - yet one of the first things mentioned that he needs it for his special job that generates tb's of data per month. by that definition he is not a common man, except that he hopes to have a miracle solution - that is quite common.

      still, a common man would choose whatever possibility gave the cheapest price per gb(probably harddrive). with dvd-r's he would end up burning multiple dvd-r's per day and it's kind of implied that the data would need to be retrievable so he would have to burn the same disc multiple times, even then it wouldn't be a sure thing.

      his needs are quite bigh though still, big enough to warrant for professional help since his likely going to be spending quite a bit of money on the thing.

      --
      world was created 5 seconds before this post as it is.
    3. Re:1TB a month?!? by Zone-MR · · Score: 5, Funny

      Oh, so that explains why that "Olsen Twins Getting it on - 12 Mar 2003.avi" file I downloaded last week contained a zipped tar archive full of boring spreadsheets and a lot of donkey porn.

    4. Re:1TB a month?!? by EvilTwinSkippy · · Score: 3, Funny
      Well if you are the one behind all the "Islamic Militant Bukakke Kitten" porn, you are one sick bastard.

      And you really need to fire your accountant. Your Caymon Island bank account was overdrawn twice in a month.

      --
      "Learning is not compulsory... neither is survival."
      --Dr.W.Edwards Deming
  8. Cheap solution by codeguru73 · · Score: 4, Insightful

    Buy some inexpensive IDE drives with high storage capacity and use a software raid solution. What kind of budget do you have anyway?

    1. Re:Cheap solution by Wudbaer · · Score: 3, Insightful

      Repeat after me:

      RAID is not backup !
      RAID is not backup !
      RAID is not backup !
      [..]

    2. Re:Cheap solution by EvilTwinSkippy · · Score: 2, Interesting
      RAID is not backup!

      Indeed, I don't believe in any backup that doesn't have multiple copies that can be stored offsite. Fire really doesn't care what was on your hard drive, nor do thieves, or axe-wielding maniacs.

      And anyone who has been in IT long enough can tell you one of the above stories first hand.

      --
      "Learning is not compulsory... neither is survival."
      --Dr.W.Edwards Deming
  9. Bulk storage? by mikers · · Score: 3, Funny

    I got a couple of drawers of old floppy disks. $10 takes 'em. Plenty of bulk.

    The Sony "lifetime" warranty may still be good on them too!

  10. age old problem... by Lumpy · · Score: 5, Insightful

    Ahh the large amount of data that has X value versus a storage solution...

    If your data is worth $20,000.00 then a $2000.00 solution is dirt cheap.

    what is your data worth? that is where you need to start and then look at the 10-30% of the data's value to start looking at how must to spend on it's storage.

    If 1 month's data was lost forever, how much money would it cost the company? that is your actual $ amount that you should be shopping at.

    and that is how I got the company to buy a $20,000.00 1000 tape DLT jukebox.

    my data is worth over $100,000 a month and is much lower than yours is size.

    That is where you need to start. Justify your storage costs by figureing out what it is worth to begin with.

    --
    Do not look at laser with remaining good eye.
    1. Re:age old problem... by D-Cypell · · Score: 4, Funny

      my data is worth over $100,000 a month

      This 'data' doesnt happen to be a large collection of email addresses does it?

    2. Re:age old problem... by Servo · · Score: 3, Informative

      First of all, don't bother with DLT. It is slow, and increasingly more unreliable as DLT is phased out of production and replacement parts are actually refurbs.

      --
      A slip of the foot you may soon recover, but a slip of the tongue you may never get over. -Benjamin Franklin
    3. Re:age old problem... by TinyManCan · · Score: 2, Informative

      LTO 2 drives are the current trend in large enterprise storage. LTO is the new hotness, DLT is old and busted.

  11. Well... by stonecypher · · Score: 3, Interesting

    Depending on your budget, the appropriate thing to do may be to get an automated DVD burning system to do scheduled incremental backups in duplicate. We used to do that with CDs at an ISP I used to work at. It's unfortunately difficult to search for while not getting people pirating movies, but this is the first thing I found on Google; doubtless there's better out there.

    --
    StoneCypher is Full of BS
  12. Tape? by darkjedi521 · · Score: 2, Interesting

    I haven't seen anyone mention magnetic tape yet. I'm sure it has its drawbacks too, but considering its still widely used for backup purposes in a commericial environment, it can't be too bad. Especially depending on how much a cartridge can hold. Its not the cheapest, but it might be something to look into.

  13. Buy an older tape drive by Apparition-X · · Score: 5, Informative

    Look for an LTO gen 1 or SDLT220/320 on ebay, with a SCSI connection (some of them are fibre, and I assume you don't want to go there!). Don't forget to pick up some tapes. In general, this sounds like it would work if you plan on doing this for a while, and can leverage the initial investment over months or years.

    Capacities are (for the cost of a sub $50 tape):
    - LTO1: 100 GB uncompressed
    - LTO2: 200 GB uncompressed
    - SDLT220: 110 GB uncompressed
    - SDLT320: 160 GB uncompressed

    If your data is particularly ammenable to compression (i.e. database data) you could easily get 3 or 4 to 1 compression with these drives without sinking your CPU utilization.

  14. Oh, I see... by spinkham · · Score: 3, Funny

    You want it fast, cheap, reliable, easy, and now, eh? Good luck with that.... Sounds like a request from the PHB...

    --
    Blessed are the pessimists, for they have made backups.
  15. 1 TB/Month by nikonius · · Score: 2, Insightful

    It does not sound like your needs are anywhere near that of the 'common man'. You sound more like a power user to me. Somethimes you have to pay for heavy-duty storage as the cost of doing business.

  16. Cheap and Big by guamman · · Score: 2, Insightful

    Tape Drives - Probably the cheapest way to store large amounts of information. The only drawback is that they aren't fast. However, If your harddrives are large enough to hold the data you are currently working on and tapes are used exclusively for backup then a speed problem shouldn't be . . . a problem.

  17. Re:!RAID by ecalkin · · Score: 4, Insightful

    because it protects against device failure, not *user* error. if you delete a file from a raid array, it's gone. that's part of what offline is all about.

    eric

  18. eh by maxbang · · Score: 2, Interesting

    If you don't have the patience for DVD backups (neither do I), then you're pretty much stuck with RAID. So buck up, spend the extra cash, and setup a storage box or two on the network with one or two terabytes in each. I have a branch of my network setup on gigabit, one box has 250 GB of storage on RAID 1 across two 250 GB (this one's for video projects), the other has 160 GB in RAID 0 (my learning system). Works fine and easy as hell to setup. If I need to add storage I can either add some drives or just add another box. I've thought about using GFS, but I don't know enough about it to implement it, yet. Anyone here currently using GFS?

    --
    I also reply below your current threshold.
  19. compression by Suppafly · · Score: 2, Informative

    First off, if you aren't already compressing that data, start. You may be able to cut the size down dramatically using compression.

    Then backup using tapes just like every other place that has to do backups. Generally do full backups once a week and incremental ones nightly or whatever is necessary based on the data you are working with.

  20. spongedrive is best by cubyrop · · Score: 5, Funny

    i am responsible for providing storage solutions for a mid-sized content creation company which, through version archiving, accumulates near 1-200 GB per day. they require access to their media backups on a rolling basis, so tapes are not an option.

    i have found that a Teutonium cluster of 6.5 TB Spongedrives (either Cray or SecreTech are fine) fits the bill nicely. housed in a 15-unit rack server, the amoeba-shaped drives utilize BioLas technology to store data on 6-dimensional Moebius Cilia for a slick seek time of 0.00 ms.

    a cluster costs about $45,000 USD but the price should come down in 2004 Q4 when SecreTech launches their new 40-platter blackholium SCSI's.

    --
    If I could make this sig kill you, I would.
  21. Drawbacks, what are you willing to put up with? by Anonymous Coward · · Score: 5, Informative

    All forms of media/backups have their own drawbacks... but some aren't as bad as others, and the others often are more accessable.

    Tape: Tapes break, they wear, they have dropouts, take a while to back everything up, can't always access files if you just want to restore something (Different methods vary, folks)... but ultimately, it's cheap when you use DAT because they're a common media. Swap the tapes twice as often (and throw old ones out) if you're paranoid about tape related failures.

    Hard Drive: Most common form of backup I see now, mainly for the 1:1 size factor. Yeah, drives fail, too. Sometimes you have a pretty good warning when this is going to happen, sometimes you don't. (My 13GB Maxtor and 40GB IBM Deathstar drives both went *pfft* on reboot.) Get enough of them at once, you could swap out the logic boards if one does fry out. Ultimately, RAID or just simple 1:1 mirroring is probably the most efficient and easy method. Accessing bits and pieces is also easiest under this method. I personally just use an external USB2 case with a 120GB drive in it. Everything I want to back up goes on that drive, and then eventually... DVDRs. I turn off the drive when I don't need it... hopefully prolonging the life of it when I need it most.

    DVDR: Not anymore. If we had these new-fangled DVDR discs (+ or -) say... when 2 to 6GB drives were common.... sure... But in addition to hard drives, recovering selective files is easy under this method too... Unless you use a backup program that crunches everything together on the disc in some spanning format. Burn times can be tedious... but it's not bad if you consider the overall amount of data you're putting on the disc. Cheaper than quality-brand name CDRs, though, in terms of price per mega/gigabyte. Only an idiot would trust $0.01-per-disc spindles for long-term backups. Even the longevity of DVDR has yet to be seen...

    CDR: I'm not going to bother.

    Network: Well, still relies on hard drives and other components... but good if you don't want to saddle one room with a ton of boxes. Simply for space and efficiency... external drive is probably better anyway.

    Old fashioned method: Print everything out and keep it in a filing cabinet somewhere. You could always OCRA the stuff later. ;-)

    1. Re:Drawbacks, what are you willing to put up with? by wik · · Score: 2, Insightful

      Pray tell, what are you going to do with just a network? Store bits on the wire and network card rx/tx buffers? There's gotta be something big on the other end of the cable, dude.

      --
      / \
      \ / ASCII ribbon campaign for peace
      x
      / \
  22. LTO Ultrium 2 Tape Drive by jeffgeno · · Score: 4, Informative

    The drive will run about $4000, but the tapes are only around $0.20/GB assuming a 1.5:1 compression ratio. And keeping that assumption, 1 TB of data should only take 3 200 GB native tapes per month, so swapping wouldn't be so bad with the single tape drive. An autoloading library would be significantly more expensive, but if you really need automation, that's the way to go.

  23. The "Common Man" doesn't need bulk data storage... by syberanarchy · · Score: 2, Insightful
    The best idea I could give you is to just create a sister system, where you mirror all your data. Not cheap, but cheaper than getting a pro-grade solution.

    The reason you won't find such things on the cheap is because the average person with a PC doesn't even know what a GB is. He simply goes into the store, the sly salesman says "oh, what do you need it for," and then says "well 60-80 gb should be all you ever need."

    Now, contrast that to me - my friends shit when they hear I have a 250 gb drive and a 120 gb drive, as well as an extra 60 gb on a networked machine. They can't fathom ever needing that much space. I know that's probably a pittance by Slashdot standards, but it's true :(

  24. Why not tape drives? by Silent8ob · · Score: 2, Insightful

    Look at what the rest of the corporate world uses for large scale storage management. It is still ruled by Tape drives.

    I don't know how much an eye goes for at the moment, but if you can spring for a Super DLT drive you'll get up 320GB (Compressed) for each tape.

    It all comes down to the Quality:Cost:Time triangle.

  25. Easy by Pedrito · · Score: 4, Funny

    I use bioneural gel packs at a cost of $0.04 per teraquad. What is this hard drive of which you speak?

  26. Hijack Cassini by Anonymous Coward · · Score: 5, Funny

    ... and program it as a repeater.

    It's about 90 minutes away, so at 250 Kbps that's over one terabit in storage on the way out there, and another terabit on the way back.

    Worst-case access latency is about three hours, though. Maybe the hard disks are a better idea.

    If you send your probe^H^H^H^H^H repeater to Alpha Centauri, you'll get more than 20,000 times the storage capacity.

    1. Re:Hijack Cassini by mangastudent · · Score: 2, Interesting
      ... and program it as a repeater.

      It's about 90 minutes away, so at 250 Kbps that's over one terabit in storage on the way out there, and another terabit on the way back.

      You laugh, but before Project Whirlwind (which created the physical modern computer as we know it) settled for 16-64 bit CRTs (DRAMs, I suppose? :-) and then invented 3D core memory, they seriously considered leasing a micro-wave line to do exactly that.

      At the very beginning, people were really hard up for memory solutions; e.g. the first Univac model used mercury delay lines, a variant on this concept.

      (The out of print Project Whirlwind: The History of a Pioneer Computer by Kent C. Redmond is recommended if you're interested in this area of history.)

  27. Use those HDDs! by jsm008us · · Score: 2, Interesting

    You can get cheap computers from the trash, donations, bulk, etc. You can use that cluster to mirror your data once or twice. I don't know what data you have, but if you have the same data on more than 1 different hard drive, you can be rest assured it will be fine. Or you can just print it all!

    The stockmarket is backed up to three (or more?) seperate locations. Look into NVRAM (e.g. flash media) or a cluster with all those hard drives linked together, with a constant backup. With the builtin IDE controller on most motherboards, you can hook up to 4 Hard Disk Drives. If you add SATA, RAID, SCSI, and IDE, you can have lots of hardrives on one machine!

    You could also rotate hard drives, so they arent constantly used (making the whole system last a LOT longer!) or replace the drives that are about to fail (which would be at least in 3 years!). Most Hard Drives could probably handle 5 to 10 years no prob (maybe even 20 if they are rotated!).

    It all depends on what you have and what you want to do!

    --

    mysql>SELECT * FROM users WHERE clue > 0
    0 Rows Returned
    1. Re:Use those HDDs! by Cecil · · Score: 4, Funny

      I'm just curious, do you have any idea how much data 1 Terabyte is? Are you suggesting that he PRINT it?

      Let's say for the sake of argument that all 256 bytes can be printed as a visibly distinguishable character, or that he's got 1TB of plaintext. Also assume you can fit 10,000 characters on a 8 1/2 by 11 page.

      You can fit 10^4 bytes per page, and you need to print 10^12 bytes (I know, it's actually 2^120, but that needlessly complicates the math, so shush)

      That means you will need 10^12 bytes / 10^4 bytes/page = 10^8 pages.

      One hundred million pages. Assuming he has a good laser printer with infinite toner, let's say he can print 60 ppm or one page per second. It would take one hundred million seconds to print the data, which is 1157 days, or a little over 3 years.

      Given that he generates 1TB per month, I think this backup plan would probably become the top agenda item of most of the anti-deforestation groups out there.

    2. Re:Use those HDDs! by deranged+unix+nut · · Score: 3, Interesting

      Depending on how you do it, you can get a lot more than the density that you assume. Check out www.paperdisk.com.

      That said, this method would still be more than twice as expensive as storing data on hard drives, would still require a million pages, but would take a little under 2 weeks to print.

      It still doesn't seem like a feasible option.

      The up-side is that, if stored properly, the data would likely be safe potentially for many hundreds of years.

    3. Re:Use those HDDs! by kava_kicks · · Score: 2, Informative

      That Paper Disk idea is pretty interesting. I read ages ago that a large Australian government agency (whose data storage goal was LONGEVITY rather than short-term backup) - might have been the Bureau of Statistics - chose to store/archive all of their data onto microfilm ...

      "Microfilm!?" I hear you say, "But this is 2004!". I was suprised too until I heard their reasoning: the only thing you need to read microfilm is a magnifying glass and a light source.

      And that, ladies and gentlemen, makes it virtually future-proof. Try finding a hard-drive for your old 5 1/4 C64 floppy disks (or better yet, those bloody tapes that took 30 minutes only to fail to load)

  28. Do what Google does by glinden · · Score: 5, Insightful

    Build yourself a cluster of cheap boxes with cheap IDE disks and replicate your data across them. Because the data is replicated across your cluster, no need for backups or RAID.

    1. Re:Do what Google does by swb · · Score: 3, Insightful

      It's a great idea, but one of the problems is what happens when your data goes bad before you realize it and it gets replicated. Then you want what you had yesterday, and that means tape.

      You can solve this by ensuring some kind of in-process backup (like a SQL maintenance schedule, where it replicates itself), but then you're loading your replication process with a bunch of data that doesn't really need to be online, it needs to be in a vault someplace.

      Besides, Sarbannes-Oxley and the IRS want you to keep backups 5+ years anyway, so this replication-only model is only good for data whose internal integrity isn't meaningful to anyone but the owner.

  29. Vital information left out by dfghjk · · Score: 3, Insightful

    How many months at 1TB/month do you require access to online? After you are done with data can you discard it or do you need it archived? What is the cost of losing your data set at any given time? In what manner do you expect to access it (read/write mixture and sizes plus aggregate throughput and number of client connections). The answers to these questions could cause the cost of a solution to vary but a couple orders of magnitude.

  30. options options, what is your time and data worth? by segfaultcoredump · · Score: 5, Insightful
    Lets see.... hard Drives are running about $0.50 per GB, DVD's are running about $0.06 per GB (100 pack, "house brand", not something I'd put my data on but this is slashdot, and there are idiots out there who think that it is a good idea), and tapes are also running about $0.20 -> 0.50 per GB (for the DLT/AIT/LTO type, the ones that have enough capacity to not drive you nuts)

    So, you can put your data on 4-5 HD's, 10 tapes or 232 DVD's per month. The Cost of doing so will be about $500 per month for the tapes or HD's and $50 for the DVD's (assuming your time cost $0)

    At work, we had a need to keep a few TB of data online permanently, so we purchased a few NexSAN ATABeast's. At $50,000 for 10TB of usable storage ($5/GB), they may be a bit out of your price range. The advantage is that you can hold almost a years worth of data and it is protected by RAID5. It also makes management a lot easier, since it is very difficult to mount 42 300G drives in a single chassis (and it takes only 4U of rack space).

    On the low end, NexSAN has the ATABoy2 or ATABaby (2TB or 1TB) for the $8-$15K range. This will let you hold a months worth of data

    On the high end, You have EMC disk arrays (Think upwards or $20+/GB for the 'cheap' stuff from them.

    Overall, if you have 1TB per month, you need to either a) get a grant to fund your work, b) hire somebody to swap DVD's for you or b) seriously rethink your data generation.

    Any of the "cheap" storage methods have serious drawbacks, and the low cost ones are, well, not so low cost if $15,000 sounds like a lot of money to you.

    otherwise, good luck

  31. If its volume you want by TheUncleBob · · Score: 5, Informative

    If you are more interested in volume than speed, then the emphasis should be on the 'ID' part of RAID. Inexpensive Disks. If you used 160GB Drives, which appear to have the best bang for your buck at the moment, and put 6 (yes 6!) in a pc. Just use any old cheap pc (I use 200-400Mhz PII)

    Run the disks RAID 5 and you will get about 800GB of storage for $600 . Now get two cheap ata100 cards so you have a total of 6 channels, and mount each drive as a master on each channel. Build a 2gb root partition on the first disk (mirror it if you want) and then set the rest of the space up as a huge raid 5 array.

    Et Voila cheap, big server. To archive data, turn off pc, and throw into attic :-)

  32. Re:Give Up Now by Zone-MR · · Score: 5, Informative

    No figures, but I think the opposite. I've had several DVD-R disks which I've written backups to only to discover that they are unreadable a year later. My personal experience has been that HD's are unreliable, but less unreliable than writable DVDs.

    Of course higher quality media might be better, but then you can no longer quote the $0.10/GB figure.

  33. CD Changer by andrebsd · · Score: 2, Interesting

    Well, I have a cd changer for computers made by NSM... It's scsi (comes with a 2x reader origionaly) so all you gotta do is find a scsi dvd burner (or a long enough ide cable and convert it, since the motors are all powered by a com port anyway) and replace thd drive, (or like in my case, a cd-rw - had the drive for a while, so at the time a dvd burner would have cost to much) then you have 100 dvd's you can burn data to automatically, and when those are full just swap them out for new ones.

    Now the problem is, you can only get 430gig's out of one changer using single layer dvd's... Double would bring you to 970gig's per changer.

    Assuming you can get the unit for 100 bucks or so, and the dvd drive costing 100 (69 bucks at frys).. Then you have a 200 dollar backup unit that can store 430gigs of information onto dvd's

  34. 4*400Go Sata on Raid 5 by da5idnetlimit.com · · Score: 5, Informative

    depending on the value of your data, you should try having a nice 4*400Go SATA in raid 5 *2, possibly using a distributed file system for redundancy...

    Not the cheapest, but fast, simple and saves you the unholy pleasure of having 2-3 DLT boxes to archive/cycle each month...

    You already have a linux cluster, so implementing a distributed file system, or even simply a nightly incremental mirror to the target server if you can afford losing one day work/computation...

    It would help if you told us what sort of data you work with... from databases and to automated telescope tracking system, both need large amount of storage, but you won't need the same system array for each...

    I seem to remember a /. story on a rackable Petabyte storage system

    You don't need to go to the Petabyte capacuty but you will find some interesting comments on filesystems, disk virtualisation, 1U rack providers and so on....so a 1 Terabyte rack server is definetly possible...

    Good luck...

    --
    It takes 40+ muscles to frown, but only four to extend your arm and bitchslap the motherfucker
  35. DLT is the way to go by pastpolls · · Score: 3, Informative

    I actually use a DLT with autoloader I got off ebay for under $200. I then bought a lot of used DLT tapes (100) and use them to backup my Video and DVD projects. It is great because when I fill my offline storage (about 1TB) I just fire up the backup software and get the old DLT going overnight. It is done by morning and the shelf life for those tapes is about 20 years.

  36. Ultrium by 7vEn_T_7vEn · · Score: 3, Informative

    I'm not sure what your budget is but if your like me you want something that complies to standards so it will be around, is cheap and effective. For this I would have to recommend an Ultrium tape backup drive. The drive is standards based (google it) and the tapes are dirt cheap a 200/400 gb tape pulls up for $55. If you figure (hardware compression) 250gb of storage per tape then it will cost just $.22/gigabyte. The problem is that the drive itself is listing for about $2600, not exactly cheap but it's guaranteed to be backwards compatible with future lto standards and the media is as cheap as you could possible ask for. One more thought, look into an LTO Gen 1 solution (100/200) for a cheap drive, cost per gigabyte is roughly the same, it will just take more swapping.

  37. Consider Online Backup by jp10558 · · Score: 2, Interesting

    One company that provides massive online backup and storage at reasonable prices is Streamload. You might want to check them out.

    --
    Opera, Proxomitron-Grypen,GPG 0x0A1C6EE3
  38. Re:I have one by Chess_the_cat · · Score: 2, Funny

    Just do what I do: memorize it all. No need for binders, cards, floppies, HD, DVD.

    --
    Support the First Amendment. Read at -1
  39. Only on slashdot by gexen · · Score: 5, Funny

    Only on Slashdot would they start talking about huge storage arrays and title it "for the common man"

  40. carousel for everyman by Doc+Ruby · · Score: 3, Informative

    A DVD-R jukebox can give you 200 DVDs at once. That's $3600 (drive/changer) + $268 (1000 DVD-Rs), for (1000*4.7GB) 4.7TB@$4000, or $1.18:GB. That's almost double your HD cost, but you'd need at least another host PC, and multiple controllers for the 16HD RAID, which is probably another $1000. And another $268 buys you another 4-5 months storage, so by next April you're down to $0.14:GB; in a year you're at $0.12:GB. A shelf of 200-disc "CD" books will hold your archives, 1 book per carousel for "fast" retrieval. Backup all your DVDs offsite at $0.27:GB. As DVD-R prices fall over time, you're probably looking at something like $0.05:GB, probably less than even plummeting HD prices. And the DVDs (especially with the cheap backups) are much more reliable, especially over 10 years, than the HDs. If you are looking at 10 year archive, at $80:month in DVDs, for 29% more money you can add a second host PC/changer set, left in their boxes, in case the original PC/changers fail.

    --

    --
    make install -not war

  41. Cheap hardrives with firewire enclosure by ippearx · · Score: 2, Informative
    http://www.wiebetech.com/products/ComboDock.html

    Makes an cheap, fast way to put lots of data onto lots of hard drives. Using one of these bad boys means no extra money is spent on drive enclosures, cases etc. You only buy raw standard hard drives. Excellent if it's only backup, and you do not need lots of access. This solution is not automated however.

    Hard drives are prone to failure. I was thinking of buying at least 2 drives of different brands to mirror, storing them in separate locations in sealed, air tight containers at just the right humidity/temperature. Also I think a disk check every 6 months or year would be necessary, and if any problems are found, replace the disk with another.

    One beauty with this method is you only need to pay for disk space as you need it, and hard drives may still get much bigger. I was going to buy drives at the lowest cost/megabyte which at the moment is 160GB drives.

    I would love to find more information on the physical storage of hard drives, especially how long they would be expected to last without use - months? years?

  42. Re:1024 GMail Accounts by Chess_the_cat · · Score: 2, Informative

    Too bad the maximum attachment size is 10MB.

    --
    Support the First Amendment. Read at -1
  43. Um, 1TB a month in IDE drives is cheap . . . by millisa · · Score: 2, Informative

    I would hope that if you are working with a TB of data, the value of that data is pretty high . . .

    Promise SX6000 = $255.95. (6) 200GB IDE drives in a Raid 5 = $624.95

    If you had a separate boot drive from the SX6000, you could just bring the system down for a couple hour maintenance once a month and slam all the drives out and put fresh ones in.

    Just keep buying new 200GB drives anymore and shelf the old ones (or if its *really* valuable and your home firesafe isn't enough, pay Iron mountain or someone to keep it).

    There aren't hidden labor costs outside of those two hours it takes to setup a new array every month (DVD's are about 60 bucks a month for a TB, with a hundred or so for a drive (which *will* need to be replaced occasionally if you are burning that much) but you'll spend hours and hours just dealing with the swap outs and breaking up your data . .. )

    If you don't have to keep the TB of data after a month or three, then your price gets even cheaper after you invest in your initial hard drive media sets . . . and you can put all the drives in hot swap chassis to further minimize your time dealing with the issue.

    Of course this is all moot if your 1TB of data isn't valuable enough to invest 600 a month in . . .

  44. harddrives by Anonymous Coward · · Score: 2, Insightful

    For cheapest backup possible, just use harddrives. Create a software raid5, backup to it, then powerdown and remove the drives to someplace safe. You'll also be able to recover the drives on any machine that can boot linux.

    Hotswap or removable drive cages can be pricy, and aren't designed for lots of swap-ins and outs, so I'd just buy new IDE or SATA cables every few backups. If you're using the same set of drives multiple times, then leave the cables connected as not to wear out the drive's pins.

    Eventually you'll wear out the ide connectors on the motherboard, so use one of those cheap ide adapter cards and replace as needed. Or use a cheap motherboard.

    It's too labor intensive to be in the same realm of solutions as a nightly tape backup, but not nearly so much as CD or DVD backups. It's easy enough to do once or twice a month.

    If you're cheap, you're not after disaster recovery, you want disaster mitigation.

  45. DVD Autoloader by SuperJason · · Score: 3, Insightful

    Explain this to me, I can buy a 200 disc cd changer for $100 bucks, but the same thing with a burner (cd/dvd) runs thousands of dollars. Isn't there any company out there that can do it cheaper?

    Heck, I remember a slashdot article about a guy who built one out of WOOD!

    This would be a great solution for short term recovery storage. Just keep a stack of CD's or DVD's ready, and it will load them in and burn them all automatically.

    On a site note, it would be great for converting a 400 disc cd collection into MP3's.

  46. Hmmm ... data is what brings in your money by hattig · · Score: 2, Insightful

    So spend some time and money on making sure it is safe!

    Even if you had a Bluray DVD burner, that would be 20 discs you'd have to burn to backup 1TB. So that is out of the question.

    Really what I'd set up is:

    1) Local: 1TB of hard drive space on IDE RAID (mirrored). An 8-port SATA controller would do, with 8 250GB SATA drives.

    2) GigE ethernet to somewhere else (got a separate garage?), or something faster if affordable

    3) A file server there with the same config for "off-site" backup. Should your PC catch fire and melt, you'll still have your data. Yeah, backing up 1TB of data over GigE will take around 15000 seconds a go, or 4 hours or so. That's okay overnight, and better than swapping 50 BluDiscs or tapes and then carrying them out there.

  47. Exabyte by HonkyLips · · Score: 2, Informative

    Try Exabyte - for hardcore tape storage.

    http://www.exabyte.com/products/prodviews.cfm

    I think you can store about 1.6 TB on a single tape or similar, but check them out. Tape drives have come a long way from old SCSI DATs transferring 20meg a minute. And they're fully automated and although there's an outlay cost for the tape drive, over time the cost per gigabyte for storage will be lower than hard drives.
    If you have a security company do patrols of your office you can get them to take the tape offsite with them after nightly backups for added security... etc etc.

    --
    Putting syrup in coffee is some form of blasphemy.
  48. Re:Give Up Now by ePhil_One · · Score: 4, Insightful

    Now call me crazy, but have folks completely forgotten the age old solution, TAPE? A SDLT tape goes for about $50 and holds about 320GB, LTO holds even more, and I believe Quantum has just released the latest generation of SDLT. While its not "cheap" an autoloader can be had for about $15,000 that can backup many TB hands off. Might be a bit much initially, but it the best solution long term

    --
    You are in a maze of twisted little posts, all alike.
  49. The simple economics of your "work" by swb · · Score: 3, Insightful

    If your "work" (as in food, housing and income) requires this kind of storage, you should be charging the kind of money that can make the ecomomics of such data storage actually viable. I'm assuming that some of the really high-end storage devices from EMC, Hitachi, et al could handle your data generation/replication/backup needs effortlessly.

    If that's too expensive (and it usually is), you can kludge your own system using low-end stuff from Hpaq/IBM/Dell's x86-server-oriented product lines. LTO1 drives are pretty cheap and we've found them to be very reliable over the past 3+ years, as well as offering 100 gig native per tape.

    If even that's too expensive, then I seriously think you need to re-think the economics of your work situation. If your work doesn't cover your capital costs, you're not charging enough. If the work and data are business valuable enough, cutting your storage bill to the bone by building Linux clusters crammed with IDE HDDs is just a bad business decision.

    If this is just your hobby-type work, then you need a cheaper hobby, like heroin addiction or something affordable. Physical space and electricity aren't cheap enough in a metropolitan area to burn through 1TB of storage per month, let along reliable data storage.

  50. Bad idea by Anonymous Coward · · Score: 5, Informative

    Google stores data for fast access, not for reliable storage. They don't care if they lose a few hundred gigs when a handful of disks die, they'll just re-spider it in a few days when the Googlebot hits the sites which were lost. Their solution is NOT optimized for reliable storage and it's not suited in the slightest to this guy's problem.

    1. Re:Bad idea by Anonymous Coward · · Score: 4, Informative

      this is incorrect. GFS (Google File System) has many systems with the same data on each node. These nodes have 3 copies of each data slice. If one server fails then the other two mirrors re-copy the data.. If two fail then the server mirrors the data to ensure it is never lost.

      google does not want ANY data to be lost. The have many mirrors of all data.

  51. Firewire hub with hardware RAID by swb · · Score: 2, Interesting

    I'd love to see a Firewire hub that could act as a hardware RAID controller. A program on the computer would enable management of the RAID controller, and once formatted, the logical volumes would be presented to the host computer as standard disk volumes, eliminating the need for any special drivers on the host computer, as well as enabling the entire array to be portable to other platforms.

    How expensive could something like this really be? $300-400 at most, I'd have to guess considering what most places are charging for SATA RAID cards.

    1. Re:Firewire hub with hardware RAID by rhizome · · Score: 2, Informative

      > I'd love to see a Firewire hub that could act as a hardware RAID controller.

      Firewire drives can be daisychained, and in fact OS X allows you to set up software RAID on multiple firewire drives attached to the system. You can't move them to another system and get access, but that's about the only limitation that I've found and it's more than decent for local high-density storage..

      --
      When I was a kid, we only had one Darth.
  52. I'm not a "Common Man" then by macdaddy · · Score: 2, Informative
    I have a dual MP2400 with 4 x 120GB WD 1200JB drives. I have a single XP2800 machine with 4 x 120GB WD 1200JB drives, 2 x 200GB Maxtor 6Y200P0, and 2 x maxtor 7Y250P0 drives. I have a dual Xeon 2.8Ghz machine with 4 x 120GB Maxtor 6Y120M0 drives. That accounts for all my regularly used machines. I guess I'm not a common man. :-) Not to brag...

    I have to disagree with the sister system though. For most geeks like you and I a sister system would be fairly adequate. It would be better with an occasional off-site backup. However it really sounds like this guy's data is far too valuable to have only one copy of it and to have all copies be at one physical location. He really needs an off-site backup somewhere. Imagine for a moment if his home (I'm guessing he works from home, but this still applies to a real store-front business) was robbed. The crooks didn't know what they were taking. They saw two shiny computers in an office and figured they could hawk them on the street. There goes all his data, both copies. D'oh! So in short a sister system is a good idea but it probably won't do this guy much of any good. It would be a good local solution for a short term live mirror (ie, data is archived that night but the sister machine gives you a backup for that one day's work).

  53. Do It Yourself CD Changer by Darth+Yoshi · · Score: 2, Interesting
    But long term storage is painful -- DVDs cost about $0.10-$0.15/GB but takes too much human time...

    Do It Yourself CD Changer

    --
    // TODO: fix sig
  54. Re:Give Up Now by tchuladdiass · · Score: 4, Informative

    Come on, this is Slashdot. A tape changer doesn't have to cost that much money if it's make of lego (shamelessly pulled from an earlier slashdot story which I can't find at the moment).

  55. You're NOT using RAID??!!!! by swordgeek · · Score: 2, Insightful

    Let me get this straight: You have a four-node cluster, you have 1.6TB of online storage, and you need some sort of permanence; and you're not using RAID of any form?

    This is utter insanity! Without RAID, your only hope of safety is in your backups--which you're only asking about now!

    RAID your data ASAP, and then start looking for backup systems. Take a look at some of the DLT4000 replacements.

    --

    "People who do stupid things with hazardous materials often die." -- Jim Davidson on alt.folklore.urban
  56. Another perspective by v1 · · Score: 2, Interesting


    Just tossing out another point of view - similar but different than some of the others previously discussed. First off, examine the data you are keeping - do you really need that much? Nowadays it's common to be able to acquire data faster than it can be processed, and if you never stop gathering data, well, you never will catch up, only fall farther and farther behind.

    If you DO need this data, and you are going to need it for awhile, (year or more) I'd recommend cheap HDs. They also have an advantage of being easily catalogged, and are untouchable when compared with access time of tapes. Don't go raid5 though, this is not "catastrophy-proof". (flood, fire, tornado, etc) For catastrophy protection, mirror your drives. When you have them loaded up with data, pull the FW cables and swap drives in the enclosures with fresh empty drives. Label them well, and then take each half of the mirror to DIFFERENT LOCATIONS. It's OK to keep one set on-site, but the other set must be somewhere else, preferably in another zip code. This will allow you near instant access to your data (since it's onsite), will protect your data from mechanical failure (through mirroring) and will protect you against catastrophy. (you WILL need to acquire new firewire boxes etc if your office gets leveled... don't forget this detail in general - the data is of no value if you lack the equipment (tape drives etc) to read it back in with) I know you can get compression and fit more on a tape etc by using archiving software, but it may be worth the extra cost to obey the KISS rule and just simply drag and drop the data to the formatted HDs. This will make data recovery MUCH SIMPLER, and if there are errors on the HD when you need to recover, this will insure you can actually recover most/all of the information. Archive streams and tapes are notorious for losing 100% of the data that follows a corruption point in the stream.

    Once you know you no longer need a specific set, drop it back into the pool of usable drives. Buy them by the case, it's much cheaper this way. It also is advisable to buy the same make/model every time you have to get more drives, even if there are newer, larger, cheaper models out, because having all the same drives means one less complication to worry about in times of crisis.

    --
    I work for the Department of Redundancy Department.
  57. This is why I hate Ask Slashdot by Propaganda13 · · Score: 3, Insightful

    Not enough feedback or information!

    OK, 1TB/month that doesn't say much.
    Always look at different levels of case scenarios and work from there. I usually start with loss of building by fire and work down through limited hardware failure or data corruption.

    There are several factors that determine how often you should backup. Here's just a couple of questions to answer.

    How much is the data worth?
    How much is your time worth? If you lost a day or week of processing time.
    Is your work time dependent? (deadlines)
    If you lost the data, did you lose the data completely or just lost processing/analyzing time on the data that you can get from your clients again?

    How long do you have to store the data, and have it retreivable? One month compared to several years really changes your options.

    How financially responsible are you for the data?

    Multiple backups(daily, weekly, monthly)(full and incremental) in multiple locations are key to a successful backups.
    Raid is for redundacy or performance not backups.

  58. Re:!RAID by MasTRE · · Score: 2, Insightful

    > because it protects against device failure, not *user* error. if you delete a file from a raid array, it's gone. that's part of what offline is all about.

    You can add to that getting hacked. They can't hack your off-line data.

    --
    Must-not-watch TV!
  59. What about aggregate speed? by Bananas · · Score: 2, Informative
    Something I've noticed in all posts, the price is a prime factor, but the original poster also seems concerned about access times (hence DVDs not being an option, due to the time it takes to retrieve data).

    For simplicity, I'm not going to go into RAID tradeoffs, etc. and just stick with "striped data", which gives you maximum bang for the buck. You should draw up a simple spreadsheet with the following headings:

    1. Media Type: Enter the type of media, it's manufacturer, etc. here. Example: HDD, Hitachi, Model XYZ, 320Gb.
    2. Size: How much raw data the medium will store. All figures should be in Gb.
    3. Speed: The expected data transfer rate of a single unit of storage. IDE drives vary and can range from 5mb to 30mb/sec, tape also ranges. All figures in Mb/sec.
    4. Watts per Unit: how much power does each media unit draw? Tape drives will be difficult here, but HDD units are typically around 20-30 watts. Go conservative and plan on allocations of 30 watts for HDD units.
    5. Cost per Unit: How much does it cost for 1 HDD/Tape/whatever?
    6. Cost per Gb: [Size] divided by [Cost].
    7. Units Needed: Given a target of 1024 Gb (1 Terabyte), how many units of storage are needed to reach that size, assuming data striping and no RAID-5? Forumula is '1024' divided by [Size]. then round all decimals up to the nearest whole number.
    8. Expected Size: Take [Units Needed] times [Size]. If you have 4 units of 320, you'll end up with 1280 Gb (sans redundancy).
    9. Total Cost: cost of non-redundant array. [Cost Per Unit] times [Units Needed].
    10. Aggregate Speed: Assuming a 1:1 ratio of controllers to units, what kind of speed can we expect? [ Units Needed ] times [Speed]. All figures in Megabytes. Note: a huge array of 1+ TB can be made unusable if you can only process 10 megabytes a second.
    11. Power Consumed: [ Units Needed ] times [ Watts per Unit ]. Important - your power supply should be rated at about 120% of this figure to make everything work reliably. Also, if you're going above 400-500 watts, then plan on some additional cooling - there will be an increase in BTU's

    It's not exactly a great spreadsheet layout, but it should be enough to enter everything in and start seeing what is practical and what isn't. I'm sure that someone else would be able to enhance this a little further - any takers?

    By the way, you really should think about RAID-5 at the very least. All it would take is just one drive to hose your data completely. Besides, as the array grows in size, the price tradeoff becomes smaller and smaller, to the point where it's really not worth your time to stripe all of your data without redundancy. I believe that the md drivers in linux support up to 32 devices per RAID set. That takes your overhead from 1/5 of your array (in a 5-drive setup) down to 1/32 of your array.

    A SAN-style setup lends itself well to this, but the price is very prohibitive to "the common man", as it requires very expensive hardware. You can emulate something like this via GFS support in Linux, which (theoretically) would allow you to aggregate your data.

    If there is a requirement to keep the data online at all times, you'll need to spend more on some PC cases, as well as some networking to string the units together. Pick a reasonably-priced case that will house all the media units, have adequate power (at least 250 Watts, 300+ would be ideal) and keep them cool. Use a motherboard that is reliable, and can adapt to several different clock speeds for a given CPU; you'll want something that can be thrown out for less than $99.00 if it should go bonkers on you, but if the CPU burns up, you should be able to still get parts off the shelf and get the Motherboard running again. Stick will the "commodity" or low-end CPUs, as (a) they tend to be cheaper, and (b) having been through a complete lifecycle, any bugs or issues with the CPUs will be well-known by now. Don't worry about the speed of the board or CPU at this point, as most "modern"

  60. Xraid by Phrack · · Score: 2, Insightful

    I haven't tried it myself, but Apple's Xraid appears to be gaining in popularity as a reasonably priced bulk data storage solution. It reportedly works with Linux, Windows, Netware and, of course, Macs.

    If that doesn't suit ya, and it's bulk storage without necessarily speed you're looking for, check into the ATABoy line from Nexsan.

    --
    Dump the IRS - http://www.fairtax.org
    1. Re:Xraid by Lukey+Boy · · Score: 2, Funny

      I do believe that this is the first time I've ever seen the word Apple used in the same sentence as "reasonably priced".

  61. What are your near- and long-term requirements by TBone · · Score: 4, Informative

    I looked through some of the answers here, and as near as I can tell, you've got a bunch of home hobbyists telling you how to back up your home computers. Perhaps all your needs entail is a computer with an external IDE drive array and 4-10 200G SATA drives in it. But from your initial post, it's not clear what you need your offline storage _for_.

    First of all, you mention that you generate and use 1G of data a month. What happens at the end of that month? Does all of the data become useless? Is some of it carried through? Is it useful for historical processing for some time after it's not "live" any more? The disposition of that offline data is important; you can't determine how you can most effectively back up your data until you know what you need to do with that data once it's backed up.

    Since no one cares about backing up old data that they never use any more, I'm going to assume you need this data in some form in the future. I'm also assuming that your data ages out completely every month.

    Realistically, you have two options: Large redundant disk arrays, or tape. Various factors give credence to one or the other.

    First of all, get off of the SATA hacks, and realize you're going to need to go to SCSI, whether you end up with disk or tape. You're backing up data, you're going ot want it to be reliably written out, and SCSI is the de facto standard for backup architecture. Yes, you pay more for it, but there's a reason for it: the SCSI equipment I manage at work fails a fraction of the percentage of time that the various IDE/ATA systems fail. While SATA is marketed as a consumer technology, it will never meet the rigors of being a reliable backup methodology.

    • Media Cost: Tape wins over disk here. LTO tape is running, at a quick check, for about $75 retail for 200/100G tapes. Even assuming only reasonable compression, you're looking at 150G for $75 bucks. And that is single-cart pricing; tape pricing quickly drops if you're ordering in bulk (typically in packs of 10, then at the 3-packs level, then more, check with your preferred media vendor)
    • Hardware Cost: Disk wins, but it's a double-edged sword - every disk you own has electrical and mechanical failure chances. The more disks you have, the more likely you are to lose one of them. The more you're storing on disk, the more you open yourself to a catastrophic failure of those disks themselves. High-end fast tape drives and libraries are expensive, but they just _work_. You plug them in, load your preferred tape management software (hell, run mtx for that matter), and start backing stuff up. No formatting, settings up arrays, hot-swap schedules, anything like that. But you pay through the nose for it - expect to spend into the $10K range for a large-scale tape storage solution that you could match (in short-term storage duration) for a couple of thousand dollars for a disk-based solution.
    • Hosting Space: Try to store 10TB of disk, and you'll need an air conditioner in that room just to cool down the disk cabinet and controllers. 10TB of tape just sits there though; you can store 4TB of tape online in a small 3U (about 6 inches) tape library - that's 24 tapes, and such libraries typically also support two drives. Go to 5-6U, and you can get 4 drives and over 50 tapes. If those were 200GB LTO tapes, you'd be looking at up to 10TB of storage available online, or easily offline and off-siteable. In addition, tape is easily expandable. Need more storage space? Buy another tape. No new hardware needed, no power concerns, just drop it in the drive or library and go.
    • Speed: Disk definitely has an edge. Set up an decent SCSI RAID5 array (real hardware raid across multiple disks on separate physical controllers, not this playtime software 0+1 homebrew IDE raid crap) and watch your write speeds triple. If you need to back up that 1 TB overnight, you don't have much of a choice but to go to disk in some form. But again, you pay a price for it. The speed you save in the
    --

    This space for rent. Call 1-800-STEAK4U

  62. Re:options options, what is your time and data wor by wik · · Score: 2, Informative

    There's no reason why you couldn't read each of the DVDs in serially and incrementally rebuilt the lost DVD. On recovery, you should only need enough space to hold a single DVD to rebuild the remaining disk.

    A disadvantage is that the data cannot change while you write all N+1 DVDs and restoring would require lots of DVD swaps (regardless of whether you've lost a DVD or not) and the ability to incrementally write files with gaps in them (not an issue with most filesystems).

    --
    / \
    \ / ASCII ribbon campaign for peace
    x
    / \
  63. ICI Optical Tape / CREO Vancouver by anubi · · Score: 2, Informative
    I haven't seen any mention of these guys here, but a few years ago, I remember a company, CREO, working on a data recorder which used a spoolable optical tape. I believe this tape was made by ICI over in Europe.

    There were several packaging options for this tape.. including reels of 2" wide tape and cartridges.

    I've lost track of what happened to it. All I remember is that this tape existed at one time and some research was being done to make data recorders of phenomenal storage capacity.

    Back in the early 90's, there was one company in Campbell, California, known as "LaserTape" which was trying to design a tape drive for the PC which used cartridges of this tape. I have lost track of whatever happened to the company.

    --
    "Prove all things; hold fast that which is good." [KJV: I Thessalonians 5:21]

  64. Re:Firewire drives? by silentbozo · · Score: 2, Informative

    Word of warning. Don't cheap out on Firewire hardware - it's touted as being bulletproof, but in practice I've found SCSI-voodoo-like interactions between cheap cards/cases, and questionable power supplies. I've pretty much given up using Firewire for applications where I need to swap drives a lot, as weird crap happens, just at the worst possible moment.

    In those applications, I've gone with dedicated ATA/133 cards with a nice roomy case with a bunch of removable drive bays. It's a pain to have to shutdown to swap drives, but less of a pain than Windows bluescreening, rebooting, and "fixing" your attached Firewire drive, scrambling all of the data on it, and making it impossible to run a recovery (no, I didn't have a backup of that data...)

    I've also had weird crap happen with my Macs as well - some hardware doesn't show up unless you have it plugged in on startup. In theory it was a great idea (mix Firewire cases with removable drive bays), in practice, you're asking for trouble if you're using cheap parts (ie, bottom-basement cases, with cheap cables.)

  65. A cheap, simple solution by Cabeiroi · · Score: 2, Informative

    I'm not sure what your price range is, but one method I've had success with is a Promise SATA add-in card and removable hard drive enclosures. SATA is hot-swappable and combine that with a cheap hard drive enclosure ($10-$30+) with any SATA hard drive of your choice and you have a relatively cheap solution.

  66. How about abusing physics law's? by OlivierB · · Score: 2, Interesting

    I remember reading one day about some research somebody did about abusing the network capacity to store data. Basically he would send mail to himself via a third party smtp server. Of course he would tell his destination server to ignore his messages until a set date, then refuse the messages which would then be bounced back to his originatin email acocunt. By having a roll on that he achieved some pretty amazing storage for FREE! with ultra reliable ISP grade mail backup. Now aplly to same principle to space! Saw you have a server on Mars. You could transmit to Mars the data in full before MArs even started receiving the Data. When Mars would receive the data it would immediately send it back, not even waiting for the message to be completely received. Thus the data would not use any storage on mars either. At this point you have achieved media less storage. And have abused the network capacity of Space. Talk about the geek factor in that!. I don't really wan't to model this network's capacity but everybody here understands that it is a function of the transmission rate, celerity, distance with "relay" server. Of course there is an amount of data for which you will start needing some sort of storage on both servers. This will noly happen if the data has time to do a return trip to and from the relay server in less time than one can transmit the data in full. Improve the transmission rate and your network "memory capacity" multiplies.

    --
    Artificial intelligence is no match for natural stupidity
  67. Work Demands should be realistic. by Linus+Sixpack · · Score: 2

    My work demands 1TB a month ....

    It sounds like you need a good cost benifit analysis and an idea of a budged.

    First RAID your existing data.

    Second Replicate any working solution you have now identically for next month and backup hardware.

    Have a serious talk with work as to what is expensive and what you can afford. What happens if a data set is lost? How much damage\cost would that incur? I would look int AIT drives from Sony.

    It sounds like you are in a frame of mind where you see everything as expensive. This will heavily influence your decision. Walk through a data disaster scenario with your backers and examine your costs in that light.

    ls

  68. Thanks to all for lots of useful suggestions by Vigyaan · · Score: 2, Informative
    Few points are emerging (I am still reading the responses, so pardon me if your comment didn't make it here):

    Hard drives based solution seems to be (currently at least) cheap + easy to use for immediate use

    DVDs are cheaper for long term storage but automation devices are still not commonly available (plus capacity per DVD is small)

    Software RAID is slow for writing but okay for reading data

    Tapes may be an viable alternative for long term data storage but the tape drives require an initial investment

    Some readers have mentioned "LaCie Bigger Disk". $1199 for 1 TB disk space ... a price to pay for convenience.
    Since a lot of you have asked me, now I will explain the nature of my data, its storage and analysis.

    I am a scientific researcher working in computational biology. I do atomistic modeling and generate snapshots of protein conformations along MD trajectories. The data is analyzed several times to calculate different quantities. Those of you who do this kind of stuff know that we can collect and store only a fraction of data we want. I generate this data on supercomputers and then compress it (using bzip2 and gzip) to store temporarily and permanently.

    The data generated is ususally analyzed within the next month. A good fraction of the data (about 70-80%) needs to analyzed again several times for different quantities in the next few months. In my experience 80% of data is usually discarded after a year of so. Therefore 2.4 TB/year need permanent storage.

    So 500 GB at least is required for "daily use", 1-2 TB would be nice to have for intermediate use and over 2 TB will need "permanent" storage.

  69. Re:Give Up Now by WuphonsReach · · Score: 2, Interesting

    The problem is most of us have 8mm tapes sitting around from a previous time we did a backup of something important to tape, only to find that the tape-drive-vendor's long dead, and the tape device is long dead too.

    Sounds trite, but EBay to the rescue.

    I started with a single Exabyte 8mm backup drive and picked up 2 more on EBay for around $150. (This was a few years ago even and the original drive had been given to me with 50-60 used/new tapes.) Now that I have a DVD-burner, those drives don't do me much good (too small of capacity, and way too slow compared to DVD).

    Rule of thumb for corporate / business use is that you always buy 2 or 3 of any mission-critical backup hardware. That way, if one of the units breaks, you still have the other to rely on while the first is either fixed or replaced.

    Having identical backup hardware and software at another location is also a great idea.

    --
    Wolde you bothe eate your cake, and have your cake?
  70. Re:Give Up Now by WuphonsReach · · Score: 2, Informative

    First rule of archiving data on optical media:

    It will get scratched and damaged.

    Which means that unless you're adding recovery data (using QuickPar) or burning 2 copies, you will lose at least some data on the media within a few years. (Cheap media sometimes only lasts a few months if not stored in dark and climate controller conditions.)

    QuickPar is nice because you can pick how much redundancy you want on the disc. I find that 5-10% is plenty for most uses and guards against all but catestrophic damage to the disc.

    (The guideline for redundancy is based on how often you check the media vs how fast the media degrades or is damaged. If the media degrades at a rate of 1% per month and you only verify the disc annually, you'll want at least 12% redundancy but more like 18% redundancy.)

    --
    Wolde you bothe eate your cake, and have your cake?