How Do You Backup 20TB of Data?
Sean0michael writes "Recently I had a friend lose their entire electronic collection of music and movies by erasing a RAID array on their home server. He had 20TB of data on his rack at home that had survived a dozen hard drive failures over the years. But he didn't have a good way to backup that much data, so he never took one. Now he wishes he had.
Asking around among our tech-savvy friends though, no one has a good answer to the question, 'how would you backup 20TB of data?'. It's not like you could just plug in an external drive, and using any cloud service would be terribly expensive. Blu-Ray discs can hold a lot of data, but that's a lot of time (and money) spent burning discs that you likely will never need. Tape drives are another possibility, but are they right for this kind of problem? I don' t know. There might be something else out there, but I still have no feasible solution.
So I ask fellow slashdotters: for a home user, how do you backup 20TB of Data?" Even Amazon Glacier is pretty pricey for that much data.
Asking around among our tech-savvy friends though, no one has a good answer to the question, 'how would you backup 20TB of data?'. It's not like you could just plug in an external drive, and using any cloud service would be terribly expensive. Blu-Ray discs can hold a lot of data, but that's a lot of time (and money) spent burning discs that you likely will never need. Tape drives are another possibility, but are they right for this kind of problem? I don' t know. There might be something else out there, but I still have no feasible solution.
So I ask fellow slashdotters: for a home user, how do you backup 20TB of Data?" Even Amazon Glacier is pretty pricey for that much data.
I would say use floppies, but I'm kind of old and out of touch now.
At home, I didn't feel like paying for 2 large arrays to store my data, so if I rip any media, I always rip it to DIVX. 800 MB for a DVD or even bluray rip is a great economy, saves me money on primary storage and also enables me to back it up. I accept the loss of quality as I can always reference the original media if I want.
Another option in the future may be subscription services which have HD content, thus eliminating my need to roll my own. We'll see what happens there.
Crashplan has unlimited storage. I use their home plan; it's unlimited for up to 10 machines. I think I am backing up about 6TB there now.
Why buy one array when you can have two at twice the price?
In related news: rm -ra * should be used with caution. ;)
Use a ZFS pool using a combination of a mirror, a raidz3 & spares. Add new disks as hot spares when money can be allocated. Easy, some what affordable & allows for failure.
You don't, at least not for cheap.
The only way you would be able to do something like this effectively would be to run another raid along side of your working raid and backup off to that
Still, holding onto 20 TB of data is overkill.
I would recommend bringing that down to a more manageable size for a home user and only backing up the items that are not easily replaceable, like personal photos, movies and documents. Backing up your movie collection or porn collection is just a waste.
"Voices In My Head" The Unauthorized Biography
I really doubt anyone actually uses 20TB of movies & music. It just sits there.
I have a 16 TB media collection at home that I just back up on more hard drives.
External hard drives in USB cases + Robocopy works great for me.
I don't respond to AC's.
Shouldn't this be titled 'Ask Slashdot: How Do You Backup 20TB of Data?'?
Not really a hard question to answer.
but you need real backup software. As you fill up drives you replace it and continue the backup until you have a full backup. This way you can take them off site. Like any other backup solution, make sure you test the drives every few months to make sure your data is not corrupt and have a failed drive.
Tape.
> It's not like you could just plug in an external drive [...]
Why not? Maybe not one, but 10 or 20 of them.
Most businesses would have two. Then just sync them. If its not worth it to you, the data must have been worthless.
Use tape.
No need to back it up because he owned all the original CDs.
Oh . . . wait.
Unlimited backups, $5 a month...
you need a BIG connection though...
some tapes and done..... should not even be that expensive..... everything else will cost more!
What will sadden me is that you could not thought of this yourself......
redirect all the backups to /dev/null :P
I have a similar situation; 18.6 TB RAID-Z at home (8 3TB drives) using FreeNAS and with the new update it shows it was initially set up using a non-native block size (I was a bit naive regarding the settings when I first set it up) and I'd like to rebuild it but I have no way to backup 14+ TB. Also, I would like to have a backup in case more than one drive dies (1 parity works well but I could still suffer a catastrophic failure). I've looked into tape backup but anything that seems like it'd have enough storage to be practical (1+ TB per tape) seems excessively expensive and the 100GB tapes seems like it'd be unmanageable.
-SaNo
If your ISP doesn't have data caps, look at Backblaze ( http://www.backblaze.com/ ). $5 / month for unlimited storage for one computer. Only available for Mac and Windows, but I'm sure a virtual instance of Windows if you're using a Linux box would work... These are the folks that opensourced their hardware design for their storage "pods." http://blog.backblaze.com/2011...
I ran into this problem with software raid back when i was younger. My fix was mirroring to multiple drives and of course backing up off-site (parents house).
And just download new porn.
Why not store it all in 20000 github repositories?
If his data is legitimate, legally aquired media, he has hard copies anyway, and only uses the digital copies for his home media? As to your question; a second server seems the only feasable option to me with that amount of data, although it all depends on the monetary value you place on your data, my motto being if it's freely aquired and easy to replace, why bother, regard it as temporary, whereas if it's important or work related, back it up.
BackBlaze offers unlimited backup storage for home users for around $5/mo - encrypted with asymmetric keys. I've got about 750 GB on there myself, works great. Although they may not *like* you backing up 20 TB of stuff, they should accept it. And, if they don't, you're about back five bucks. Probably worth a try.
"My friend (read I) lost 20TB of pirated content! What should my friend have done different?"
How about, ask yourself, how much of that content were you intending to ever consume again. Yeah, you can most likely delete 95% of it, that's 1TB of content that you might use again.
Hoarders! *lol*
Tape, still cheapest per byte when random access is not needed, including backups. Much of the speed disadvantaged can be handed by putting a VTS in front.
Re-rip from the original media.
I have about 7TB. I built 2 RAID devices, and back one up to the other.
It really depends on how often you plan on backing up, whether automation is required, and what your budget is.
The most expensive (but most automated and up-to-date) is a second RAID array that's made of cheaper disks and mirrored. Doesn't have to be fast, just made up of cheat 2TB/3TB disks.
Tape backup is also going to be an expensive solution, but also one that is much more automated.
The cost of Blu-Ray disks can't really be much compared to the other two options - if it's a media/file server, just do monthly incremental backups and save to Blu-Ray, the majority of the backups won't be substantial in size.
The best solution might be to tone down the digital packrattery - 20TB is a LOT of space for media that will probably never be played again.
Totally not a "backup" solution but raidz2 to protect the data from many types of failures, and hourly snapshots to protect the data from the operator....
Now if your box catches fire, floods, etc you are in trouble but i agree the problem is not easy to fix.
You either spend a ton of time (and money) writing blurays, expensive tape soloutions, etc.
At the end of the day you might find it is cheaper to just have two boxes with seperate raidz2 pools and sync them.
Heck, you can even use the snapshots to support offline replication where you power up the second box and dump the snapshots across and power it down again.
I've been using Crash Plan (http://www.code42.com/crashplan/) and it's pretty good and relatively cheap ($4-$5/month depending on contract length).
For one price you get "unlimited" storage. I only have 60GB or so of data that I back up, though. I wonder if they would have an issue with 20TB.
The big problem would be the initial upload. If you have a bandwidth cap issue, you can rate-limit your upload. You can also send them a seed drive, but that may be pretty tricky with 20TB of data.
Just a thought.
A quick check at one service which lists such large amounts, you would be looking at almost $20k/year to keep a single offsite copy of that. That is the posted price however, I imagine that is enough that you could shop around and find a deal, but, a deal is still going to be prohibitive for most people.
At 20 TB I would start thinking about one of two things: Tape, and/or git-annex.
Unless prices have changed since I last looked and the scales tipped, tape has the advantage of being cheap. Of course, you will need to test your tapes occasionally and likely want 2 copies just in case, but, at that point you are invested in tape, may as well.
The other possibility is git-annex and lots of drives, but you can mix types. That way you can keep a catalog of your library and information on where it all is, and how many copies of each thing you have.
Of course, any way you slice it, each physical piece of media is something that can fail so you have to occasionally test to ensure redundancy.
"I opened my eyes, and everything went dark again"
Glacier at $20 per month for 20TB is rediculously cheap by today's standards. And at those sizes, you'd want to ship those drives to Amazon instead of uploading. We do this all the time and it's not that hard.
The price of TBs of storage of course will come down without question. But by today's standards $20/month for a medium that won't "bit rot" on you is an amazing deal.
It sounds like your friend had 20TB of movies and music that he might - or might not - have had legitimate copies of elsewhere. I'm not defending the RIAA or anything here but it sounds to me that all your friend is out is time. If he had any legitimate data - of his own creation - he should have already been backing it up somewhere else.
Damn_registrars has no butt-hole. Damn_registrars has no use for a butt-hole.
Any good 8 bay NAS with 4TB drives
Raid 6 on a separate GIGABIT sub-net in a separate location in the house with battery backup.
Qnap, Synology or Thecus all make models that should do.
I personally would go with a Qnap ts-870-pro and HGST drives.
He could have always bought a sufficiently large tape-library from ebay - but I guess the data wasn't worth that much.
That's always the first pair of questions to ask: how much is it worth and how much would it cost to recreate?
If the answer is somewhere between "I don't know" and "Well, it's not that much", then he just should stop hoarding that much stuff.
He could have built a filer with ZFS and sent daily snapshots to a 2nd filer - but that wouldn't have helped him if the house burnt down...
Windows 2000 - from the guys who brought us edlin
Who the hell even listens or watches to that much freely downloaded content anyways? Seriously, you're nothing more than a digital hoarder who gets emotionally attached to data.
If you want to back up 20TB of data, you have to pay for it.
Build another server and rsync hourly.
If the data cost more than $1200-1800 to replace it, it's worth paying to have a backup.
Amazon Glacier is $0.01 per GB, or $205/month for 20TB. Clearly that becomes uneconomical fast.
A 20TB backup can be had in six 4TB drives via RAID 5 (don't want a backup drive failing, do we?).
A Drobo B800FS* is small enough (14x12x5) to be easily stored offsite and is a self-contained network appliance.
Since it has 8 bays total, there's also room for expansion to 28TB.
Total cost: 6*150 + 850 = $1750.
*You can shave off about $500 by using an off-brand RAID enclosure, but I think the software and support for the Drobo is worth it.
Figure out the theory of everything.
Then you can always recompute your data from scratch.
If Pandora's box is destined to be opened, *I* want to be the one to open it.
With a second array, or tape backup. The second array is going to be the easiest solution, but tape backup provides you the option of storing the tapes off-site, which is important for any real backup plan. After all, your friend could just as easily wipe out the 2nd array by mistake, or a disaster could wipe out the physical location. LTO-6 tapes are cheap and can hold 2.5-6.5TB of data depending on compression. Tape drives are perfect for backup, so why even ask if it's right?
That's not very much data I have at least 75TB worth of hard drive space and it just keeps getting cheaper and cheaper. To be fair though I run a repair shop lol so I get a lot of extras people don't want or want to trade ect.
http://www.amazon.com/s/ref=sr_nr_p_n_feature_five_bro_0?rh=n%3A172282%2Cn%3A!493964%2Cn%3A541966%2Cn%3A1292110011%2Cn%3A595048%2Cp_n_feature_two_browse-bin%3A5446812011|5446813011|7817230011%2Cp_n_feature_five_browse-bin%3A2419644011&bbn=595048&sort=price-asc-rank&ie=UTF8&qid=1394631960&rnid=2419643011
See what I mean.
As you noted, Bluray holds a lot of data, but would take some time. Since its audio/video media, odds are most of it is pretty stagnant. I'd do an initial rsync job to write out to Bluray... then once a month or so repeat the job but now rsync will only get what's changed. Depending on the media type and age, you could also look at dedup'ing it and if the dedup'd copy is significantly smaller than the source you might be able to put that onto say one or two 3-4Tb drives.
Music is ephemeral. Use streamtuner. There is no need to save copies of muzak.
Bitcasa
I use ZFS on NAS4Free at home and have two 48TB arrays, the second array is at a neighbors house, I am using mikrotik SXT PTP links in trade for him keeping my secondary server at his house, he gets internet and access to the movie storage/backup array. With ZFS I am not worried about a RAID failure as I just had a controller card fail and kill two drives on each of my pool. I didn't have any problems rebuilding the array and had I, I would have just pulled from the backup server. Also, with ZFS you get RaidZ-2 along with snapshots, which has been very handy at client locations to be able to save deleted documents(more than once from a disgruntled employee) also all of our machines backup to a backup area the kids have more than once gotten malware and restoring from a snapshot is easy!
Cheers!
I have a LaCie 5Big for my Mac. Using RAID0, it gives me a full 20TB of storage. I'm using it with two drives in RAID1 configuration, and three drives in RAID0, giving me a total of 16TB mounted as two volumes. Cost was just under $2k US.
Delete the porn you don't actually watch.
Don't RAID the storage. Use (and connect) drives individually and keep a sane index in order to quickly search and access the proper drive. Yes, it's a bit more hassle, but so is losing 20TB in a RAID accident. I personally don't even see why one wouldn't consider the problems and dangers with RAID of this scale instead of just being a bit more careful, but then again I'm the pragmatic type, not the over-engineering nerd type.
Ironic since from your description it would appear the RAID architecture served sufficiently well here (as it should have). It would appear you are seeking a solution to operator error, not equipment failure or other acts of God. Good luck.
You could always just call up the NSA and ask them to restore the data. Odds are good they have a copy of it...
Same as always.
One byte at a time.
Whenever you buy storage, you should buy the necessary backup capacity at the same time. You should never buy storage without buying backup capacity. Budget for it right from the start. If you can't afford the backup, you can't afford the storage. This may mean getting half as much storage as you'd like, but that's just the way it has to be. You probably wouldn't buy a car without an engine. It wouldn't do its job. So don't buy storage without backup. If you do, you have a storage system that can't do its job.
AC, please point us mere mortals in the direction in which me may find these DVDs with a storage capacity of 4TB...
I would consider: http://www.makeuseof.com/tag/b... ... and then plug a big array into the pi. Then host that in someone else's home (someone I trust of course!).
I agree, I've been using Crashplan for three years and the unlimited space it's really great BUT... ...I'm not sure about the bandwidth they provide: how long it will take to upload 20 TB?
Anyway, I don't see what's the problem in using external drives for backup. Here in my lab I've realized that the best way to backup X Terabytes is to have another storage with X Terabytes...
That would be more like 5000 DVDs.
ok, but it's not like he lost anything he owned. I think what he's worried about is the 40TB of bandwidth (assuming 1 up 1 down sharing) required to replenish that.
I'd like to plug BackBlaze. I've been using it for a while now and it's fantastic. It saved me from going down the whole RAID NAS / DAS route which isn't really backup because of fire / theft.
Here's my brief blog post on why I chose it...
http://www.bentristem.com/1/post/2014/03/offsite-backup-thats-easy-to-setup-geek-proof-and-secure.html
You can even make copies and leave them in a safety deposit box! Of course the real question is what data that was lost can be replaced (personal collections of movies can always be "found" again). Perhaps the backup plan for 20TB is unnecessary when smaller storage could do.
If you set up a RAID6 system and keep tabs on it, replace drives as they come and go, then you'll probably be OK unless you do something stupid, or have a fire, flood, or what ever. The ONLY way to really have backup of important information is geographically separate redundant copies.
Google Mail offers 15GB of storage space.
So hjsplit everything into 25MB chunks and upload it.
You'll need 1366 different accounts, though.
From my own (painful) experience: if you don't plan for it up front, you are always fighting fires (playing catch-up). Organizing your data can help a LOT! If it is media, arrange it by genre (e.g. video animation or video classical or whatever) to keep a particular grouping small enough to backup easily. If it is data, arrange by some category that works for you (e.g. current financial projects or past analytic projects).
The most useful guide I have found for resources allocated to backup: how much is it worth to me to re-create this resource? ("Worth" can be money, time, sentiment, or any other measure(s) or combination you chose.)
My current feelings: disk is the most versatile and cost effective.
Obviously you get 5,000 women pregnant, and ask each one of them to backup just one DVD-R!
It's not like you could just plug in an external drive, and using any cloud service would be terribly expensive. Blu-Ray discs can hold a lot of data, but that's a lot of time (and money) spent burning discs that you likely will never need. Tape drives are another possibility, but are they right for this kind of problem? I don' t know. There might be something else out there, but I still have no feasible solution.
Lets start from the top: You *can* plug in an external drive, it's called a complete hardware duplicate of your array (or perhaps for space/cost consideration, a single disk based copy held offline and synced regularly). Not hard and not terribly expensive (i would go with this solution personally). Cloud? Yep the bandwidth and storage even on something like Amazon Glacier would be prohibitive to all but the most financially independent geeks. Bluray doesnt hold enough (even at 50gb/disc you need 400 of them, groan). So, tapes? You bet your ass tapes are designed to do exactly this task, why do you think they are still in use? You can get individual tapes at 1/1.5TB, but for a one man operation they are probably going to cost you more than the first solution (offline spinning disks) and they are a pain to manage properly.
Now what is this doing on ask slashdot? A pencil, some scratch paper, and 15 minutes between amazon.com and newegg.com would tell you the prices of every solution. Oh, right, they need a chance to tee up some targeted ads for Carbonite, Mozy, Crashplan, etc.
How about backing up only the crown jewels of the collection?
Make a directory like /entertainment/premium and put the best stuff there, with a 4 TB limit. Rotate two external 4 TB HDDs and copy the stuff over periodically. Put a little sticker or some other mark on the newest, so you remember which one it is. If your main RAID array fails, build a new one, and restore the premium stuff from the most recent one of the two external disks.
What, noone mentioned duplicity yet?
http://duplicity.nongnu.org/
Why, yes! I AM new here.
You can get 10, yes 10TB of free cloud storage from a Chinese organisation
http://www.weiyun.com/
Create two accounts, job done.
It's quite a complicated signup - but there are guides. You need to create a login on the QQ site and activate it with a smartphone app - something like that.
You'll have the Chinese authorities looking at your data instead of the NSA!
http://cdn.crushable.com/files...
.
These "unlimited" claims always turn out to be lies. When will we learn?
My friend paid for an "unlimited" account from JustCloud for backup. He stored 1.8 TB on it and then they "fair use"'d his ass and canceled his account. They didn't even give him a refund for the rest of the money he prepaid.
Simply compress and encrypt your backup data, then post it on a torrent tracker as "New Julian Assange insurance file, decryption key to be released if extradited". Thousands of other people will make backup copies for you.
Well, Double layer blu ray can be a way of archiving for at least once without continuous updates.
It will take about 40 double layer discs. Not sure if it will survive 50 years, but I'm sure it will be enough until a new magnetic or optical technology comes to replace it in, lets say, 5 or 8 years.
I use Glacier and its great. 20 TB is about $200 a month which to me does not seem like all that much money for backing up that much data. The biggest problem from a home users perspective is getting all of that data to Amazon. Hopefully he lives somewhere where fiber is available to his house.
md prnt dwn
http://www.mdisc.com/what-is-mdisc/
Connect a raspberry pi and configure it as a backup server and let it copy all to /dev/null... ...
Then put aside the money you would have invested in a "better" solution, put it in a safe bank (under your mattress)
and wait until you need to restore something..
Most probably you'll enjoy the money more
I can sort of feel for your friend. I once lost about 20GB of music due to not having a backup -- the ex-wife had nearly all of the original CDs, so there was no way to get it back.
However.
Let's say the entire collection was movies. At .7GB per movie, that would be about 28,000 movies. At an average cost of about $15 per movie that's $420,000 worth of movies. So, your friend has spent $420,000 on movies in his lifetime? That seems a little far-fetched.
If the entire collection were music, at 3mb per song, that's about 6,666,666 songs. At $1 per song. So, your friend spent $6 million on music in his lifetime? Again, that seems a little fantastic.
If my numbers are wrong someone please correct me.
I know just about everyone has SOME illigitimate media on their systems, but perhaps you friend's real problem isn't a lack of backup capability, but a lack of karma. :-)
-=Skip
When you have that much data your only real viable option is to have a second storage array just as large. RAID 5/6 isn't backup. It provides fault tolerance which means you can still access (read/write) the array as you normally would if a disk fails.
Its almost a no-duh solution in lieu of tape or other cumbersome removeable storage options. Even backing up to 50GB or 100GB blu-ray discs would be rather pointless as the cost of a single disc is the cost of buying a movie on blu-ray. Even if you could fit 4+ movies on a 100GB BD-ROM is it worth the hassle and cost?
Why is another array better? Its quite simple. You dont have to shuffle discs or tapes to make backup sets. You also aren't stuck with a format that could become obsolete. That LTO tape drive might look good but what if it fails? Can you find another for a reasonable cost? Will you periodically test your backup tapes or disks for bit-rot? A tape or BD-ROM rotting in a safe deposit box, safe or shoe box under your bed is useless. If your disk fails you can replace them quite easily as most every disk supports SATA or SAS and controllers are found on every motherboard. A failure of a disk will be reported and you can handle it.
Here is another question that is kind of burning in my head. If your friend legally purchased all of his movies and music, wouldn't he have the original sources? If not, and i'm not judging I have a collection of both music and movies that I pilfered over the years, you have to be smart and make copies. My collection is just about 2TB, mostly some hard to come by movies and TV shows (entire MST3K library). My home server used to be on 24/7 but it was a waste of power. I almsot lost that array of 5 500GB disks until I made copies to 1TB drives. I now have a few 2TB drives that have copies of the server data on them. One drive is even at my place of work in a USB box. Even if my home burns down I still have a copy somewhere.
15 million of them.
Troll is not a replacement for I disagree.
Even though he has 20 TB of data, there's no chance all that data is irreplaceable.
That's the only thing backup is for, irreplaceable data. When you have large amounts of data, you start evaluating what needs backup and not.
Recently crashed a 2 TB disk myself. Barely shrugged from it, even with only a few hundred MB worth of backup (private photos, videos and documents).
Why? Because most of the data was either easily replaceable or not important enough warrant a backup.
I actually considered the disk crash nothing worse than spring cleaning. New disk, restored what was irreplaceable.
Tape backups are the cheapest way to go as far as media and surprisingly is making a comeback due to high storage requirements. It can be expensive as far as hardware and software depending on what you buy. We backup about the same amount of data in our production environment for offsite storage. Latest tapes can hold 4 TB per tape.
Catalogue the contents and when you lose it all you can spend 10 minutes searching for the 2% of the content you really want to download again and feel good that you now have 98% of your storage space back to start filling with more crap :D
Might be big enough for that.
lot of time (and money) spent burning discs that you likely will never need
If you have any data, over a long enough period of time you WILL need a backup. Saying "I will likely never need this backup" is a non-sensical statement, because (a) you probably will, and (b) the cost of NOT having the backup is essentially infinite in pain and grief.
"There is more worth loving than we have strength to love." - Brian Jay Stanley
Even if 100% of that 20TB is legally owned content, recovery is a huge process: re-ripping hard media is still awfully slow -- if you can even find where you stashed it (I think a few CD's have walked off the reservation)
Purchased digital media is no better: you've got many sources to find it from, and it may disappear: preview tracks, live tracks, etc. may disappear when they stop updating their MySpace, or a local distributor goes belly-up. That's also assuming you're still using the same providers: if you had download privs on some of the music servers of the 2000's, you'd have 'ownership' of that media, but you may not be able to get it again if you aren't still paying for the account.
The most economical and reliable is probably a mirror RAID array. It sounds like this guy accidentally issued a command to erase the content, rather than a RAID failure. Ordinarily, the RAID should be good for most stupidities, but this falls a little outside that. The question is, if you have mirroring software, how frequently does it try to match, and would it clean off the mirror too?
Design for Use, not Construction!
20TB of music and movies? How many of those could be downloaded again tomorrow? My guess is "most." The only thing that's really "mine" on my computers, and not backed up, is my own pictures. I upload those to image sharing sites on the internet. Most docs are done on Google Docs for portability reasons, and other things I've created are already on Dropbox.
I experienced a catastrophic hard drive failure a year or so ago. After replacing the hard drive, and about one day of downloading and installing the programs I needed, I was up and running again. It took 24 hours to download enough of the series that I was watching to pick up where I left off again. And if I ever get a hankering for watching something I've seen before, well, I can get it from the internet again in a matter of hours or days.
It may look like I'm doing nothing, but I'm actively waiting for my problems to go away.
--Scott Adams
Very very slowly. Who pulled the whelps?!
I was going down the route of buying an expensive RAID NAS / DAS, but then I remembered when I got broken into in the Canary Islands and the thieves took both of my backup drives from two separate rooms. I'm now settled on a simple external drive, with the whole lot backed up offsite. I was looking for... + Unlimited backup, so I don't need to think +The ability to backup attached drives (NAS, DAS, USB, etc) + To feel that my data is safe with a 2nd layer of encryption You can try it free here: http://bit.ly/1bRNax1 My blog post about this: http://www.bentristem.com/1/po... Enjoy!
Ben Tristem I'd love to know more about you in this short survey... http://bit.ly/1oM7Fvl
No one will ever see this anonymous post but a cheap robot changer (used) on ebay can be had from between a few hundred to a few thousand dollars. Most of us are geeks and love technology. I use two such devices, couldn't imagine life without them. LTO4 is still the sweet spot in storage cost (media) and capacity. The tapes hold 800GB and can be purchased for around $22 dollars each.
"Blu-Ray discs can hold a lot of data, but that's a lot of time (and money) spent burning discs that you likely will never need."
Any form of backup is there incase the very worst case scenario happens, so you have answered the question, if you want to save it - is it worth the effort?
If it's 20 TB of movies, TV shows and music then maybe he purchased hard copies legitimately. That is your backup.
1- if you need to backup 20 TB today, you need to budget for 40TB in the medium term.
2- a backup is off-line, off-site, tested, and multiple. The "multiple" part is pricey, and the other 3 you can get cheapest with a PC filled with HDs. Or two (I'm making do with one). $200 for the BC, $150 per 4TB HD x 5 = $950. Hide that backup in a place safe from theft, floods, fire...
The Cloud - because you don't care if your apps and data are up in the air.
Call the NSA and see if they'll do a restore on your data.
Tell him to stop hording. Streaming is good!
The only irreplaceable data that I posses is digital home videos and digital photos of the family/kids. I have a lot, but it's super easy to back up to an external HDD that I keep locked up at work. To protect against HDD failure between those monthly off-site backups, I replicate the data partition on our main PC to a second PC in my house using a scheduled Sync Toy job.
Cheap, very effective.
Safety deposit box, Iron Mountain, your trustworthy friend who lives far away but not so far you couldn't drive there and back in a day, and so on.
If there's a twister with your home address written on it, all the onsite backups in the world won't help.
.
The same way they got the music seed it to the internet
Why would it be a problem? if he owns all the music/videos the worst is he has to do is re-rip all of it, thats just time, it sucks but thats life. At 20TB of data, there is no good backup solution, not one that is affordable or reasonable.
just label it as terrorist plans and they'll back it all up from you for free!
First "LOL", sorry had to get that out of the way.
1st choice for that much and to be reliable use tape, 2nd choice multiple hard drives and by that I mean multiple backups not just one, 3rd choice BlueRay disk, after that backingup starts using bandwidth, Amazon Glacier is cheap to upload and store, they get you when you want it back.
Or you could sign up for Mega at 50 GB a pop.
20 Mega accounts = 1 TB, yes go ahead and laugh but I have a terabyte of storage online for free. (use their sync app)
I used a Gmail account to sign up, so lets say your Gmail is "turtle@gmail.com" well did you know you can use "turtle+01@gmail.com and it will go to "turtle@gmail.com"?
Yeah it will, so my Mega sign up scheme is turtle+01@, turtle+02@, etc, and I can manage it all from "turtle@gmail.com"
Finally I would like to offer an apology to "turtle@gmail.com", nothing personal it just popped into my head.
"If any question why we died, Tell them because our fathers lied."
20TB of backup drives.
Not that hard, although if it was really important I suggest 40TB of backup drives and 2 full backups. I'm a fan of tape for large capacity. SDLT600 would be his best solution with a small 10 tape carousel. do a full backup monthly and then incrementals every week.
20TB means he has to spend money. If someone freaks out at the cost of a real backup solution, then the data was worth less than the backup.
SO manually back up to 20 2TB hard drives on a esata interface, or automated Tape system. either way it is not going to be cheap.
Do not look at laser with remaining good eye.
There are many > 1TB tape back up systems, many with very high speeds, assuming you can feed it data fast enough.
I have to wonder though.. 20TB for a single person? I'm not gonna do the math but that sounds like so much stuff to be impossible to listen/watch all of it.
But at least he has proven once again, RAID is not a backup. RAID will merrily do what ever you wish, including copying drive corruption.
ever hear of punch cards!
Most data that people have on their hard drives can be readily re-obtained via BitTorrent or in other ways. The simple and probably best strategy is to figure out the 500 GB or less that is actually irreplaceable, and make several copies of that. I have three or more copies of my most important data.
Or, looking at the problem another way, 4 TB hard drives are selling for $160 right now including shipping. A complete insurance policy would cost $800 plus your time. What I would have done if I just had to save everything would be to simply copy all of the data in 4 TB hunks, and put each hard drive one by one into a fireproof safe, or in a safe-deposit box at the bank. A second RAID would be complete overkill, unless time to recovery is of the essence or the data churn rate is high. More than 90% of my data simply accretes over the years, and I'm sure that is true for most people.
$800 is a small price to pay for your data. I seem to recall that it cost a company I worked for over $1,000 to recover a 9 GB IBM hard drive that failed about 15 years ago.
According to this article, Seagate is promising 20 TB hard drives by 2020:
http://www.computerworld.com/s...
Solution: Find a friend who has a collection very similar to yours. Make a gentleman's pact and rsync them, serving as an active backup for each other.
I don't know about you, but I'm pretty sure my ISP would flip out if I tried to transfer even 1TB in a month. Even if they didn't care about the amount of data being backed up, it would still take me around 231 days to upload that much. Any kind of online backup would be infeasible for the initial dataset, but it's also probably not a great option to ship in a box of hard drives.
Let's be honest: any large dataset like that is going to cost some serious coin to backup. You can probably "cheat" by incrementally backing stuff up to Crashplan (with its "unlimited" storage), but it'll take so long to seed that initial dataset that you're likely to experience some kind of data loss before it's done.
There is a difference between "insightful" and "inciteful" other than spelling.
The first thing you need to do is to categorize your data both by type and by common content groups (e.g. audio tracks by artist, photographs by family member, etc.).
Next, start backing up your data one group at a time with only one group per backup device -- hardware is cheap.
When you are done you will have a segmented backuup that should be stored off site in order to protect you from a localized catastrophe.
Every night you should back up any new or altered content to removeable media which you can take with you.
Periodically bring back all of your media and add the content of the incremental backup to the appropriate media.
Now, as to media. My own preference would be for external hard drives because they are inexpensive, reliable, and readily available.
By the way, with 20TB of data it would be a good idea to build a small locator database so that you can easily locate and retrieve specific content.
Have fun.
Jerry
As in "How Do You Back Up 20TB of Data?" "Backup" is a noun. Verb conjugations are "back up", "backs up", "backed up", etc.
It would be relatively easy to backup this on "only" four 4TB disks. They could be in one USB3 enclosure each, or in an outdated PC (pentium 4 or something) that is turned on for backups only, whatever.
A simple mechanism to make them appear as a single ~16TB volume or directory would be nice. Or perhaps optional. Or just use some real backup software.
Maybe the backup will be so painfully long (days?) that a drive failure may be a concern.
On another note, I'd like a very easy and nice to use program that simply back ups the file names etc. ; I can afford easylier to lose music/movies if I have a list of what I actually had, so the good stuff easy-to-find can be found back and reconstituted.
Ive used Crashplan for years at clients, friends, and personally, and its generally been good. They have 2 options that may work here.
The first is their all-you-can eat backup service, but they may well balk when you tell them its 20TB-- they might shove you to a $120/year business plan.
The other is buying a pack of Crashplan ProE licenses, which let you host your own cloud backup service. You can use any PC as the "server" (just make sure its reliable and on 24/7) and it handles diffs like a champ. It also verifies backups to avoid bit rot.
If he possesses 20TB of digital copies of legitimately backed-up media ripped from CDs, DVDs and Blu-ray discs, it shouldn't be that big a deal, should it? Just start ripping again as he needs them.
Just assuming that your friend had a fully legal collection, I would think that all he needs to do is ask the media companies for a new copy. Because the media industry tells us that we do not buy music, we buy licenses, right?? So even if we lose the bits-and-bytes which are easy to replace, then we still hold a license and the media companies should facilitate that your friend can exercise his licensed rights..
[/sarcasm]
To Terminate, or not to Terminate, that's the question - SCSIROB
he didn't have a good way to backup that much data
But he did. Another RAID array of the same size would have sufficed. Oh, now I see what you mean. He didn't want to spend the money on a good way to backup that much data.
Another issue entirely :-)
8 of 13 people found this answer helpful. Did you?
Having a similar setup myself, and having looked into exactly your question, you have exactly one realistic answer:
You back it up to an identical (or larger) disk array.
If possible (though not necessary), you'll want to do the initial backup with both arrays directly connected to the same host; but after that, just rsync --link-dest (to make hardlinked differential snapshots) them on a nightly basis.
For a media server, where the typical use case consists of adding large files slowly over time and little ever changes, your backup shouldn't take up much more room than the primary storage.
Seriously, who needs 20Tb of data at home? This is like a digital version of "Hoarders" or something. Time to clean house and organize.
First, it's time to TAKE OUT THE TRASH. I'll bet the large majority of this data is stuff you never use, don't know you have or is simply out of date and unnecessary. Toss it.
Second, De-Duplicate what's left as best you can. No need to have multiple copies of the same pictures at different resolutions, or the same video encoded multiple ways in your backups. Keep the best resolution stuff in your backup, forget the rest. Don't backup anything you can re-rip from the original media (i.e. that DVD collection, Oh, don't have the DVD's anymore? Turn yourself into the MPAA...)
Third, Compress what's left.
If you find that 20Tb is what you need to keep, then stop asking Slashdot for advice and go buy yourself a professional tape drive and some brand new tapes and start doing backups like a professional. If this is too expensive, start over at step 1 and really take out the trash this time.
"File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
If IBM punch cards were used, 1 GB equals approximately 47 cubic yards (assuming 80 bytes per 187x86x0.18mm per card) and about 70,000 lbs (at 2.42 g per card), so one standard railroad boxcar (limited by both cubic capacity and weight) could hold about 3 GB. 20 TB would need over 6000 boxcars of punch cards; at 60 feet per boxcar, that's a freight train about 70 miles long.
With a four Drobo B1200i 2x local and 2x offsite silly goose!
70 TB tape backup; http://www.zdnet.com/blog/stor...
>>"ad space available -- low rates!!!"
You would be looking at either 20 gigs of distributed storage across multiple hard drives (not recommended but a valid option for us who don't own a raid server yet), or a secondary storage (like a secondary raid array server or drobo box that gets backed up to once a month)
I'm no where close to 20 gigs, but even without a raid array, if I had a hard drive fail, I all of my data either backed up on unplugged hard drives (in case of a power surge), on there original disks (optical media) or in the cloud (google music / docs / onedrive / etc)
The bigger issue here is versioning. It sounds like your friend has a decent setup but should set up some sort of redundant copies of files (like what time machine and windows backup kind of offers) so that if you somehow deleted your files, you could just point to a specific backup and press restore (you would obviously want more space) That way if your drive somehow got wiped (other then a complete format of a disk) you should be able to recover since it would be on its own hidden partition.
Take the storage media he has, and then duplicate it.
Really, was that hard?
The Kruger Dunning explains most post on
NAS
First, get organized - what can you absolutely not lose, what would be inconvenient to lose, and what do you just not care about...
Things that can absolutely not be lost (email, accounting and tax stuff, projects documentation source materials and deliverables, photo library, etc) get backed up 2x with one or both backups offsite.
Things that would be annoying to lose but could be re-built (system software, settings, VMs that are doing something, etc) get backed up only once (twice if you have business continuity needs but I was assuming a home user)
Things that you don't care about (test VMs, non-keeper photos, downloaded videos, etc) are not backed up.
Then you can reduce backup times by doing stuff like making sure your email is in MailDir format, not Mbox or PST files (which use one big file for everything - and this file changes every time you look at an email.
1 Synology DS1813+ - $1000
6 4TB drives (RAID 5) - $1000
Done for $2000, and you have 2 more drive slots to expand into. If the stuff you're backing up isn't worth $2000, just get 5 4TB external drives and use them. Bonus if after the initial backup you put the Synology at a friend's house & use it remotely.
This is where the "physical media" is obsolete BS hits the wall. I keep all of my video on a disk array for convenience. For emergencies I have the originals on disk. Gasp! Yes I purchased them on actual physical media. Since I know nobody here pirates their video, I'm sure this will work for them too. If you download all of your paid video, then I guess you should budget for a decent backup too. Pretty much eliminates any cost savings though.
This is Slashdot, until someone makes a Beowulf cluster of punch-card processing machines, we can't call ourselves geeks!
And, someone needs to compute how many punchcards it would take to back up Google. Oh yeah, I'll just google that;
Let's assume Google has a storage capacity of 15 exabytes, or 15,000,000,000,000,000,000 bytes. A punch card can hold about 80 characters, and a box of cards holds 2000 cards. 15 exabytes of punch cards would be enough to cover my home region, New England, to a depth of about 4.5 kilometers. That's three times deeper than the ice sheets that covered the region during the last advance of the glaciers.
http://gizmodo.com/if-data-was...
I think it's important to know that Google's data would be three times thicker than the glaciers during the ice age. It's strangely comforting.
>>"ad space available -- low rates!!!"
20 TB is an awful lot of data for backing up over the net.
What I do is backing up over the net to my brother's NAS. (He lives in another country.) I use rsync and it works like a charm. It is a bit of a bother when I have been taking a lot of pictures but as it works in the background and is traffic shaped with low priority, it is manageable. I've got a fairly slow 1Mbps/6Mbps connection, so it takes some time. 20 TB would take the better part of a year, but since I do it incrementally as I get the data, it has been manageable so far. The Raspberry Pi server at my brother's replicates it to a friend's NAS as they both have 10/10 Mbps lines.
I keep a local copy on a Raspberry Pi with a couple of USB drives, just for the fun of it.
Worst case scenario that my house burns down or similar total catastrophe: My brother copies my data to an external disk and sends that by courier to me. Downtime around 24 hours.
And, obviously it is fairly easy to restore individual files over the net.
Setup a FreeNAS box with enough space to store your data, have it replicate to the FreeNAS box, then take that FreeNAS box off-site.
What does it mean that he didn't have "a good way to backup that much data, so he never took one"?
The concepts behind backing up data have not changed. You need to manage the size of your data to redundantly fit into the storage of your system. So either pony up the cash and time to properly store your files, stop collecting TBs of crap, or stop complaining about losing it when your system crashes.
It's frustrating to see people continuously complaining about how they have too much data to back up cheaply and conveniently. It's even more frustrating to see them complaining about losing all of their data because they didn't back it up properly.
I think that the main issue is that most people do not realistically or conservatively plan their actual storage capability. For example, it seems like 90% computer users believe that having 4 TB of hard drive space means that they can safely store 4 TB of data.
After a conversation about scratch space, redundant drives, and timestamped backups, they then will grudgingly agree to allocate 25% of their available storage to RAID/Backup space, which obviously does not get the job done! Very few are willing to accept using 66% of their available hard drive space for RAID and Backups, which is really the minimum metric for any sort of storage longevity.
20 TB is an awkward amount of data for a non-corporate individual to be storing. It's more data than most people actually need for their media and it is getting into a very expensive price range to backup for basic music/movie content. (By expensive, I mean that it would be cheaper to just re-purchase the media rather than back it up.)
Hey, look how big my collection of movies is that I never watch and did not purchase! You don't need a home backup strategy since your usenet provider or torrent tracker already has it waiting for you in most cases...
To /.ers saying that 1TB+ tapes would be a good idea to do this backup, please:
Add some references and price of such hardware and media that would suit best home usage.
Just because the player is integrated to the spinning discs doesn't make it more electronic than tape or CDs. Even record players have been electronic for what, 70 years now? How old is subby?
Why bother? It's rarely-used, practically useless bits anyway. A quote from John Nash: "Facts are available where direct memory fails in many circumstances." In this context, that could mean use spotify and netflix to stream your dumb music and movies, rather than saving them indefinitely.
More media than one could actually consume in one's lifetime. Unless you're some kind of weirdo that actually likes 4GB-sized lossless encoded films.
How do you listen to 20 terabytes of music? You won't repeat a song for at least a year, at a guess.
>>"ad space available -- low rates!!!"
He should at first assess his needs for backing up files. What kind of data for home use could possibly fill 20TB? Does he need to keep a backup of everything? Either way, you could always get a second (third?) NAS server and HDDs and setup automatic backups. A NAS with 2 bays will set you back 100$ or less. Add 2x 4TB for 400$. Better yet, build a custom file server for under 200$ with a nice case with plenty of room for disks. Be sure to choose a motherboard with 4 or more SATA ports and fill it with HDDs. FreeNAS is a great OS with many options. Five 4TB (=20TB) would cost under 1000$, so for 1200$ you have a nice long term solution with great flexibility.
Seriously, where do you get 20TB from? I mean, if you rip 250 DVDs at home, you've got 1TB. So for 20, you'd have to rip 5000 (or a bit less), without further compression. If you compress them a bit, 20TB would store over 20.000 movies. So, what's in this 20TB and where is it from?
Anyway, simplest solution: 10 $99 2TB disk units and some time.
There are a few things I thoroghly enjoyed watching and would consider watching again.
Here is a list of things I would consider having stored.
The Wire
Drive
Oz
The Matrix
Here are a few things I wouldn't consider storing
Sitcoms
The simpsons etc.
Uwe Boll movies
Things I'm on the fence about
Game of thrones
Breaking bad
Now, here is another thing that makes this discussion rather useless. If you like it that much, just bite the bullet and buy a copy, I know it seems wrong buy you are most likely not a starving college student anymore.
Part of your choice of solutions will depend on the nature of your data. Is it changing often? At all?
I use Bacula for my backups. My wife has a photography business and her collection of images is about 6TB and is being added to constantly and occasionally edited. The Archive is about 5TB and is stuff that is unlikely to change. Then there's the Working array, which is 1.5TB (max) and generally clocks in around 700GB. This is current work that hasn't been delivered to the client yet (RAW files from recent weddings/portraits, JPEGs where the client is still picking out what they want, PSD files for current album designs, etc). Both are on RAID5 arrays, and the Working array has a hot spare. At the end of each month, the Working folders are gone over for bodies of work that have been delivered to the client and are unlikely to change. This work is then moved to the Archive, backups are burned to DVD and also copied to an external hard drive.
For the Working backups, I have JBOD on another server. I think it's 6 1TB disks. These are set aside for different Bacula pools of volumes. There's two full backup pools, two differential backup pools, and two incremental backup pools.
Every night at 5AM a Bacula job kicks off. On the first sunday of the month, a full backup of the Working array is dumped to the JBOD. Bacuala makes a backup copy of everything (~700 GB). On the other sundays of the month, a differential backup dumps everything that's changed since the previous full back up to another set of volumes on the JBOD (~80GB). On every non-sunday of the month, an incremental backup copies over everything that's changed since the previous incremental backup (or differential or full backup if it's a monday). I have two sets of these pools. On odd months, it uses Full-Pool1, Diff-Pool1, and Inc-Pool1. On even months it uses Full-Pool2, Diff-Pool2, and Inc-Pool2. This way I have two sets of backup copies of everything so I don't have to delete last month's full backup to make this month's full backup.
It works pretty well, and every morning I get an email telling me that all the backups worked fine and the arrays are stable. I know it's a little anal, but well, I couldn't imagine having to tell a bride "Hey we lost your wedding photos. Hard drive crash. Too bad." With the system I've got, unless the house burns to the ground I'm fine. And if the house burns to the ground, I've got bigger problems. I wouldn't mind an off-site solution, but I don't see how I can transfer the several TB of backup data I have at any given time someplace else, except by carrying hard drives out of the house every day, and I don't think that's something I'd be able to stick with for very long.
We don't have a state-run media we have a media-run state.
If you put it in the cloud and distribute it to everyone odds are it will remain available until the end of the world.
First of all, don't run RAID at home for data storage. RAID systems are for corporate high availability. They are inherently dangerous the moment you have to touch the config and only worth it if you need a drive system available 24/7 with hot spare. Truly stable RAID systems are also huge power hogs and heat sources. You can build highly redundant file systems for a fraction of the cost, with a small fraction of the power.
This is easily the 15th time I've heard of someone loosing huge amount of personal data to RAID. The last one I heard was everything for the poor fool, wedding pictures, kids pictures, etc...
Beyond that, I have about 20TB myself. I use DFSR to keep it highly available, then a one way rsync job with no purge. That way if I mess up one of my replicas, it won't get purged from the rsync target. I take an encrypted version of the rsync target to a friends house regularly so there's no chance of massive loss. I also back up limited encrypted data to the cloud, but only documents, code, and pictures.
Don't add complication where you don't need it.
overthinking the joke times pi
Buy a Drobo D800FS for $899 which has 8 drive bays. Drop in 8 drives w/ 4gb @ $186 for a total of 32gb...for $2,400 total. If your data isn't worth $2,400 to you, then forget about it.
Sure the information density is pretty low, but it lasts forever!
Ken
Choose Cheap, Quick and Correct, but you can only choose two.
Sent from my TARDIS
It likely didn't cost him a dime to build up that collection...
Ken
I thought I was bad with my media consumption. 5TB of visi media, 3TB of audio, 2TB of um... not porn. I really only back up the audio. Do you really expect to rewatch all of the other stuff? If you did how hard would it be to reacquire?
Back up all your data to stone tablets.
I do it easily every night.
If you have a friend with a large server in their basement, you can back each other up securely and quickly. You might have to physically transport one of the servers to the other for the initial sync, after that you can easily keep up with each other over a FIOS home Internet connection.
Of course, my data's stuff I actually own, which makes a pretty big difference. You might not have many friends willing to incriminate themselves by storing your malware-riddled piles of pirated music and pr0n torrents on their property.
The first step is to classify the data in two groups: what you would not want to lose at any cost, and the redundant data (movies, music, etc) that you could survive without. This is the most important step
The second step is to backup the important data using an external 1 TB drive, tape or similar.
Optionally, the third step is to delete the remaining 19 TB.
After a certain point, you have to go big or get out.
A tape drive able to handle 20 TB is going to be $3k+.
Online backup is out of the question. If it takes two weeks to backup 300 GB to Crashplan or Amazon Glacier, it'll take two and a half years for the 20 TB.
Being a Jottacloud customer for a long time, I really like their backup. Unlimited storage is 6$ per month. You can specify when to back up, and you can exclude subfolders from sync, and you can limit the bandwidth used.
I guess it's not very well known in the US, but it's been for several years in Europe. All servers are located in Norway.
Unlimited is limited to one computer.
Jottacloud.com: Jottacloud.
(I am in no way affiliated to jottacloud)
I solved this problem by simply making copies for all my family members for their media centers. If one of my drives fails then I could simply call one of them up and do an RSYNC to recover the missing data.
Do you need to back up all 20 TB? Or is half of it crap you got from usenet/torrents?
I run a 24 TB usable zfs array that I snapshot regularly so I can restore an event like me being a dumbass and doing an rm -rf /Array/.
As far as backups I separate my content into 3 major categories.
original content - this stuff i backup regularly to 2 locations. it contains things like home movies, pictures, documents, etc. I copy to a usb drive and to a cloud backup service (I use crash plan). It's stuff I can not replace and would be devastated if I lost it.
rare content - stuff that's hard to find. I back this up too, but only 1 location. It's mostly static, it consist of things that took a lot of time and effort to find but are probably still replaceable. I back it up to the cloud only.
replaceable content - stuff that's backed up already on the bit torrent network. It's mostly media i just hoard that i download off of usenet. If i lose it it's not a big deal.
Just mail them to get your data back? I tried it and it worked like a charm. The next day the latest copy of my document was in my maibox.
They even had gone through the trouble of correcting a few spelling errors, a misspelled name and a glitch in the layout.
They did censor the part about privacy though.
20 TB is an awkward amount of data for a non-corporate individual to be storing.
4K movies will be out shortly. We will be looking at 50-100GB per movie. Some people want to backup their disks and have them accessible for their HTPC because its significantly more convenient.
And before you suggest, no I do not want to compress my movies to a lesser quality... That's why I got the BluRay in the first place. Because I want high quality. I watch them on a 136" projection screen. I can tell when it's been compressed...
Redundancy is the way you do it.
You have an identical system as the first one, with 30TB of drive space, and every night you copy the data over to the other system. That will cover you for anything short of a house fire/tornado/earthquake/flood.
4TB drives sell for $165 right now on Newegg. So five of those would cost $825. Your friend could stick them all into a large PC with multiple bays and create an enormous RAID-0 array of 20TB. Then he could use FreeFileSync to copy those files. Or he could set up another NAS with those 5 4TB drives and just do a copy/sync. It'd take days for the initial load, but it would be backed up. The problem with having that much data on CrashPlan (also my cloud backup of choice) is that it would take so long to restore it -- too long, I'd think. You'd blow all sorts of bandwidth limit to do so. And you can't use their restore-to-door plan for backups greater than 3.5TB. Until we have Google Fiber running everywhere, bandwidth just doesn't make it feasible to push all that data to the cloud.
First question is how much data actually changes on a given day? I have 20TB of data... 50% are videos that will never change. I think I only average 10GB of changed files a day. So, every physical drive I buy also has a USB external drive for backup. I keep the drives online and test the backup weekly for the content that is static. Critical drives are mirrored daily (usually to flash AND usb hdd). Less critical is on a 72 hour mirror (or on demand if i know i made many changes) I've lost about 7 drives over time. I just pull out a spare drive, start an immediate mirror and order another replacement drive. I stopped using real-time mirrors to avoid accidental data deletion.
The cost of 2 independent sets five 5GB hard drives is NOT enough to worry about compared to the cost of obtaining the 20GB of data.
You can price the cost of this in a few minutes, period, end of discussion.
If you are half way clued in, take those backup/clones over to two different physical locations.
Don't have that much data. See, it really was that easy.
http://www.backblaze.com/
Unlimited storage $5 a month. You're welcome
Tubby or not tubby. Fat is the question
Multiple 4TB drives. Best you can do.
http://hubic.com/
Solved.
-- /. is now http://soylentnews.org/
My
1. purchase a second machine (similar spec, but not necessarily)
2. find a friend with a similar situation
3. build initial copy of data onto second machine on local LAN
4. go to friend's house and maintain a BT sync copy remotely
5. allow friend to use your bandwidth as you're using theirs (or alternatively pay for another high speed drop at friend's place)
That's how I'm doing it (only 6TB, but still)
BT sync, someone above mentioned Crash Plan, rsync ... pick your poison
There are plenty of people who do 1:1 backups of movies and music. It's extremely convenient. I don't handle any physical media more than once. It keeps the house tidy and the disks in pristine shape if I ever need to re-rip.
Around 6 months ago I had a similar problem to the story. My media drive died a sudden death (Seagate drive, never again). I had all of my family pictures, home movies, music, and movies on that drive. I had done backups and stored them remotely and was able to recover most of what I had. A few re-rips of some movies and I was done.
The time investment necessary to rip a 1:1 copy for a large collection is not insignificant. I probably should setup raid + parity at some point but right now I'm only doing a clone of my stuff. I don't have bandwidth capacity at home to use any sort of cloud storage.
It's only 16TB today, but when 5TB drives arrive in the market (Real Soon Now), it'll be a 20TB solution.
I use the Mediasonic 4-Bay USB3+eSATA enclosure -- small, portable, easy to just "plug in".
Then Linux + mhddfs to bond the four drives into a single "meta" filesystem of full capacity.
Simple.
What if his 20TB is mostly self-shot video files? Or pictures? Or studio recording session audio? Some people do produce their own content.
A 4 TB slowish seagate hard disk can be had for about $160ish if you look around. Five of them are $750. An inexpensive bod tower such as a TowerRAID 4 Bay eSATA RAID runs about $150. Get two of them.
Total cost is around $1100 and the solution is expandable .
----- In Your Cubicle No One Can Hear You Scream...
A couple of things. First, do you really need to back up all 20 GB of data? How much of that can be recovered by other means? For instance, is it reasonable to back up the OS if you would probably just reinstall anyway? How much of your content did you acquire electronically? Would it be easier to go back to the source?
Thing two: If you really have to back up all 20 GB, the only really practical, cost-effective way to back up that much data is to another set of hard drives. Build up a second array, replicate, and then turn the backup array off. Leave it off except for periodic backups.
For incremental backups, dedicate one removable SATA slot. (I use one of those "hard drive toasters" that plug into a USB slot and allow you to hot-plug a SATA drive.) Plug in a drive on a regular schedule, and copy over the files that have changed recently. Mark it with a sharpie and put it in a safe place.
The idea is to (a) back up only what you couldn't easily recover through other means, (b) back up to the cheapest and fastest per byte, which is currently other hard disks, (c) keep your backup disks turned off when not in use, and (d) Figure out a schedule that suits you. For me, it was replicating the entire array only a couple times a year, supplementing with incremental backups to individual drives every week or so. Yes, you could still lose data, but not nearly as much as if you did nothing. Don't choose a solution so ambitious that you would later tire of it and stop doing it.
Oliver's law of assumed responsibility: If you're seen fixing it, you will be blamed for breaking it.
So, guy spent around 10 x $100 (2TB drives), maybe more since you mentioned redundancy, for a total of ~$1000. Guy kept drives probably up 24-7, spending a lot in the electricity bill, I would say something in the lines of $150/month. Guy also had to manually maintain the complex disk array, prone to failure. Guy failed at it and lost invaluable amounts of (mostly) unrecoverable data (good luck getting that TV show from the 90's that now has 0 seeds on TPB, your familly event pics and videos, or your college papers).
Now tell me, how can ~$250/month be expensive for 20TB in Amazon Glacier? They will give you transparent redundancy (if they lose the data you have reasons to sue for MILLIONS, you know, those numbers with 7 figures instead of 3). They will pay the electricity bill. They will buy the hardware. They will maintain the hardware too, so no need to replace drives. Your ISP is shapping traffic to AG? Sue them or change provider. Last time I checked it was a lot easier than doing ANYTHING on your 4TB+ RAID array, especially since it's for home use and will return you absolutely nothing besides self-complacency.
Just sayin'
In all seriousness most people don't have that much data to backup. But I can see it might be possible, but it isn't going to be necesarrily cheap. Assuming that this data is a positively must KEEP, then using the 3-2-1 rule of backup here is what I would suggest. 1. Need to have a second synced copy. So you are going to have to purchase some kind of NAS or large storage device. You can go your own DIY route (FreeNAS) or BackBlaze storage Pod 3.0, or something like Drobo. Plenty of lower cost options out there. But it will cost some money to do it. 2. Use BackBlaze or CrashPlan for an offsite replication. There are no limits! I use BackBlaze for mine and have about 2 TB backed up there. It took about a week to get it all there because there are upload limitations by your ISP and by them, but it will eventually get it all. For $60 a year, you can't beat it! 3. Writable media (Blue-ray or DVD) is a viable option, as it is cheap but complicates recovery. And it has longevity issues. It should not be thrown out if keeping cost low is a priority. Also if the data is so rarely used, then this would be a better solution than paying for the energy and cost of hard drives. Other considerations: 1. Like any filing system, physical or digital it needs to be checked, purged and arranged on some kind of annual or semi-annual schedule. To get rid of stuff no longer needed, and to make sure you do not have duplicates, and to see if you are going to need more space this year. I simply have an internal 4TB drive that I use to sync data, a second drive for image backups of the computer, then I use backblaze for offsite storage. I know, I have 4 copies, but it makes me feel safe. 2. It seems like priorities haven't been established when it comes to retrieval. At times it appears Cost is your highest priority, then at others convenience. You won't be able to have an extremely convenient cheap solution. You need to decide which is the highest priority, and then the next and then the next.
Services like CrashPlan cost pennies a day and would have backed it all up. If they could afford 20TB of media and the storage to host it, there is no reason they could not afford to back it up.
I think buying 5 x 4TB hard drives would be the best solution but if you have a decent upload speed, there are online back-up solutions for $4-$5/month (usually you have to pay the whole year in advance). I've also seen people back-up on usenet: create your own alt.binaries. sub and upload everything there. Obviously, don't upload the personal files even encrypted since anyone can download them.
If you want to go the cloud route you should look into CrashPlan. For around $4 a month you can store unlimited data on the cloud--and they actually mean unlimited. I know of someone who has 61 TB stored on their cloud. Although, restoring that much data would take a long time and a load of bandwidth. Restoring that much from tape would take loads of time too...
Everyone hear seems retarded. the first question everyone should ask was what raid was he using 0,1,5, ect. I'd suggest he use a form of raid 10
Comment removed based on user account deletion
Wasn't it?
All my music and movies are stored and backed up on TPB. On my machine at home I have a 100GB HDD that is approximately 10% full, and that includes the OS.
If you have a PEBKAC error torch your array, you can have a PEBKAC error with your backups.
I want to delete my account but Slashdot doesn't allow it.
Wherever you backup to, consider the time it takes to create a full / incremental backup. That basically rules out any online backup service, as even with 100MBit/s fibre it would take you one month to create a full backup. If you want a full backup to complete within 24 hours, that means you need something with nearly 2GBit/s of bandwidth. While SATA 2.0 and 3.0 can do that, you will need to write this to a medium. Current hard drives perform at about 1.5GBit/s. So you will need to backup to a RAID level that combines the bandwidth of several disks (RAID-0, 5 or 6).
Hi,
just use backblaze, at 5$/month unlimited storage...
"Failure is not an option, it come bundled with the software"
The cheapest solution where you retain a decent amount of control is basically to replicated what Amazon or whoever would do - create an array of the cheapest high-cap disks you can buy and put the data on it. Your net cost will be about $1000 plus ongoing electricity cost.
Anyone who's charging you less than that (with the $1000 amortized over 3-5 years) is running at a loss and likely won't be around when you want your data back.
Either buy double the storage and periodically do a differential backup or use a cloud service. A Google search for 'unlimited cloud backup' yields tons of results.
If he has 20TB of music and movies, why even back it up at all? The majority of that content is available on BitTorrent. The idea of backup is that you only backup unique data that can't be replaced.
FWIW - the moment your data leaves your person/computer/home, searching the data no longer requires a search warrant. Just a subpoena, which is simple to get.
http://www.zdnet.com/u-s-attorney-general-government-should-get-a-warrant-before-email-cloud-storage-snooping-7000015493/
So my first thought is that 20TB is excessive. But if he and you are certain that the 20TB is all necessary then it is going to be expensive(ish). Buy a computer with a Perc controller and a used DAS/MD1200 from some supplier. I just bought one with 30TB of storage for 3K, with less than a few months of use on it. Take that and set up Syncback Pro on it to monitor for changes and set it to back up the new files/changes into the DAS backup folders.
20 TB is no small amount of data to accumulate. If it is precious and valuable and needs backing up then your friend needs to be prepared to accept the costs associated with protecting such a large quantity of data. If he balks at it, then ask him if the roughly 3.5 k would replace the lost files. People who have serious photography/lightroom habits are in a similar position. I spent about 40 hours trying to rescue and restructure the un-maintained mess of someone who couldn't be bothered to understand file folders and naming methods. When their primary drive failed, it was some effort to piece it all together from recoverable portions of their drive and files located across many folders on many different drives. Lightroom confused him more than manual placement would have.
A compact 16TB cube:
Total: $759.95
Then for the last 4TB, throw on a $149.99 Seagate Backup Plus 4TB USB 3.0 3.5" Desktop Hard Drive STCA4000100 Black
I have a similar set up. Between music, movies and photos I'm close to the 15TB range. I'm selective as to what I back up however. :P ).
I don't back up commercial movies or music. I have the CDs/DVDs/Blurays that I ripped. If something were to happen to the NASes that's holding that media I can always re-rip. For movies/tv shows, I find myself only watching them once or twice, so if something were to happen I probably wouldn't be re-ripping most of my collection. What would probably need to be re-ripped right away would be the Barney/Dora/Thomas DVDs for the kids. For music it's fairly quick to rip (and even faster to download
The only things I back up are home movies and photos. For home movies I backup the uncompressed files, but for photos I don't back up my RAW files, only the jpegs. Those are backed up to external hard drives that I keep either at my desk at work or at my parents' place. If by some weird coincidence I would lose those as well, a great deal of my home movies were uploaded to Youtube (private) and selected important pictures to Flickr.
With that much data, what it comes down to for me is what I absolutely do not want to lose or can't afford to lose.
It's better to burn out than to fade away
I have close to 10TB of data on my home server and the most cost effective solution I could come up with was to build two servers the 2nd one mirrors the data of the primary using DFS (Microsoft Distributed File System). Neither server uses RAID for redundancy, just Spanned disks. This makes it more cost effective because I'm not wasting a drive on each server for parity. If I lose a drive I can replace it and restore the data from the mirrored copy on the other server. I probably have a more elaborate hardware setup than most because I tend to do a lot of testing using Hyper-V and VMware. The backup server is a small and efficient Mini-itx system in a Chenbro SR30169 compact server case with 4 hot-swap bays. Even with 16GB of RAM, Core i5 4570S CPU, 120 SSD, and 3 SATA drives it only draws 40 watts from the wall. The primary server is more powerful ATX system with an 8 core Xeon CPU, 64GB RAM, SSD, 3 SATA, but still only draws 60 watts at idle using a gold rated power supply helps.
Are you fucking stupid? Either you care about the data, or you don't. If you care about the data, buy a tape drive. This problem has been solved for decades.
If tape is too expensive, or it takes too long to back up and is too much effort ... then you don't care about the data. Simple.
What a fucking waste of time.
Easy, make a torrent and name it as porn. Soon enought you will get some hundred of seed, an will be a distributed backup easy to download.
Use an old computer box and operating system like Win XP w/3GB memory, and put 5 or 6ea - 4TB drives in there and back up across the network. Simple and only ONE time up front cost instead of a monthly $$$ hit.
Rsync
Offer to keep backups of their data in exchange that they keep backups of your data.
Each party must supply the other with storage hardware. If it is just next door you could run direct cable/wifi to your backup.
Ultimately (though I am not aware of any current projects) I could imagine the need for an open-source project that offers a global distributed encrypted storage system. It would work kind of like bit-torrent/tor/bitcoin all rolled into one, you would give some drivespace and bandwidth to others so you can replicate your own data into the system. You would get a storage key (like a bitcoin wallet) for your data. The data would always be replicated across multiple nodes for safety in case of node drops. You could also restrict access to your own node for certain time periods (like when you are sleeping). That would allow one half of the planet to back up the other halves data. IMHO if done right and with enough uptake it would be more reliable than anything any corporation could offer, it would turn our planet into a single backup system.
All joking aside... Seriously, get 20 tb of additional storage. I don't have 20 tb of data but I do have 3 tb. I did an initial "copy" at home, it took a while. Then I took that "copy" to work. I now use robocopy to copy the differential daily over vpn. Most of the time the job takes a min or less, if there are new items to copy, it takes whatever time is needed to copy over the new data. I am able to copy over 3 gigs in about 20 or so min. If your 20 tb is changing daily, yea, you're screwed. But if you only have a couple or few gigs a day that change, this will work. Granted, 20 tb of drives off site may be an issue. I just have one 3 tb drive and before I had that, I had 3 1 tb drives. Also, my upload speed at home is between 5 and 9 mbps, that's the bottleneck
The problem with a "solution" here is there's no way to know how the data is organized.
I'd say any relatively hack-free solution will involve a commercial backup application and a storage array of sufficient size to handle at least one full backup and some chain of incrementals.
Ideally the backup array would be of sufficient size and disk count that you could gain some small protection by creating independent disk groups each capable of each holding an independent file system for a full plus backup chains. I say this having supported large backup arrays where monolithic file systems were created only to corrupt, causing the entire backup to be useless. It doesn't protect against failures caused by faulty array controllers or enclosure failure, but nothing does but multiple complete arrays.
Decent commercial backup software will make the job simpler with compression, deduplication, intelligent incremental management, cataloging, etc.
CDW says $9,000 will get you a Netgear ReadyNAS with 12x4TB disk. In RAID-10, you'd have 24TB to work with. Combined with decent backup software this would result in a fairly painless way to backup that much data and manage it.
If you had nothing but time on your hands, you could roll your own solution with rsync, de-duped ZFS, etc but the hardware piece is still not cheap and rolling your own is nearly as expensive with a lot more headache.
^^
https://www.youtube.com/watch?... It's super easy, I will gladly send you the schematics.
Liberty - Security - Laziness - Pick any two.
You are just 3 order of magnitude off... You need 5000 DVDs for 20 TB...
Blackblaze. $5 a month, unlimited.
I bought a Syba 5.25-Inch Dual Bay Mobile Rack for both 2.5-Inch and 3.25-Inch SATA HDD Plus 2 USB 3.0 Ports SY-MRA55006 for my latest desktop build. You could then buy 7 or 8 3TB drives, back things up, then store them someplace. After the first full, you could take incremental backups for a while. You would have to refresh it every so often but my thought is that the backup should be good for at least a year. Just make sure that the drives aren't stored next to the microwave...
Of course, the enterprise solution would be to buy a SAN or NAS, fill it with storage, and use data duplication software.....
"Blu-Ray discs can hold a lot of data, but that's a lot of time (and money) spent burning discs that you likely will never need."
You have your answer, you're just not willing to put in the effort and despite actually needing them you still manage to avoid the work by saying you don't need them.
Get a 4TB HDD, back up changes to that, flush changes to BD periodically. Not all changes that make the HDD need to go to BD since you're not keeping a full history there, just a recovery point. Example, you add a file and then delete the file this month, it doesn't make it to BD next month.
Get a friend to buy a similar disk array to yours. Bring it to your house and copy all the data over (pre seed). Setup the array at your friends house and setup a continuous sync between the two arrays. BOOM! Personal cloud.
There are two decent approaches: backup or mirror your setup offsite OR archive the previous generation intact and do incrementals starting from that point. I'm assuming that a home user isn't going to be picking up a $2000+ LTO-6 tape drive and swapping in 8+ $65 tapes for each full backup.
The first is to have your own offsite storage that you back up to, where the backup is (at least) as large as the original. Multiple people have recommended Crashplan, and that's certainly a viable option. There are undoubtedly other options that could do similar things depending on how down into the weeds you want to get - rsync, the various rsync-based versioning backup solutions, git-annex as mentioned by someone else though that one's new to me. I'll note that from experience with Crashplan's Enterprise product on some older 32-bit servers, the client software can chew some fairly significant memory when you have a lot of files or data.
The other and probably simpler option is that when you start to near capacity on the storage system, don't upgrade it - shut it down and store it, preferably not in the same (not-yet-burning) building after building the new system and copying the data over to it. After you shut the old one down, keep backups of anything you've changed since that "checkpoint" system; hopefully your data isn't changing that rapidly - 20 TB seems to me almost guaranteed to be mostly static.
fencepost
just a little off
I would definitely say external drives for the irreplaceable data (photos, home video, scanned images, voice clips, documents, etc.). The rest is already *cough torrents cough* backed up for you. Yes, it would take a while to rebuild, but ultimately it's available.
I would also perhaps back up any older or hard-to-find collections to the hard drive, or any particularly cherished movies (kids movie collection, perhaps). Personally, I back up everything to three 4TB external drives because I have the ports available on my server, but if you don't then back up what's important and don't worry about the rest...
Your only other option, really, is to get a 6-bay NAS and some hard drives to fill it. This setup would run you around $2,000, but then you'd be able to back up all of teh things...until your data grows beyond 20 TB (assuming you'd put the NAS into Raid 5 at least :)
"I love animals! Some are cute, others are tasty, what's not to like?" - Betsy Schroeder, Jeopardy contestant
I just did this with a friend and we don't have 20tb but we both have about 8 so 16 total. rsync or brfs,zfs send snapshot. If you have the space keeping multiple snapshots should protect you from accidental delete. Do the first back up on site then set a cron job weekly or daily. This is easy to set up on any low power cheap Linux box with a couple USB 3 4-bay enclosures. If using ZFS make sure you have enough memory. You can run encryption on each users space so only you have access to your data. Just a thought.
Bitcasa.com might be a good one to look into, set yourself up for a long first backup process, then get happy and keep on keeping on.
Duplicate your existing hardware storage setup and then send it to a colocation datacenter. Then you'll have one offsite copy at a fixed price. Downside: you only have one copy, but that's better than none.
To back up one RAID that large, you really need another RAID, maybe half again as large. Removable media just doesn't cut it.
Then you use something like an rsync wrapper (fast, deduplicates some) or backshift (deduplicates better and compresses).
http://stromberg.dnsalias.org/~strombrg/backshift/documentation/comparison/index.html
Wait-- you have 20 TB of data, yet are complaining about expense? That's far above the media requirements for home storage. You're in enterprise territory. Fast, reliable, or cheap: pick any two. Since reliability is not negotiable for backups, you have two options.
Buy an LTO autoloader and tapes. This will cost about $3,500-4,000. You may also need to buy backup software for another few hundred dollars. You'll be able to back it all up within a day, and backup new files in minutes.
Buy the Crashplan unlimited home service, buy the seed drive service to get the first few hundred GB started, and you'll be set in a few days to a week.
Gamingmuseum.com: Give your 3D accelerator a rest.
I use a Synology 1512+ with 4TB drives in RAID5 configuration. The entire system cost under $2000 to build (of course, you may need a slightly bigger system or a different configuration to put all 20TBs in one system).
I replicate some of my critical data to a friend's Synology for an offsite copy. I return the favor by letting him use part of my array. So far, things have worked great and while I haven't had the need to do a full recovery of my data, I feel fairly confident in this solution. Synology works as an iTunes server and a target for Time Machine backups for Macs.
Do you really need to back-up that much data?
I'm just speaking generally here, there are certainly cases where someone would need to back up this much data, but for your home media library? If we're talking movies, 20 TB is roughly 20,000 movies (for sake of argument, I'm not considering music). At what point is this just digital hoarding? I used to keep a large collection of movies, mostly pirated, and eventually realized that:
a) I was spending more time and money managing the collection then I wanted to. b) That I rarely watched many of the items in my library. c) That I was placing myself in legal jeopardy by storing so many illegal copies. d) Anything I did want to re-watch I could get from Netflix, the public library, or download.
Music would be slightly different, as I could see where music is in some kind of constant rotation, but again, how much of it are you actively using? I'm just playing devil's advocate here, but I think this kind of collecting/hoarding is a byproduct of pre-internet scarcity.
The free version of their home service only backs up locally. To backup to the cloud you need to pay the (completely reasonable) $5 a month.
ZFS snapshots work well. I have 12TB on a server in my garage, and another 12TB on a different server at a friend's house. I occasionally "zfs send" incremental snapshots to an external drive from the main server, drive over to the other server, and "zfs recv" the snapshots to keep my two servers in sync. USB was pretty slow, so now I just have an extra SATA data and power cable hanging out of each server, and carry a bare 2TB hard drive in a plastic box. That makes 1TB incrementals almost painless.
You really just have 2 options.
1. Tape; LTO6 which will run you about $3000 plus another $250 for tapes; But the data can be backed up often and recovered very fast.
2. Offsite; I have seen a lot of suggestions, but i didn't see Backblaze on the list. Backblaze is $5/m all you can eat. The problem is, recovering those 20TB might take you a few months as they cap the speed at which the service operates. I Think there are options to have a drive sent to you for like $250 plus an hourly rate. All that said, still cheaper then tape, just not going to get your data fast.
I personally go to the tape route. Tapes will archive for 30 years if kept in a cool clean environment, so you don't have to worry about bit rot as much as you do with just keeping stuff on a NAS.
Just like having piles of stuff taking up all available space in your home is a problem having a 20 TB media collection could be a sign of a larger problem. I'd recommend going through it and getting rid of the things that will never be watched / listened to again or can just be downloaded again. Then worry about backing up what's left.
Just sit tight until these are perfected, and then buy a couple of dozen-
Next-gen “Archival Disc” will squeeze 1TB of data onto optical discs
http://arstechnica.com/gadgets...
Either that, or install a duplicate RAID to back up the first one....
Differences between how you act when some one is watching, and how you act when no one is watching, define who you are
I figure that's good enough.
---- The above post was generated by the Turing Institute. Maybe.
Actually the free plan lets you backup to an offsite location. But not to the cloud.
Let's suppose you have 200+ movies. I very much doubt that you need to have instant access to them, since you'll probably watch them only once.
Why don't you just burn what has not been used since a long time on DVD, and then catalog your DVDs ?
If you have 4GB DVD, simply subdivide your data in 4GB folders, and burn at least one every day.
If you fear that your DVDs vanish, burn everything twice and store them at different places.
Benefits:
1) you can probably reduce the 20TB to less than 5 TB that you need at any moment. Use the saved space to mirror your data
2) doing backups frequently is a good habit that'll be useful in the future
3) doing some cleaning will help you categorize your collection
Tell him to stop storing all those movies and simply stream them using Navi-X for XBMC.
My backup strategy is to keep the old drives from my previous array and put them into a second server, then back up to it weekly. I use a linux software raid 5 setup for backup, with the drives powered off unless the backup is running. I have a script that spins them up, starts up the raid, mounts the filesystem, performs the backup using rsync, then unmounts and powers down the drives. I only can back up about 1/3rd of my main array, so I have to be choosy, but a large amount of what I have stored is replaceable non-original content that I'm content to simply have one raided copy of, so I just exclude the right folders and I'm good.
The servers are currently in the same room, which makes me uncomfortable, so I've long considered creating a mini-server for a relative and setting it up in their home as an offline backup. Using a commercial service would probably make more sense, but I'm not sure I'm comfortable with that yet.
Another thing I'm considering for my next setup is using ZFS for the backup filesystem and keeping snapshots as long as I can for a combination backup/version control. I'm interested in how efficient that would be with vm disk images where the file changes every time, but only small parts of it. Would it detect the unchanging portions, even if rsync re-writes parts that didn't change, or would that cause duplicated space usage? Does anyone have experience with this?
set softtabstop=4 shiftwidth=4 expandtab nocp worlddomination
NSA does it for you so you don't have to.
There's plenty of them.
Best bet, IMO.
Lecie big 5 networked servers could hold that much data. Save doubling your storage by using raid5. Better speed and you can save all your data if a drive fails. Some of these options are affordable but 20tb will probably cost you a bit. You could also use the drives you already have in a barebones setup.
...One Byte at at time!
Thank you, thank you. I'll be here all week.
archival quality optical media in a robotic silo. 100 year guarantee on your data. Storage space only limited by the size of your silo.
...that I never got around to implementing my Redundant Array of Free Email Accounts virtual drive idea.
In January of 2013 (one year ago), Kingston announced 1 TB thumb drives. They are pricey ($2000). But 20 of them would back up 20 TB. Cost: $40,000 (sell the Lexus as used and you are good to go). Choice 2: I bought a 2 TB drive last week for $69.- 10 of these will cost you $690.- and will get you there. Choice 3: A BD writer (blu-ray(tm) disk writer will cost you $150.- plus a 50 pack of blank disks will cost you $130. 50 blank disks will give you 2.5TB, so you will need 8 of these packs, so $150+8x$130=$1190.- ...I don't have any good options on hand for tape storage. Personally I use a NAS for backing up software and data on the computer, and archive movies on DVD. Its widely available, cheap, and for me 'good enough'. I recently had a drive die about a week ago, and its *very* annoying, but it seems that I've been through the pain of it often enough that I can pull out data recovery tools and install new disks almost in my sleep. Spinning disks of rust are cheaper than spinning disks of pricey plastic, and that is wildly cheaper than solid state chips.
really, there are 3 options
1) a second array
2) tape (biggest are 5TB right now...)
3) online
1* RAID6 (or raid5+hot spare) would be 7x 4TB drives and could be built for about $1200 using a cheap workstation and external drives w/ freenas
2* tapes would be expensive and cumbersome IMHO. Also expensive!
3* I say this is an option but it's not realistic. if you have a typical 4Mbps upload from Cox/Charter/etc then the initial seed would something like 2 year!
Who is going to revisted 20,000+ hours of viedo and music?
Careful with this.
If a fumble fingering kills the data, rsync will happily duplicate that fumble and delete your backup.
Either enable some sort of version control, or setup the rsync so that it won't sync a massive change. (but will email you to do it manually)
My dad is a bit of a hoarder. I tell him to "store" his broken toaster collection at Goodwill. They will have one when he needs one.
Unless your lab burns down.
20 TB backup to the "cloud"?!?!?! What the hell kinda internet connection do you have? Mine is 1 megabit/second upload. That would take over 6 years to complete the backup.
Agreed. If you want backup, you have to pony up. You have to either buy twice the disks, an expensive tape drive (or a cheaper tape drive a lot of tapes) or pay for bandwidth and off-site storage.
Competition Good, Monopoly Bad.
The most practical solution, considering the type of data: /dev/null
just copy everything to
I split my data into two.
Important stuff I can't recreate easily. (email, pictures, music, around 200G)
This data is fired out into the cloud on a fairly regular basis.
The rest(~11TB) It'd be a pain to get back but it can be gotten back. I threw all my old drives into a JBOD and rsync to that every few months.
No "this happened to a friend" story should be accepted on Slashdot, and everybody submitting one should be banned for life. Especially when the question is needlessly contrived and every reasonable solution has been summarily excluded. I'll even grant videos can take a lot more space than that, but it's perhaps worth mentioning 20 terabytes can hold quarter million albums, or well over 30 YEARS worth of music. Based on that, one might assume this guy is trying an experiment where he's recording every sound around him for his whole life.
In any case, if he cares enough about Slashdot users' opinion and is likely consider advice from Slashdot, he should spare 5 minutes to ask the question himself. Unless he already did, in which case the only thing we can conclude is the person asking the question is lying about details of the problem.
Don't rm -r your RAID, fool.
Drobo 5N 5-Bay NAS Gigabit Ethernet Storage Array and 5 x 4TB Hard Drives providing the necessary storage:
http://www.bhphotovideo.com/c/product/907157-REG/Drobo_Drobo_5N_20TB_5X4Tb.html
Your friend is clearly a hoarder. 20TB is absurd for a private collection of unimportant media. Instead of looking for ways to back up the Library of Congress, your friend could seek counseling to work through his attachment issues.
"Asking around among our tech-savvy friends though, no one has a good answer to the question, 'how would you backup 20TB of data?'. It's not like you could just plug in an external drive, " tells me that you have NO "tech-savvy friends". None. Zip.
Right now, I'm on my biweekly offline backup - that's where we rsync from the online backups to offline backups. This is the 10 3TB drive, if you're interested, out of 13.
Now, if you actually had any "tech-savvy friends", as opposed to people who think they're "power users", they'd have pointed out, first, that what your tech-savvy-friendless friend had was *not* a 20TB file, but many, many files. It's certiainly not any kind of problem to partition them - y'know, divvy up the RAID and have movies and music subdirectories, and break that up by moving all the movies whose title starts with "A" under /movies/A.... and then rsync (or however you prefer) copy enough to close to fill one drive, then swap drives....
Oh, and why can't you do it in an external drive? Certainly, that's what I'm doing *right* *now* as I type with those 13 3TB drives.
mark
What did he use to store that 20Tb in the first place? I'm assuming we're talking a large RAID array. I doubt, from the sound of it, that we're talking software RAID. So, at a minimum, it'll cost AT LEAST the same as building that RAID again to have a backup, no matter what the medium.
Yeah, you're throwing 50% of your money away - on nothing but backups of shit you've probably downloaded or taken off discs you own anyway. So now he probably sees quite what that data was "worth" anyway.
To be honest, nowadays, for home use, just build another RAID the same size and mirror the data across.
Oh, and if you're that daft with 20Tb of data that you press the wrong button and wipe out an array that you have recovered several times over, you shouldn't be let near the low-level storage. Use a filesystem, or even just access layer, with some kind of snapshotting / rollback.
Buy a cheap NAS, or just build yourself a new RAID from scratch and use the "old" array as a redundant copy of the data. Keep it powered off and somewhere else except once a month or whatever when you mirror across.
Backup speed will be good (the speed at which you can interconnect the two computers, basically, probably Gigabit Ethernet for the cheapest scenario), restore speed will be the same, media will be cheap, no fancy software or hardware required, you can re-use your old setups and just buy a new one when it starts getting full and your backups are literally working copies with no further action required.
If that turns out to not be good enough for your needs, that's when you can look at tape and other stuff. To be honest, tape is dying. The places I've seen have weaned themselves off them and just replicate to as many places as possible (including an occasional "offline" copy to prevent automated spreading of bad/corrupt data to the backups).
Some times I do hoard some data, usually after some harddisk change of something, in wich case i get a folder named olddisk with the contents of the old disk, this method does lead to having a bunch of such folders, like olddisk1 olddisk2 and so on, I usually take far too many photos on my trips so I end up with GIGs of photos, I can imagine a person who takes vidoes of everything would fill 20TB in a couple of years... anyway, im trying to lead this to a non piracy based all music and torrented video scenario. what I do,bear in mind that I only have 2 TB of stuff, is : run a duplicate finder program,maybe you (ops, I meant your friend) have to much duplicate stuff. if there are many videos, are they in the apropriate quality and sampling rate ? do your VHS tapes really need to be in 1080p ?
:) I really cant visualise getting 20TB of data for a home user...
then
1) use a raid station to backup. I have a synology I use for bck.
2) keep the sdcards after they are full. I do not erase sd cards from my camera, I just store them remotelly, cheap baackup for one of the most precious and inpossible toreproduce files.
3) get 10 2TB hard disk to copy your stuff and store it
jc
I remember a puzzle in OMNI some decades ago, where an alien had to transport the knowledge of the Encyclopedia Britannica in its space ship away from Earth without carrying any additional weight.
The solution was to transform all the data into a single rational number between 0 and 1 and to etch a scratch on the surface of the Alien's space ship, where the size of the scratch would correspond with the single rational number (say in inches or some comparable measuring units). It was apparently possible for aliens to etch and subsequently measure distances at the subatomic scale.
Not to mention that I've been going through CD's and DVD's that I burned back in 2001-2003, had stored in a zipped closed CD/DVD 150-disc binder for 10 years, and am copying them off to a NAS... and am finding that about 1 in every 25 has corruption and can't recover the files. Literal bit-rot due to dye break-down (not from light... but from age and ambient air.) If you advocate storage on optical media, this is a factor you need to contend with.
I myself have a 20TB NAS/iSCSI SAN at home using FreeNAS, 16GB ECC RAM, and 8 4TB HDDs in a Raid-ZFS2 Array connected to my home lab with dual 10GB NICs. It cost me under $2000 to build and when combining ECC RAM with ZFS you have self-healing, double-parity fault-tolerant data storage. I have it filled with VM Images and backups of 4 PCs. 20TB, contrary to what many might say, is not an excessive amount of data. That is about average for someone in IT, Digital Art/Photography, a Musician, or a household of 4+ people.
(And yes, I can empathize that there is a major time investment in ripping a large CD/DVD/Blu-Ray library to HDD. It took me two years at @ 5-8 hours a day to rip my 1400 CDs to FLAC, 1000 DVDs, and 250 Blu-Rays to disk. I really don't want to ever do that again, thank you very much!)
Just prior to this I had lost everything digital I had accumulated over the course of 20 years after simultaneously losing both my PC's HDDs (Raid 5) and losing > 2 drives in my Raid 5 backup array (including 10 books I had written but not yet published, 10000+ pages of research, and all the pictures of my child from birth to her teenage years, and decades of correspondence that was irrecoverable), I realized the hard way that one on-site backup is not enough. If your data is critical to you then you simply need redundancy. So, when I built my NAS/iSCSI SAN I ponied up and built a second one to store off-site and use rsync to nightly mirror the one at home, although I could just as easily do snapshots too (I work as a SysAdmin for a Data Center so Co-Lo doesn't cost me a dime, but one could easily do Co-Lo for free by finding a friend who is willing to host your backup NAS in exchange for you hosting theirs. If you build your NAS right it should be a small ITX form factor, silent, and use less than 40W).
You can also use Crashplan with FreeNAS for a remote tertiary backup of all or just the most critical directories...but personally I wouldn't trust a Third-Party Vendor for your only backup. Companies come and go, or may not hold themselves responsible or liable for the integrity of your data. It's too much of a risk for a primary backup, however for a redundant secondary or tertiary backup it is a wise idea.
Steps:
1 - Get a RAID similar to your main storage to use as backup.
2 - Put the second RAID in a relative's house, where you can get access to it.
3 - Have this backup run an rsync over ssh once a week/month, pointing at your main storage array.
With proper ssh key exchange set up ahead of time and using an ssh username and port that are non-obvious (with ssh on your main system only allowing known keys and not username/password combinations), you'll do pretty well against everyone except a malignant government entity.
Help! I'm a slashdot refugee.
$50 per year for unlimited data, and you can use your own encryption keys to encrypt prior to upload. Will take a loooong time to back up that much data initially, but incremental updates are pretty quick (depending on how quickly you add new media).
Note: Not affiliated with altdrive, just a happy customer. altdrive.com
Dan
Write to some friends that you have 20GB of Al-Quaeda training footage and the NSA will do the backup for you. (PS: Use another set of hard drive to backup and never have the original set up as RAID array)
I've got a large collection of movies (12TB). My backups are the physical DVD/BluRay/CD media. It does take a bit of time to restore a 4TB drive, rips are typically about 1GB/minute for BluRay or DVD.
My recommendation: don't store the collection as a single RAID array. That way, when you lose the array (which will happen), you don't lose the entire collection.
Personally, I'm too cheap to pay for the extra drives to implement mirroring, so I just use JBOD.
Your friend should consider dating for a while.
To me, the only feasible backup strategy for a home user (like myself) for LARGE volumes of data (I have 2TB, not 20, YMMY) is to keep two copies. One being your working copy, that you have in an active server, the other copy you should keep in a safe and rsync to bring it upto date every few months.
If your volume is really 20TB, which seems extreme to me (do you really need all your DVD's and Bluray's on a media server? With netflix and other online streaming services?) then I guess you're gunna need a tape backup of enterprise-level quality. Expect to pay for it. My personal massed music collection after 20 years of collecting music is like.. 30GB. That's a lot of music too. So I think you should look at what your storing, reducing it to the stuff that's truly irreplaceable.
But bottom line, to me, mirroring your drive(s) and sticking copies of them in a safe is the best backup strategy for a home user.
Simply going for multiple USB HDDs seems to be the obvious option (cheap, extendable, can be stored offsite and offline, etc.). However what would be some good Free Software to actually handle the backup? Common solutions such as duplicity, rsync, rdiff-backup, etc. all seem to assume that your backup target directory can hold the whole backup all at once and that the whole backup is online at the same time. While one can probably hack something together with union mounts to accomplish that, it seems like a very cumbersome and fragile solution.
Is there anything that allows you to just copy the data to a HDD and then plug-in a new one when the old one is full? Preferably in a data-format that is robust enough to handle some backup HDDs dieing without destroying the data on the other drives (i.e. no incremental changes across HDDs).
I would suggest: LTO6 + Autoloader. It's not pretty, but it will get the job done.
A PC with 5-6 4TB drives, A Tower Case, a Mainboard with 6 SATA header, and a big Power Supply.
Then I would suggest CentOS and some scripting, or Openfiler if you would like a GUI.
That is it. No other choices. I think you could pull together a box new for under $1000, or buy a used server online with 6-8 bays for $300-$400 and pick up the drives separate.
Buy a basic PC chassis and a MB that has multiple SATA ports, with a raid bios. Add 5 3T or 4T drives in a simple raid5 config, and use a dedupe program and some basic backup / sync software to run an incremental backup. It will take a while to initially get it all into the baseline, but a job will pull whatever has changed (at the file level, but that isn't too bad for this) and any decent dedupe application should get the files to under 50% and leave plenty of space for the offline de-dupe to work. Given the deals on drives you could run this pretty reasonably for under $600 or so with a little careful shopping. Set up the machine bios with a wake time and power down time to minimize power demand, or just leave it running. Not free, but compared to the cost of replaying 20T of files, music and pictures, a lot better than a poke in the eye with a sharp stick.
I've been playing with Crashplan in San Jose and the ISPs in my neighborhood. I'm getting about 400-500 kbps sustained through Comcast, and was getting a similar number via AT&T Uverse, in both cases with plans that should have in theory been able to outstrip that by a good margin. As near as I can tell, there's a bottleneck eitehr between here and CrashPlan, or within Crashplan itself.
I'm going to put a couple disk's in a friends rack system and handle it myself, but I would strongly recommend against using CP for anything more than about 1TB.
It was about 19.5TB of music more than he ever listened to. People are packrats - instead of figuring out what they use and what they don't use, they save everything. Even those things that are not worth saving.
$5/mo
No; you're contending one must buy a second engine.
What are you paying to maintain a 20TB RAID array, plus backup for this? Wouldn't be cheaper to subscribe to a $5/mo music and pay $3 per movie when you decide to watch?
Backup your personal videos, photos, data. I don't see the value in backing up commercial content.
A worthy adversary to rival my porn collection!
Set up another 20TB of storage and rsync daily
With 2TB and even 3TB spindles being pretty commonplace these days, why not fill up an external drive cabinet, make the entire thing into a RAID5 device and backup using rsync? May be a little pricey but how much time and effort went into creating a 20TB collection of data? I have a friend who did something like that (but using smaller Buffalo devices) for his small business by having several systems shuffle files around using rsync. In the event of one computer's storage failing there'd still be 2-3 others on the network with a copy of the data. And, if memory serves, he had one system that had a couple of arrays that would be rotated in/out and one of them kept offsite just in case.
I'm still trying to figure out how much time it would take ripping CDs and converting from WAV to wind up with 20TB of MP3 files. Based on what Amarok is telling me about my music collection, a quick calculation tells me that that 20TB would amount to about 30 years worth of continuous music playback. I'd better get that ripping and converting started now if I want to have that much music for my great grandkids to listen to; it's probably already too late to get that done for my kids or even grandkids to enjoy.
CUR ALLOC 20195.....5804M
20 TB worth of content in the first place should easily be able to afford a backup system for it. He did come into that 20TB of content by legitimate means, right? You can't legally transfer a digital copy of a Blu-ray disc to HDD, so it must be UltraViolet copies, so he must have the original Blu-ray discs...
I recall a lawsuit that the RIAA brought against someone several years ago in which the defendant used an interesting argument to defend his having tons of illegally acquired music files on his computer/iPod. I may have some of the details wrong, but the argument was essentially that since songs cost $.99 each on iTunes, and an iPOD (at the time) could hold 8GB (or was it 16GB) equivalent to >$20K worth of music that no one in their right mind would ever pay for music to fill up an iPod. Therefore, Apple was encouraging people to get music illegally by providing a device to keep and play more of it that any sane person would ever buy.
I don't think the guy won with that argument, but it does make one think about the huge HDD capacities that are available for very low cost. What would people ever have to keep that takes 3TB (a single HDD), if not a bunch of movies, TV, etc., the majority of which has been acquired illegally? I'd bet the number of people who could legitimately fill that sort of space (home movies?), let alone 20TB, is very small.
I really did get a kick out of some of these responses. I sell data protection products for a living and 20TB is what I would consider an average small/medium customer. Every business these days has tens of terabytes of data. Of course they all need to backup their data, so there is nothing novel here. We have plenty of customers backing up hundreds of petabytes of data. Every dataset just needs a plan for backup, pretty simple.
The way I see it, this guy has a few options. One option is to just get more disk and make redundant a redundant copy. This would have have saved him in this case of the mistakenly erased raid, depending on how smart his sync script is. But a redundant copy is not a valid genuine backup plan. So many types of failures will show the holes of the dumb redundant copy.
The other option for a home user who's not looking to spend a bunch of money, is LTO6. They hold a sufficiently large amount of data, so only a handful of tapes will be needed. LTO6 drives are cheap enough, they won't break the bank. Since the data is on tape, you can shuttle the tapes to an off site location. Seems pretty simple.
I use Backblaze who only offer very reasonably priced unlimited plans. I currently backup about a TB. They will send you disk to speed up the restore process for a price. I’ve been impressed with them so far.
http://www.backblaze.com
For purely backup/recovery purposes LTO6 internal drive will use about 10 tapes
http://en.wikipedia.org/wiki/Linear_Tape-Open#Generations
Here's what you need if you hav ethe time and money to burn:
2 boxes with quad core processors and 16 GB RAM
12 x 4 Tbyte hard drives
2 x 250 Gbyte hard drives
2 x 256 Gbyte SSD drives
FreeNAS
For each box, install FreeNAS on the 250 Gbyte drive (optionally, you can use an SSD or raid 2 drives together) I recommend using a separate drive from your share volumes so you can at least bring up the box if you lose the shares. Format 6 of the 4 Tbyte drives and set them up as a zpool using ZFS. Determine the level of failure you want to maintain using RAID-Z. Set up the 256 Gbyte SSD as the L2ARC.
The first machine is set up to always stay on. At night, the second machine will boot itself, run the appropriate disk utilities, take a snapshot of the first machine's ZFS filesystem and copy the diffs, and then shut down. If you want to be really safe, host the second box at a colocation, such as your neighbor's basement. This works fine, especially if you can string a network cable or connect to your wireless SSID from that location.
If you have a drive fail in your first box, you can add a new drive and the filesystem will rebuild parity. If your whole server dies, you can boot the second box and have all the data available from the night before while you work on bringing the first box back to life.
So I don't see the problem.
"I believe in Karma. That means I can do bad things to people all day long and I assume they deserve it." : Dogbert
The first thing to do is to not create a single-volume RAID that spans several drives. Each drive should be able to stand on its own. Especially with not-quite-essential data like ripped DVDs. This way if one drive fails, you only have to re-rip one drive of DVDs. But most importantly, you can't erase them all with one command. I'm not sure how submitter's friend happened to do that, but it's exactly the kind of failure that RAID does not protect against!
Sure, it's nice to have one big volume and not have to worry about switching over as they fill up, but unless you have some kind of advanced volume management that can deal with drives disappearing and let you easily add or remove drives of arbitrary sizes, it can come back to bite you.
If you really want redundancy, use mirrored drives, or sync to a mirror volume, or whatever, just don't use RAID 5. Parity RAID seemed like a good idea at the time, but it's just begging for two drives (usually from the same manufacturing lot) to fail at the same time. And the system is loaded way more when you're trying to do recovery, which could cause another drive to fail from the extra stress. Even worse is that the size of modern drives means the sensitive recovery period is going to last longer.
This advice is specifically toward storing large A/V libraries. The really important stuff (financial data, family photos) is going to be smaller. Keep it separated from the big non-essential A/V files and it should be easy to use multiple backup strategies like removable storage and cloud backup.
#naabhaprzrag, #sverubfr-000, #agi-fcbafberq, negvpyr[pynff*=' negvpyr-ary-'] { qvfcynl: abar !vzcbegnag; }
Of course, he didn't lose anything because he had all the original discs for his music and movie collection, right? Certainly, he wasn't downloading illegal copies of stuff...
As for the question, the way I handle this exact situation is I run 2 volume groups (I use LVM instead of RAID), one of the volume groups serves as the main archive, and the 2nd volume group serves as the backup. They are identical in size, 4 hard drives each, currently at 9TB each. So long as I don't lose a drive from both volume groups at the same time then I won't lose anything. However, I actually do have every physical VHS/CD/DVD/Blu-ray that I have backed up to the server. I accept the risk over the cost of hosted solutions or redundant servers, etc... The one thing I do invest in and recommend everyone do so as well is use a good UPS to provide clean power and protection against sudden power loss.
To mediate the risk of drive failures, I replace drives every 2-3 years before they have a chance to fail. In 20 years I have not had a single hard drive fail on me that was in active use. I just swapped out the 2 oldest drives in my server, for example. They were working fine, but I knew they were both more than 3 years old and overdue for replacement. I'm going to replace 2 more in the next month that are just over 2 years old. That's about $1,000 every 2-3 years to replace all 8 drives, if I did that every 2 years it would be less than $50 a month. Far less than any "cloud" service for that much data.
If I were going to keep an off-line copy of the archive, I would use Blu-Rays and a program that could keep track of what had been stored already so that incremental backups could be done on a monthly basis. I don't feel like putting that much effort into it though, I would rather just re-rip everything in whatever new wiz-bang format the kids are using.
If you can afford a 20TB RAID *and* have enough data of value to warrant *retaining* 20TB, then you can certainly justify the expense of a tape drive and corresponding tapes to back it all up.
Tape is not dead, contrary to more than 3 decades of claims otherwise. It is, in fact, perfectly alive and healthy, and well worth using (with a proper backup/rotation scheme) when you have that kind of data volume to store.
I've worked for Arcus/Iron Mountain and Recall both, and I can't tell you how many times over my years with those companies I've heard someone say "We don't need off-site backups" or "We don't need tape, we just have the IT guy take the hotswap drives home every day", only to have them come crawling back in tears weeks, months or years later when they've lost everything.
"Inveniemus Viam Aut Faciemus" 'We will find a way... Or we will make one!' --Hannibal of Carthage
Easy. Get a 6 or 8 bay NAS and a bunch of 4TB drives to fill it. Set it up in JBOD. Only local onsite backup solution that's feasible. Keep it powered down and unplugged except when you make periodic backup. Offsite backup is more complicated, and unfortunately will have to shell out a lot for, and may not be feasible to backup via a throttled home connection upload speed. Around these parts in US most ISP's have 30mbit down, but only 3mbit or 4mbit upload. I'm being "Upgraded" to 60mbit down / 4mbit up next week. The upload to download proportion is ridiculous.
I'm backing up 8TB at home, by rsyncing to another 8TB of disk space. It's been working reliably for years, starting back when a TB was a lot and adding/replacing disks over time.
A 4TB hard disk is pretty cheap these days, so he just needs to get six of 'em and make another RAID array. Once you've done the initial rsync, I presume that subsequent changes will be relatively small, so transfer speed doesn't matter much, so he could hang them off a USB port in one of those USB-to-SATA dock things.
"Blu-Ray discs can hold a lot of data, but that's a lot of time (and money) spent burning discs that you likely will never need."
That is what all backups are. Spending money and time making copies you likely will never need.
You can't avoid spending money and time making copies you will likely never need, unless you choose to simply not backup.
When you reach that point, you have to look at solutions that fit. Offsiting is not gonna work unless your bandwidth is amazing (I'm guessing no). Really, it's going to have to be onsite, but you can do some things to reduce the size. It's most likely going to be a disk backup, since even older tape drives (LTO3 or 4) can cost a bundle when it comes to that many cartridges. You're better off buying the disk to meet the growth of the data, and backing up to some sort of deduplicaiton mechanism (zfs from Oracle, etc.). Some are free, some aren't, you need to research what makes the most sense and if you want a backup management app (like BackupExec or zManda), and use those to write to the device and manage the images.
Regardless, it's the same old problem: good, fast, cheap; pick two...
You're better off building a second server.
Then use one server as the live server (the one which access from the network to work).
and the other as a server.
- doing rsync and directory rotation [either ZFS/BTRFS/etc. snapshotting, or plain old rsync+hardlinks and directories] should work, specially that (unless you work in the video editing business) chances are that not a big chunk of the 18 TB change a lot. So you could invest into 24 TB of RAID-6 or RAID-Z2 and afford to keep a few daily/few weekly/couple of monthly+yearly snapshots.
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
All your friend's music and movies are already there.
To put it quite simple its their fault. RAID is dead and cannot support large data sets well. To augment the short comings of RAID you need to store multiple copies or use erasure coding. Storage is cheap so buying a bunch of drives to hold 20TB (40TB raw for 2 copies) of data will not break the bank. There are many solutions out there and quite a few open source projects that one can leverage.
And unless the question's asker is working in the video editing industry, chances are that not much of these 20tb change on a regular basis.
It should be possible to build a 24Tb or 28Tb RAID-6(*) backup server, that could still quite a few daily/weekly/monthly/yearly backups, provided a space-efficient snapshot rotation system. (Not actually keeping separate copies, but either using a file-systems Copy-on-Write snapshots like BTRFS' or whatever is the ZFS equivalent, or using the old classic RSync+hardlinks).
The only thing that you don't solve is disaster resilience (you'll need an offsite replicate for *that*).
(*) At this size, hardware failure are going to be a certainty. RAID-6 (or ZFS's RAID-Z2) are the best solution against bitrot and for resilience against dead drives.
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
Greetings, dear reader-
I've been pondering the same thing. Thanks to Usenet, I've managed to amass a collection of over 13000 albums of music. It's in a lossless FLAC format, and I hope to live long enough to listen to it all. I think of my 13-year-old self, having built my first decent stereo system, having the system and the collection I have today placed before him. He'd think he was in heaven, and his current self feels similarly. Anywho, I'm thrilled to have this collection, but.... It's currently on two 3TB drives, in an external box, the drives setup as JBOD for simplicity, and to get the best use of space. Yes, I'm acutely aware that I'd be better off were I to set it up as some level of RAID. The thing is, finances are an issue, I just couldn't afford to do that. I've got 'em set up now in JBOD so it's just one big collection, not broken into sepearte drives, and I prefer it that way. Sadly there's no 6-8TB drives yet.
Well, so I'm aware that these drives are now over two years old, thus they're approaching the point where they're more likely to fail. I've had great success with drives over the years, some I've had for >ten years now, but what's on those isn't critical, wouldn't kill me to lose 'em. The music collection, OTOH, that would really really suck to find one day that it'd failed.
So, like the OP, I have a lot of data that I'd like to back up, and to do it as easily and cheaply as possible. Yeah, I know, probly not gonna be easy, cheap or fun, but it's either that or lose the results of all my time and effort amassing this terrific collection of ear candy.
I'm glad I don't have to find a way to save 20TB of stuff, at least. I keep adding to the collection too, an average of 5 new albums/day, It's gonna completely fill the 6Tb of space I have now, so it'll be time to do something new. So, my house is always filled with happy tunes. I'll be watching this thread. Thanks to the OP for posting the Q.
Peace
Option A: (I wish I had the bucks for a NAS)
(1) Buy a NAS that holds more than 20GB. For example, there is a 5x5GB NAS being advertised.
(2) Mount the NAS as a local drive letter.
(3) Spend a lot of time exploring which file-copy programs that update 20TB "quickly."
(4) Disconnect the NAS and store it somewhere else.
(5) Put a P-Touch label on the original giving the last time it was backed up.
Option B: (This is what I use...with a bunch of 1TB to 3TB USB hard disks)
(1) Buy a bunch of large USB hard disks.
(2) Break Your data up into directories. (It is probably already broken up into directories.)
(3) Copy the directories, one at a time, to the hard disks. Sometimes an Excel Spreadsheet of how much is in each directory helps. LEAVE LOTS OF EXTRA ROOM for future additions.
(4) Spend a lot of time exploring which file-copy programs that update directories "quickly." I use: http://sourceforge.net/projects/freefilesync/
(5) Disconnect the first USB hard disk and store it somewhere else.
(6) Proceed to the next directory and the next USB hard disk.
(7) Put a P-Touch label on the original giving the last time it was backed up.
rsync is not intended for backup.
Name slips me, however some company recently shutdown because their backup was rsync based. Original corrupted and rsync copied the corruption to the backup.
Don't use rsync.
agree that you backup my shit and I'll backup your shit.
What about 4 5TB mirrored drives in a fireproof safe?
All y'all losers posting things like "it must be all stolen", and "nobody ever needs 20tb" are just filling this thread with noise. It doesn't matter if it's porn, pirated movies, or your own damn sequenced genome, there's nothing wrong with wanting to have 20tb and there's nothing wrong with asking how to manage it. It's none of your damn business if he "should" or shouldn't" have something!
I have 14 x 2Tb of storage. 25Tb of data. My solution is to have a back up system. I do a refresh every week or so. It isn't perfect. I have had JPEG images get corrupted and the back up process copies the fault over. Some videos get damaged but knowing it that is hard as you can only really see when you watch the 90 minute 2 hour film. A way would be to do a CRC check and repeat that monthly - very tie consuming. The back up system saves some damage and the accident deletion but they are in the same room. The next stage is for a third system that is swapped out with the 1st back up system every month and stored somewhere else. If the material is commercial - ebooks, transfers of DVDs and CDs for streaming then it could be built up again. All the family home movies going back 30 years, the photographs from negatives and digital cameras can only be done again if the original tape, negative is still around and the technology to extract it. Some of my early video tapes can be used again but it would be difficult to get a good image from them - the digital copy done 15 years ago is a better copy. I have hard drives still running from 98 but a pile of broken ones from only three years ago. These large 2-4Tb drives need checking often. RAID would help but that would mean 28 drives for each system. Even with costs coming down that the costs soon mount.
Holograms with images of QR codes. How many holographic images can you put on a small glass cube? With Holographs, making duplicates would be a snap... Get it?
you need RAIRANASD...
Redundant Array of Inexpensive Raid Array Network Area Storage Devices
----------------------------
Esobofh - Currently drinking fresh mango juice.
Is to simply call the NSA or subpoena them since they probably have a copy or two of your data somewhere.
Wait, he "erased" his 20TB RAID array? What, with a giant electro-magnet or something? Did he Select-All > Delete and then go to bed thinking all was chugging along ok? Run a script that secretly had rm -rf * tucked away in it that he left running overnight? Cripes. Well,.. bum luck to that then.
Yeah, LTO5 or 6 cassettes are your best option, really, since you can additionally get those off-site, avoiding the catastrophe of a fire or flooding taking our your next 20TB array.
Better, though, is to PRIORITIZE: Identify which 5-8TB of data is "most critical" and make sure at least *that* is backed up, (onto removable HDs?). You can get to the other 12-15TB as time and expenses allow, or just let it be at risk.
Of course at this point he has 0TB of data, so he could start small with a cloudy services, and then scale the backup as his hoarding expands again and takes over his life.
I guess this proves the maxim: "If your data is not in two places, it's already gone."
The OP said that this friend had 20TB of music and movies. It's not like it was a bunch of code for a game he was developing, or original documents, etc.
Unless he created that content himself, then nothing has been "lost" here.
It's still all available to re-download or re-purchase or re-rip.
If my media raid ever died... would I really be that dismayed? Not really. I only ever re-watch maybe 20% of it, and it can be re-aquired.
How about just finding a friend to share your media library with? You both build identical 20TB systems, and sync with each other periodically over a VPN or similar. So if one person loses their crap, 99% of it is still with your friend.
25 X DLT-S4A (0,8 TB) each.
Since he owned the 20 GB, he could re-rip them
It's obvious that at 20 gig, he only was moving digital copies of material that he personally owned - so it should be a simple matter of his re-ripping the material.
For something that large, and presumably something you may not want certain organizations with 4-letter acronyms that end in 'AA' to be able to subpoena a 3rd party and gain access to without your knowledge, build your own redundancy. It may cost more upfront, but ultimately building a second raid array on separate hardware and using an automated process like DRBD to keep them in sync seems like the most sane approach.
I'm waiting for the "Archival Disc" that is being proposed. At 300Gb?? per disc. Would take quite a few discs.. but I would think storing data on a non mechanical system would be best.
http://www.ign.com/articles/2014/03/11/sony-and-panasonic-announce-300gb-archival-disc
First off, compress the hell out of the data. Using 7zip or rar or something at its greatest compression ratio could probably cut the size of the data down to 10TB.
http://www.ebay.com/bhp/1000-blank-dvd
in corporations, there is a lot of junk files that users put up there, thats why administrators use this thing called disk quotas.
under windows 4 and 2000 you didnt have such a thing and we had 12 gig of storage (back when pc's had 250 meg drives)
90% of it was junk, cute cat pictures etc.
you should limit yourself also...
There's no reason to be backing up 20TB of data if you're not Amazon or Google. Separate out the essential data that you can't live without. Your music collection, your work files and your photos. The rest is disposable. Then go get yourself 2 nice identical hardware RAID cards, set up a 4+ drive RAID5 fileserver using 1.5 or 2TB drives, with 3 active drives and one drive as a hot spare. Buy at least one extra drive for when you need to put your hot spare into action and replace a dead drive. Put all your important data on that raid, put it in a closet somewhere, put in a ventilation fan of some sort, set up email alerts to tell you when there's SMART errors or the raid is degraded, then check your raid status software once a month just to be sure it's all good. Then get another cheap external RAID enclosure with built-in raid5, (something like a StarTech SAT3540U3ER - which is iffy, but works for me) and fill it with 3TB or 4TB drives (plus another spare for when THAT raid fails) and use that to back up your first raid. The backup raid should be large enough to hold at least 2 full backups of the first raid -- choose your drive sizes accordingly. Then back up your first raid onto the 2nd and smile because you've finally achieved relative safety for your important data. Then take a deep breath and say to yourself "I can live without all those videos if something goes wrong. It will suck, but it won't be life-ending, after all that's what bittorrent is for". If you know where to go, you can find almost any movie or tv show and download it in under an hour via bittorrent. Hell, you can download 15 seasons of South Park in h264 format in under an hour. Most HD movies in 1080p format take 30 minutes or less over a decent connection. And if you really care about saving your videos, make an offline library and burn them to DVDs. Not really feasible if you've waited until you filled 20TB of drive space with movies and tv shows, but I have a series of 4 filing cabinets with DVD-sized drawers full of around 2000 CD-Rs and DVD+Rs (and a few BD+Rs for my 170GB collection of BBC Horizon documentaries) I've burned since about the year 2000 when DIVX and XVID format movies started to appear en masse. Every few months I'd spend an evening or two burning my latest batch of movies to DVD and then removing them from my hard drives. But I've found that most of the time these days I don't even touch my archives when I want to watch one of my movies because it's easier just to download a fresh copy in 1080p which is generally better than the archived version I downloaded years earlier. I expect that trend will continue, which is why I've recently stopped burning my movies altogether and now I just add hard another shared hard drive to one of my HTPCs as they fill up, or delete stuff I know I'll never watch again. And last, but not least, I'd be remiss if I didn't mention unRAID. Though I don't use it myself because I've long-since gone down the path of the aforementioned RAID5 setup, unRAID could be the best option if you're starting from scratch. unRAID gives you RAID5-like redundancy, but with arbitrary disks, and with the benefit of only losing the data from specific failed drives in the rare event that it can't be rebuilt from parity data. If you want to know more about unRAID, google it. And forget about backing up 20TB of data for at least another 5 years. No one has that kind of time.
I was editing some off air material and deleted the sub-directory on the wrong computer. Some of these where new files and not yet backed up. The source for the edits had already been over written. I have Microsoft Home Server. I took the drives out of the computer and external 4 drive housing and used RecoverMyFiles to look for the files on a desktop PC. The videos had already been spread across three drives but I was able to recover them all. The software was more than the data was worth - a DVD will be release this year. My existing version of the software could not find these files. But at least the latest version does work and will be useful again in the future. I have used RecoverMyFiles to search through a dying drive that wouldn't allow access to the directories. I needed to check what I had lost and not backed up. I could recover files particularly documents and databases but not video. I was able to list what I had lost and to re-do the work. I was able to check that I had not lost anything important.
It's around 30$ a year and you can accessit from every devices with access to the Net.
I'm backing up my 40TB music library on Jacquard loom punch cards.
Added bonus: You can use the punched cards to make fabric. ...as a sweater!
Right now I'm wearing Justin Bieber's "Love Me"
https://web.duke.edu/isis/gess...
The reasoning for not using a dvd or blue ray writer is pretty flimsy. They might be more expensive per TB than a hard disk, but they'll last 6 times longer. The majority of the data sounds like it might not even change. Movies, songs... it's all static data. Get a good incremental backup solution in place and it won't be hard to make sure everything is backed up.
As the the whole probably never need argument... well, your friend just needed it.
No matter what, you will need a backup. At some point in time you will not be able to buy replacement raid array cards that will work with the volumes you've created. Hardware will be obsoleted, and you'll have to replace it all... that means your backups will need replaced too. If you want it to last 50 years, then that's what it takes!
their entire electronic collection of music and movies
Explain to me how this 20TB digital collection isn't already a backup?
I'm a bit amazed that no one has mentioned 5 x 4TB external hard drives.
Simple solution for backing up digital media is unRAID.
Overall simple setup, extremely easy to maintain, and expandable.
Nothing comes cheap, but unRAID is by far the most cost-effective solution you can take when considering the complexities of a typical RAID setup.
I currently store roughly half, 10TB array and have absolutely no issues and have been running so for the past year and a half.
Several others in the well supported community have plenty of tenure and experience with the system as well.
You can use the grandfather / father / son method. onto tape or DVD's. It would be a lot of them, but not undooable,
1. Buy LTO6 drive (for $2k)
2. Back up to 8 tapes or less using LTFS or Windows' NTBackup ($500)
3. Sell LTO6 drive on eBay (may lose a couple hundred, but your drive is basically new with a few hours on it)
4. Repurchase drive at much lower cost if/when future recovery is ever needed (or send to tape restoration company, usually couple hundred per tape)
5. Back up incremental additions to 4TB USB HD (optional)
SyQuest 44. I heard there is a new 88MB cartridge coming soon.
Compress with ZeoSync (100:1 lossless compression of random data). Repeat until compressed file is small enough to backup.
If you need to backup 20tb of data, you can use symform. All you need is another 40tb space to be available for symform backup use.
20TB is five 4TB external hard drives. If you want better backup speed, use drives removable SATA caddies. Just think of hard drives like media -- "removable storage cartridges". They are cheaper than any other alternative for non-datacenter applications.
"by erasing a RAID array on their home server" - don't use a RAID array. If it fails or you make a mistake, you lose everything. unraid is ideal for this. It isn't fast but it has disk level parity and a thriving support community to help. If you lose a disk, you can rebuild the array, and if you lose 2 or more disks, you only lose 1 or more disks of data - the rest of the disks are readable so most fo your data will survive.
'how would you backup 20TB of data?' - wrong question. Think instead how to replicate it off-site. My own setup has 16TB of films/music etc, and I have two 16TB copies at different sites which are loaded from a 1TB portable drive I move around. The bonus is whoever has the other servers gets to share your collection too.
Cost? - seriously, get over it. 3TB drives are mainstream and getting cheaper. HP's microservers are around £100 each and will take 5 drives.
1. Upload (encrypted) files to alt.binaries.backup.
2. Repeat #1 every 3 years.
You don't even need to keep your $10/month account if you don't want to.
Don't forget your parity files...
You could set up another server with 20TB (or more, for versions) and ssh rsync with a shell script & cron (or whatever) job. (Linux box, obviously). I did this for years, using keychain for authentication between servers.
Now we've switched to Windows Server, and I've got to find a way to replicate it using Win. Slow going.
It's not like you could just plug in an external drive,...
Actually, it's just like that. There are plenty of multi-drive external enclosures with USB 3.0 and/or eSATA ports on them. I'm using a Vantec HX4R with 4 x 4TB drives in it plugged in with eSATA, then just back up to it using rsync. If the data's important enough to you it's trivial to buy a second one and just swap them over periodically for off-site backups.
Then I realized that Blurays, DVDs, CDs and such are covered by homeowners insurance. With the exception of rare copies, you can buy a new version in nearly all cases. Ripping to disk really only needs you to rip the things you want to watch at an acceptable quality.
And really, I don't bothered to backup that stuff unless all the rest of my things are backed up.
I have a Mac Mini server set up hosting about 8 TB of primary data on mirrored USB3 drives. I then have it running Time Machine on all of that to a 16 TB RAID5 array on a NAS. Total cost (not including the server itself)? About $1,000... and that's for two sets of backups, one for drive errors (primarily) and one that has an always-available actual backup.
-Daniel
Then that is your backup. If you are pirating, then serves you right.
The only cost-effective solution that I know of is to purchase a Synology Network Attached Storage (NAS) unit and back it up to Amazon S3 (backup to AWS S3 is built into the Synology), then set up the S3 bucket to migrate the files to Amazon Glacier after some period of time (maybe 14 days). A quick calculation on AWS costs would be about $250.00/month for 20TB of Glacier storage and there would be a small amount for S3 storage before migration (new files for a few days). A Synology DS-1813+ with 8 x 4TB hard drives (in RAID 5) would result in 28TB of storage and can be expanded to 68TB quite easily. Cost of NAS would be approximately $1750.00. The beauty of this configuration is that it is all managed by Synology's very easy to use DSM web management interface and could be running in a few hours. Now it would take weeks to upload the information the first time, but after that a few hours per night. I have a smaller configuration of this running on Synology DS412+ (4x4TB) and it works like a charm.
If you think you can afford to build a 30TB NAS then you really can only afford a 10TB NAS plus 2 identical backups with one kept off site and only one of the backups in a writeable state for updating at a time, the remote copy only provides checksums and then is swapped with the local copy.
Keep data moving and checksum test each replication then it will survive intact. Look at how life manages the data in DNA, not perfect but still usable after billions of years.
I don't really understand the dilemma. If you have paid for 20TB in storage and your data is important spend the cash on another 20TB array or PC with DAS and back it up to disk. It's really not very complicated.
ZFS would be a great way to hold onto your data in the event you as a user some how deleted your own files. Simply restore from a snapshot. :)
Sadly to make a backup of 20TB, it would just make sense to have another server with a mirror image of your data off site. Keeping both in the same location would defeat the purpose. Now partner up with a good buddie and split the costs and your media collection. Win-win situation
One option is to store you files are metada to a huge number of "small" jpegs. You can also use steganography if you like. Then, upload them to sites that allow for unlimited number of pictues (facebook, picasa, etc). Or simply get five 4TB HDs ($165 each).
Good day. If you recreate the RAID array exactly as it was then the data is there and not deleted so you are effectively recovering it. That is if you actually only deleted the raid array definition in the Raid software/Firmware. I have done this with success.
I use it to keep 60tb backed up that way. Works without a problem. Tips: you need a decent internet connection and split up into smaller jobs to keep manageable.
Well, to be fair, that's not a good analogy. An Egyptian gov't might only last 4-5 months anyway... ;-)
I have a 3x2TB RAID 5 array (4TB total) and I'm constantly bumping up on my limits. It's like 3/4 downloaded movies, some music, some software, and some personal pictures and data. Honestly, the personal stuff, the stuff I couldn't just re-download, might take up a whole 10-15 gigs. That stuff is backed up multiples times over via Dropbox, Google Drive, and scattered Blurays and flash drives.
They actually charge little to back up, but they charge an arm and a leg to get at the data. Glacier is really only for a rainy day. Also, $200 a month isn't really economical when you consider after half a year you could have purchased a 20TB backup solution...
http://www.flexraid.com/
http://lime-technology.com/ (UnRaid)
Best solution for big media collections.
All data is stored seperatly on each drive, and 1 separate parity drive can protect up to 21 drives (as long as its as big or bigger than any 1 of those 21 drives).
I love my cigar too, but I take it out of my mouth once in a while.
1. You put it on a filesystem that has versioning/snapshots and you replicate it.
2. You buy a tape library and back it up to that.
My library is around 12tb and I use a dock and bare drives. I have a series of unix scripts that backup the library in parts. Here are my scripts.
rsync --verbose -r --delete-before --ignore-existing --include=[A-Ma-m0-9]* --exclude=* /Volumes/Drobo/iTunes/Movies/ /Volumes/Movie\ Backup\ 1/
rsync --verbose -r --delete-before --ignore-existing --include=[N-Zn-z]* --exclude=* /Volumes/Drobo/iTunes/Movies/ /Volumes/Movie\ Backup\ 2/
rsync --verbose -r --delete-before --ignore-existing --include=[N-Zn-z]* --exclude=* /Volumes/Drobo/iTunes/Movies/ /Volumes/Movie\ Backup\ 2/
rsync --verbose -L -r --delete-before --ignore-existing --include=[A-Da-d0-9]* --include=*.mp4 --include=*.m4v --exclude=* /Volumes/Drobo/iTunes/TV\ Shows/ /Volumes/TvBackup1/
rsync --verbose -L -r --delete-before --ignore-existing --include=[E-Re-r]* --include=*.mp4 --include=*.m4v --exclude=* /Volumes/Drobo/iTunes/TV\ Shows/ /Volumes/TvBackup1a/
rsync --verbose -L -r --delete-before --ignore-existing --include=[S-Zs-z]* --include=*.mp4 --include=*.m4v --exclude=* /Volumes/Drobo/iTunes/TV\ Shows/ /Volumes/TvBackup2/
rsync --verbose -r --delete-before --ignore-existing --exclude=‘Mobile Applications/‘ --exclude=‘iPod Games/‘ /Volumes/Macintosh\ HD/Users/dave/Music/iTunes/ /Volumes/Macintosh\ HD/Users/dave/Dropbox/Backup/iTunes/
Even with windows you can run rsync if you install Cygwin and it's sshd.
If you can put the second machine in a distant room (garden shed, detached garage) that's unlikely to go up in the fire, that's better.
"It if was easy to do, we'd find someone cheaper than you to do it."
Considering that you've got to be running something larger than your average desktop PC to hold that much data, I'd consider looking at a tape library like this:
http://www.tigerdirect.com/app... ($3750)
8 slots for Ultrium 6 tapes, non-compressed will hold 20TB, 50TB if you can get decent compression...which I'm guessing you might not. I think tapes can be found for just under $65 each, depending on how you shop them.
I guess it depends on how many tapes you want to back up to after that.
Awk! Pieces of eight. Pieces of eight. Pieces of seven... ERROR: General Protection Fault. [Paroty Error.]
I don't think there's that much enjoyable content out there. Perhaps learn to delete crap media you don't ever really want, and then your backups will be much more manageable.
You'll only need about 14000 reams of paper...
www.ollydbg.de/Paperbak/index.html
You could laminate them for extra data integrity!
If you were a photographer you wouldn't be doing what you are doing.
That's from XKCD. No need to use an aggregator site.
Well, I might have a way, but it only works on a semi spherical planet in a vacuum.
Assuming you don't need RAID on the backup device itself, then a cheap desktop PC (usually from a custom white box builder - most OEM PCs don't come with enough SATA connectors/hard drive bays) with 5 or 6 4TB SATA hard drives does the trick. Sure, it'll cost you a fair amount for the hardware (in the UK, probably around 1,000 pounds or so), but it might be the most flexible solution (e.g. could be located offsite if you're paranoid, though you'd need a fast connection to it - at least 100 Mbits/sec I'd have thought - for that amount of data).
Of course, if you then want to keep multiple archive copies, then you'd have to look at compressing the backups and/or perhaps using backup software that does incrementals (e.g. Amanda on Linux or whatever). Another much pricier alternative is multiple spanning Ultrium 5 tapes in 24-slot autoloader attached to a machine with little local storage (1-2 TB free for holding space), but we're talking 5,000 pounds or so for this solution.
My storage array is only 8TB, but I doubled down and built another 8TB array in a small Lian Li case. I do a mirror backup to it once a week or so, and carry it to work with me. That way, it's offsite in case something physically happens to my home.
I'm in a similar situation and I actually have planned for a worst-case scenario. However, my storage needs are slightly more modest at about 5TB (give or take).
My main, active archive exists on my primary desktop and is the location that will get the most changes. That, in turn, is backed up to a dedicated NAS server (currently an 8-bay Synology unit packed with 3TB disks) in my home. THAT, in turn, is backed up, off-site to a friend's NAS units of similar construction and capacity via CrashPlan. The free version offers "backup to a friend's computer" as an option, though the paid subscription offers to store data on CrashPlan's servers, instead. The cost is fairly reasonable for that option if none of your friends has enough storage for you.
One other last point - it might not make sense to back up EVERYTHING you have. Photos, critical documents, etc. (things you can't easily replace) should absolutely be backed up. Copies of game files, software installations, etc. (things that can be replaced relatively easily from the original media) should probably be left out of the backup set. That limits the amount of remote storage required as well as the time it takes to back up those items in the first place.
My sources are unreliable, but their information is fascinating. -- Ashleigh Brilliant
Pogoplug Cloud storage is unlimited for $5/month. Or at least they claim it is; backing up this amount of data might be a good way to find out.
How about CrashPlan? It is an online backup that offers unlimited space.
http://www.symantec.com/connec...
I too have that sort of magnitude of digital stuff on a home server.
I have flagged for backup only those files and folders that I would not be able to restore from elsewhere - so, no music, films, or other external content - either I have the original CD's and DVDs (I rip because most of the films are Zone 2 and I'm not buying the same damned stuff again because I now live and operate in Zone 1), or the material is available elsewhere online: where content is available via a stable url, I keep, flag and backup the URL's; Thus, only material that I have produced and which is "original" content is backed-up.
Result (surprise, surprise)? Less than 1TB and with an annual growth rate over the last 10 years of no more than 10% year-on-year
Bornoulli drives.... Like an IOMEGA 20+20. Dual 40MB 12 inch floppies should back that puppy up in no time! There's one on ebay for $64 that's 1% of the cost in 1986!
I did some consulting work for a company that wanted to store that much data, a kind of one-shot archival deal...
We looked around at options, and there was really only one cost and time/speed-effective one.
So I bought several SATA drives, backed it up on them, made an index, labeled them, then unplugged them and put them in a safe...
I know it probably seems dumb, we bought 14 3 TB SATA drives at 135 bucks each, so it ran over 2k including shipping and taxes.
They don't anticipate needing the data, but if at some point they ever do need it they can relatively easily locate and retrieve it.
They have a pretty big vault to store stuff like that in long term, so I understand it's not a perfect solution for just anybody. I would have preferred blu-ray, but I didn't want to be the one to burn all that data to discs that might not even be readable in 10 years.
I have my media computer and then I robocopy the data to a set of hard drives with the exact same sizes onto my server every night - incremental updates. So much data is worth the few hundred bucks you would have to pay. If I had to re-rip all of those blurays and DVDs it would take me months and costs me thousands in my own personal time.
Always ask yourself how much your time is worth.
I do like the idea of tape in theory due to the ability to backup and store offsite. At home I have different buildings so I am creating a data center in a garage that I will be doing incremental backups to but in thinking about how to do this in a city got me going. The reason I don't go directly to tape is because it is slow, not automatic (you have to physically move the tapes) and more expensive than setting up a duplicate drive system.
How about making a deal with a neighbor that you can either get a direct network line to or wireless at the least and store backups at each others house? Encrypt them and do incremental backups nightly. That should be sufficient and if your house burns down just pull the drives from your neighbors. That might give you the best of both worlds.
ennalta
Backup only the data that needs to be backed up; cannot be recreated or redownloaded. Hard to believe all 20tb is critical.
Or host a replica server at another location after seeding the data. Then replicate the changes to keep the 2nd copy synced. Be sure to use encryption.
Who cares, really? There's no way this guy legally purchased that much A/V media over the years. Easy come...easy go...
hmmm... 20 TB data.
buy 20 1TB external hard disks,hit 1TB and write in a notebook serial no of disk and wat u stores in each one.
pro:ur data backed up n organized,u can pull t respective hard disk and get data.
cons: mmm 20 hard disks.. may be u might get discount f u walk up t vendor n shock im tellin u wanna buy 20 nos of 1TB external hard disks..
Why do you ALWAYS RUN from these -> http://tech.slashdot.org/comments.pl?sid=4885825&cid=46474817
APK
P.S.=> Is it since you're a undereducated WANNABE in the art & science of computing? Yes... apk
That shows how STUPID a little troll like you are vs. this -> http://tech.slashdot.org/comments.pl?sid=4885825&cid=46474817
APK
P.S.=> Why do you always "RUN" forrest? Is it because you're nothing more than a useless waste of life TROLL? Absolutely, lol... apk
Running away from a fair challenge to you, again? See here http://tech.slashdot.org/comments.pl?sid=4885825&cid=46474817
APK
P.S.=> You're nothing but a TROLLING waste of life I am going to ENJOY destroying, once again, on the same thing (you being unable to disprove my points in favor of custom hosts files value in added speed, security, reliability, & even added anonymity online)... apk
http://en.wikipedia.org/wiki/Occam's_razor
Get another raid.
www.pricewatch.com
I see there are 4TB drives going for $156. I would maybe shell out another 20 or so to get a better/name brand. So for around 1.5~2k, you have a backup with far less time invested and can be rebuilt as needed. Or, get more drives with less space for better redundancy. The next smallest drive is roughly 30% cheeper.
It is a bit more expensive, but take would be the way I would go. There are cartridge tapes that can run as fast as the hard disk transfers can take place.
Many tape backup systems rely on a base backup, and then on incremental backups. Each incremental backup was the one based on the base backup. Once a month we would create a new base backup. (It was a business application, with daily changes).
I made sticky lables to indicate the backup date and generation number. These labels went onto the tape cartridge's plastic case.
In a very large shop, tape backupsrun from a feeder machine, with major automation and cataloging to allow reasonable file recovery time.
Tapes are checked a day before reuse, to insure no lost oxide or fading. That function was part of the tape backup system.
Weekly tapes were duplicated and moved off-site.
All it takes is gelt and time.
Leslie Satenstein Montreal Quebec Canada
The only and best way to back it up is with another 6TB Array. You could setup your array card to write to both arrays simultaneously. You get that much data it becomes a major chore to keep track of the tapes and another array would afford you real time up to the millisecond backup availability. One thing the tapes will do for you is you can keep versions of your backup images from different dates. I would recommend like a small Dell library that holds like 30 tapes and maybe two LTO6 Tape drives. The tapes will allow you to go back in time where the 2nd array will not be able to do that but both would be good for a reliable restorable options in times of disaster.
Paul E. Bahre
I've got about 10TB and use an offsite NAS that I rsync to every week. Most friends will store it pretty happily if they can back their own stuff up to your server. You can split the cost and you both get your data backed up. And copies of all their stuff, too.
I'm not an expert but could he just only buy mirrored drives?
Thinking out loud- maybe he could buy 7 3TB drives to do it as well.
I wouldn't use Glacier. The cost is too high and you'd have to worry about bandwidth too.
The best way is to backup to another set of harddisks of course. Cheap, fast and you dont need new hardware, such as tape station.
Are tape drives right for this kind of problem? only if you are interested in backup.
Also you use the "Towers of Hanoi" method to save tapes, it's the most efficient if a little more complicated.
As I home user I'd recommend you write out a tape change schedule and put it on the fridge.. yes defiantly leave it somewhere visible so that you don't forget a step and get confused.
http://en.m.wikipedia.org/wiki...
YBs are an active element of public policy. A YM is sort of the size of the universe and at that size the cosmologist types have multiple definitions of distance. But a TM is about the size of the solar system. Pluto is about .6 TM out and Voyager 1 is about 18 TM out. The next star is about a PM out.
If i buy high end SD cards and load them up with a YB then they will fit very comfortably in the space shuttle assembly building.
i notice that people are talking about TB USB sticks.
Yours was the best of the bunch (minus formatting html tags), though I enjoyed reading about the trials and tribulations of punch tape vs punch cards vs tape/dat backup systems. The biggest problem I had many years ago was using a dat format system that I could not longer purchase hardware for. So I had tapes, but no way to read them. That taught me a lesson. Never use a media that I might not be able to read from 10 years from today. Thus I only backup on hard disks today.
I agree that to backup music, videos and other static content that has been downloaded via the internet (and not personally created) is a waste of time and space. As you pointed out, with even a throttled cable connection you can download this fairly quickly. So never waste time backing it up. Totally agree with you.
Now the one exception to video, pictures and music, are those that you create yourself. For your own personal pictures and personally created video. That needs to be backed up and I would suggest a harddrive (or multiple hard disks) for this purpose.
If you work in the video / movie industry creating content, obviously this comment does not apply to you...check into creating your own Linux video sever farm for while-you-sleep-rendering and a homemade Linux SANs like this Petabytes on a budget: How to build cheap cloud storage. You will have to learn some Linux to do this, but it would be well worth it, if you have the need. This article should help you, Thoughts about this DIY-Thumper and storage in general
Just as with industrial and union jobs of yesterday, white collar IT jobs, your movie editing jobs are now being offshored to India and when I was in LA a couple of years ago, a number of studios were relocating to Canada because it was cheaper for them...fyi.
For home users not in an industry creating massive videos, the next few paragraphs should cover you. Give thoughts to what you really need and why. Don't back up anything you do not have too. Like Software, Operating Systems, only focus on the data you create.
Plan your locations for different types of data, since you can label (mount point) your directory whatever you want. You could have one for video, one for audio (music), one for non picture images (your digital camera) and one for everything else. If you have the need, perhaps a DB directory as well. This would look as follows:
/video/ ~ for downloaded video, not home movies, never backed up (this will be your largest directory for most)
/music/ ~ for downloaded music, not self created, never backed up (you could write this to DVR or copy to a USB thumb drive if you want, the files are NOT that big. A 64GB thumb drive costs less than $30 on sale. Get a Micro USB adapter and only purchase micro SD cards and get very large ones. I use to use 8GB in my Nokia N800, now my zareason ZT2 Tablet has a 32GB micro SD card in it. Since I am using it for books, PHP development and research only, it will take a very long time to fill up.)
/myvideo/ ~ personally made video, back it up
/mymusic/ ~ personally created music, back it up
/images/ ~ digital images from your digital camera, back it up
/db/ ~ custom database stuff, back it up
/data/ ~ everything else, back it up
For the majority of you reading this, from /myvideo/ to /data/ (five different directories) will easily fit on one 500GB drive. If you are smart and compress it when you backup, you can probably fit a months worth of backups on that 500GB drive if not more. Linux comes with built in compression / backup commands and you can use PKZIP (or other compression program) for Windows to compress your data sizes and make your backup space go further. Even mo
Deduplication.
Tape would be best, though kind of pricey. Either that or hard drives either cheap slow disks or to be more pricey duplicate your live setup. It's not gonna be cheap for 20 TB of home use data, I'm guessing mostly of the size comes from video and audio, probably could be reacquired if need be. Backup your most prized data (personal documents, pictures, video, etc. that cannot be replaced) and take your chances with RAID 6 on the rest.
Great minds!
If you want to definitely keep digital information protected forever, you would have to go back in time, buy 20 ton of cards to be drilled and use that machine that IBM used to make the North American census. With punch cards no electromagnetic storm can destroy my data and I'd be quiet. The problem now would be another, how to store 20 tons of perforated paper? My definition of backup says that the backup must have 3 copies, one is worth, so I need to buy a shed to store 60 ton of paper, I think I will change profession and working with paper! LOL!
One byte at a time
For a one time charge (in the high 4 digits or low 5 digits), guys like this http://pivot3.com/surveillance... have solutions that claim to use something called RAID 6. In my experience, this is a good solution if the data is more write intensive than read intensive. At first glance, good for storing movies and music for personal use, bad for streaming to multiple subscribers.
Quite simply your solution is simple: a)money dependant Get another storage server with exact same specs and run an rsync cronjob to backup data once a night to other system. b) this time use only half the storage in a raid 10. I recommend FreeBSD or freenas with ZFS enabling compression as well, to get most out of your space.
I just can't see why you haven't thought of grabbing 10 or so 2TB or 3TB drives, copying a segment of the data to them and putting them back in their anti-static bags on the shelf somewhere. SATA drive docks are cheap, or plug it into your eSATA port to access the drive. Perform updates once per week on only the lead drive. I think the problem that remains is software to index it all and know which files need to be backed up and which are on drive X of Y.
Sadly, a Libertarian cannot force his views on another, and freedom cannot spread as does the cancer known as religion.
gosgog:
Hunt for an old Intergraph machine (Photogrammetric Companies somewhere may stil have one... then back it up on the hugh tapes.
Its not the amount of data (20TB) that's important; its the "worth" of the data.
If the data is important enough, it is worth the cost of a tape backup device & media.
If its not, worth it, don't bother.
If only part of the data is important, match the cost of the backup device and media with the importance of the data.
I back up the most important documents online (dropbox), the unrepeatable but large (pictures) online in the home network + offline copy at the office,
the replaceable (music) only online in the home network, and the rest is not backed up.
Set up a torrent, let N people download it, and redownload in event of disaster.
Sharing: it's not stealing, it's data conservation!
They are sure to have a copy, just ask nicely.
First of all, I don't know why I'm posting as Anonymous Coward, but would happily change that to ArtfulOne, aka Arthur Fuller. If anyone can tell me how to change that, I'd be appreciative.
I have tracked down the way to do this, but as others have posted, the cost is non-trivial. About the cheapest way I can think of is to score a dozen or so 4TB drives (they're quite cheap now) and chain them; but the really (cost no object) effective way to go is with IBM's fancy drive, more about which may be read by visiting http://www.technologyreview.com/news/425237/ibm-builds-biggest-data-drive-ever/.
As for me, my demands are far more modest. I recently purchased a Blu-ray burner and backed up everything and it only took 8 discs, nowhere near even a couple of TB, but mind you, I'm just one semi-retired guy and most of the stuff such as DVD movies and CD music has already been burned to disc.
Arthur
Buy a cheap large USB disk.
Sort the files by creation time, remove the ones in your list of files already backed up (empty at start) and fill your USB disk using the oldest files first.
Add list of backed up files to your collection of disk indexes, (excluding files that you want the backup to process again).
Label the disk as "files from xxx to yyy"
Repeat until you're duplicating current files into a current disk.
If the data is sensitive then truecrypt the disks before writing the data, being aware that losing your password would also lose your backup and be prepared to wait days building the encrypted container.
I wrote an awk script to do this, though at the time I was writing 4Gb DVDs rather than multi-terabyte USB disks.
At some point in the future, when disk sizes have grown you can copy multiple old smaller disks into larger ones to avoid being stuck with old technology. I expect after the submitter has moved to a 100Tb NAS in 2018 they'll be using the 20 as a backup.
-- Don't believe everything you read, hear or think
Bit by bit ... a little bit here, a little bit there.
Sure enough, the cow costume was hanging up next to the superhero outfit and sailors uniform. (S,Spud)
Why do you need that huge data?
low cost country suppliers
What I would do is build a second NAS.
Do an initial backup of it and move it to a friends basement/rack and sync it up every week or so.
You can use my wife, I'm pretty sure her ass can hold a few 100.
"Recently I had a friend lose their entire electronic collection of music and movies" You can stream almost everything for free and if you want the files back then apparently there is this new thing they invented called bit torrent.
Most films are so crap that they really can only be watched once anyway, if at all. How many times a year was he watching the same movies? I doubt that in 1 year he would watch 20Tb of movies.
Back it up on five 4tb drives which as of 11pm last night when I ordered one cost $150 each.
20TB for a home user is likely to be media data. It doesn't change much and it 's usually possible to recover - rerip, red/l, etc. so it's probably OK to live with a higher risk of loss than a business would need for their backup of 20TB of data. With those assumptions, I'd focus on minimizing the risk of loss and opt for snapshot raid. (If he wants true backup, backup to disk is my preferred option, using a decent backup program. If offsite is needed, carry the data that doesn[t change offsite and arrange to send to web the stuff that changes daily.) True raid 5/6 seems like a good way to minimize the risk of loss, but it's not nearly as good as snapshot raid for media data. True raid is too likely to fail during a recovery when the disks and controller are heavily stressed. I've lost raid arrays from both controller failure and multiple disk failure. Plus, a loss beyond the redundancy level loses everything with true raid. Snapshot raid with pooling software is much better for media backup. You only lose the data on the disks that actually die if you have more than the redundancy/parity number of drives that fail. You can add additional parity drives at any time to increase redundancy. For windows or *nix systems, Snapraid is a free snapshot raid option that works great. It comes with pooling capabilities that will make the entire array look like a single drive or, for more advanced pooling needs, there are multiple 3rd part options. Liquesce is free for Windows and there are even more free options for *nix systems that need pooling.
Currently, SD cards or memory stick can be obtained slightly less than $1 per GB. Suitable for storing the most precious data... A few years later, this technology hopefully will be more economical than tape.
Know your pads. One time pad: good for cryptography. Two timing pad: where to take your mistress.
Curiously in the labs there is this highly redundant DNA storage scheme. Sort of refrigerator size. Currently a YB of sd stick retail would be 10*27/10**9 or a million T$. Better called an exa $. Or a quintillian ion bucks.