Ask Slashdot: How Do You Manage Your Personal Data?
New submitter multimediavt writes "Ok, here's my problem. I have a lot of personal data! (And, no, it's not pr0n, warez, or anything the MPAA or RIAA would be concerned about.) I am realizing that I need to keep at least one spare drive the same size as my largest drive around in case of failure, or the need to reformat a drive due to corrupt file system issues. In my particular case I have a few external drives ranging in size from 200 GB to 2 TB (none with any more than 15 available), and the 2 TB drive is giving me fits at the moment so I need to move the data off and reformat the drive to see if it's just a file system issue or a component issue. I don't have 1.6 TB of free space anywhere and came to the above realization that an empty spare drive the size of my largest drive was needed. If I had a RAID I would have the same needs should a drive fail for some reason and the file system needed rebuilding. I am hitting a wall, and I am guessing that I am not the only one reaching this conclusion. This is my personal data and it is starting to become unbelievably unruly to deal with as far as data integrity and security are concerned. This problem is only going to get worse, and I'm sorry 'The Cloud' is not an acceptable nor practical solution. Tape for an individual as a backup mechanism is economically not feasible. Blu-ray Disc only holds 50 GB at best case and takes forever to backup any large amount of data, along with a great deal of human intervention in the process. So, as an individual with a large data collection and not a large budget, what do you see as options for now (other than keeping a spare blank drive around), and what do you see down the road that might help us deal with issues like this?"
I think you already have the answer
1. Buy hard drive from brand A
2. Buy hard drive from brand B
3. put in seperate esata enclosures
4. backup to both drives.
I run a RAID5 array on a spare box for backups, totaling 8TB before file system and RAID takes out its chunk. It's only turned on during backups, and is a fairly cheap solution for lots of storage if you look for sales on drives.
Sounds like you have the digital diogenes syndrom. What about you tidy up your room.. i mean, data ?
"I'm sorry 'The Cloud' is not an acceptable nor practical solution." Not sure what brand tin-foil hat you're wearing, but there are cloud backup solutions that encrypt your data *before* it leaves the machine. I use CrashPlan (I can't speak for others) and I've verified the encryption myself by capturing the traffic leaving my machine, even when CrashPlan was backing up to other machines on my own private network. Even the data it writes to locally-attached hard drives is encrypted. So there's at least one company who gets it right.
One way to save a bit of cash is to buy a USB eSATA drive dock (single or double) with some bare eSATA drives. This cuts the enclosure out, and allows you to buy bare drives, which are often cheaper than enclosed drives.
You could also consider Drobo or one of the Wiebetech multi-drive RAID containers. But encryption + cloud isn't all bad.
"large data collection and not a large budget"
This is your problem right there. You can't enter into a a situation like this without planning a budget for the inevitable failures. I suggest purchasing a new larger drive (3TB are common now) and migrating the data from the problematic drive. Then migrate the data from several older smaller drives. This will reduce the component count (points of failure), save you power (cost in the long run) and keep you ahead of failures. You should plan on doing this periodically to maintain the integrity of the data.
Can You Say Linux? I Knew That You Could.
Of course you have to have enough drive space, but spinning storage is fairly cheap (modulo the hopefully temporary price bubble from the flood).
I use rsync (try "man rsync") from my main machine to mirror all my data to another machine nightly, and yet another machine weekly. This only copies the incremental changes, so it's fast. E.g, if you check something new into a SVN repo, only the last day/week of checkins will be copied by rsync.
For the *really* critical stuff, which is much smaller than everything, I also rsync it to a rolling set of several USB sticks, at least of which is always off site.
I have a solution I call the "Buddy NAS". Go out and get two cheap computers. It could be a PC or a mini-NAS or a low-end server. Anything that will hold multiple hard drives. You jam both full of hard disks and use them as a backup/NAS server. One PC is kept at your place, the other at your friend's house.
Both computers have an account for you and an account for your friend (it helps if your friend is nerdy and "gets" backup solutions). Both of you now have a backup solution in your own home and a remote backup server at a friend's place. Two copies of your data, one remote. Basically it's like having local and cloud storage for you and your friend and it'll cost less than a grand if you shop around. If neither of you have static IPs you can use dyndns.org to connect to the remote boxes. Bandwidth shouldn't be an issue if you use rsync to backup changed files nightly.
I have both personal and business data that I can't afford to lose. The primary storage location for this is a 2TB data drive in my main machine.
I bought a Synology DS209 and installed a couple of 2TB drives in RAID 1. I have a scheduled rsync job that backs up the important local files to the NAS. Since the NAS is RAID1 I actually have 3 copies should one drive fail in some manner. Luckily I bought this last year before the flooding drove HD prices up, and the thing is actually worth more now than when I bought it :)
Bonus side benefits of the Synology include the fact that it runs on linux and is hioghly configurable. With lots of built in services it can stream my music to me through a browser window or act as a DLNA device. The file system is also directly accessible from my network which means I can easily access anything on there from my laptop while working in the garage, etc. and can keep my main machine tightly secured.
I have invested in USB backup drives of about the same total capacity as my primary storage drives. Yes, that's a lot of hard drive space for backups, but it's really the only practical solution that I have found. Just think of it as the cost of not losing all that data to the inevitable drive failure.
An external eSATA drive dock and a stack of 2TB drives might be a somewhat better way to go about it, at least backups and restores would be faster than the USB drives.
I can offer a couple of suggestions... What I did was buy a used Dell Poweredge 2950 on eBay for about $500 bucks shipped and I added 4 x 1tb SATA drives to it and I run a raid 5 setup with 3tb of usable space across the four 1tb drives. This solution cost me less then $1000 and I have a nice playground to experiment with VMWare ESXi.
I know that's not exactly budget conscience but it works great for me.
If I were on a tight budget I would just buy a 2tb USB drive from Newegg or somewhere similar. It looks like you can buy a name brand for about $130 bucks.
If you have a little bit more money to spend you could always buy a couple of 2tb internal SATA drives and run RAID-1 mirroring on them. You could put these into an old computer and make a little NAS linux server...
If you're saying you have no money to spend then maybe you need to consider cleaning up your data. Often times all those "personal files" that you think you need to keep... Really aren't required. Just my 2 cents but this problem is very solvable.
Seriously. How much crap do you really need to keep around?
Cheapo used market PC, invest in some large drives and a couple of drive docks, install FreeNAS.
Take a weekend to organize your data however it makes sense (by year, subject, file type, whatever), and store it on a particular drive. Rinse wash and repeat. Depending on how important the data is, store in a fireproof safe onsite, or offsite. When (read: if) you need the data again, dock the drive and retrieve.
Personally, I'm about to liberate myself from years of data. I'm tired of all these bloody drives and the annual, "I really want to look at _______ again". It's amazing how much of that crap has zero personal value anymore. (This isn't a comment on your data, but mine.)
#SickNotWeak
I put it onto Facebook.
Your "personal data" has all of one spindle of redundancy, and you're worried about how to copy data off in case the drive is currently dying? IT SOUNDS like you wouldn't be sad if that data disappeared tomorrow, especially the stuff on external drives that will happily fall a few feet and be rendered useless.
I suggest you start by augmenting your storage with *a few* additional 2 or 3 TB drives, and learning about Unison. I've been on a *bootstrapping* budget for a while, and yet my *music* is synced between two 2TB drives, and backed up to a mostly-off hodpodge raid array of old 300GB drives. My important personal files (32G), are synced across a flash drive, two laptops, a server, and those two music drives.
To start with you could decide what data you actually care about. Start with things of your own creation - personally I use online and backup drive copies of my source code repository, original music, etc and perform backups regularly. Just grab it with a script and it's done. Apart from that, who cares? How much information could you actually be creating that isn't available elsewhere? I wouldn't be happy to re-rip tons of CDs and re-install dozens of tools, but it would be more of an annoyance than a loss. If all you're doing with the data is wrestling to keep it anyway, it may be time to downsize.
One, you can't have enough backup images as something always seems to go wrong. Should include at least one offline unplugged "safe" unit. I know, it's a hassle to keep them up to date.
Two, the longer you wait the less all this backup space costs, so don't buy too much too soon.
So your disks are full and possibly broken. You don't want to have more disks, you don't want tape or optical medias, and a storage provider (aka The Cloud) is not an option... Then you have three solutions "down the road":
1) Delete stuff
2) Invent a new compression algorithm that will allow you to reuse the same disks forever without losing data
3) Rely on magic*
*might overlap with solution #2
lucm, indeed.
It depends on the data, but many formats compress really well when using WinRAR. Many of my files, for example, that reach nearly 10:1 compression. Unless we are in the same profession, I wouldn't set your expectations that high, but I imagine on average you could get your data usage down to 40%. If I'm right, maybe you could winRAR several folders from the failing drive to the smaller drives, and not necessarily need to get more space available.
That said, I really do think the suggestion of buying another drive is spot on. I saw a 2 terrabyte drive for $120 at Best Buy yesterday.
"I like to lick butts!" by MobileTatsu-NJG (#32700246) (Score:5, Informative)
As several people have said you already answered the question yourself. Spare HDD + Blueray.
You can achieve what you want by also changing the way you think about your data.
How much of your personal data is live? As in, how much of it do you access constantly, and need immediate access to?
Here's what I do, I have discrete HDD set up for each data type (not needed...but I had spare ~500gb drives so it's how I did it) There are broken down to Music, Projects, Video, and Photos. Each of them is synced monthly to a 2TB external drive that is spun up only to do a differential backup.
Data that I haven't accessed in 6 months (mostly phots and old closed projects) is moved to Archival grade DVD and removed from the Archival HDD.
So irreplaceable things (3 decades of photos, years of work) are stored and can be accessed within a few moments, less important but commonly accessed stuff (music and instructional videos, or documents I use every day) are live and backed up on the Archive.
*A)bort, R)etry, I)nfluence with large hammer.*
You must give up some of it, or transfer it to some other, long-lived medium.
Otherwhise, I suggest you face reality and invest accordingly
Windows 2000 - from the guys who brought us edlin
and keep copying things forward onto newer technology and vaugely try to keep two or more copies of everything in
some unorganized state
until you get old enough that you realize that its not going to matter soon anyways
Drobo -> mostly reliable local backup
BackBlaze -> mostly reliable offsite backup
You might want to substitute a ZFS-based FreeNAS for the Drobo, if you're so inclined. It's less automatic, but seems just as reliable.
All of my personal data is in my home directory and easily backed up to non-volatile media (which I do a few times a year, but not as often as I should.)
All of the project data is on SourceForge or company project servers, so there are duplicate copies of that.
I hardly think of my music or movies as "personal" data nor as irreplaceable. Were I still playing video games, I don't think I'd bother backing up game data, either.
When people talk about needing entire drives for their personal data backups, I have to wonder: WTF have you got ON there?
I do not fail; I succeed at finding out what does not work.
If there's some personal data you're missing at some point, just ask Google or the NSA ... But seriously, I've never made backups and not even bothered to copy over stuff from old PCs to newer ones when I upgraded (I keep old hard drives in a closet just in case there's something old I'm missing, but I never really do). The only personal stuff I keep safe is images on my iPhone (backed up on the PC) and email (safe-ish on the server at work). If I needed more space, I'd go with Wuala due to its relative safety (redundant storage, client side encryption) - but it's only free for 2GB or so nowdays. But ask yourself: do you really need all that data? I don't think so.
"I love my job, but I hate talking to people like you" (Freddie Mercury)
Check out RDX (http://www.rdxstorage.com/). Much easier, cheaper and versatile than tape for backup. It has essentially unlimited capacity and can be upgraded very easily without losing any prior investment. It can be used online or taken offline for storage.
Here's what I did:
My first iteration of off-site storage was simply using an external drive that I kept at work. I'd bring it home every so often and run a backup. However, I'm really too lazy for that. So....
2nd iteration was to buy a dirt cheap PC and a 2TB drive (enough for my needs at the moment). Put linux on it and wrote a quick shell script to log into my home network via openVPN, mount my requisite NFS shares, and run an incremental backup via tar. It's currently sitting at my parents' house. Every few months I'll use the above external drive to refresh the entire backup (that would take months via broadband) on a visit to the parents. Fortunately, with the paltry upload speeds TW gives me, I don't have to worry about killing my dad's connection. And the incremental backup usually runs in a few hours. If I ever need it, they're only a few hours away.
Obviously you still have an issue of tracking things down on the rare occasions when you actually need some of your family photos. But you can rest assured that they're in there somewhere and weren't purged last time you needed a few GB for more webserver logs.
Maybe the first step is to de-dup the existing data. You'll still have some manual intervention to check possible duplicates, but it's a first step towards tackling the bigger problem.
politicians are like babies' nappies: they should both be changed regularly and for the same reasons
I have a large amount of personal data as well (also, no pr0n), though I realized some place along the line that it wasn't that important not to lose it. When I die most of will probably be tossed anyhow. Who is going to want to sort through it all.
With that said, I still don't want to get rid of all my data, so I have a drobo with 5-2TB drives, and I also have a linux box with raid set up, that backs up the drobo. But really, think hard about how much effort and expense you want to put towards keeping data. There is probably a whole lot of it that can just be tossed because you will never look at it again. It will probably save you a bunch of time and money to go through all the data and get rid of the stuff you'll never actually use. This actually works well for garages too.
My fileserver uses RAID and makes a separate (encrypted) backup to an external USB hard drive (fortunately, my data hasn't grown faster than hard drive sizes so I can fit it all on a single 2TB drive, to ensure file integrity, periodically I have rsync verify file checksums,)
As a secondary backup, I use a 1TB notebook drive locked in a USB enabled fire safe:
http://www.amazon.com/SentrySafe-QA0121-Fire-Safe-Waterproof-Storage/dp/B00166187Q/ref=cm_cr_pr_product_top
I used metal straps to tie it down and lock it to my computer desk in the hope that if someone comes in to steal the computer, they'll just grab it and run without prying off the data safe. The safe is only rated for 30 minutes @ 1500 degrees so it's not a perfect solution, but better than nothing.
For my really important data (old tax returns, scanned in records and receipts, etc) I back them up to the cloud. For photos, I keep the full-size image locally (some TIFFs, mostly JPG's), but keep a lower res lower quality image in the cloud. All of this is less than 20GB.
Most of my big data is DVD's that I've ripped and I'll count on insurance to replace them if they are lost - I don't even back them up to the drive in the fire safe.
I had a similar problem. I had let a friend borrow an external for a backup and it came back write protected! He was using windows, I mac and linux, and could not figure out why/ how this happened. I was at capacity on that drive as well as two others and needed space bad (needed to back up a laptop before installing new os). A few other friends came to my rescue and allowed me to borrow some of their drives until i could find a more permanent solution. The suggestion to build a raid box is a good one.It will allow you to build up your storage capacity over time. My advice is to use multiple methods: back up locally AND send your most essential data to the cloud via an encrypted service (this will protect against a fire, theft, etc. Plus its handy to have access to your files from any computer). I like wuala personally, but spideroak is also very nice solution as well. There are lots of good, secure solutions now that are relatively affordable
I think it's time to admit that you're a hoarder. What exactly -is- your personal data that's so precious? I run a server just to keep my skill set up and run my side business, but I've only managed to accumulate around 600GB of data, only about 35GB of it is 'mine', the rest is client backups.
So first admit that you're a hoarder, then decide if you wan to address that issue or indulge it. If you choose to indulge it, you're going to want to build a small home server. Something with a low-end 64-bit CPU (i3?), a gigabit LAN port, and lots SATA ports and 3.5" drive bays. Buy a bunch of high-quality (WD RE4?) matching drives that fit your data needs times two (you're RAIDing space away). Once you have that, install Linux on it, build a software RAID-1 or 0+1 array (don't do RAID-5 unless you can handle days of rebuild time), and format it with something accessible (read: in the kernel, like EXT4). Create a share on the array with Samba and happily access it from all your machines (don't bother with Netatalk or NFS; CIFS is great on all platforms). As your data needs grow, you can add drives in pairs or replace drives with larger ones and grow the volume. If you need backup, you'll want another array, preferably on another low-end box (an enclosure on your desktop?) but it can be built on a RAID-0 or JBOD to save money.
"Sometimes, I think Trent just needs a cup of hot chocolate and a blankie." -Tori Amos on Nine Inch Nails
To backup personal data from drive C on Windows, I use a batch file using xxcopy to clone all the folders. These are all I would need to restore after a clean install of the Operating System. These folders are routinely cloned back into the C drive of a wide range of computers. For all other major folders on other partitions I use Allway Sync and make sure I have 3 copies of all 10TB of data scattered around my six computers and six enclosures. Most publicly related major folders are carried away to other houses.
I keep absurdly huge drive images of customer machines since they seem to be able to destroy them so quickly it's become simpler to image than to re-fix....
So anyways, I have a specific machine, old case, huge, noisy, excellent airflow and it has two RAID1 arrays in it, a 1.5 tb raid, and a 2tb raid - They are backed up weekly to external 1.5, and external 2.0 tb disks, that are left unhooked except while syncing.
THUS: RAID protects me from component failure ( if one drive fails, stick in another drive! I've already done this with a 2tb disk ) allows me a centralized fileserver for my home network, and the externals protect against localized acts of god/viruses... If my RAIDs get compromised and I have lots and lots of redundant viral infection instead of my data - format 'em, and re-write from externals.
Admittedly, this many disks was not cheap, however in the long term, all I need to do is replace disks (ideally) and the prices on this size of disk will certainly keep dropping. As a side bonus, since all of my data is centralized I can share it to myself online via a wee bit of port forwarding and suchly...
TL;DR build a computer with a RAID array of largish disks, back it up to external regularly. Titty sprinkles.
PS Slashdot, FUCK YOUR CAPTCHAS.
It's that time again, is it?
http://ask.slashdot.org/comments.pl?sid=2452630&cid=37557630
Either.. ..or..
A: Buy that HDD. Yes, they're a bit more expensive right now
B: Wait a few months, prices will come down again, buy that HDD then. Yes, you may lose your data in the mean time.
Now stop asking or I'm going to pull over.
Drives die, sometimes without warning (and old statistic by IBM says 50% of the time there is no warning). You could just throw everything away, as you are going to lose it anyways, sooner or later. Or you could find the resources needed to make sure you have everything on at the very least two drives (one of which should not be connected or running). There is nothing that can replace reasonable backup.
As a side note: Common sysadmin wisdom is to have 3 independent backups in addition to the original.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
You might be on the next spinoff of Hoarders programs, a digital hoarders show.
In this show, redundancy, old versions, and files that haven't been opened in 5+ years are brought into question, for which you will be embarrased to defend... You will attempt to justify why you still have linksys drivers for a wrt54g you don't even have anymore. And no, the DVD ISO of the Alvin and the Chipmunks movie, that you never burned or watched, is not worth saving.... Neither are about 85% of the digital pictures you took (you know, the ones that were the 'bad shot' that you took before finally getting the good one).
Take a day or two, go through it chunk by chunk, and purge! PURGE!
Compression is your friend. Also, weeding out things you don't use. Hording is problem in the digital age as well.
A friend and I each have FreeNas servers with multi TB raid-z.
Some of our data we keep mirrored between them.
The servers are physically brought together occasionally for a full sync, but most of the time rsync -n is done over the internet to see what needs to be updated and the data transferred on a removable hard drive.
1984 was not supposed to be an instruction manual.
I just put everything on Facebook -- hey, where's everybody going?
Floppies
More floppies
More floppies
(I can sell you some cheap!)
Chaos maximizes locally around me.
I was given a p4 optiplex desktop which I am using for my fileserver. I have a pair of 1.5tb disks and a pair of 2tb disks in mirrors running freeNAS on a usb thumbdrive. The optiplex is great because is quiet, fast enough and the case conveniently fits 4 drives(and free). I have a cheap 4 sata card for the drives and I splurged on a decent intel gigabit nic after couple other cheap ones with no luck (asus and dlink I believe). If you do the thumbdrive thing, make sure and buy two and dd the contents over, one failed on me and now I have to manually manage my raid via cli. Of course this does not protect me from myself, but I haven't lost too much due to my own stupidity. I used to have a larger fileserver with all sorts of little drives, but I replaced them with a couple big ones and I'm much happier for it. Lots of people swear by drobos. One person I know plugs a decent usb drive into his hacked router. Rsnapshot is good for automatic incremental backups.
I've learned a few things from my 45 years in the computer business. RAID arrays aren't a reliable way to prevent data loss. It's not a backup unless you have successfully restored from it. You need to have a FIFO with off site backups.
As others have noted, if you can't afford AT LEAST another drive, serious problem right off the bat. One wonders what the data is worth given this.
I'll move on, assume the data is worth AT LEAST another drive or two (we're talking a couple hundred bucks at most, come on):
1. drbd: raid to a low cost, remote machine with similar sized drive. Dead drive is now recoverable.
2. amanda or similar backup to drive on remote machine. No, not tape, just virtual on disk. Now have a backup history as well in case one needs that file that one deleted 6 months ago.
Yes, cost is a couple hundred and some older machines, but really, what is your data worth in the first place?
Anything is possible given time and money.
Get older version LTO drive, get identical harddisks with spares and do full drive image backups with incremental backups in-between.
Mirroring disks (Raid 0+1) is good too.
All that other stuff - DVD/Blue Ray, cloud is chickenshit. Drives die or act up.
Run Linux on one drive and do image backups from there.
Zerofill the empty space,
dd if=/dev/zero | split --verbose -b 2000m -d - ZERO
compress the image:
dd if=$DEVICE | gzip -v | split --verbose -b 2000m -d - NAME
and write the chunks to tape.
People have different needs. Some needs are imposed by either employers or the wonderful US Govt. for mandatory data retention. Others are your life's design work that you want to retain until you die. Other data you want to pass to your kids. If you can't afford to lose it keep multiple backups on multiple media in multiple locations. Books & pamphlets have been written on this. Transfer the data to new media once a year or two or three & keep all working drives.
No single storage device local or remote is immune from disaster. The Alexandria Library succumbed and took with it countless early human treasures. Wars have done in archives all over the world. Lightning, outages and power surges can defeat the best protections even when electronic equipment is turned off, but still plugged in (laptops are better when left unplugged, which is actually a great asset).
Backup is one thing; recovery is another and it can be GUT WRENCHING. The recovery process needs as much thought as backup.
A Clue or Two: A business partner had his MBPro backed up to 2 external HDs. Not great, but OK. Said MBPro crashed on the Lion upgrade. No way to know whether it was hardware or software and the MBPro should have at that point been off limits for use until carefully checked out. He happens to live in an area subject to lightning and outages which can affect anyone (even with a UPS). However, he reinstalled the Snow Leopard and plugged the first BU HD in an attempt to reload the data; HD became corrupted. Should have stopped, but then the 2nd HD was corrupted. Moral of the story; Recover data from a backup to an external HD running on another computer than the one that got mucked up.
The cost of 3-4 external 2-3 Terabyte hard drives and a couple cases or RAID box is dirt cheap compared to the value of the hours you put in on your computer each year as are Blue Ray drives & disks.
Caution: Someone on this list mentioned putting drives and disks in a "fireproof safe" or "fireproof file cabinet"; wrong! The UL approved boxes are designed only to protect "paper" for a given amount of time in a typical fire by releasing steam (212 deg. F = goodbye DVD/BR disks). Once the fireproof agent uses up its water...Farenheight 456 takes care of all contents...permanently. This is why multiple locations are needed.
SparkleShare looks and works like Dropbox, but is actually just a fancy automated self-hosted GIT repo, (which you can interact with using GIT commands on a remote repo if that is what you want to do).
The wiki explains how to encrypt things (and the encfs recipe doc'd on the wiki also works with Dropbox, etc.)
I think the project has matured really well, but still isn't really well-known, and doesn't even get mentioned much on the slashdots, although that's where I heard about it.
www.sparkleshare.org
https://github.com/hbons/SparkleShare/wiki
You can't be ahead of the curve, if you're stuck in a loop.
I'm assuming to start with that you have backups of everything in some fashion, with which you could put it all back together if your biggest drive suddenly failed spectacularly.
In that case, how important is uptime to you? Since this is personal data, I'm guessing that you could live without live access to it for a few days. And given that, I think your best bet is not to keep a spare just sitting around, but to only buy one when you need it. Hard disk prices keep going down, and the price for the same drive six months from now is almost guaranteed to be lower than it is now, barring things like the Thailand floods. The other big advantage of this method is that when you upgrade to bigger drives, you don't have to immediately upgrade your spare as well -- and if you manage to go a whole upgrade cycle without needing the spare, then you've saved yourself the purchase of an entire drive.
At the moment, because your drive is possibly failing, then yes you need to get that spare. But if the current drive is actually failing, it won't be a spare so much as a replacement, and then you're back to the same situation.
I was in a very similar situation about 3 months ago, but with 80+ gigs of data. I had pictures that don't compress well, personal documents all sorts that needed to save. I used to just back them up to an external drive. Then i would hear stories about something happening to peoples computes and their external drive that was left connected. I never wanted to go to the "cloud" and loose my data, besides I had a very slow DSL connection so uploading would be a pain. I went researched the cloud solutions went round and round then held my nose and purchased a subscription to Mozy. Yes it took about 4 weeks to get my data upload. Of course i was using the internet connection for other things at the time also. At the time I figured what is 4 weeks when i have gone for so long with very few other options. After everything was done I now have Mozy running in the background and every few house it backs up. I love it I feel more comfortable about my data being safe. I don't have to worry as much about it. I still use my external drives for things like an extra copy of my Quicken data but that may change as I change my habits as Quicken data is also backed up by Mozy. For me I worry, what could happen to a computer, drive, even raid corruption, fire, flood, tornado, etc. I am paying for it and i know some want something free or less cost, but if i want to spend less i guess i could use the free version of Mozy and just do a few docs and no pictures. However family pictures are just as important as some of my personal documents. I know some may prefer one of the other backup services but right now I can say I am impressed with Mozy, even on my slow connection.
...is backup to a hard drive and then unplug it and put if on the shelf for a long time. it will hold the data for a while but it will not hold it. A RAID 5 is best, or a boat load of Blu-Rays.
SDLT tape drive and some tapes. If your "personal data" is not worth the $800.00 to buy a good used SDLT drive and a few tapes, then it's not worth backing up.
Just do not dink around with theoretical "backup solutions" that are not proven. and no, hard drives are not "proven" for reliable and long term backup. I have DLT tapes from 16 years ago that I know for a fact I can still read.
If your data is important, You dont screw around with consumer hard drives that are known to have a low MTBF.
Do not look at laser with remaining good eye.
I, too, have a lot of non-porn, non-illegal stuff. At first I just used an old DDS4 drive to back-up the critical data (documents, camera pictures, etc.). I know you say tape isn't economical, but it really can be. Sure LTO5 is still expensive, but you can find LTO2 & LTO3 drives (200GB / 400GB respectively) for the price of a hard drive - even some auto-loaders aren't that expensive. The controller isn't much more than a night at the pub.
Tape really is the way-to-go for must-have backups. I have tapes that are 15 years old and I can still read them. I can't say the same for CDR's or even hard drives from the same period.
It may not be the fastest solution if you backup your whole array all the time. But this way, you can backup the whole array if you want and at the very least, you can backup a lot of your critical data and have a bit of piece-of-mind.
My personal and family data (not including ripped DVDs etc) are about 1 TB. Mostly photographs and video with my DSLR so the files tend to get large...but I also have a ton of documents, app installs, and all sorts of misc data. I must admit I'd be curious as to what fills multiple external HDs for personal data but to each their own.
Good organization outweighs medium in my case. 2xExternal 2 TB HDs - primary and secondary...and then a third stored off site at my parents that I update about 3 times a year, so if the worst happens I'm 6 mons out of date, but its usually about 4. And thats if both my primary and secondary go down. Thats a cost of about $300 total and a little a bit of effort.
"A little bit of effort" is defined by how you organize. Backing up manually means I don't rely on software or a service, but it requires some forethought. For me I break it up by data type and usually year...sometimes I go one more by how that data was acquired (photos I add who took the picture). This is important because I put anything new into a diff folder so I know whats new and whats not. It took me a couple of years to get to the structure I have but I sometimes add small tweaks. The effort or time now is fairly miniscule.
What I'm trying to get at is this : if you're prepared to put a small amount of time in every now and again, with an initial overhead, you can do this very easily and cheaply.
RAIDz2 on two machines with completely different hardware and in two geographically distinct locations, or at least on different circuits in your house. That is just to store your data and drive images. Still keep local copies on your everyday machines.
Especially if your data is worth $$$, perhaps if you do consulting, then you should be able to justify the cost, ~$2000 per machine with 5 drives each and server grade hardware with all the error checking and failover hardware you can get your hands on.
I have some really lame internet stalkers and of course the normal issues everyone else deals with. The way I deal with securiy is to use an anonymous laptop, a USB key with http://www.spi.dod.mil/lipose.htm Lightweight portable security and an IronKey USB key. All my important personal data, real banking and portfolio data , etc is on the ironkey. I only use the ironkey when I boot my anonymous laptop from the LPS USB key. I am only on line 15 minutes at a time so if some loser did detect me the chances of them hacking my read only linux key in that time are remote. Even if they did, the next time I boot I boot a read only clean system. Pretty simple really. (Oh, and I leach off a neighbors WLAN when I use that box :-). You are limited to the size of an Ironkey but the system is pretty secure. I used to use just an anonymous laptop boting from a linux disk, but, I like this better.
Example:I move my archivable personal data off my HD as gzipped tarballs, and regularly backup my home and root directories separately as gzipped tarballs (that way I have a history of backups too, just in case). On ther other hand, I rsync my music collection separately because despite its size, I use all of it regularly, but I don't want it weighing down every single backup of my home. So you may have to come up with your own solutions for your own specific situation, but in general: consolidate and optimize!
stripe the drives, export the whole pile as a share and copy everything to the server.
Christ, this is 2012, storage is a solved problem.
I want to delete my account but Slashdot doesn't allow it.
I have a local Microsoft Homeserver for onsite backups. I also have an external drive encrypted with TruCrypt (opensource) where I spin off every 90 days my loved ones (files) just in case the house burns down. I take this external drive to work and lock it in my file cabinet. It is encrypted to make it difficult for anyone to steal the data but not the drive. I have too much data (like most people) between pictures, music and other personal data to play with uploading to the internet or writing to any optical disk. Obviously my risk is that both drive copies die at the same time.
I just bought two mybook from western digital. 200 usd the 3TB version.
one is in my home, the other on my father's home.
i run rsync on them so they keep replicating by themself.
cheap and easy.
Megaupload
Whether YOUR data is of concern to the CopyrightStaatsPolezei doesn't matter. The moment Hollywood sees some site as infringing, their allies in "our" government and elsewhere will bring it down. You may note that the many innocent users of Megaupload, including, if some reports are correct, some government agencies, STILL don't have their data back and may never get it back.
What you don't own, you don't control.
Its around $1500. not too bad for 3TB of tape storage (1.5TB native). tapes are like 100 bux.
http://www.newegg.com/Product/Product.aspx?Item=N82E16840121080
Considering you will spend that much building a NAS witha bunch of storage just get the tape drive and be done with it.
is the nerdy way to go, I did and it kicks ass specially when the ex-gf deleted my stuff
When I started my own business, I got myself a safety deposit box. Every time I go to the bank to cash my cheques, I use an external 3.5" HD mount, backup critical files using a script, take a hard drive into the bank and remove last month's copy. It has the added benefit of being fire and earthquake proof (two things I won't have if I stored them in my home.
I only have about 200GB of data I consider "critical" though, but the same thing would scale up. At ~$40/year for a small safety deposit box it's one of the cheapest and most robust mechanisms for keeping your critical data safe.
You can get a not-so-old Sony AIT3 for about $200 off of ebay. Getting cheap since they have recently been discontinued.
100GB tapes (uncompressed capacity) will run you about $20 from ebay as well and they seem plentiful.
So that's 10 tapes for a terabyte. Really not that bad and tape lasts forever.
I have a second internal drive that is automatically synced with the main hard drive that contains all the important data. (No RAID, just plain old backup.) Moreover, all important data is backed up from one machine to another and to an online server.
For both of it I'm using Crashplan with a long-term contract. It's quite affordable and the software works really fine both for online and local backups. I don't understand why the "cloud" is not a solution apart from the obvious fact that you should avoid companies that use the word "cloud" too often. I've been using Jungledisk and Crashplan without complaint so far. Then again, I've never been struck by disaster so I don't know how easy it is to get the data back.
If you're storing a lot of illegal content like pirated movies then you'd probably be better off with choosing a European backup provider, but in any case you need to have encrypted online backup if you really care about the integrity of your data. No local backup can safe you from a flood or a fire.
If the data isn't worth a few $120 2TB hard drives then it's not worth keeping, is it? 3TB hard drives are around $200-$225 also.
You ruled out: hard drives, tape, discs and cloud storage. What exactly do you expect us to say here? There isn't some other magical form of storing data we've been hiding from you.
Of my four internal drives, only one has non-recoverable data on it (business and personal). The others are things like games and applications which can always be reinstalled. So a second drive is the same size as the first, and a nightly backup combination of windows backup and a few scripts copies that data from the first to the second drive. I like having recent backups fully accessible.
On the long-term side, every few weeks, or months, I copy from the second drive to a drive in the closet -- which really should be off-site, but I don't really worry about those types of things in my life.
So between the two, I can really only lose a single day of work. For anything particularly impressive, I just drag the file to the second drive when I'm done that work, to make me feel better.
But that's it. At this point, the closet has about ten drives covering the last ten years. The second drive has a good two years on it, fully compressed of course. And it takes me about two years to fill a backup drive. Of course, my business files are mostly text. If I were playing with more images than I do, I'd go through drives about ten times as quickly, which would mean one backup drive per quarter. Which is still only $300 per year these days, so that's fine.
First, let's look at your problem: You are gathering too much data. Either the data is 100% needed and irreplaceable, or it isn't. If it isn't, your first step is to treat your data just like you would physical junk that accumulates in your house.
Create Three folders.
1. Critical Keep
2. Unsure
3. Toss
Go through your data and MOVE it to one of those three folders. If it isn't critically important data that you would be upset that you lost and can't be recreated (wedding videos, etc) It goes in the Critical Keep folder. If you aren't sure about it right now, but you can't declare it for folder 1, put it in 2. Anything else "old install files, backup data from a windows 98 machine, etc" That stuff can be deleted. Be harsh with yourself. Think of it like moving from house to house, if you haven't opened that box by your third move, just toss it in folder 3.
Repeat the process until you either have everything in your Critical Keep folder, or your delete folder.
Now, hopefully you have reduced the size of the data you are using to something marginally manageable. I'm a data hoarder, and I've managed to keep the rate of growth of my data to lag behind the general rate of growth of HDD capacity. Now for the fun stuff:
Two things you want to avoid.
1. Loss due to a dying disk
2. Loss due to a destroyed home (fire, theft, etc)
Here was my budget solution that resulted in a fire and forget backup system that is suitable for a home user and is about as minimal as you can get for cost.
3 Disk Drives.
A primary drive to run the operating system and hold installed programs and two LARGE data drives in a RAID1 configuration.
Static data files (Video, pictures, etc) get stored on the RAID1
A scheduled process (once per month for me) backs up the OS drive to a virtual HD file on the RAID1. The files on the RAID1 are then backed up to a cloud storage service (Carbonite in my case).
So, what is the result of this?
My operating environment is backed up monthly. The only thing I lose here is configuration changes or programs installed since the last backup (less than 30 days for me)
The RAID1 ensures that my personal/static data is protected from a single disk failure, and helps a bit with read performance for the static (and large) files.
Should a cataclysmic failure occur and my entire computer is lost to something like a fire, remember that I've been sending what is on the RAID0 out to the cloud (carbonite), so when I can rebuild a computer I can just download the (very large) offsite backup from the cloud to my new machine.
The downsides I have right now:
1. I maintain the windows backup as a VHD file because it allows me to ensure that the backup data is 'packaged'. I don't know the exact details about windows backup, but given that Carbonite sometimes excludes system files I didn't want to risk an important hidden/system file being missed in the backup. In addition I didn't like how it could only backup to the root folder of a drive. The downside is that the resulting 100GB file is a pain to backup, which is why I restrict the backup histerisis to 30days (previously I had it backup every 3 days) This keeps it from continually uploading the VHD file to carbonite.
2. The HDDs for the raid1 lose half their total capacity in that configuration. I used it because it let me only have to use 2 drives and the performance boost. If you can afford 3 drives, go for a RAID5.
3. Most Motherboards support RAID natively now. However, I understand that you can run into issues with hardware RAID if you have to switch to a different hardware solution. I haven't tested this, but it could potentially be an issue if you use a RAID5 from hardware and your motherboard fails and you can't replace it with an exact model. The good news here though, is if you have been backing up to the cloud, typically it's done on a per file basis, and thus you don't have to worry about this. Just download your stuff ba
Out of modpoints but really liked a post? 1BDkF6TtmmeZ3yqXbz9yhdYVqRYnwFoXDj
split(1)
And a lot of Gmail accounts. About 300, I think.
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
unRAID (cheap, expandable, protected data) coupled with inexpensive offsite disk backup (eg your parents house, providing you don't live there already!).
A USB 3 dock will speed cloning time dramatically. Get 1 or 2 huge drives to clone to. Finis.
Splitting the important, volatile stuff (source code, e-mail) apart from the nice-to-have stuff that doesn't change much (scans, downloads, PDFs, etc) makes backup easier.
Important stuff gets backed up regularly. I have 2 Linux boxes, and rsync important stuff to the other one overnight. I put important stuff on a 64GB USB keychain every day, and a second USB keychain a couple of times a week as a redundant backup. If something bad happens, I can be back up and running quickly.
Nice-to-have stuff is on a 1TB hard drive. I rsync it with another 1TB hard drive in an external case once a month. I also back up stuff I download when I download it to optical media. If this goes poof, it's not the end of the world since I don't change this very often, and the stuff that does change like software downloads I can re-download and would probably have to for the latest version.
I haven't gone beyond 1TB of nice-to-have stuff yet. I don't expect this system to change much when I do, since nice-to-have data doesn't change often for me.
I recommend freeNAS with a RAID5 setup. FreeNAS can run well on an old computer (great if you have an old comp sitting in the closet) For the raid5 you can use 3 or more HardDrives of same size that you have laying around. Good raid5 setup tutorial here. http://www.youtube.com/watch?v=x8-DrhYKTFE
I would suggest purchasing new hard drives though if your data is important.
I do not have as much data as you do (yet), but I have a system that works reasonably well for my needs. Maybe you can adapt it for yours (which seems like merely a matter of larger disk capacity).
At home I have a laptop (probably used most frequently) which lives most of the time in the family room, I have a desktop PC in my spare-bedroom-office, a third machine in the basement-home-school room, and there is also a daughter who occasionally visits with yet another laptop. All those machines have a DOS batch script that runs rsync via ssh on top of cygwin to- and from- a cheap PC "server" outfitted with Fedora Linux.
Whenever a work session is initiated on any of the non-server hosts, the first thing done is rsync from the server. The last thing prior to terminating a work session (as well as possibly periodically during the work session) is rsync to the server. This makes for "loose coupling" between the client hosts ... that is to say, the most current changes are, naturally, on the machine currently being worked and are duplicated to the server at the end of the session. Eventually all the hosts get, via rsync from the server, all the data that may have been updated on other hosts.
The server also runs rsnapshot with hourly, daily, weekly, monthly, quarterly, and annual snapshot cycles. The rsnapshot repository is on an external, USB-attached hard drive.
This "eventual consistency" means I have (except for the small risk of losing to hard-disk failure the most-recent, not-yet-rsync-ed work) at least two, and up to five local copies of all my data between the various hosts and the external drive -- plus historical snapshots.
I have yet to take the additional step of employing multiple USB drives so as to enable locking a copy away in a safe deposit box, though, but that would add little, not insurmountable complexity. However, since I maintain a hosting account anyway to support a couple domain names, I use OpenSSL to encipher all my files and rsync those to the hosting account ... just in case.
Introducing a new host or replacing a outgrown hard disk with more capacity means a lengthy initial rsync copy session, but in on-going regular use, the efficiency of rsync really shines, even for the home-school and visiting daughter hosts that get pretty far out or of sync with the server repository as well as for going across the WAN connection to the hosting account.
And as bonus, with my NAT-ted router, I can maintain this loose coupling from anywhere in the world as well when I travel with the laptop.
Delete your porn
The rest of your personal data will fit on a floppy.
You could try PaperBak - http://ollydbg.de/Paperbak/
Of course, 1.6TB is going to take over 1000 reams of paper (half that if you have a duplexer), so you might just want to suck it up and buy another hard drive.
We're up to 2 NAS units now, with 7TB[*] of disk space between them, all backed up on schedule. The USB backup drives are rotated every few weeks with another set kept in a secure place in the garage.
[*] One NAS unit doubles up as media server, so it's got a load of movies & music in addition to user files in its 6TB. The other one is our web server and email server with only 1TB of disk space.
Those who can make you believe absurdities can make you commit atrocities. - Voltaire
There is important data, and then there is IMPORTANT data.
Movies, games, music, documents are important data; that is to say I care about having it easily accessible, and at my fingertips, but I wouldn't lose more than a night's rest were it to disappear. Code is IMPORTANT data; that is to say that if it were to disappear, heads would roll.
As such, and owing to the evil overlord rule of not having one of anything important, the important stuff is in plain sight, and exists in several variant copies; it only requires a mind of uncommon origin to understand where it is, and what it's worth. Finally, I keep the general ideas behind the important stuff memorized, in case those copies disappear. To this end, I apply the general rule behind the fictional philosopher's stone-> anyone who is meant to use my stuff would understand it, and anyone who is not will not; thus, it's only the intelligence of the seeker than limits them. It has been successful in keeping the bungling burglar, the thuggish criminal, and the demanding tyrant at bay.
This all comes down to another evil overlord rule: the IMPORTANT stuff will not be clearly labeled as such.
I am John Hurt.
The words "Fireproof Safe" is by the definition of Underwriter Labs. It merely refers to being fire resistive for a given amount of hours in a typical fire "for paper".
Forget DVDs and CDs and any hard drives surviving a fire in one of these "fireproof" devices. They are designed to release steam to keep the temperature at 212 def F until the fireproofing material exhausts all its water at which time the temperature goes up and, well...you can imagine what.
My solution to this problem is painfully simple: about 5 years ago I bought 5 drives 500GB each. I have put a server (made from old parts, like pentium IV and so on) in the basement (where nobody hears it, and it can be as noisy as hell). I installed debian on it and configured cron to call rsnapshot three times per day for doing automatic backups of all PCs in my family. I never touched this machine since then.
With one exception: 3 years ago I started to run out of space, so I bought 2 HDDs 2 TB each, reconfigured raid6, which was extremely easy because for raid I am using mdadm, which supports such operations online. Also I had few more spare drives during the years, so I kept adding them to the array, and currently there are 9 HDDs in this PC. It is very noisy, but nobody cares about that.
It runs flawlessy, untouched for years, and nobody cares about it, except for when somebody in my family accidentally loses or deletes a file. Then suddenly backup comes very handy.
Rsnapshot is especially good, because it keeps hardlinked copies of data from last week, 2 weeks ago, last month, and much more, depending on how you configure /etc/rsnapshot.conf. Currently I have backups dating back about 2 years, with granularity of 1 month. And it only occupies the space on HDD to reflect the changes between data, thanks to hardlinks.
So my raid6 array has total size about 4TB and still 500GB free. And I feel this will last at least a year or two. In case of problems I can start deleting copies that are more than 1 year old. While most recent snapshot uses about 2 TB or such.
Rsnapshot also can backup windows machines, so you don't need to worry about compatibility. Though I don't have windows machines and I don't test that in practice ;)
#
#\ @ ? Colonize Mars
#
Of course, it's with tons of other stuff that I have, but I wound up going with a dedicated computer (full size case, AMD processor/board, gigabit network) and 5 2TB drives in a RAID5 for just under 8TB of protected storage on my local network. More than enough space for any and all backups, and single disk fault tolerant. Software is FreeNAS and it works quite well and while 2TB drive prices are still a little high the entire setup was under $800 (I got the drives at $90 each right before the floods in Thailand). I've had the system running for around a year now, and have had 1 drive failure which was covered by WD (free replacement drive so I have one cold spare sitting handy).
Because it's a consumer oriented case and board, noise isn't bad, even with a bunch-o-fans keeping the drives cool.
best way to slim down = calculate folder size then sort by size. go from biggest to smallest. osx has it built in and if you have win7 FS-Inspect does the job.
I organize all my data into three categories, (a) stuff I need backup including old versions, (b) stuff I want backup of latest version, and (c) stuff I don't mind losing.
Two external backup drives, one is kept in another location, the other at home. Every once in a while I switch the two, so there is always an offsite copy that is at most a few months old.
Simple and stupid setup for making copies; each external drive is formatted with an encrypted filesystem. Made some scripts for copying with "rsync" for files which go into categories (a) and (b). Using some additional scripts for recursive "copying" of old versions with hard linking (on an ext filesystem) to preserve history for files in category (a).
Keeping an off-site backup is important to protect against some risks such as fire, and it also helps prevent losing data in case some malicious software or accidental "rm -rf" or the like would wipe your backup. Keeping old versions of category (a) data helps protect against a scenario where master data is lost, and an rsync operation removes the data on the backup.
If you want to be really paranoid, you could make the backup from a remote computer so that if the master you are backing up is compromised, it would not be able to wipe the backup (at least not "type a" data). Personally I found it was not worth the hassle, given the redundancy added by the offsite backup.
If this is data which is important to you, keep in mind a RAID really gives you very little protection. It only offers some protection against hardware failure, but any virus or accidental wiping will kill your data just as well as it had been a single disk. If the data is important, you need at least one external backup.
You really, really want offsite backup in addition to whatever you do at home/office. If "the cloud" works for you, fine, but for many the bandwidth issues are a big problem
One option is to have, in addition to whatever drives you run live, enough extra drives to back up your data twice. Keep one set at home, and another in a safe deposit box. Depending on your risk tolerance, you can use packaged drives, or carefully swap bare drives in eSata enclosures or the like. Backup as often as you can on the set at home. Then every few weeks/months whatever meets your needs, do a swap; put the up-to-date backups in the safe deposit box; take the other set home and use it for new fresh backups. With that as a base, you can usually use cloud-based solutions to make sure there are daily or immediate offsite backups of truly critical data that's changing often.
Trust me: disasters do happen. You can lose a lot if you haven't prepared.
Buy more harddrives. Seriously, just ask yourself how much pain it would be if would loose the data. It's easily worth a couple of hundred dollars/euros.
As for what strategies, incremental vs 1:1 backups etc. I have recently written two blog entries for a filmschool where I regularly teach where they have the same problem - huge amounts of data, little cash, loose organisation. Have a read if you like (the newest blog post covers incremental backups with hardlinks under windows and osx which has some nice properties but is not always the best solution) http://www.danielbachler.de/node/121
I have 2 10TB RAID 6 servers running Debian that sync every night. There is also a emergency backup on Windows that is just 4 2.5 tb drives chained together.
only one everything
news for you, pal, "the game' has nothing whatsoever to do with electronic technology. It's been going on for ages. Just as a for-instance, the Roman Catholic church together with monarchs would mindfuck their subjects with B.S. about religion and patriotism, to take the worker's wealth and have power and dominion. Old, old scam, same old shit different century.
Repeat:
Buy a new 2TB drive, copy your data from your smallest drives to that one, then sell them at eBay.
until you have all your data on new 2TB drives
At the end you should have money for an extra drive.
If you like (and use Linux) you can start a RAID5 with one drive (+1 missing), copy your data onto it, then add another drive, do a reshape, migrate your data...
And at the end add the last drive to regain redundancy..
I would suggest cleaning up unneeded data. I know it is not the kind of answer you are looking for but I am sure there are things that you do not really need, maybe duplicates or older version of sets of files. By making a throughout clean-up, you can probably gain many GB, you then might have enough space to swap your files around to be able to reformat your fleaky drive. Yon then have the time to think over to find a more durable solution for when this happens again.
"Storage on S3 is ridiculously inexpensive any more.
I have about 6 TB of data that I need to keep backed up."
So you mean that 6000/month*0.125$=750$/month is cheap?
Or did I miss something?
I gave up with the idea of an useful sig...
Basically here is what I do:
Incremental ghost of the OS to a separate drive 3 times a week with a full on Sunday, on the alternate days it does a file based backup of require data to same backup hard drive of everything that has changed. No deletion of filebased backup but Ghost gets overwritten weekly
Monday I bring home a portable hard drive using truecrypt from work and hook it up. Over night it does a mirror of the backup drive which gets brought back to work on Tuesday.
I am only an individual so I really do not have to worry about having a data farm or anything else like that. I have had decent success with cloud storage. I went with SpiderOak and that haven't been bad. It is fairly cheap (about a $1 per 10gb), has clients that will run in Mac, Linux, Windows, Android, iOS, and your data can be accessed anywhere with an internet connection by logging into your account on their site. They claim you data is encrypted before it leaves your PC to their servers and remains encrypted. SpiderOak also claims that they do not have access to your encryption keys so they can never look at your data. What I use is not the best service in the world, they are comparable to Dropbox but doesn't have the features that Dropbox does, but SpiderOak gets the job done. I just needed something that I could keep a backup of my important stuff on and if I need to reformat/reinstall for whatever reason I can get my important data back. However, I stumbled upon the Cyphertite project the other day and I am considering giving that a try.
The question really is how much do you value your data? A little? A lot? My solution is a dual solution (albeit still waiting for the 2nd part to arrive). Online I have a subscription to CrashPlan (although there are other various services available which will do a similar job). You can get the software which will backup your computer (or selected folders) to another computer with the software installed over the internet (e.g. your parents if there is enough free space). If you pay a subscription you can back up your files encrypted to CrashPlans servers (and I think you can even put in your own encryption key), albeit it can take a few days to do this. You can even get family packs for multiple computers.
The 2nd part for which I am waiting is a networked attached storage - I am getting a Synology product, although again there are other companies making these. The model I am getting will have 2 spare bays for hard disks of your choosing, and then you can run a backup on your computer to these which will keep the discs up to date. You can also use this as a file server, as well as a media server, bittorrent client etc. (see the synology website if you are really interested). You can stuff a couple of 2TB drives in there and even implement some sort of RAID.
So you can then have an onsite and an offsite backup with a NAS and crashplan. The 3rd part of the solution probably is to trim down what you store as I can vouch I have a lot of crap that really doesn't need to be saved. Then do regular backups of the really important bits (for me this is not my itunes folder) to DVD-R.
Overall it comes down to how much is your data worth and how much are you willing to spend?
This is beginning to worry me as well. I have several machines on a network and a 500GB external drive. A lot of my files are my own work including a lot of video which I would really not like to lose. So far I have made 1 copy of everything on DVD, but I am not happy with that, especially as I now have found that DVD-R seems to have a shelf life, as discs that I could read now seem to have errors, although this could be down to drive issues.
I would love to get off site storage using an on-line system, but it would swamp my broadband just to get the current files saved, and I think BT would look askance at me maxing out my upload bandwidth for hours on end to cope with the backlog.
I recently put a 200GB drive out of a broken laptop into a USB case and I am putting some files on that, but given its history I cannot really trust it. I do wonder if some of the old tricks of using a VHS tape might be worth looking at again, at least they are cheap and I think that if proper storage conditions are available they would have decent life.
I once thought that files that came from the net didn't need to be backed up, but I am now finding that some files are no longer there, dead web sites or "new" versions which don't actually have the same content. So this is going to make things worse as every file I really like or use is going to have to be held locally.
For me this is just a matter of entertainment or hobby interest, so I cannot justify (or afford) a high cost tech fix for the problem, but perhaps we have found a niche market for a low cost / true archive quality storage device for home use>
2 > profit !
so all we need is step 1 > invent device.
nec sorte nec fato
I'm a server/SAN admin and worked with another engineer to come with this for both mine and his houses. We tried a few different SAS controllers before settling on this one for price, performance, and actually working.
Here's the set-up:
- A Windows 7 PC (I have an older Dell precision with two 320 GB drives in a RAID 1)
- An LSI MegaRAID SAS 9280-8e Adapter
- A SANS Digital TowerRAID TR8X (has 8 hard drive bays and a twa 4-channel SATA controllers)
- 4x 3TB SATA drives in two RAID 1's (could change these later).
And now, the reason for Windows 7.....
BackBlaze unlimited backup for the whole system. Create your own password (encryption key) and your data is encrypted at rest on their end and they don't have access to it. It's how I do it. I have about 3 TB of family movies and pictures that my wife would kill me if I lost.
I also took all the pictures (about 300 GB) and sent them over to my in-laws on a separate drive.
The Windows box is now my main file server and has a few TB of data on it. It works great. The only thing to be aware of is that I didn't get "enterprise" hard drives so once in the year I've been using them, one of the arrays degraded and rebuilt itself. I forget the exact details, but it was expected in this design. I get great speeds, can scale up quite a bit and everything is backed up.
I have a 6x2TB array and a 4x320gb array in my server, the former with HD204UI drives on an Intel SASUC8I controller, and the latter with various 320GB notebook drives I've collected, in a 6-bay 5.25" rack on the motherboard's controller. There's also a 120GB SSD in that rack to run ESXi and most of the guest OSes.
We keep all of our crap on the 4x320 array, which is backed up to the other array.
For data to last it has to live. Data lives by continually replicating itself. ... more replication.
For quite a few years now hard drives have been the only economical and practical backup solution for mere mortals (enterprise is moving there too). The best way to work around drive failures is
Right now my setup involves continual backup to a RAID 1 drive with Time Machine (use whatever works on Windows/Linux) which is then rsynced to a remote server I setup at somebody else's house as burglar backup (after an initial local rsync on the source machine and sneaker netting the hard drive). This gives my data 4 drives to live in and I'm not sure I'm happy with that.
And now there's 200 comments where the people are proud of their kilowatt server arrays which are powered 24 hours a day for their PHOTO ALBUMS? Are you people shitting me? I mean, you're putting me on, right? You don't really use up 10,000 kWH per year storing your family photos, do you?
Hey, I've just invented the electric elevator-button-pusher. I save a TON of finger wear and tear.
Sometimes, humanity makes me sad.
RAID sucks as a backup because if you accidentally delete a file off your RAID storage, it gets deleted from all the drives in the RAID. Your file is not safe as it would be on a backup.
RAID is for redundancy. So you don't have any downtime if a HDD fails. Without RAID, a HDD failure would mean downtime until you can get a new drive and restore from a backup. With RAID, your array and your business keeps chugging along as if there were no failure, and you can replace your failed HDD at your leisure.
Rebuilding a RAID array with a failed drive has been simple and automatic in my experience. Pop out the dead drive, plug in the new one, and it'll start rebuilding automatically. Your data is still accessible during the rebuild, although access times and transfer speeds may be degraded. Depending on the amount of data, a rebuild can take anywhere from a few hours to a few days. A second failure while rebuilding means all your data is gone. So you want to keep backups of everything on your RAID array.
If you just want to glom a bunch of old drives together to use as a backup drive, you want a multi-bay JBOD/RAID enclosure like this or this. Be forewarned that if you plug these in over eSATA, you need an eSATA port with port multiplication. No laptop eSATA port I've found does, so you'll need to rely on USB or built-in hardware RAID/JBOD to use these with a laptop.
If you want something which will sit on your network acting as a file server, you want a NAS like this or this or this. You can read NAS comparisons at Small Net Builder. But keep in mind what I said above - even if you get a NAS, you will still need to make backups of it.
A 2TB drive is something like $120.
Is protecting your data worth more to you than $120?
If yes, buy another drive. If not, erase a bunch of junk you've accumulated.
I use a Sans Digital 5 bay external SATA raid array for all the pics from 2 Canon digital cameras. http://www.newegg.com/Product/Product.aspx?Item=N82E16816111172 It's not backup. It's storage. I've had it for over a year, and it's pretty damn solid. It is NOT cheap, and you'll want a reasonably hefty UPS to handle it and the computer it's on. ( On the other hand, it is not Expensive either, for what you get. ) You MUST go the sans digital website and, pretty much only, buy the drives they recommend. If you do research and get RAID ready drives, you can call them and discuss with them your drive of choice, and then buy, at your own risk of course. I went this route. Bought the whole thing, external enclosure & SATA card, Drives, thru newegg. It does need a slot for the card. It slows your boot down. It is fast. It's RAID 5 w/ a hot spare as configured( i think). For Disaster recovery, you will still need something else. Overall, I love the thing. For backup, you might be able to get away with one of those top-loading sata drive holders. Move stuff off to two cheapy drives and store someplace offsite. Good luck.
Most people *think* they want bare metal backups. NO. You want data files. I have 450Gb of crap on my laptop (2x500Gb internals). But less than 20Gb of data that is of any use (I have install DVDs and already made a restore DVD when I got the laptop).
So I stopped backing up to a 'bare metal' level, bought a USB 3 1Tb drive that uses Genie Timeline for onsite immediate recovery, Carbonite for 'cloud' and 2 64Gb USB sticks for monthly long term data backups.
That's about $250 for 3 layers of backup. All my data is 3x redundant backed up and I have 2 levels of history (the USB sticks are 30 days, Carbonite will delete a file that has been deleted locally after 60 days.
I can't see a cheaper way to get that level of protection.
scoobydo!
HDDs are kinda cheap.
Have the manufacturers recovered from last year's Thailand flood yet?
Not trying to troll here. I'm serious. Consider that you might be simply storing too much stuff.
What are you asking? The ONLY answer is buy an extra
drive(s) that match the total amount of stuff you don't want
to lose, copy it all over.
Problem solved.
Anything less than this answer is less. More is buying double
the amount of stuff you don't want to lose and keep a rotation
of your backups.
Further, break down each group of stuff you collect each month
or quarterly and burn to a DVD.
Don't forget you have to store these either offsite or in a fire-proof
storage, or best... both. Otherwise, it's all an exercise in futility.
Did you really just ask this question in Slashdot? Not sure what's
worse, plain obvious questions that have been either asked so
many times or is answered with a simple google search or obvious
promotion of advertisers stuff.
I used to be proud (and snarky of course) when I said that I read /. /. is failing. Nothing NEW is covered here any more. It's
Now, I don't mention it... it's irrelevant to the newest generation and
the current generation I'm sure has noticed that even as an news
aggregator
always old by the time it hits the front page sometimes painfully so.
It's also obvious what stories get picked up and which ones get looked
over. And then, to break up the monotony...
"How should I back up my data that I don't want to lose?"
effin brilliant
-AI
For me, it is far better to grasp the Universe as it really is than to persist in delusion
My setup:
Currently, they setup's been running for 2 years, and I've only ever used the online backup service for testing my backups, and I've never had to hot-swap a drive. I've got almost 2 TB of data in use, and I've finished ripping my DVD collection, so growth is slow now. My TV can play all my video files straight from the QNAP, and our phones can access it to download songs/audiobooks/videos through our local Wi-Fi.
Just because you're paranoid doesn't mean there isn't an invisible demon about to eat your face
A single 3TB external USB3 drive should address your needs nicely.
Just unplug it when not actually using it (to protect against power surges, lightning, etc) so it can be considered a backup[1].
I acknowledge the unplugging/replugging thing can be a pain if you need it frequently. What I'd really like to see is a hard switch of some kind that can physically interrupt both the power and USB lines to a device. It wouldn't need to cut the mains power, just the 12V/5V after the transformer, but would still need to give enough separation to prevent arcing in the event of a lightning strike.
[1] Recall that an online backup is no backup at all.
"Nine times out of ten, starting a fire is not the best way to solve the problem." - my wife
"news for you, pal,"
I love the introduction's tone, chills to the bone! I'm not a "pal" by the way, especially not one of those who defends the stomping towards the NWO.
It's very interesting when spiritual/tech conspiracies are discussed, there's usually always one that jumps on the bandwagon against religion and The Catholic Church.
Newsflash, many Bible believers aren't Catholic and don't agree with criminal organizations.
You've managed to, IMO, divert the topic like many in the Cointelpro racket, now let's see you prove my post wrong without diverting away from the topic again.
I'm sure the anti-God parade will mod you up heavily just as they will defend any attempts to bashreligion because they're so sure of there being no-god and their belief in science when science hasn't told us yet what accounts for our entire brain's function, this world and everything in it, other worlds, etc.
Are we clear as human beings? Or am I still a "pal?"
The "Game", IMO, as we both know it, though only one admits, has everything to do with technology and spirituality. The "eye" of the "triangle" is on many right now, but given enough time, it will slowly try and destroy Christianity, too, in countries where it's still legal to believe and practice this faith.
I am making my way through the comments, and want to clarify a little. I am not talking about backup. I am asking about disaster recovery or just plain drive maintenance tasks that should be done annually. The drives are my backup. Yes, good corporate data storage practice is to have spare drives around. I am talking about home. How many have 2 TB drives sitting empty on a shelf at home, just in case? I don't know anyone, personally, and I know hundreds of geek admin types of all ages and experience levels, myself included. We usually buy storage upgrades as needed and seldom have current technology, large drives just laying around because we're using them! Other than that, great stuff so far. Thanks all.
Here's my scheme: /data/ (not the OS) from all my systems. I usually buy the 'sweet spot' of most GB/$ even if it's more than I need right now. Powered off except when used.
- USB 3.0 External drive, as big as I need to back up all my
- Once a week I rsync it all over with a single script. Use cwrsync for Windows.
- If it runs out of room, then it's time to buy a bigger one and swap out the smallest drive in one of the current systems with the old backup drive. I wipe the old small drive and put it in a pile, just in case, but they rarely get used ever again.
- A small amount of really critical stuff I save off-site (rsync again) nightly with an automated script.
I haven't lost any significant data in 20 years doing it like this (though it wasn't as convenient or fast 20 years ago), even though I have had systems crater - it's easy to restore from a drive.
Backup to optical is so annoying I end up not doing it, and tape is clumsy and expensive (and for me has a very poor record of actually being able to restore anything) but if it's just a matter of turning on the external drive and running a script I can do it religiously.
The bulk of my Data is Photos - although some is music which is much more static.
However my photos grow at an enormous rate because I have both an SLR and an 8 MP camera in my Galaxy Note phone.
The Music - I just keep an offsite backup at my Fathers place updated every 6 months or so... not perfect but it does... and then allows me to listen to my music when I visit.
For everything else - I use 2 TB drives. I load every thing onto a 2 TB drive - and then move that onto a second - cheap PC and then sync that with another 2 TB drive.
(giving me two copies onsite) and then sync a third one that I then take off site - when the drive is full - I keep these about 180 miles away,
As the bulk of the data is photos I also pay for Google Storage and I upload selected pics to Picassa. Anything that wipes out all of that will wipe me out as well I reckon... but also I like the fact where ever I am I can access the pics and show people (the modern equivalent of a 70 family slide show - muhahahahaha
There are now available 4 TB drives but for now Im sticking with the 2 TB drives because
1) I have lots of the power supplies of the 2 TB drives and they are the most unreliable part....
2) 2 TB takes a while to fill up anyway...
3) they are a fair bit cheaper per GB than the 4 TB drives
4) None have failed yet so its been pretty reliable so far so Im happy to stick with it for now....
Every 15 minutes my important data is Rsynced to my colleagues house and his over to mine. We then each have a USB hard disk which Rsyncs all of the local data. Works pretty well and ensures no data is lost of we have a fire/flood etc. All secured over SSH.
I have three raid 5's, 9 drives, haven't lost any data since I moved off FreeBSD. (gvinum is _finicky_, and when a good drive drops out, it's hard to be sure everything is going as expected. Yes, gvinum was a long time ago.)
When I build my raids, I partition them so that the largest partition is the size of a current biggest drive -- when I built my 1TB drive-based raids with three drives, I made two 1TB partitions on it. That way I really only need one drive to take all the data off (temporarily), format, and return. (If you're feeling particularly daring, and trust your drives, you could even use the third spare drive for this and rebuild the RAID after completion.)
If one drive dies, I have two options: turn the RAID off (take the disks out) while I RMA the dead drive, or order a drive to arrive in a couple days and rebuild. It's not like disks are expensive now, and I don't order a spare at build because I know I can save 50-70$ if I order it when I need it in 2 years, and I don't have to worry about that drive being aged, too.
Along those lines, order _different_types_ of drives for your raid: standard same batch goes bad at the same time issue, so order at least different batches. Differences between specs don't matter so much, you'll still get good speeds from them (but you're not doing this to maximize your speed, are you? I've never actually tested speeds between same disks and different.)
Ideally I'd like to move all of this to a RAID6, but I don't like the thought of wasting two drives for each raid -- may as well RAID10 or do 6 drives per raid, but I'm not doing that because I want easy rebuild ability... if I _need_ to, I can get two disks and rebuild the entire raid, as opposed to four. Maybe that just doesn't mean a lot. What I would _love_ to do is use a single extra disk for _all_ the RAID5's: 3 disk RAID5, two data, 1 parity. Three RAID5's. Now take one disk, and use that disk as a sort of "raided parity" for all three drives.. so if two of any one raid5 failed, I'd have that 4th disk shared between the three raids as a fourth backup. Along similar lines, less useful but as viable, having that extra drive counting as a hot spare for any of the RAIDs should a drive drop out. I don't know of any RAID level supporting this (people are going to say just use RAID6), but it _may_ be possible with an odd combination of LVM and RAID..
I feel like I'm headed toward RAID6. Probably with my next array I'll get 3 more drives and combine those with one of the arrays that I have now -- 6 disks, 4 usable, but two losable before a real problem (and I would hope that RAID6 gives me a sort of data integrity checking/correction, whereas RAID5 can only inform you of data mismatch).
To date my biggest problem has been with dropped good disks: power loss on external enclosure (power cable issue, failed power adapter), incomplete writes (machine freezes, RAID slightly inconsistent, subsystems (DM, mostly) holding devices open so I can't stop the raid, driver bugs.. I've run into a _lot_ of kernel bugs around RAID), and port multipliers (not much supports them, the best-supported devices tend to be PCIe 1x and suck for multiple drives, and everything else is EXPENSIVE-- go SAS, it's cheaper on ebay).
My favorite enclosures are Venus DS3r two-disk enclosures (port multipliers required..) and Sans Digital 4-bay enclosures (port multipliers... but converting to SAS, 45$/unit plus those funky power cables with activity and presence status leads.)
A filesystem tip: mkfs creates enough inodes for 4096 (1 block(?))-byte files to fill the entire drive. This is incredibly wasteful -- who creates a series of small files over the entire disk? Mine average half a meg each. So, count the total number of files on your filesystem, then sum their byte total (don't use df, that reports usage other than all your files), and divide -- that number is the average file size. Divide your disk size by the average file size, that's about how many inodes you _need_. Multiply it by 2 or 4, for scaleabi
Duplicity backup software: $5/mo (donated to EFF)
FDCServers Atom + 2TB HDD: $45/mo
Comcast Internet with 20 Mbps upload: $130/mo
Running into your 250 GB transfer cap in just 24 hours: priceless.
Daniel
TB's...the fuck? I have to guess your 'personal data' is photos, videos, useless crap etc. Personal data still fits on a floppy.. tax returns, birth certificate (2048RSA). "Real work is done on paper" - Michael Scott
Unless you are on a very limited budget, don't reuse old drives for primary backup.
Drives are cheap. Buy a new 1.5TB drive as your primary backup.
If you really want to be careful, use your old drives for secondary backup
I probably have 30 drives on my shelf after years of backups.
If you buy used at least. I bought a 16 tape LTO3 library and 20 LTO 3 tapes for around $500 used on ebay about a year ago. It might be even cheaper now. And if you don't mind changing tapes every hour or so, LTO3 stand alone drive can be had for $200. Also, if you're only going to deal with only 1TB worth of data for a while, LTO2 is more than enough, and a used LTO2 autoloader can be had for under $200. Hard drives are never a proper backup solution. The data can be lost(without paying a few thousands for recovery at least) any time you plug in the hard drive. The tape solution is just so much more stable as a data storage platform, I'd say you look into getting used LTO2 autoloader at least. They really shouldn't cost more than a couple of hard drives.
http://www.bhphotovideo.com/bnh/controller/home?O=&sku=608030&Q=&is=REG&A=details
This sounds totally stupid, but this is basically a 4TB drive loaded with linux and a gigabit ethernet port. Plug in and you can nfs/samba/ssh to it all you want.
With the ssh you can setup complex mirroring setup with multiples of these, it's totally just a arm chip attached to a 4TB drive. Nice part is the drive will go into power save, which saves your drive over time.
I use this setup a lot for instant local storage server, for the cost, power savings and flexibility it's hard to beat. I move large files to local for editing then shove them back into the ethernet storage when i'm done.
I use a 3TB (4TB -1) OWC Mercury RAID 5 array that is backed up constantly via Apple Time Machine.
Time Machine rocks. By the time it starts bucketing data, it will be at the 3-year mark. I keep my eye on the disk health, and am prepared to swap out as necessary.
I keep an external drive rsynced to my main drive, so I have an immediate backup, if necessary. I have used it, on occasion (sucks when I do -my primary drive is an SSD).
That's my personal data. I have a Mini that uses a Drobo to store my Web site stuff.
I don't keep any data from my "day job" on my personal systems. My system at work is backed up very well indeed.
BTW: I use git for my personal source control, and Perforce for my day job source control.
"For every complex problem there is an answer that is clear, simple, and wrong."
-H. L. Mencken
Your "problem" is not well defined. A problem is the difference between the way things are and the way you ant them to be. In your case, you have a large amount of digitally stored data and you are afraid that it might be lost. What do you need to do so you don't have that fear? (This is not a "management" problem, but I will get to that later.)
The only practical solution is to keep duplicates somewhere "safe", with safe being defined as someplace where you think you will be able to recover it in an emergency. Maybe the emergency is just a local hard drive failure, but it doesn't hurt to think of the effects of floods or Tunami. RAID is good, but I have customers who spent a lot of money trying to recover data stored on RAID arrays because they didn't have another backup. No matter what, you are going to lose some data in an emergency. Figure out how much you can afford to lose. This establishes the timing of your backups. 1 day? Ok, differentially back up your data 3 times per day on different backup media and you will not likely have to ever recreate more than a day's worth of data. Then arrange for the media to be copied or stored offsite so local disasters don't affect it.
I predict that loss of personal data is going to be a big problem a few years from now. Even today, how many people who taped their kid's birth in 8mm film can actually find a way to view it now? What about those files, ideas and manuscripts that you saved on that Z80 running CP/M and had the MFM drive? Oh, and those old CD's that are now rotted away? Management means deciding how long you want this info to be available. Do you want your great-grandkids to see how it was to live in the USA before Communism replaced our Constitution? Better plan for it.
Management might also include retrieval. I know at least one "backup solution" that issued a version of software that couldn't restore the data. I have over 3000 books. I don't need them every day, but I do occasionally like to go back and research or review some of the stuff I've read. Digital storage should allow us to have a tremendous amount of data at our fingertips if we can only put our fingers on the right set of facts... Semantic search is not quite good enough but it's improving. In the meantime, you might do worse than to use the data-cache model for retrieving your knowledge base.
You can't get around the budget problem. You can purchase reliable solutions or you increase your risk.
Family scrapbook solution: We created a scrapbook and made 6 copies which we sent to all the siblings in my generation of kids. If my brother's house burns down we can re-create his scrapbook for him. Use the same idea for digital safety and you might have a high probability of recovering it.
"The mind works quicker than you think!"
If you're going to do RAID, seriously consider RAID6 over RAID5. Yes, the extra disk costs money and a port, but the bathtub failure probability curve suggests that after infant mortality and during the 20-30 hour RAID5 rebuild after a 1-drive failure, you have a significantly non-zero probability of a second drive failure (especially considering you'd be running it at 100% load for those 20-30 hours). My other solution is "rm -rf /*".
I have a similar problem in that I have 3 copies of data. One on a laptop, one on a server and an external drive.
I guess I could plug them all into the laptop and start going through manually, but its a lot of work right? Some kind of progam for detecting dupes would be handy but thats only part of the solution... man its a lot of work, perhaps Ill just delete everything apart from Docs and Pics...
a job for a digital assistant... if only I could trust them with payslips etc...
I wrote an erasure coded filesystem using Jim Plank's old ecc library. Supports multiple encoding forms, e.g. I mirror processed photoshop across 6 drives, but keep original raw files in a 3 data + 2 parity erasure code. The import process automatically selects the "least-loaded" drives, so when I want more space I buy a new drive, copy the smallest drive's contents to it and put the new bigger drive in, poof, more space; I've debated writing an official rebalancer, but haven't needed it yet. Intended for that type of archival model, not intended for being able to do overwrites of existing files. Available at: https://github.com/eric-anderson/eccfs; possibly so poorly documented that it's unusable by anyone but me. I've been using it for many years, I think I have a few more fixes on the machine that actually runs it, but since I'm away from that machine now I don't know (the github copy is what I have on my laptop right now).
The same thing happens to people who can't throw away a candy wrapper.
The physical problem has already been solved. Buy x number of x TB hard drives and put them in some type of configuration where they are all accessible at once. The problem with (desktop) hard drives is that statistically for every xTB you read an unrecoverable read error will appear. I manage over 100TB right now at my job using enterprise grade systems and almost monthly we have the system reporting a URE. The solution for that is checksumming. ZFS works great as does BtrFS and it keeps your data intact or at least reports what is bad about it.
"File system errors" are simply unacceptable, you either use a very shitty file system or as said, there is a bad block somewhere on the hard drive causing you issues.
Custom electronics and digital signage for your business: www.evcircuits.com
I just gave it all to Wikileaks and they take care of ensuring that I can get access to it later.
Keep all your important files in a version control system. Personally, I use Perforce (it's free for 2 users or less). That gives you: multi-revision history and checkin comments, an easy way to pull a subset of files to any computer in your house, and peace of mind that you don't need to worry about kids deleting anything important as it's all stored on the server with history. Also easy to see what has changed on any computer and check those files in. And there's a big win for data integrity checks: Perforce stores the checksum of all files (and revisions) and can easily check that every file still matches the checksum in the central database. If you have any disk corruption, you'll know about it when you run 'p4 verify -q //...'. You can store files of several gigabytes each with no problem.
On top of this, I use rsync to copy the server data onto backup drives. I'm also looking at storing backups online, but haven't taken that step yet.
I've been using this system for years and I couldn't imagine being without it. It's so easy to find and retrieve exactly what I want - my resume 5 revisions ago, my tax return, photos from 2003. Even without that, the data integrity checks give a lot of peace of mind.
I'm tackling this issue at the moment. My current strategy is ZFS + snapshots + incremental backups stores on S3:
http://japanesesoapbox.blogspot.com.au/2012/03/how-to-do-your-own-dirt-cheap-cloud_25.html
ZFS.
Read a little Stoic philosophy, such as Marcus Aurelius. There are some wise words about how to live within your means.
A few have posted the same, but since the OP asked, I figured I'd post my own setup:
I had the exact same problem, lots of data I did not want to lose and hard drives started to fail. I think backups to CD/DVD/Blu-Ray or external hard-drives are messy and troublesome, I wanted something automatic. I re-activated one of my old computer cases and bought a decent motherboard for it with a good SATA controller. I added 5 drives, 1 system drive, 3 RAID drives and 1 internal backup drive. The RAID currently has a size of 1.5TB. All drives sit in a cage that's accessible from outside and which allows me to hotswap them (while the system is running) without even requiring a screwdriver. The system sits in a closet where nobody notices the noise and is connected to my network with a gigabit controller. The OS is a standard Ubuntu installation with mdadm.
It is setup so that there is a daily incremental backup of some vital files to the internal backup drive, with weekly full backups. Additionally, I have CrashPlan running that continuously backs up all data from the RAID into CrashPlan's cloud. If the backup is failing, I get an email to check what's wrong. If one of the drives is failing, I get an email and can replace it while the system is running - rebuild is starting automatically. I tested it while building the system, it's actually pretty sweet. :) The odd thing is that since I did install the system (about 3 years ago), I never had a single component fail on me.
I mount most directories via NFS to my desktop system, my girlfriend (who insists on using Windows) can access her data via Samba. I additionally run a DLNA server to stream music and movies to my TV, iPod or whatever is equipped with a UPNP client.
As for the cloud problems that OP mentioned, I don't know where the problem is. CrashPlan allows to use private keys for encryption and if you're in the states, you can just send them a drive to seed the backup. Since I am not in the US, this option was not available for me. I just let the server run for a few weeks, uploading continuously. After that was done, the bandwidth and time required to keep the backup up-to-date is minimal.
There is a ton of other cloud backup companies out there that provide similar services. They're usually not for free, but quite honestly their prices will also not bankrupt you.
I keep important stuff (spreadsheets, letters, vector graphics, coding projects) in a subversion repo on my "home-server" which I keep synced with my laptop and desktop. That way, I always have backups of this stuff and I can update/commit from remote locations.
Now for the stuff I don't want in the svn because it's just too big and I don't really need a history (photos, videos), that just resides on the server, cifs-mount.
On some sundays (maybe about 4 times a year) I hook up a external sata drive, put an encrypted filesystem on there and do manual backups (not always the same stuff and not all of it, because I dont want to afford that many big drives). That drive then gets transported to alternating familiy members places for safekeeping.
SuperMicro X7SLA-H board (E150 when I bought it more than a year ago)
A couple of WD Greens(E100 each at the moment)
A low power silent PSU (E100) (WARNING: this one has no ground and therefore no decent surge protection. Always combine with an external surge protector)
Some RAM that fits (E50) (FreeNas advises 1 GB/ TB of harddisk, but will function perfectly under low loads with much much less)
A case (free if you have one idling in the attic)
FreeNAS (free).
Total: E600 for 3x 1,5 TB (3Tb under raid 5), expandable quite a bit with PCI-E Sata cards (E50 for 4 devices. Raid controllers are overrated for home use. Soft raid gives les trouble if the hardware dies)
Well, I might have a way, but it only works on a semi spherical planet in a vacuum.
Manage your backups with bup (available in recent ubuntus, and in apenwarr's git repository). It is "the awesome", as they say.
It's backup how it should be done. I'm not sure what data you actually keep, but if it is e.g. a lot of similar Virtual Machne images (e.g. 50 different Win7 VM images), you might find that your backup set is 10 times smaller than the original set because bup detects and uses to great effect repeating data in the source database by storing it just once. "Deduplication" is how it is usually called, but bup goes a lot farther than most.
(Though, pay attention to the section on PAR redundancy -- the downside to deduplication is that if the single deduplicated copy goes bad, it takes every virtual copy with it)
I'm currently building my own multi-solution. Beige box solution is the way to go.
My game rig/media server/work station is a fast AMD Phenom with 1 150GB 10K rpm OS disk, and then four 2TB disks in RAID 6 as "storage".
Eventually, I'll separate the boxes into one Amusement machine and another for media player/storage. I have CAT6A cables in the walls here.
The storage disks are backed up over the network to my 4 disk Netgear NAS with "RAID X" (~6). I'm thinking 3 times a week.
PC and NAS are on UPS, so I hope to get a 2nd NAS placed at my brother's house (in turn for some storage space at my place) for redundancy with monthly backups.
Defining Statistics and Social Research
of some sort, use RAID 6 if possible, with as many/large drives as you can, and keep a couple spares handy. Backup super important stuff to the cloud (www.backblaze.com)
If security is an issue, TrueCrypt is your friend.
Don't use RAID 5.
RAID5 was a great solution when disks were very small and very expensive. Now they are relatively huge and cheap, so RAID5 is almost never a good idea.
I'd keep the music. One of my favorite hobbies is running music through Audacity to fiddle with the tempo and/or pitch. There's a ton of songs that I only sorta like as is, but they become favorites when I blast off a couple of custom adjustments.
But give or take a couple of years, we were seeing a potential explosive breakthrough in storage tech, it may not even be worth nit picking files if you think you'll want them later.
My first Journal Entry ever, in 8 years! http://slashdot.org/journal/365947/aphelion-scifi-fantasy-horror-poetry-webzine
If the submitter doesn't want to just trash his disks, it might work to reorganize them. For me at least office files tend to be small - so make one of those small drives office/text data only. Then he can get a big new drive to churn all the music&videos on. If he still has more small drives to use, the third could be backup copies of software installers.
My first Journal Entry ever, in 8 years! http://slashdot.org/journal/365947/aphelion-scifi-fantasy-horror-poetry-webzine
Raid5 still gives n-1 storage for N drives, even with the larger drives available, it is still a pretty efficient way to get fault tolerant storage. Mirroring (Raid 1) is also cheap given the size of the drives available vs their cost, but is about the least efficient storage, n/2 storage for N platters. If you are going to be backing up a high volume, fault tolerance makes sense, as does a soft de-dupe if you can set it up using something like Linux on an older PC chassis.
You need to keep extra hard drives around. There isn't anything that comes close for being price competitive.
All my data that I care to keep backed up fits on a 2TB drive. I have three of those 2TB drives, two are live in my server and manually mirrored via script bi-monthly. The third drive sits in a safe deposit box in a local bank's vault.
Once a month I pull the secondary drive out of the server, drop off that more-fresh backup in the safe deposit box, and take the outdated backup drive back to my server and put it back in.
My data does not change too rapidly, so my periodic backups work for me. I'm protected against single-drive failure, and I'm protected against the house burning down. The only thing that could get me is an meteorite destroying the entire metropolitan area. Maybe I should get a fourth drive that I ship to a friend who lives out-of-state.
Also, I use TrueCrypt to keep those 2TB drives secure when they aren't in my possession.
I'm sure people using Synology NAS appliances could do something quite similar with the hard-drive-safe-deposit-box-shuffle.
I think the problem you need to solve first is how to keep your personal data at a reasonable size. I can't see what TBs could be. I'm sure a large portion of it not that important and might get lost without affecting your life.
The way I do it is try to keep my *most important* data at a reasonable size (a few 10 GBs), compress all that can be compressed, and put the absolutely most important part of it on Dropbox.
I use to have your problem. Then I went out and bought a Synology 211 NAS. I love it. It has a fantastic interface (DSM 4.0). It is set up for Raid 1 at the moment with mirroring. I have had integrity issues with my drives where I had to rebuild one of the drives. Other than that, I love the peace of mind I now have. Plus it is so compact... check it out.
Huh? [devShell.org]
1) git-annex (if on Unix) /dev/foo #this will erase all data on disk. If it reports even one single bad block, toss the disk out.
2) badblocks -swo foo.bb
I am a big fan of zfs for its data integrity features. What I did is below
Home Server (you can configure a minimal hardware file server with components to fit your needs, this is just what I used) Quad core Athlon X4
16gb memory
LSI HBA SAS
9x2tb hdd
The hard drives are configured using raid-z2 with one spare. I am using zfs on linux now, but freebsd works great, and their forums are very helpful for people new to bsd. This server is overpowered for my uses, but it also is a media server, web server, samba server, etc. ZFS has snapshots as well, which work great.
Off site (at a friends house)
Core2 Duo
8GB memory
onboard sata
4x2tb hdd using zfs / raidz2
This system is a low power system sitting in his closet using his internet connection. I only sync the stuff it would be difficult for me to replace. The initial filesystem was created at my house, directly off my server to RSYNC only has to send deltas.
I looked at online backup solutions, but my upload speed is so slow (512K up) that it would take forever to get the initial upload into the vapor. Further, the low end, low cost providers I would look at are more likely to fold, leaving me without a backup.
This is Grok's wife. I had to answer for him after he ruined two monitors trying to chisel his answer onto my new 22" monitor.
I've had a good experience with MooseFS (http://www.moosefs.org/) using my macbook pro and a couple of old BSD boxes. There are other distributed filesystems that will give you peace of mind in terms of storage resiliency - http://en.wikipedia.org/wiki/List_of_file_systems#Distributed_file_systems
I have a Norco rack mount server with an Areca RAID card. I currently have 6 2TB drives in RAID 6. They are Hitachi "green" desktop drives. The controller died a few months ago, and I RMAd it. Once I got it back, all was well. I didn't lose ANY data. I think people saying RAID card failures and drive failures is way overrated. From what I see, people are saying parity is unsafe. I just don't get it. My future plans are a Supermicro case, 2 Xeons, tons of RAM, a bunch of LSI HBAs, 24 drives (haven't decided on the size yet, the shortage is being played well by the corporations), and two a pool of 2 striped raidz2 arrays (like RAID 60). I'm not much of a fan of backups. I am a home user and my data is not worth the cost of a second server with another 24 drives. A tolerance of 4 drive failures will be plenty peace of mind for me. Cloud storage is BS unless you have some unlimited gigabit fiber ISP. Even then, certain data can become contraband and be deleted.
RAID5 on a small number of large drives is not particularly fault tolerant. By the time you rebuild the one failed drive you'll have another one failed, in practice.
It's got lousy performance too, because it has to write parity bits and incomplete blocks.
You need LOTS of drives to make RAID5 worthwhile.
Buy 2 servers, preferably used storage array servers. Start a raid 5 or 6 array on one. (This server is the storage server.) This is your main storage drive. Store ALL data on it. It's helpful to have it support multiple access methods (SMB, NFS, iSCSI, etc). You could go full OS like Debian or something like OpenNas or OpenFiler (BSD based).
On the second box, add as much storage as is accesable on the first. This is the backup server. Run a cron job to regularly r-sync the data off the Storage server over to the backup server.
In this configuration, you have some redundancy in the RAID and a true backup in the second server. You also have the ability (hopefully) to drop in drives as you need so you can expand as you go. And if the hardware it's self breaks, you can simply replace it and keep going.
I do security
I use a variety of uses to backup my files. First off, it's only about 140GB compressed but I limit it to Bly "important" files and even that could be trimmed down.
My setup:
Linux box with 1 TB, encrypted HD for my data. 1 TB external HD, encrypted for weekly backups using rsync. Dropbox in encrypted container for a small selected amount of not overly personal data. I'll never drop personal data in Dropbox because access to my encryption key is available by the company. Small backup goes to Wuala. Slightly more personal data than Dropbox but nowhere near the bulk. I also have an issue of upload speed (650 Kb) and 60G cap. Every 6 months I backup (update) photos to DVDs and but in lock box and also keep set at work locked away. Finally, every year I travel to the parents place for vacation and make copy of external HD to keep at there place.
It's a bit of work to keep on top of but it becomes routine.
How many ppl actually restore these backups? Too many times ive seen backups go bad because they were never checked. I randomly do a full restore 2-3/year.
Keep a spare blank drive, and keep a copy in a Disaster Recovery (DR) site (friend/family) or DR fireproof safe.
Okay, I'm sold. :)
What would be a better alternative be then? At home, I've got a Linux RAID-5 configuration running on my file/media server/MythBox. I built it several years ago with four 500 GB drives. Its performance has always been more than enough to handle standard-def MPEG2 recording from the TV encoder. Soon, though, I will need to upgrade it with larger drives and that would be a good time to switch from RAID-5.
At work, I'm about to build a new more powerful workstation/server. To say that my budget is constrained is putting it lightly. Currently, it has two 1 TB drives in a RAID-1 config. We had an external backup/snapshot drive, but it has since died. Our write speed requirements are not at all extreme, so current drive speeds should be fine.
I guess RAID-6 seems like a good alternative in these cases, supposing the array is built with a minimum of four 1 TB drives? I don't think I need the performance of RAID-10. At least, at home, I'd rather have the extra space. I'm also not considering SSD drives in these two machines as I can't justify the cost.
So much to consider...
Elrond, Duke of URL
"This is the most fun I've had without being drenched in the blood of my enemies!"-Sam&Max
I have implemented several types of ISCSI targets here on wanfuse.blogspot.com.
You can read through articles posted mainly they are implemented using VMWare or KVM virtualization along with a virtualized copy of Nexentastor, but you could just as easily implement (probably easier for a novice) by using the Nexentastor disk here @ http://www.nexentastor.org/projects/1/wiki/CommunityEdition without virtualization.
You are allowed to have a datastore of up to 14TB on the free license
As far as software storage software goes it is much simpler to operate than Openfiler or any of the other systems that are around.
It comes with a web interface which you access remotely from your existing workstation so there is no need for the command line like in my examples on wanfuse.blogspot.com (which is for the trained admin to use).
You will need a computer with at least one nic card preferably Gigabit ethernet, and 2 or 4 drives setup in a raid format either raid 1z or raid 10z (four, two TB drives will do) and a computer with like 8 gig of ram (getting cheaper now) doesn't need to be the latest it can be the same model computer found here http://wanfuse.blogspot.com/search?q=my+esx+purchases which is two years old now. (Search on my blog for Supermicro ) and get a case that has proper cooling and a full tower for space for the drives and put it on a UPS.
Alternatively you could encrypt your data using something Truecrypt (or use the algorithm found built in to Nexentastor)
You could then store it on the cloud using multiple layers of encryption which is unlikely to be broken by anything other than a quantum computer built by the government.
Don't forget to get a 800 to 1000 watt power supply.
Its not a free solution but its an excellent one...Nexentastor supports multiple copies of data on disks(in case of corruption) and you could put this storage unit on a family members network that you trust and create a VPN tunnel to copy the data over. If your afraid of using open source software (solaris based) there is many commercial cheap storage NAS type units you can buy (make sure they take multiple disks for data protection) and you can access them through microsoft sharing such as this "http://www.google.com/products/catalog?hl=en&q=NAS+storage&ix=seb&ion=1&bav=on.2,or.r_gc.r_pw.r_qf.,cf.osb&biw=1024&bih=640&um=1&ie=UTF-8&tbm=shop&cid=7981614780730855982&sa=X&ei=GPJwT-ebB-Tx0gH1t4XcBg&ved=0CLwBEPICMAU" without the quotes which is a cheaper method but not nearly as professional or fast as my setup.
In my setups I heavily use ESX 4.1 free edition to access a virtualized storage unit using the method found here: http://wanfuse.blogspot.com/2012/02/newly-updatedlocal-raw-disk-mapping-to.html called raw disk mapping to virtualized guest(this method is the only method I find that works) even better than "pci pass through" (no purple screens of death on esx which is what happens when you use raw disk mapping from local storage).
Good luck!
I've been using RAID10 on my squeezeserver/mythTV box, personally, for about five years now. But I got a case of server-grade hard drives for free (nearly all my equipment is dumpster-diving booty) and honestly I built around what I had on hand. Outrageous performance but it's a noisy, power-hungry machine compared to most home servers, even with temperature controlled fans and so forth.
I'm about due for a home server rebuild too, and I was also thinking about RAID6 this time around. Unless I find another case of hard drives somebody's throwing out, I guess.
megaupload won't always be there for you
If you think backup is slow, wait until you try a restore or RAID rebuild.
Copy/sync to a duplicate file system on a replacement drive in an external enclosure.
When the internal drive fails, swap in the pre-loaded replacement, and refill the external enclosure.
This also allows for growth.
--
The Truth of Large Numbers - almost all numbers are larger than you can imagine.
Publish your data to book. As a process it will force you to really consider what is valuable. Your photos and coorespondanes make the cut. It will push you to put more effort into your writing diaries or lettters. Your really sensitive information can be locked in a safe. Most of that stuff is is useless. If you had two minutes to evacuate you should be able to get what really matters. And I don't think it will be your tax returns!
I could have used many other Christian or muslim sects as example. point is the NWO was set up centuries ago, too late to worry, pal. The "technology" for that can be done by hand or by computer.
I use a combination of a 4x1TB in Raid5 (3ware 9650) for storage. And then use a combination of BackupPC on Ubuntu and DeltaCopy for Windows.