Slashdot Mirror


Ask Slashdot: How Do You Manage Your Personal Data?

New submitter multimediavt writes "Ok, here's my problem. I have a lot of personal data! (And, no, it's not pr0n, warez, or anything the MPAA or RIAA would be concerned about.) I am realizing that I need to keep at least one spare drive the same size as my largest drive around in case of failure, or the need to reformat a drive due to corrupt file system issues. In my particular case I have a few external drives ranging in size from 200 GB to 2 TB (none with any more than 15 available), and the 2 TB drive is giving me fits at the moment so I need to move the data off and reformat the drive to see if it's just a file system issue or a component issue. I don't have 1.6 TB of free space anywhere and came to the above realization that an empty spare drive the size of my largest drive was needed. If I had a RAID I would have the same needs should a drive fail for some reason and the file system needed rebuilding. I am hitting a wall, and I am guessing that I am not the only one reaching this conclusion. This is my personal data and it is starting to become unbelievably unruly to deal with as far as data integrity and security are concerned. This problem is only going to get worse, and I'm sorry 'The Cloud' is not an acceptable nor practical solution. Tape for an individual as a backup mechanism is economically not feasible. Blu-ray Disc only holds 50 GB at best case and takes forever to backup any large amount of data, along with a great deal of human intervention in the process. So, as an individual with a large data collection and not a large budget, what do you see as options for now (other than keeping a spare blank drive around), and what do you see down the road that might help us deal with issues like this?"

24 of 414 comments (clear)

  1. Keep a spare blank drive around by Anonymous Coward · · Score: 5, Insightful

    I think you already have the answer

    1. Re:Keep a spare blank drive around by AngryDeuce · · Score: 5, Informative

      Agreed. I've been gradually rotating larger backup drives in and smaller backup drives out over the last 10 years or so. Right now I have about 2 TB's of unique data in my archive which is kept on the host machine if it is regularly accessed or duplicated on another external hard drive. Everything (I care about) has two copies at all times. As my archive grows, I'm going to have to upgrade my archive device's capacity, but that's a given, no matter what you do, if you want it stored locally, you'll have to add capacity somewhere obviously. DVD-R's and BluRay discs aren't a viable option in my opinion, because I've got a ton of old self-burned discs that I recently had to toss because they were rendered useless from laser rot, even though they were in sealed containers in a cool, dry place.

      The cloud is, to me, not a backup solution. I see it as a way to globally access my data and I use it as such. No sensitive data of mine will go to the cloud because the likelihood of needing access to it without warning is completely nil, so in my case, it's limited to media that I want constant access to. Now, the cloud definitely has the potential to serve as a backup solution, don't get me wrong, but there's just too much uncertainty involved in the cloud these days, especially as concerns the government nuking sites from orbit without warning, whether justified or not.

      However, I agree with some others that are telling you to do some house-cleaning. I recently went through my backups and found 300 GB's worth of crap that I hadn't accessed or used dating back to the early 2000's that I was saving for some stupid reason. Disc Images for ancient games that don't even run well on modern systems (or require a lot of fucking hassle to get running well), music that I haven't listened to in half a decade, old-ass videos that I'd downloaded from the internet back before there was such a thing as youtube, etc. Not to say that everyone's data is as silly as mine was, but it just added up over the years...

    2. Re:Keep a spare blank drive around by Jabroney · · Score: 5, Informative

      https://www.googleapis.com/urlshortener/v1/url?shortUrl=http://goo.gl/rIh07 { "kind": "urlshortener#url", "id": "http://goo.gl/rIh07", "longUrl": "http://www.backblaze.com/partner/af3012", "status": "OK" } Trying to sell cloud solutions on Slashdot? You must be new here.

    3. Re:Keep a spare blank drive around by xaxa · · Score: 5, Informative

      Link contains a referral ID, so Shikaku is earning from this, but not willing to say so.

      Eventually, it ends up at http://www.backblaze.com/

    4. Re:Keep a spare blank drive around by JackDW · · Score: 5, Informative

      Right. Other than buying new disks, there is no good solution.

      The asker seems to be looking for some kind of "join all my small disks together" solution. And yes, he can do this. RAID-0 or LVM. But... don't do it! If even one of those disks fails all the data is effectively gone. The solution is cheap to implement but totally worthless. Sorry, your 250Gb SATA disk now belongs in a museum.

      RAID-5/6 is, IMO, also a bad idea; there are too many instances where the controller has failed or multiple disks have failed.

      The asker explicitly excludes cloud solutions. It's depressing that people have recommended various cloud solutions nonetheless. Apart from not being answers to the question, these solutions are totally awful for large quantities of data. Amazon S3 may be nearly free if you want to store a few gigabytes, but if you want to store a few terabytes you are going to pay through the nose, and all the other service providers are the same. 2Tb would cost $234 per month just for storage, transfer cost not included. For the price of two weeks of S3 storage you can buy a 2Tb external disk. For the price of upload, download and a month's storage, you can buy four or five such disks and have as much redundancy as any normal person could ever need.

      --
      You're an immobile computer, remember?
    5. Re:Keep a spare blank drive around by DarwinSurvivor · · Score: 4, Insightful

      1) Find a trusted person you see often or have easy access to (friend, neighbor, relative, coworker, etc).
      2) Each buy enough HDD's to duplicate your stuff
      3) On a regular basis trade drives, update backups, trade back
      4) If you are worried about security (either from them or someone breaking in), encrypt the drive(s) and keep one copy of the key with yourself and another in a safety deposit box (or another friend, etc).

  2. Enjoy your delusion by Trixter · · Score: 4, Informative

    "I'm sorry 'The Cloud' is not an acceptable nor practical solution." Not sure what brand tin-foil hat you're wearing, but there are cloud backup solutions that encrypt your data *before* it leaves the machine. I use CrashPlan (I can't speak for others) and I've verified the encryption myself by capturing the traffic leaving my machine, even when CrashPlan was backing up to other machines on my own private network. Even the data it writes to locally-attached hard drives is encrypted. So there's at least one company who gets it right.

    1. Re:Enjoy your delusion by Anonymous Coward · · Score: 5, Insightful

      It's great that you know how fast his connection is and exactly what data restrictions his ISP imposes. I'm actually rather impressed you can be 100% sure his computer is connected to the internet at all. All I know is that if I had that much data, the time it would take to upload would probably be longer than the time it takes for the HDD to wear down and implode.

    2. Re:Enjoy your delusion by burisch_research · · Score: 5, Informative

      You're assuming that it's encryption that's the problem. In my case, it's a problem with the size of data vs. how much bandwidth I can use. I get an allocation of 20GB a month, and even that's very expensive. Backing up my 5+ TB to the cloud is simply not an option.

      Cloud is very trendy right now, but that doesn't mean it's a one-size-fits-all.

      --
      char*f="char*f=%c%s%c;main(){printf(f,34,f,34);}";main(){printf(f,34,f,34);}
    3. Re:Enjoy your delusion by Wrath0fb0b · · Score: 4, Informative

      You're assuming that it's encryption that's the problem. In my case, it's a problem with the size of data vs. how much bandwidth I can use. I get an allocation of 20GB a month, and even that's very expensive. Backing up my 5+ TB to the cloud is simply not an option.

      CrashPlan will let you Fedex them a hard drive to get the backup started. From then on, you only need to send deltas.

    4. Re:Enjoy your delusion by AmiMoJo · · Score: 5, Funny

      I'm actually rather impressed you can be 100% sure his computer is connected to the internet at all.

      Well he did post his question to an internet forum...

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    5. Re:Enjoy your delusion by Anonymous Coward · · Score: 4, Insightful

      In typical "I need IT advice, but I have preconceived notions about how things should work and am not willing to budge on that" fashion, the asker has discounted some reasonable options without specifying the reasons that won't work for him, and failed to provide some super useful info like how large his data actually is, how often it changes, how much existing data changes, how much new data there is, and how quickly it grows.

      So it could be that the reasons for his concern are unmerited, and GP merely points out that if his concern is privacy, there's ways to use the cloud safely. In typical Slashdot fashion, you rebuke the potential shortcomings of the advice without knowing whether those shortcomings actually apply to the asker.

      Backup should be provided in depth, several prongs provides the best redundancy and the least single points of failure. Cloud storage is an excellent option for one of the prongs given certain factors. If most of the data rarely changes (pretty typical for very large data sets), incremental bandwidth usage past the initial storage is usually not much more than the data growth rate. As observed, it can be done in a way that respects privacy and safety.

      Cloud storage has two main advantages over local backup solutions. You won't run out of disk space, and it's off-site (so a house fire won't take out your data set). Any on-site solution automatically fails that level of redundancy. Storage on S3 is ridiculously inexpensive any more.

      I have about 6 TB of data that I need to keep backed up. I have about 12 years of digital photography and video originals, including stuff like wedding and honeymoon photos, as well as the birth and first years of my children's lives. When people suffer house fires, one of the most common and greatest laments are the things that can't be replaced - usually photographs.

      My solution is four tiers. I have a local RAID0 in my Mac Pro. I have Time Machine backups of that (this is hands-down the best consumer on-site backup solution on the market). I rsync those files to a local RAID10 NAS device (Synology are a bit pricy, but they are completely worth it, really excellent built-in software with a lot of features you might find surprisingly useful, and you can purchase expansion bays to extend capacity as you're running low). Then finally I back up to Amazon S3 in encrypted form with JungleDisk (I no longer recommend this software, I own a copy of it from before it was bought by RackSpace, the quality has gone down since RackSpace bought it and "improved" it, plus I gather you now have to pay a monthly subscription, AND pay for your own storage - crap).

      The only way my data is in jeopardy is if my house burns down (takes out 3 local redundancy & backup solutions) on the same day that Amazon has critical failure. And it's all 100% automated, Raid0 happens at time of write, TimeMachine alerts me if there's problems creating a backup and gives me local history, my NAS warns me by email & SMS if it so much as writes too slowly (my rsync cron script emails me if it can't reach the NAS for some reason), and JungleDisk does a nightly sync with S3, and sends me weekly reports so I can be sure that it's doing its job. I have quick local access, and slow offsite access if everything else fails (I'd probably go bum my work's huge pipe to do the initial restore if I had to rely on that).

    6. Re:Enjoy your delusion by BlackPignouf · · Score: 4, Insightful

      I have Time Machine backups of that (this is hands-down the best consumer on-site backup solution on the market).

      Did you actually use it for recovery?

      Both my rotated TimeMachines were corrupt. They never complained during backups, but failed miserably while trying to recover my Pictures HDD.
      Only some of the backup files were corrupt, but when you try to recover a complete disk with TM, it's all or nothing, and the process stops after the first error, leaving you in the dust.
      I had to write a parsing script with ruby, "cp -avX", ditto and chmod in order to get my system back.
      It wasn't so hard, but it sure was stressful with one disk down, two corrupt disks and no other backup to get my pictures back.

      BTW, TimeMachine doesn't backup every file in your system, and is too stupid to realize that it should not begin from scratch after recovery : it needs twice the storage after that, because it thinks every file is new.

      My drives weren't big enough, so I had to wipe the backups and lose the local history.

      Fuck it. I began using Carbon Copy Cloner since then, and never looked back.
      It's free as in donationware, it works, it gives you a bootable backup that you can actually test and rotate properly, it can easily be automated, it archives the files that you've deleted between backups, and uses much less space than TimeMachine.
      I hear SuperDuper is just as good.
      TimeMachine is some crappy software with nice looking interface that gives you a false sense of security.

  3. Bare Drives and a USB Drive Dock? by wanderfowl · · Score: 5, Informative

    One way to save a bit of cash is to buy a USB eSATA drive dock (single or double) with some bare eSATA drives. This cuts the enclosure out, and allows you to buy bare drives, which are often cheaper than enclosed drives.

    You could also consider Drobo or one of the Wiebetech multi-drive RAID containers. But encryption + cloud isn't all bad.

  4. Budget by macemoneta · · Score: 4, Informative

    "large data collection and not a large budget"

    This is your problem right there. You can't enter into a a situation like this without planning a budget for the inevitable failures. I suggest purchasing a new larger drive (3TB are common now) and migrating the data from the problematic drive. Then migrate the data from several older smaller drives. This will reduce the component count (points of failure), save you power (cost in the long run) and keep you ahead of failures. You should plan on doing this periodically to maintain the integrity of the data.

    --

    Can You Say Linux? I Knew That You Could.

  5. Buddy NAS by Anonymous Coward · · Score: 5, Interesting

    I have a solution I call the "Buddy NAS". Go out and get two cheap computers. It could be a PC or a mini-NAS or a low-end server. Anything that will hold multiple hard drives. You jam both full of hard disks and use them as a backup/NAS server. One PC is kept at your place, the other at your friend's house.

    Both computers have an account for you and an account for your friend (it helps if your friend is nerdy and "gets" backup solutions). Both of you now have a backup solution in your own home and a remote backup server at a friend's place. Two copies of your data, one remote. Basically it's like having local and cloud storage for you and your friend and it'll cost less than a grand if you shop around. If neither of you have static IPs you can use dyndns.org to connect to the remote boxes. Bandwidth shouldn't be an issue if you use rsync to backup changed files nightly.

  6. I delete stuff by Amiga500_Rulez · · Score: 5, Insightful

    Seriously. How much crap do you really need to keep around?

  7. Magic by lucm · · Score: 5, Insightful

    So your disks are full and possibly broken. You don't want to have more disks, you don't want tape or optical medias, and a storage provider (aka The Cloud) is not an option... Then you have three solutions "down the road":

    1) Delete stuff
    2) Invent a new compression algorithm that will allow you to reuse the same disks forever without losing data
    3) Rely on magic*

    *might overlap with solution #2

    --
    lucm, indeed.
  8. It's that time again, is it? by QuasiSteve · · Score: 4, Insightful

    It's that time again, is it?
    http://ask.slashdot.org/comments.pl?sid=2452630&cid=37557630

    Either..
    A: Buy that HDD. Yes, they're a bit more expensive right now ..or..
    B: Wait a few months, prices will come down again, buy that HDD then. Yes, you may lose your data in the mean time.

    Now stop asking or I'm going to pull over.

  9. Hoarders: Digital Edition by joocemann · · Score: 4, Funny

    You might be on the next spinoff of Hoarders programs, a digital hoarders show.

    In this show, redundancy, old versions, and files that haven't been opened in 5+ years are brought into question, for which you will be embarrased to defend... You will attempt to justify why you still have linksys drivers for a wrt54g you don't even have anymore. And no, the DVD ISO of the Alvin and the Chipmunks movie, that you never burned or watched, is not worth saving.... Neither are about 85% of the digital pictures you took (you know, the ones that were the 'bad shot' that you took before finally getting the good one).

    Take a day or two, go through it chunk by chunk, and purge! PURGE!

  10. Re:RAID array on a spare box by swalve · · Score: 4, Interesting

    That's not a bad idea. I started with the OP's problem, trying to keep data from multiple machines in sync and backed up and with enough room to spare. After having spent more weekends copying data back and forth to clear out a drive in order to replace it, I decided to go to the fileserver paradigm. I built a machine with three 40gb drives RAIDed together and made that the only place useful data would be stored. I've since expanded it up to 3tb in various increments, and it has worked well. It has saved tons of time and money by allowing my computers to use whatever cheap harddrive was available and just restore from backup when it went TU. But with the need for increased data availability outside my house (IE, making my notebook my main computer), I'm starting to reverse course and move to your idea. Using robocopy on the clients and shell scripts + hard links on the server, I've set up a workable versioning backup system that doesn't take up too much space.

    I also use Dropbox for some stuff.

  11. Re:Solution.. buy hard drives! by TheRaven64 · · Score: 4, Informative

    If you're using ZFS, then the best solution is to use RAID-Z for online storage and then have two external disks which you use zfs send / zfs receive to update. This means that catastrophic failure (e.g. a power supply problem blowing all of the drives in the machine) will still leave you able to recover. Ideally, you should store one drive at home and one elsewhere, so that if someone steals your computer then they don't get the data.

    --
    I am TheRaven on Soylent News
  12. Re:Solution.. buy hard drives! by AngryDeuce · · Score: 4, Interesting

    Honestly, it's my Western Digital drives that have lasted the longest. My dad is still rocking several single digit GB capacity WD drives actively in his legacy tower, and I've yet to have one die on me. Not to say I haven't replaced them as their capacity becomes outdated, but I've had much better luck with them than Maxtor (the worst brand I've ever used), which is now a part of Seagate, which I've also had a couple fail on me (but nowhere near as bad as Maxtor).

    I've never used Hitachi or Samsung or any other brand that I know of, so I can't speak as to their quality, but I'm sticking with Western Digital.

  13. Use a NAS with backup by AliasMarlowe · · Score: 5, Interesting
    What I did some years ago was recognize that "manual backups" were not done often enough, and important stuff was scattered around a few PCs. So I got a NAS, stuck a pair of disks into it (RAID 0 for speed), and set up its automated incremental backup to run 3 times per week to an external USB drive. The PCs now mount the NAS at login, and that's where all data files are stored by default (even the kids use it).

    We're up to 2 NAS units now, with 7TB[*] of disk space between them, all backed up on schedule. The USB backup drives are rotated every few weeks with another set kept in a secure place in the garage.

    [*] One NAS unit doubles up as media server, so it's got a load of movies & music in addition to user files in its 6TB. The other one is our web server and email server with only 1TB of disk space.

    --
    Those who can make you believe absurdities can make you commit atrocities. - Voltaire