Slashdot Mirror


Ask Slashdot: How Do You Manage Your Personal Data?

New submitter multimediavt writes "Ok, here's my problem. I have a lot of personal data! (And, no, it's not pr0n, warez, or anything the MPAA or RIAA would be concerned about.) I am realizing that I need to keep at least one spare drive the same size as my largest drive around in case of failure, or the need to reformat a drive due to corrupt file system issues. In my particular case I have a few external drives ranging in size from 200 GB to 2 TB (none with any more than 15 available), and the 2 TB drive is giving me fits at the moment so I need to move the data off and reformat the drive to see if it's just a file system issue or a component issue. I don't have 1.6 TB of free space anywhere and came to the above realization that an empty spare drive the size of my largest drive was needed. If I had a RAID I would have the same needs should a drive fail for some reason and the file system needed rebuilding. I am hitting a wall, and I am guessing that I am not the only one reaching this conclusion. This is my personal data and it is starting to become unbelievably unruly to deal with as far as data integrity and security are concerned. This problem is only going to get worse, and I'm sorry 'The Cloud' is not an acceptable nor practical solution. Tape for an individual as a backup mechanism is economically not feasible. Blu-ray Disc only holds 50 GB at best case and takes forever to backup any large amount of data, along with a great deal of human intervention in the process. So, as an individual with a large data collection and not a large budget, what do you see as options for now (other than keeping a spare blank drive around), and what do you see down the road that might help us deal with issues like this?"

64 of 414 comments (clear)

  1. Keep a spare blank drive around by Anonymous Coward · · Score: 5, Insightful

    I think you already have the answer

    1. Re:Keep a spare blank drive around by AngryDeuce · · Score: 5, Informative

      Agreed. I've been gradually rotating larger backup drives in and smaller backup drives out over the last 10 years or so. Right now I have about 2 TB's of unique data in my archive which is kept on the host machine if it is regularly accessed or duplicated on another external hard drive. Everything (I care about) has two copies at all times. As my archive grows, I'm going to have to upgrade my archive device's capacity, but that's a given, no matter what you do, if you want it stored locally, you'll have to add capacity somewhere obviously. DVD-R's and BluRay discs aren't a viable option in my opinion, because I've got a ton of old self-burned discs that I recently had to toss because they were rendered useless from laser rot, even though they were in sealed containers in a cool, dry place.

      The cloud is, to me, not a backup solution. I see it as a way to globally access my data and I use it as such. No sensitive data of mine will go to the cloud because the likelihood of needing access to it without warning is completely nil, so in my case, it's limited to media that I want constant access to. Now, the cloud definitely has the potential to serve as a backup solution, don't get me wrong, but there's just too much uncertainty involved in the cloud these days, especially as concerns the government nuking sites from orbit without warning, whether justified or not.

      However, I agree with some others that are telling you to do some house-cleaning. I recently went through my backups and found 300 GB's worth of crap that I hadn't accessed or used dating back to the early 2000's that I was saving for some stupid reason. Disc Images for ancient games that don't even run well on modern systems (or require a lot of fucking hassle to get running well), music that I haven't listened to in half a decade, old-ass videos that I'd downloaded from the internet back before there was such a thing as youtube, etc. Not to say that everyone's data is as silly as mine was, but it just added up over the years...

    2. Re:Keep a spare blank drive around by CapOblivious2010 · · Score: 3, Informative

      This doesn't help much for those of us with crappy internet - I've only got about 300K (bits) upload speed, and at that speed backing up 1TB would take around a year.

      FWIW, my strategy is to keep truly important stuff on a raid enclosure (and backup to other disks periodically), and to just live with the fact that there's really nothing irreplaceable about the rest.

    3. Re:Keep a spare blank drive around by Jabroney · · Score: 5, Informative

      https://www.googleapis.com/urlshortener/v1/url?shortUrl=http://goo.gl/rIh07 { "kind": "urlshortener#url", "id": "http://goo.gl/rIh07", "longUrl": "http://www.backblaze.com/partner/af3012", "status": "OK" } Trying to sell cloud solutions on Slashdot? You must be new here.

    4. Re:Keep a spare blank drive around by xaxa · · Score: 5, Informative

      Link contains a referral ID, so Shikaku is earning from this, but not willing to say so.

      Eventually, it ends up at http://www.backblaze.com/

    5. Re:Keep a spare blank drive around by JackDW · · Score: 5, Informative

      Right. Other than buying new disks, there is no good solution.

      The asker seems to be looking for some kind of "join all my small disks together" solution. And yes, he can do this. RAID-0 or LVM. But... don't do it! If even one of those disks fails all the data is effectively gone. The solution is cheap to implement but totally worthless. Sorry, your 250Gb SATA disk now belongs in a museum.

      RAID-5/6 is, IMO, also a bad idea; there are too many instances where the controller has failed or multiple disks have failed.

      The asker explicitly excludes cloud solutions. It's depressing that people have recommended various cloud solutions nonetheless. Apart from not being answers to the question, these solutions are totally awful for large quantities of data. Amazon S3 may be nearly free if you want to store a few gigabytes, but if you want to store a few terabytes you are going to pay through the nose, and all the other service providers are the same. 2Tb would cost $234 per month just for storage, transfer cost not included. For the price of two weeks of S3 storage you can buy a 2Tb external disk. For the price of upload, download and a month's storage, you can buy four or five such disks and have as much redundancy as any normal person could ever need.

      --
      You're an immobile computer, remember?
    6. Re:Keep a spare blank drive around by chrismcb · · Score: 2

      I use crashplan. They have a service, where they'll send you a 1TB harddrive to seed your backup. Of course this is a problem if you have more then 1TB or you have a LOT of churn.
      Another option is a couple of external drives that you keep at the office or a friends house, or the bank.

    7. Re:Keep a spare blank drive around by Electricity+Likes+Me · · Score: 3, Insightful

      FYI: storing hard disks in a fire-proof safe is not a good idea. Fire-proof safes are generally rated for their ability to protect paper documents from burning up - but paper is very robust, and stable to very high temperatures provided it isn't actually exposed to a source of ignition.

      This isn't really true of a hard disk - you can heat paper to 150 degrees C no problems, but as far as I know most hard disks when in storage may not actually survive prolonged exposure to those sorts of temperatures.

    8. Re:Keep a spare blank drive around by DarwinSurvivor · · Score: 4, Insightful

      1) Find a trusted person you see often or have easy access to (friend, neighbor, relative, coworker, etc).
      2) Each buy enough HDD's to duplicate your stuff
      3) On a regular basis trade drives, update backups, trade back
      4) If you are worried about security (either from them or someone breaking in), encrypt the drive(s) and keep one copy of the key with yourself and another in a safety deposit box (or another friend, etc).

  2. Solution.. buy hard drives! by FrozenFood · · Score: 3, Informative

    1. Buy hard drive from brand A
    2. Buy hard drive from brand B
    3. put in seperate esata enclosures
    4. backup to both drives.

    1. Re:Solution.. buy hard drives! by Ichoran · · Score: 2

      Exactly. If you can't afford two new drives from different vendors large enough to hold your data, then you cannot afford to keep your data safe.

      Don't bother fiddling with RAIDs unless you have many terabytes of data. Single drives are a lot faster to get and use.

    2. Re:Solution.. buy hard drives! by houstonbofh · · Score: 2

      I only buy Western Digital, and avoid Seagate/Maxtor for the same reasons.

    3. Re:Solution.. buy hard drives! by TheRaven64 · · Score: 4, Informative

      If you're using ZFS, then the best solution is to use RAID-Z for online storage and then have two external disks which you use zfs send / zfs receive to update. This means that catastrophic failure (e.g. a power supply problem blowing all of the drives in the machine) will still leave you able to recover. Ideally, you should store one drive at home and one elsewhere, so that if someone steals your computer then they don't get the data.

      --
      I am TheRaven on Soylent News
    4. Re:Solution.. buy hard drives! by AngryDeuce · · Score: 4, Interesting

      Honestly, it's my Western Digital drives that have lasted the longest. My dad is still rocking several single digit GB capacity WD drives actively in his legacy tower, and I've yet to have one die on me. Not to say I haven't replaced them as their capacity becomes outdated, but I've had much better luck with them than Maxtor (the worst brand I've ever used), which is now a part of Seagate, which I've also had a couple fail on me (but nowhere near as bad as Maxtor).

      I've never used Hitachi or Samsung or any other brand that I know of, so I can't speak as to their quality, but I'm sticking with Western Digital.

    5. Re:Solution.. buy hard drives! by issicus · · Score: 2

      But how long can you shelf a harddrive and have it still work.. that is the question.

    6. Re:Solution.. buy hard drives! by Fjandr · · Score: 2

      For my own part, Maxtor, Quantum, and WD drives are the only consumer drives I've had fail on me. Have a dead 1TB caviar green drive on my desk right now. Then again, I haven't had to use any of the Seagate 7200.11s (or was it the 12s?) that had so many problems.

      At least from the perspective of SMART output, the most reliable drives per hour put on them that I've run are Samsung drives, but those have also been a minority of drives I've used.

      That said, I try not to put drives from one single manufacturing batch into the same array if I can help it. That's the one nearly-universal sentiment I've heard from others who deploy massive numbers of drives. In the end, company name doesn't really matter all that much. Failures come in clusters, and all companies suffer from them. If looked at neutrally, I'd guess that Seagate and WD probably run fairly similar failure rates. People tend to go with their experience though, and it's no surprise there are people like you or I who have had better luck with one specific company. The smaller the sample size, the more likely non-trends will appear to be something they're not.

    7. Re:Solution.. buy hard drives! by d3ac0n · · Score: 2

      Agree with this, mod up.

      Particularly for home users (which TFA seems to indicate is the case here) a simple mirrored RAID array will do the trick. I recommend the following setup:

      Buy 4 2TB drives.

      Put 2 drives in a Mirrored array using motherboard-based RAID.

      Put 1 drive in a USB 2.0, 3.0 or eSATA drive enclosure and back up RAID array to this drive.

      Keep 4th drive as a spare.

      Replace all 4 drives with larger drives as needed and available.

      Done. You will almost never lose data using this method. If you REALLY want to, you can purchase a Carbonite (or Mozy, or whatever) cloud storage backup service and back up your most critical and valuable data to the cloud, or you can simply archive to DVD or BD and store appropriately.

      The only concern I would have for this setup for TFA writer is that he appears to actually be OVER the 2TB level already, which means he's looking at some pretty pricey 3 or 4TB drives. Unfortunately, at this data capacity level the solutions can be technically simple but they are never CHEAP. No matter what, to store that kind of data, TFA writer will be spending close to a grand, regardless of what solution they pursue (Home built RAID with big drives, buying a large NAS, Etc.)

      --
      Official Heretic from the "Church of Global Warming". Proven right thanks to whistle blowers. AGW = Flat Earth Theory
  3. Enjoy your delusion by Trixter · · Score: 4, Informative

    "I'm sorry 'The Cloud' is not an acceptable nor practical solution." Not sure what brand tin-foil hat you're wearing, but there are cloud backup solutions that encrypt your data *before* it leaves the machine. I use CrashPlan (I can't speak for others) and I've verified the encryption myself by capturing the traffic leaving my machine, even when CrashPlan was backing up to other machines on my own private network. Even the data it writes to locally-attached hard drives is encrypted. So there's at least one company who gets it right.

    1. Re:Enjoy your delusion by Anonymous Coward · · Score: 5, Insightful

      It's great that you know how fast his connection is and exactly what data restrictions his ISP imposes. I'm actually rather impressed you can be 100% sure his computer is connected to the internet at all. All I know is that if I had that much data, the time it would take to upload would probably be longer than the time it takes for the HDD to wear down and implode.

    2. Re:Enjoy your delusion by burisch_research · · Score: 5, Informative

      You're assuming that it's encryption that's the problem. In my case, it's a problem with the size of data vs. how much bandwidth I can use. I get an allocation of 20GB a month, and even that's very expensive. Backing up my 5+ TB to the cloud is simply not an option.

      Cloud is very trendy right now, but that doesn't mean it's a one-size-fits-all.

      --
      char*f="char*f=%c%s%c;main(){printf(f,34,f,34);}";main(){printf(f,34,f,34);}
    3. Re:Enjoy your delusion by Anonymous Coward · · Score: 2, Informative

      You're assuming that it's encryption that's the problem. In my case, it's a problem with the size of data vs. how much bandwidth I can use. I get an allocation of 20GB a month, and even that's very expensive. Backing up my 5+ TB to the cloud is simply not an option.

      Cloud is very trendy right now, but that doesn't mean it's a one-size-fits-all.

      Crashplan has an option where they will send you a hard drive to seed your backup locally and mail it back. That way you only have to do incremental backups once you do the initial seed.

      If there's no offsite backup, the whole scheme is worthless. What happens if there is a fire?

    4. Re:Enjoy your delusion by Wrath0fb0b · · Score: 4, Informative

      You're assuming that it's encryption that's the problem. In my case, it's a problem with the size of data vs. how much bandwidth I can use. I get an allocation of 20GB a month, and even that's very expensive. Backing up my 5+ TB to the cloud is simply not an option.

      CrashPlan will let you Fedex them a hard drive to get the backup started. From then on, you only need to send deltas.

    5. Re:Enjoy your delusion by AmiMoJo · · Score: 5, Funny

      I'm actually rather impressed you can be 100% sure his computer is connected to the internet at all.

      Well he did post his question to an internet forum...

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    6. Re:Enjoy your delusion by Anonymous Coward · · Score: 4, Insightful

      In typical "I need IT advice, but I have preconceived notions about how things should work and am not willing to budge on that" fashion, the asker has discounted some reasonable options without specifying the reasons that won't work for him, and failed to provide some super useful info like how large his data actually is, how often it changes, how much existing data changes, how much new data there is, and how quickly it grows.

      So it could be that the reasons for his concern are unmerited, and GP merely points out that if his concern is privacy, there's ways to use the cloud safely. In typical Slashdot fashion, you rebuke the potential shortcomings of the advice without knowing whether those shortcomings actually apply to the asker.

      Backup should be provided in depth, several prongs provides the best redundancy and the least single points of failure. Cloud storage is an excellent option for one of the prongs given certain factors. If most of the data rarely changes (pretty typical for very large data sets), incremental bandwidth usage past the initial storage is usually not much more than the data growth rate. As observed, it can be done in a way that respects privacy and safety.

      Cloud storage has two main advantages over local backup solutions. You won't run out of disk space, and it's off-site (so a house fire won't take out your data set). Any on-site solution automatically fails that level of redundancy. Storage on S3 is ridiculously inexpensive any more.

      I have about 6 TB of data that I need to keep backed up. I have about 12 years of digital photography and video originals, including stuff like wedding and honeymoon photos, as well as the birth and first years of my children's lives. When people suffer house fires, one of the most common and greatest laments are the things that can't be replaced - usually photographs.

      My solution is four tiers. I have a local RAID0 in my Mac Pro. I have Time Machine backups of that (this is hands-down the best consumer on-site backup solution on the market). I rsync those files to a local RAID10 NAS device (Synology are a bit pricy, but they are completely worth it, really excellent built-in software with a lot of features you might find surprisingly useful, and you can purchase expansion bays to extend capacity as you're running low). Then finally I back up to Amazon S3 in encrypted form with JungleDisk (I no longer recommend this software, I own a copy of it from before it was bought by RackSpace, the quality has gone down since RackSpace bought it and "improved" it, plus I gather you now have to pay a monthly subscription, AND pay for your own storage - crap).

      The only way my data is in jeopardy is if my house burns down (takes out 3 local redundancy & backup solutions) on the same day that Amazon has critical failure. And it's all 100% automated, Raid0 happens at time of write, TimeMachine alerts me if there's problems creating a backup and gives me local history, my NAS warns me by email & SMS if it so much as writes too slowly (my rsync cron script emails me if it can't reach the NAS for some reason), and JungleDisk does a nightly sync with S3, and sends me weekly reports so I can be sure that it's doing its job. I have quick local access, and slow offsite access if everything else fails (I'd probably go bum my work's huge pipe to do the initial restore if I had to rely on that).

    7. Re:Enjoy your delusion by Anne+Thwacks · · Score: 3, Funny

      Have you actually tried to backup 1TB using IP over avian carrier?

      --
      Sent from my ASR33 using ASCII
    8. Re:Enjoy your delusion by Anonymous Coward · · Score: 3, Insightful

      My solution is four tiers. I have a local RAID0 in my Mac Pro. ...

      You do understand what RAID0 is, right? RAID0 is strictly for performance and offers zero data redundancy or failure protection. In fact, since you need both disks to function to read your data - you're essentially halving the MTBF of using one disk. Perhaps you meant RAID1? (a mirrored set)

    9. Re:Enjoy your delusion by BlackPignouf · · Score: 4, Insightful

      I have Time Machine backups of that (this is hands-down the best consumer on-site backup solution on the market).

      Did you actually use it for recovery?

      Both my rotated TimeMachines were corrupt. They never complained during backups, but failed miserably while trying to recover my Pictures HDD.
      Only some of the backup files were corrupt, but when you try to recover a complete disk with TM, it's all or nothing, and the process stops after the first error, leaving you in the dust.
      I had to write a parsing script with ruby, "cp -avX", ditto and chmod in order to get my system back.
      It wasn't so hard, but it sure was stressful with one disk down, two corrupt disks and no other backup to get my pictures back.

      BTW, TimeMachine doesn't backup every file in your system, and is too stupid to realize that it should not begin from scratch after recovery : it needs twice the storage after that, because it thinks every file is new.

      My drives weren't big enough, so I had to wipe the backups and lose the local history.

      Fuck it. I began using Carbon Copy Cloner since then, and never looked back.
      It's free as in donationware, it works, it gives you a bootable backup that you can actually test and rotate properly, it can easily be automated, it archives the files that you've deleted between backups, and uses much less space than TimeMachine.
      I hear SuperDuper is just as good.
      TimeMachine is some crappy software with nice looking interface that gives you a false sense of security.

    10. Re:Enjoy your delusion by cheetah · · Score: 3, Informative

      S3 storage for 5TB isn't what I would call cheap. We are talking about $580/Month(or almost $7k per year). For that amount of money, you could buy a new set of 5TB worth of hard drives each month and then ship them to a remote location and pocket about $200 a month in savings.

      Not a perfect solution(no online access) but I think it underscores just how costly S3 still is for large amounts of data. If you are talking about a few hundred GB of data, S3 storage is cheaper(and better) than anything you could reasonably do yourself. But once you scale up the usage... Heck, you could buy and colo a remote server and ship drives back and forth for less than what S3 would cost...

    11. Re:Enjoy your delusion by way2trivial · · Score: 2

      http://www.wired.com/gadgetlab/2010/03/32gb-microsd-card/

      weight of a micro sd card :~ .5 grams

      capacity of a micro sd card:~ 64 gb
      http://www.amazon.com/SanDisk-Mobile-MicroSDXC-Memory-Adapter/dp/B005V7WIA2/ref=sr_1_1?ie=UTF8&qid=1332714352&sr=8-1

      MATH:~ 1000/64 = 15.6 or 16 cards
      https://www.google.com/search?q=1000%2F64&btnG=Search

      16 cards weighs 8 grams grams

      Capacity of a pigeon:~ 38 grams
      http://interbug.com/pigeon/technology/homing_pigeon_with_gps.pdf
      "Thirty-eight grams total is still a lot for a pigeon to carry, representing about ten percent of its body weight."

      --
      every day http://en.wikipedia.org/wiki/Special:Random
    12. Re:Enjoy your delusion by Centurix · · Score: 2

      Put them all in a coconut and you could transport them by swallow.

      --
      Task Mangler
    13. Re:Enjoy your delusion by turbidostato · · Score: 2

      "Until that company goes out of business"

      So what? It is not live data but one for backup purposes. If the company goes out of business you look for another one, no data is lost in the meantime.

  4. Bare Drives and a USB Drive Dock? by wanderfowl · · Score: 5, Informative

    One way to save a bit of cash is to buy a USB eSATA drive dock (single or double) with some bare eSATA drives. This cuts the enclosure out, and allows you to buy bare drives, which are often cheaper than enclosed drives.

    You could also consider Drobo or one of the Wiebetech multi-drive RAID containers. But encryption + cloud isn't all bad.

  5. Budget by macemoneta · · Score: 4, Informative

    "large data collection and not a large budget"

    This is your problem right there. You can't enter into a a situation like this without planning a budget for the inevitable failures. I suggest purchasing a new larger drive (3TB are common now) and migrating the data from the problematic drive. Then migrate the data from several older smaller drives. This will reduce the component count (points of failure), save you power (cost in the long run) and keep you ahead of failures. You should plan on doing this periodically to maintain the integrity of the data.

    --

    Can You Say Linux? I Knew That You Could.

    1. Re:Budget by nine-times · · Score: 2

      Yeah, in my view, there's a simple time-tested method for managing your data: Get a centralized storage medium big enough to hold it all, move everything to it, and then back that up. Your backup medium should be at least twice as big as your storage medium-- for example, if you have a 2TB drive that you're storing everything on, you should have a backup device that can hold at least 4TB.

      But it seems like he's saying, "I want to store everything and back it up, but I'm unwilling to pay for the storage media. Does anyone have a solution where I get things for free?"

  6. Buddy NAS by Anonymous Coward · · Score: 5, Interesting

    I have a solution I call the "Buddy NAS". Go out and get two cheap computers. It could be a PC or a mini-NAS or a low-end server. Anything that will hold multiple hard drives. You jam both full of hard disks and use them as a backup/NAS server. One PC is kept at your place, the other at your friend's house.

    Both computers have an account for you and an account for your friend (it helps if your friend is nerdy and "gets" backup solutions). Both of you now have a backup solution in your own home and a remote backup server at a friend's place. Two copies of your data, one remote. Basically it's like having local and cloud storage for you and your friend and it'll cost less than a grand if you shop around. If neither of you have static IPs you can use dyndns.org to connect to the remote boxes. Bandwidth shouldn't be an issue if you use rsync to backup changed files nightly.

  7. Here's a couple of solutions by jchawk · · Score: 2

    I can offer a couple of suggestions... What I did was buy a used Dell Poweredge 2950 on eBay for about $500 bucks shipped and I added 4 x 1tb SATA drives to it and I run a raid 5 setup with 3tb of usable space across the four 1tb drives. This solution cost me less then $1000 and I have a nice playground to experiment with VMWare ESXi.

    I know that's not exactly budget conscience but it works great for me.

    If I were on a tight budget I would just buy a 2tb USB drive from Newegg or somewhere similar. It looks like you can buy a name brand for about $130 bucks.

    If you have a little bit more money to spend you could always buy a couple of 2tb internal SATA drives and run RAID-1 mirroring on them. You could put these into an old computer and make a little NAS linux server...

    If you're saying you have no money to spend then maybe you need to consider cleaning up your data. Often times all those "personal files" that you think you need to keep... Really aren't required. Just my 2 cents but this problem is very solvable.

  8. I delete stuff by Amiga500_Rulez · · Score: 5, Insightful

    Seriously. How much crap do you really need to keep around?

    1. Re:I delete stuff by SuricouRaven · · Score: 2

      True... but in the experience of my own family, none of it ever gets looked at. It just becomes data you are obliged to keep safe forever even though no-one really cares to access it.

  9. Magic by lucm · · Score: 5, Insightful

    So your disks are full and possibly broken. You don't want to have more disks, you don't want tape or optical medias, and a storage provider (aka The Cloud) is not an option... Then you have three solutions "down the road":

    1) Delete stuff
    2) Invent a new compression algorithm that will allow you to reuse the same disks forever without losing data
    3) Rely on magic*

    *might overlap with solution #2

    --
    lucm, indeed.
  10. Redundancy and Archiving. by Leareth · · Score: 2

    As several people have said you already answered the question yourself. Spare HDD + Blueray.

    You can achieve what you want by also changing the way you think about your data.

    How much of your personal data is live? As in, how much of it do you access constantly, and need immediate access to?

    Here's what I do, I have discrete HDD set up for each data type (not needed...but I had spare ~500gb drives so it's how I did it) There are broken down to Music, Projects, Video, and Photos. Each of them is synced monthly to a 2TB external drive that is spun up only to do a differential backup.

    Data that I haven't accessed in 6 months (mostly phots and old closed projects) is moved to Archival grade DVD and removed from the Archival HDD.

    So irreplaceable things (3 decades of photos, years of work) are stored and can be accessed within a few moments, less important but commonly accessed stuff (music and instructional videos, or documents I use every day) are live and backed up on the Archive.

    --
    *A)bort, R)etry, I)nfluence with large hammer.*
  11. Drobo + BackBlaze = Win! by mveloso · · Score: 3

    Drobo -> mostly reliable local backup
    BackBlaze -> mostly reliable offsite backup

    You might want to substitute a ZFS-based FreeNAS for the Drobo, if you're so inclined. It's less automatic, but seems just as reliable.

  12. Is your time more valuable than a new disc? by petes_PoV · · Score: 3, Insightful
    Deleting stuff is all very well. But unless you just do an "rm -rf *" and just be done with it. you need to invest some time in deciding what to remove, what to keep and whether that directory called family-photos really does contain what you expect it to. Even at minimum wage rates, the time spent trawling through a couple of TB of "stuff" could easily exceed the cost of a new disc - and then a background copy / backup onto it.

    Obviously you still have an issue of tracking things down on the rare occasions when you actually need some of your family photos. But you can rest assured that they're in there somewhere and weren't purged last time you needed a few GB for more webserver logs.

    Maybe the first step is to de-dup the existing data. You'll still have some manual intervention to check possible duplicates, but it's a first step towards tackling the bigger problem.

    --
    politicians are like babies' nappies: they should both be changed regularly and for the same reasons
  13. Have Less Data or Build a Server by MarcQuadra · · Score: 2, Insightful

    I think it's time to admit that you're a hoarder. What exactly -is- your personal data that's so precious? I run a server just to keep my skill set up and run my side business, but I've only managed to accumulate around 600GB of data, only about 35GB of it is 'mine', the rest is client backups.

    So first admit that you're a hoarder, then decide if you wan to address that issue or indulge it. If you choose to indulge it, you're going to want to build a small home server. Something with a low-end 64-bit CPU (i3?), a gigabit LAN port, and lots SATA ports and 3.5" drive bays. Buy a bunch of high-quality (WD RE4?) matching drives that fit your data needs times two (you're RAIDing space away). Once you have that, install Linux on it, build a software RAID-1 or 0+1 array (don't do RAID-5 unless you can handle days of rebuild time), and format it with something accessible (read: in the kernel, like EXT4). Create a share on the array with Samba and happily access it from all your machines (don't bother with Netatalk or NFS; CIFS is great on all platforms). As your data needs grow, you can add drives in pairs or replace drives with larger ones and grow the volume. If you need backup, you'll want another array, preferably on another low-end box (an enclosure on your desktop?) but it can be built on a RAID-0 or JBOD to save money.

    --
    "Sometimes, I think Trent just needs a cup of hot chocolate and a blankie." -Tori Amos on Nine Inch Nails
  14. It's that time again, is it? by QuasiSteve · · Score: 4, Insightful

    It's that time again, is it?
    http://ask.slashdot.org/comments.pl?sid=2452630&cid=37557630

    Either..
    A: Buy that HDD. Yes, they're a bit more expensive right now ..or..
    B: Wait a few months, prices will come down again, buy that HDD then. Yes, you may lose your data in the mean time.

    Now stop asking or I'm going to pull over.

    1. Re:It's that time again, is it? by QuasiSteve · · Score: 2

      Don't worry - happens to the best of us; I just think that /. should do better in these stories in terms of filtering them (off the front page or entirely). But that's a different discussion :)

      Your question seemed to set limits which are unrealistic, that's why the conclusion is really 'HDD' even though you specifically set that as something that wouldn't be an option.

      I can tell you how I do it, it's just not cheap. Also, it's not really 'managing' my data.. it's a storage/backup solution. The difference is that if I 'managed' my data, I wouldn't have tens of thousands of digital camera photos in a bunch of folders with meaningless names, but just a few that are actually worth saving to me. It's not that I'm saving all the others for future generations either, I just don't have the energy to go through so many photos and delete all but the best (the very best I've already shared anyway).

      But if it's just storage/backup...
      1. Every write made to the main HDD is mirrored via a mirroring RAID setup. Pure mirroring, I don't want to deal with RAID levels that use parity/etc. that may save some space but are a PITA to rebuild (and must be rebuilt - a simple mirrored HDD mounts just fine when taken out of the RAID).

      2. Files are written to a versioning filesystem, so that if I delete something that I later regret, I can get an older version back (presuming things didn't run out of space and it had to be overwritten with new data).

      3. Files saved to a specific area are further synced with a cloud storage solution. These are basically files that I need to be able to access from any location at any time (short of the cloud hoster folding/etc.) asap in case of an emergency. There's very few files that qualify, so bandwidth and monthly caps aren't an issue. I did upload about half a GiB worth initially, though.

      4. Every night the computer does a differential backup to an external, also mirrored, HDD, over the network. This is a set that is in a completely different area of the house, so if I manage to trip and splash water all over everything here, the others are fine.

      5. Every 2 weeks (used to be weekly) I bring one of the HDDs in the mirroring set in the other room to an off-site location (basically a storage locker). From that off-site location I bring back another HDD and put that into the RAID, and force an update of that HDD from the other one.

      So -if- one of my main HDDs dies, there's always the other one. If they both manage to die at the same time, I've still got a daily backup in another room. If that dies, that has another one. If those both die, I still have a 2-weekly backup in an offsite location. If that one's dead as well (what are the odds??), then all my most important stuff is also in the cloud. If that cloud storage solution goes belly-up at the same time and data can't be retrieved? Well, I'm screwed. But life does go on - people whose houses burn down often don't have such a rigorous backup method in place, and they pick up again as well.

      That said...
      6. Of very important photos, I've got prints (a Kodak booth does better than your home inkjet) or even negatives (the better photography stores can point you in the right direction for that). Of very important documents, I've got print-outs (laserjet). Of very important video? Nothing. Of very important (music) recordings? Also nothing. I have no such 'very important' of the latter two - but I think you get the idea: I would have gotten those transferred to film and/or tape. The reason is that those can easily be seen by human eyes or played back for human interpretation - digital data not so much.

  15. Hoarders: Digital Edition by joocemann · · Score: 4, Funny

    You might be on the next spinoff of Hoarders programs, a digital hoarders show.

    In this show, redundancy, old versions, and files that haven't been opened in 5+ years are brought into question, for which you will be embarrased to defend... You will attempt to justify why you still have linksys drivers for a wrt54g you don't even have anymore. And no, the DVD ISO of the Alvin and the Chipmunks movie, that you never burned or watched, is not worth saving.... Neither are about 85% of the digital pictures you took (you know, the ones that were the 'bad shot' that you took before finally getting the good one).

    Take a day or two, go through it chunk by chunk, and purge! PURGE!

    1. Re:Hoarders: Digital Edition by AmiMoJo · · Score: 3, Informative

      You nearly said it, but I'll just come out and ask: what is this vast quantity of personal data? If it isn't downloaded movies or rips of DVDs the OP owns... Maybe he's a film maker or prolific but unpublished musician.

      We need more info because the only option, other than deleting stuff and throwing money at the problem, is some clever solution based on the specific data in question.

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
  16. Re:RAID array on a spare box by swalve · · Score: 4, Interesting

    That's not a bad idea. I started with the OP's problem, trying to keep data from multiple machines in sync and backed up and with enough room to spare. After having spent more weekends copying data back and forth to clear out a drive in order to replace it, I decided to go to the fileserver paradigm. I built a machine with three 40gb drives RAIDed together and made that the only place useful data would be stored. I've since expanded it up to 3tb in various increments, and it has worked well. It has saved tons of time and money by allowing my computers to use whatever cheap harddrive was available and just restore from backup when it went TU. But with the need for increased data availability outside my house (IE, making my notebook my main computer), I'm starting to reverse course and move to your idea. Using robocopy on the clients and shell scripts + hard links on the server, I've set up a workable versioning backup system that doesn't take up too much space.

    I also use Dropbox for some stuff.

  17. Define what your Backup Must Do First & Why by BoRegardless · · Score: 2

    People have different needs. Some needs are imposed by either employers or the wonderful US Govt. for mandatory data retention. Others are your life's design work that you want to retain until you die. Other data you want to pass to your kids. If you can't afford to lose it keep multiple backups on multiple media in multiple locations. Books & pamphlets have been written on this. Transfer the data to new media once a year or two or three & keep all working drives.

    No single storage device local or remote is immune from disaster. The Alexandria Library succumbed and took with it countless early human treasures. Wars have done in archives all over the world. Lightning, outages and power surges can defeat the best protections even when electronic equipment is turned off, but still plugged in (laptops are better when left unplugged, which is actually a great asset).

    Backup is one thing; recovery is another and it can be GUT WRENCHING. The recovery process needs as much thought as backup.

    A Clue or Two: A business partner had his MBPro backed up to 2 external HDs. Not great, but OK. Said MBPro crashed on the Lion upgrade. No way to know whether it was hardware or software and the MBPro should have at that point been off limits for use until carefully checked out. He happens to live in an area subject to lightning and outages which can affect anyone (even with a UPS). However, he reinstalled the Snow Leopard and plugged the first BU HD in an attempt to reload the data; HD became corrupted. Should have stopped, but then the 2nd HD was corrupted. Moral of the story; Recover data from a backup to an external HD running on another computer than the one that got mucked up.

    The cost of 3-4 external 2-3 Terabyte hard drives and a couple cases or RAID box is dirt cheap compared to the value of the hours you put in on your computer each year as are Blue Ray drives & disks.

    Caution: Someone on this list mentioned putting drives and disks in a "fireproof safe" or "fireproof file cabinet"; wrong! The UL approved boxes are designed only to protect "paper" for a given amount of time in a typical fire by releasing steam (212 deg. F = goodbye DVD/BR disks). Once the fireproof agent uses up its water...Farenheight 456 takes care of all contents...permanently. This is why multiple locations are needed.

  18. Manual on External HDs and good organization by seriesrover · · Score: 2

    My personal and family data (not including ripped DVDs etc) are about 1 TB. Mostly photographs and video with my DSLR so the files tend to get large...but I also have a ton of documents, app installs, and all sorts of misc data. I must admit I'd be curious as to what fills multiple external HDs for personal data but to each their own.

    Good organization outweighs medium in my case. 2xExternal 2 TB HDs - primary and secondary...and then a third stored off site at my parents that I update about 3 times a year, so if the worst happens I'm 6 mons out of date, but its usually about 4. And thats if both my primary and secondary go down. Thats a cost of about $300 total and a little a bit of effort.

    "A little bit of effort" is defined by how you organize. Backing up manually means I don't rely on software or a service, but it requires some forethought. For me I break it up by data type and usually year...sometimes I go one more by how that data was acquired (photos I add who took the picture). This is important because I put anything new into a diff folder so I know whats new and whats not. It took me a couple of years to get to the structure I have but I sometimes add small tweaks. The effort or time now is fairly miniscule.

    What I'm trying to get at is this : if you're prepared to put a small amount of time in every now and again, with an initial overhead, you can do this very easily and cheaply.

  19. Well let's see by jon3k · · Score: 2

    You ruled out: hard drives, tape, discs and cloud storage. What exactly do you expect us to say here? There isn't some other magical form of storing data we've been hiding from you.

  20. Data Hoarding and my solution by IndustrialComplex · · Score: 3, Interesting

    First, let's look at your problem: You are gathering too much data. Either the data is 100% needed and irreplaceable, or it isn't. If it isn't, your first step is to treat your data just like you would physical junk that accumulates in your house.

    Create Three folders.
    1. Critical Keep
    2. Unsure
    3. Toss

    Go through your data and MOVE it to one of those three folders. If it isn't critically important data that you would be upset that you lost and can't be recreated (wedding videos, etc) It goes in the Critical Keep folder. If you aren't sure about it right now, but you can't declare it for folder 1, put it in 2. Anything else "old install files, backup data from a windows 98 machine, etc" That stuff can be deleted. Be harsh with yourself. Think of it like moving from house to house, if you haven't opened that box by your third move, just toss it in folder 3.

    Repeat the process until you either have everything in your Critical Keep folder, or your delete folder.

    Now, hopefully you have reduced the size of the data you are using to something marginally manageable. I'm a data hoarder, and I've managed to keep the rate of growth of my data to lag behind the general rate of growth of HDD capacity. Now for the fun stuff:

    Two things you want to avoid.

    1. Loss due to a dying disk
    2. Loss due to a destroyed home (fire, theft, etc)

    Here was my budget solution that resulted in a fire and forget backup system that is suitable for a home user and is about as minimal as you can get for cost.

    3 Disk Drives.

    A primary drive to run the operating system and hold installed programs and two LARGE data drives in a RAID1 configuration.

    Static data files (Video, pictures, etc) get stored on the RAID1

    A scheduled process (once per month for me) backs up the OS drive to a virtual HD file on the RAID1. The files on the RAID1 are then backed up to a cloud storage service (Carbonite in my case).

    So, what is the result of this?

    My operating environment is backed up monthly. The only thing I lose here is configuration changes or programs installed since the last backup (less than 30 days for me)

    The RAID1 ensures that my personal/static data is protected from a single disk failure, and helps a bit with read performance for the static (and large) files.

    Should a cataclysmic failure occur and my entire computer is lost to something like a fire, remember that I've been sending what is on the RAID0 out to the cloud (carbonite), so when I can rebuild a computer I can just download the (very large) offsite backup from the cloud to my new machine.

    The downsides I have right now:
    1. I maintain the windows backup as a VHD file because it allows me to ensure that the backup data is 'packaged'. I don't know the exact details about windows backup, but given that Carbonite sometimes excludes system files I didn't want to risk an important hidden/system file being missed in the backup. In addition I didn't like how it could only backup to the root folder of a drive. The downside is that the resulting 100GB file is a pain to backup, which is why I restrict the backup histerisis to 30days (previously I had it backup every 3 days) This keeps it from continually uploading the VHD file to carbonite.

    2. The HDDs for the raid1 lose half their total capacity in that configuration. I used it because it let me only have to use 2 drives and the performance boost. If you can afford 3 drives, go for a RAID5.

    3. Most Motherboards support RAID natively now. However, I understand that you can run into issues with hardware RAID if you have to switch to a different hardware solution. I haven't tested this, but it could potentially be an issue if you use a RAID5 from hardware and your motherboard fails and you can't replace it with an exact model. The good news here though, is if you have been backing up to the cloud, typically it's done on a per file basis, and thus you don't have to worry about this. Just download your stuff ba

    --
    Out of modpoints but really liked a post? 1BDkF6TtmmeZ3yqXbz9yhdYVqRYnwFoXDj
  21. My 2 TB errrr cents by denmarkw00t · · Score: 2

    Delete your porn

    The rest of your personal data will fit on a floppy.

  22. Use a NAS with backup by AliasMarlowe · · Score: 5, Interesting
    What I did some years ago was recognize that "manual backups" were not done often enough, and important stuff was scattered around a few PCs. So I got a NAS, stuck a pair of disks into it (RAID 0 for speed), and set up its automated incremental backup to run 3 times per week to an external USB drive. The PCs now mount the NAS at login, and that's where all data files are stored by default (even the kids use it).

    We're up to 2 NAS units now, with 7TB[*] of disk space between them, all backed up on schedule. The USB backup drives are rotated every few weeks with another set kept in a secure place in the garage.

    [*] One NAS unit doubles up as media server, so it's got a load of movies & music in addition to user files in its 6TB. The other one is our web server and email server with only 1TB of disk space.

    --
    Those who can make you believe absurdities can make you commit atrocities. - Voltaire
  23. Fireproof is NOT Firesafe; Oxymoron by BoRegardless · · Score: 2

    The words "Fireproof Safe" is by the definition of Underwriter Labs. It merely refers to being fire resistive for a given amount of hours in a typical fire "for paper".

    Forget DVDs and CDs and any hard drives surviving a fire in one of these "fireproof" devices. They are designed to release steam to keep the temperature at 212 def F until the fireproofing material exhausts all its water at which time the temperature goes up and, well...you can imagine what.

    1. Re:Fireproof is NOT Firesafe; Oxymoron by Bender0x7D1 · · Score: 3, Informative

      Unless, of course, you get the safe that's rated for computer media.

      This is an area where you really need to RTFTS (tech specs) as it will tell you EXACTLY what kind of fire, temperature and duration it will protect a specific kind of media for.

      --
      Reading code is like reading the dictionary - you have to read half of it before you can go back and understand it.
  24. rsnapshot + raid6 on server in basement by Janek+Kozicki · · Score: 3, Interesting

    My solution to this problem is painfully simple: about 5 years ago I bought 5 drives 500GB each. I have put a server (made from old parts, like pentium IV and so on) in the basement (where nobody hears it, and it can be as noisy as hell). I installed debian on it and configured cron to call rsnapshot three times per day for doing automatic backups of all PCs in my family. I never touched this machine since then.

    With one exception: 3 years ago I started to run out of space, so I bought 2 HDDs 2 TB each, reconfigured raid6, which was extremely easy because for raid I am using mdadm, which supports such operations online. Also I had few more spare drives during the years, so I kept adding them to the array, and currently there are 9 HDDs in this PC. It is very noisy, but nobody cares about that.

    It runs flawlessy, untouched for years, and nobody cares about it, except for when somebody in my family accidentally loses or deletes a file. Then suddenly backup comes very handy.

    Rsnapshot is especially good, because it keeps hardlinked copies of data from last week, 2 weeks ago, last month, and much more, depending on how you configure /etc/rsnapshot.conf. Currently I have backups dating back about 2 years, with granularity of 1 month. And it only occupies the space on HDD to reflect the changes between data, thanks to hardlinks.

    So my raid6 array has total size about 4TB and still 500GB free. And I feel this will last at least a year or two. In case of problems I can start deleting copies that are more than 1 year old. While most recent snapshot uses about 2 TB or such.

    Rsnapshot also can backup windows machines, so you don't need to worry about compatibility. Though I don't have windows machines and I don't test that in practice ;)

    --
    #
    #\ @ ? Colonize Mars
    #
  25. Re:Enjoy your pricy delusion by Nikademus · · Score: 3, Insightful

    "Storage on S3 is ridiculously inexpensive any more.
    I have about 6 TB of data that I need to keep backed up."

    So you mean that 6000/month*0.125$=750$/month is cheap?
    Or did I miss something?

    --
    I gave up with the idea of an useful sig...
  26. So now EVERYTHING is powered? by professorguy · · Score: 2
    So they've convinced us that WASTEPAPER BASKETS must be plugged in at all times (shredders and so-called "electric dustpans"). And I see everyone out with gas-powered BROOMS. And even the SAP which drips freely from our maples is, in modern sugar houses, vacuum pumped to a tank.

    And now there's 200 comments where the people are proud of their kilowatt server arrays which are powered 24 hours a day for their PHOTO ALBUMS? Are you people shitting me? I mean, you're putting me on, right? You don't really use up 10,000 kWH per year storing your family photos, do you?

    Hey, I've just invented the electric elevator-button-pusher. I save a TON of finger wear and tear.

    Sometimes, humanity makes me sad.

  27. The delete button by kikito · · Score: 2

    Not trying to troll here. I'm serious. Consider that you might be simply storing too much stuff.

  28. Ok, let me steer a little... by multimediavt · · Score: 2

    I am making my way through the comments, and want to clarify a little. I am not talking about backup. I am asking about disaster recovery or just plain drive maintenance tasks that should be done annually. The drives are my backup. Yes, good corporate data storage practice is to have spare drives around. I am talking about home. How many have 2 TB drives sitting empty on a shelf at home, just in case? I don't know anyone, personally, and I know hundreds of geek admin types of all ages and experience levels, myself included. We usually buy storage upgrades as needed and seldom have current technology, large drives just laying around because we're using them! Other than that, great stuff so far. Thanks all.

  29. All My Data Is Belongs To Me by ios+and+web+coder · · Score: 2

    I use a 3TB (4TB -1) OWC Mercury RAID 5 array that is backed up constantly via Apple Time Machine.
    Time Machine rocks. By the time it starts bucketing data, it will be at the 3-year mark. I keep my eye on the disk health, and am prepared to swap out as necessary.

    I keep an external drive rsynced to my main drive, so I have an immediate backup, if necessary. I have used it, on occasion (sucks when I do -my primary drive is an SSD).

    That's my personal data. I have a Mini that uses a Drobo to store my Web site stuff.

    I don't keep any data from my "day job" on my personal systems. My system at work is backed up very well indeed.

    BTW: I use git for my personal source control, and Perforce for my day job source control.

    --

    "For every complex problem there is an answer that is clear, simple, and wrong."

    -H. L. Mencken

  30. Re:Prioritise and use tape by mlts · · Score: 2

    I do consider tape the best answer to backups. No, it isn't flashy like cloud storage, or the latest Internet rendition of it (be it a glorified file share, rsync, etc.)

    Modern tape drives are a lot more reliable than the old 8mms and QIC cartridges. I still have DLT tapes from '98 which are still readable.

    It all depends on the size of data. For the amount the OP has, if there is money, I'd consider an external LTO tape drive. They are around $1400 on NewEgg for a LTO-3 external, plus one will need a couple C notes for a SAS card. However, LTO-3 tapes are $15-$20, so $200 would back up 3-4 TB of compressed data, and with uncompressed, definitely more. Once the drive is bought, having the ability to save off the data completely for $200 or so is pretty economical. To boot, once the read/write tap gets flicked to read-only, nothing software-wise is going to be able to corrupt the tapes, and you can always pay for LTO-3 WORM tapes if you want true tamper-resistance.

    Tape will always have a place in IT, even if it is just for a place to save data cheaply long term for pleasing the auditors.

  31. Data integrity by thereitis · · Score: 3, Interesting

    This is my personal data and it is starting to become unbelievably unruly to deal with as far as data integrity and security are concerned.

    Keep all your important files in a version control system. Personally, I use Perforce (it's free for 2 users or less). That gives you: multi-revision history and checkin comments, an easy way to pull a subset of files to any computer in your house, and peace of mind that you don't need to worry about kids deleting anything important as it's all stored on the server with history. Also easy to see what has changed on any computer and check those files in. And there's a big win for data integrity checks: Perforce stores the checksum of all files (and revisions) and can easily check that every file still matches the checksum in the central database. If you have any disk corruption, you'll know about it when you run 'p4 verify -q //...'. You can store files of several gigabytes each with no problem.

    On top of this, I use rsync to copy the server data onto backup drives. I'm also looking at storing backups online, but haven't taken that step yet.

    I've been using this system for years and I couldn't imagine being without it. It's so easy to find and retrieve exactly what I want - my resume 5 revisions ago, my tax return, photos from 2003. Even without that, the data integrity checks give a lot of peace of mind.