Slashdot Mirror


Ask Slashdot: What's The Best Way To Backup Large Amounts Of Personal Data? (foxdeploy.com)

An anonymous Slashdot reader has "approximately two terabytes of photos, currently sitting on two 4-terabyte 'Intel Rapid Storage' RAID 1 disks." But now they're considering three alternatives after moving to a new PC: a) Keep these exactly as they are... The current configuration is OK, but it's a pain if a RAID re-sync is needed as it takes a long time to check four terabytes.

b) Move to "Storage Spaces". I've not used Storage Spaces before, but reports seem to show it's good... It's a Good Thing that the disks are 100% identical and removable and readable separately. Downside? Unknown territory.

c) Break the RAID, and set up the second disk as a file-copied backup... [This] would lose a (small) amount of resilience, but wouldn't suffer from the RAID-sync issues, ideally a Mac-like "TimeMachine" backup would handle file histories.

Any recommendations?

This is also a good time to share your experiences with Storage Spaces, so leave your answers in the comments. What's the best way to backup large amounts of personal data?

21 of 366 comments (clear)

  1. Commit it to memory! by danomac · · Score: 4, Interesting

    Memorize it! Just don't take any head injuries or you won't remember anything.

    More seriously, back up to hard drives is the only viable option. Then make sure you have more than one backup drive and store one at some other site. Relative maybe?

    Cloud options with that kind of storage would take forever to upload. And I've heard of people having stuff randomly go missing on their cloud service, not the entire contents, but a file here and there. I'm not so sure that's a good option.

    For storing on-site you can get a fire rated media safe, but they can be quite a bit more expensive than a regular safe.

    1. Re:Commit it to memory! by danomac · · Score: 4, Informative

      In addition, I forgot the 3-2-1 backup principle. 3 copies of data, on at least 2 different types of media, and 1 copy off-site.

  2. Come the fuck on by Anonymous Coward · · Score: 5, Insightful

    2 Terabytes is nothing.

    Here's how you do this:

    10 You buy an external hard disk that is 4 Terabytes or larger, and USB 3.0.
    20 Copy the fucking files to that thing.

    You're done. Now you have two copies: one on whatever bad idea you have as your main drive, and the other on a physically separate drive.

    Not good enough? GOTO 10

    1. Re:Come the fuck on by spire3661 · · Score: 3, Insightful

      You forgot checksumming and verification after transfer.....You have something on the other drive after the transfer, you wont know what until you verify it.

      --
      Good-bye
    2. Re:Come the fuck on by AmiMoJo · · Score: 4, Informative

      Bad idea, because it requires on-going effort. Most people will forget, or get lazy.

      For most people encrypted online backup is the best option. I use Spideroak (I took up the unlimited space special offer, about £100/year), but there are others. It's automatic, happens constantly in background. I've got over 4TB on Spideroak, only took a few months to upload. Obviously you need a reasonable upload speed and no/high data caps.

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    3. Re:Come the fuck on by joe_frisch · · Score: 4, Insightful

      Agreed! The minor changes I would make (and do for my own few TB of files).

      Have a script the runs the backup. I use rsync on linux.
      Make two copies, one that mirrors, one that just adds files.
      Use two backup disks, always have one at a remote location (your work) so you don't lose data in a house fire.

      If it is a single command (my is "backup") then its easy to remember to do every week.

      I actually have 3 backups. One at home. Two at different work sites that I cycle through. I do my backups from a linux machine that doesn't provide write access to my main windows machines. That makes me a little more resistant to hacks (since they would have to hack two different OSs.

    4. Re:Come the fuck on by Chelloveck · · Score: 4, Insightful

      What he said. And for ongoing backup, keep the disk at a buddy's house and rsync your files to them periodically. And reciprocate. Keep their backup disk at your place and let them rsync to you. Done. You're safe and you've made the world a better place.

      Although I imagine that our "anonymous Slashdot reader" who asked this question wouldn't know rsync if it bit them on the ass, being the marketing person for Storage Spaces and all. Come on, the only purpose of such a fucking obvious question is to get some front-page name recognition for the product. Nice timing, too, slipping it onto the feed Sunday night, ready for everyone's Monday morning Slashdot-and-coffee ritual. Kudos.

      --
      Chelloveck
      I give up on debugging. From now on, SIGSEGV is a feature.
    5. Re:Come the fuck on by mlts · · Score: 4, Informative

      As others have said, 4TB isn't that much. The key is to have a 3-2-1 plan for the data -- 3+ copies, 2 on different media, one offsite:

      First, I'd recommend purchasing a NAS appliance. Synology and QNAP offerings are inexpensive and even though one can build their own system with FreeNAS or something else, a small NAS appliance takes up relatively little in wattage, which is nice for the electric bill. I also like the fact that you have the ability to encrypt data, and segment it into shares. Some NAS models even allow for snapshots. They are not too expensive -- an ARM based dual-drive NAS is about $150 + drives.

      For four terabytes, I would recommend a Synology DS216+ ii (the reason for the long name is that the DS216+ had components which were discontinued, so the mark 2 edition is current. This NAS model is x86 based and can use btrfs to detect bit rot on the RAID volumes) Then, drop in two WD Reds (6 or 8 TB), and you have RAID 1.

      Second, buy an external USB drive to plug to the NAS. RAID and snapshots are nice, but this provides a true backup mechanism.

      Third, get an offsite backup mechanism. QNAP and Synology have software that can back up to a number of providers, and back stuff up encrypted. There are many offsite backup providers out there.

      Fourth, consider a manual offsite mechanism, even if it is another external hard drive that you plug in, dump the contents of the NAS to, remove, and put offsite somewhere. This way, if you lose your NAS and Net connection, you still have some means to access your data.

  3. RAID is not backup by bad_fx · · Score: 5, Insightful

    Say with with me: "RAID is not backup!"

    1. Re:RAID is not backup by Frobnicator · · Score: 5, Insightful

      Say with with me: "RAID is not backup!"

      Indeed. There is also a difference between backup and archive.

      RAID = This is running live, and I need a duplicate that is instantly available so I can keep running in case one drive fails. The trick is that if there is an operation that destroys data (e.g. ransomware infection that encrypts your stuff) then you lose all disks. This is why RAID is not backup.

      Backup = Just in case the machine dies, or I accidentally delete a bunch of stuff, or a virus hits, I can restore from the backup. This generally follows the 3-2-1 rule: At least three copies, at least two media, at least one off site. Businesses often use D2D2T systems for this.

      Archive = I will probably never look at this again, but I absolutely need to keep a copy around for historic or business reasons. Think about services like Iron Mountain or Amazon Glacier. Tape archives that are quite cheap and almost certainly never reviewed again. This is along the lines of "show me the obituaries from a newspaper published 7 May 1957", or similar.

      For the original story, it seems like he is looking for an archival solution rather than a backup solution.

      --
      //TODO: Think of witty sig statement
    2. Re:RAID is not backup by Anonymous Coward · · Score: 4, Informative

      Of course RAID isn't a backup technology. It's a way of providing fault tolerance across large filesystems. It does this by alerting administrators to failed drives, and allowing them to be swapped in & out while the filesystem stays online. At that task, it works reasonably well, although it does need to be supported by a robust alerting & "hands+feet" strategy. That's why it's still in widespread use in enterprise environments. They have the $$ and manpower to make it work.

      Conversely, maintaining a good backup of your data (vs keeping it online) is a different beast. For that you have a whole bunch of other technologies like incremental copy, snapshotting, and clever combinations of the two, that store the resulting backups on everything from another RAID array, to tape systems, USB3 portable drives, remote filesystems, cloud solutions, etc etc.

      What the OP seems to be asking is "what backup strategy should I consider to back up 2TB of personal data using SOHO technologies?" Personally, I wouldn't even consider doing it locally, as it's prone to human error and keeps all the data in the same location (thus failing to protect against the two most likely causes of data loss in a home environment: you forgot to run the backup, or your house got flooded/burnt/ransacked). I'd consider a cloud-based solution (rsync.net or something similar) as it solves both those issues, albeit at a higher ongoing (capex) cost rather than just a straight capital cost for a USB3 portable drive. It's hard to say an ongoing cost would be acceptable in this case, as the OP didn't mention whether $$ was a factor.

    3. Re:RAID is not backup by dgatwood · · Score: 3, Informative

      The problem with cloud-based solutions is that the cost for backing up several terabytes of data is typically several orders of magnitude higher than building your own RAID array, and the performance of Internet-based backup absolutely sucks beyond measure unless you're the sort of person whose data needs are measured in tens of megabytes.

      • To back up 2 TB over a typical cable modem (say 3 megabit upload speed) will take you 61 days. Over typical DSL (300 kilobits per second), it will take almost two years.
      • If you lose your original copy, getting the data back will be almost as painful. On a fairly fast cable modem (30 mbps), assuming the cloud-based backup server can completely saturate your downlink (which is by no means guaranteed), it will take you 6 days of continuous downloading to restore the backup. Over 3 megabit DSL, again, that number goes up to 60 days.

      The ideal solution, if you can pull it off, would be to build a small concrete bunker in your yard, run power out to it, put a UPS and power conditioner in there to protect against bad power, put a RAID array in there, wire it with Ethernet to your house underground, put a watertight door on the thing, add a power cutoff that shuts down power if water does get inside (e.g. a GFI breaker and an unused extension cord whose output end is lower than your equipment), and hope for the best.

      But more realistically, I would tend to suggest an IOSafe fireproof RAID array loaded up with five 6 TB drives (or maybe even 8 TB drives). Put it in a closet somewhere, and hope for the best. If you want to increase your protection a bit, you could also get two RAID expansion cabinets, store them at work, and periodically bring one home, clone your main RAID array to it, and bring it back

      --

      Check out my sci-fi/humor trilogy at PatriotsBooks.

  4. RAID IS NOT BACKUP by networkzombie · · Score: 4, Insightful

    1) RAID IS NOT BACKUP unless you have another read only set.
    2) STORAGE SPACES IS NOT BACKUP unless you have another read only set, and please, it is JBOD with some added features.
    3) You are exchanging RAID sync issues with backup sync issues.

    I would setup hardware RAID, but that is not related to what you need... Backup to two other disks. Upgrade disk size and technology as needed. A 4TB disk is like $140

  5. Not an advert - but Backblaze by forgottenusername · · Score: 4, Interesting

    https://www.backblaze.com/clou...

    $5/month unlimited data size (writes).

    You can sync files back over or they will actually ship you a HD with your data; if you return the drive you get a refund of the drive cost but you're also free to keep it.

    The cost for individual file reads is reasonable too.

    No muss no fuss

    1. Re:Not an advert - but Backblaze by hcs_$reboot · · Score: 4, Informative
      Backblaze:

      Linux, BSD, Unix and other *nix systems:
      These operating systems are not supported and Backblaze can not be installed on them

      --
      Slashdot, fix the reply notifications... You won't get away with it...
  6. RAID is NOT backup! by gweihir · · Score: 4, Informative

    RAID is fine to reduce downtime, but completely unsuitable as a replacement for backup.

    The RAID does not have the following things which you critically need from backup (the following list is not complete):
    - resilience against operator error (accidentally delete/overwrite files, e.g.)
    - geographic redundancy, usually not even safe against the box killing the disks, lightening, fire, theft, etc.
    - too few copies: Usually 3 (!) independent backup copies used in rotation are considered the minimum. RAID1 gives you one and it is not independent.

    My recommendation is to get at least 3 external USB disks, and establish a backup with them, because currently you have none.

    Steps:
    - Select a backup interval. This represents the maximum time-interval for which you think losing new data is acceptable
    - At the end of each interval, do the following:
          1. Fetch oldest backup disk from off-site location
          2. Put backup copy on it, making it the newest backup. Make sure to do a file-by-file comparison.
          3. Move disk to off-site location

    For somewhat reduced reliability keep the oldest copy at home and do the following:
          1. Make backup, overwriting oldest copy. Make sure to do a file-by-file comparison.
          2. Move new backup to off-site location and fetch oldest from off-site location.

    An "off-site location" can be anything from a garden-shack to a storage locker at work to an arrangement with a neighbor or a friend you see regularly.

    If you think this it too much effort, then your data must not be worth much. This is pretty much the agreed minimum experienced sysadmins want. Of course, there are always those that never lost any important data and they almost universally think this is way too much effort. Many of them learn in time when whatever they do results in that loss.

    --
    Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
  7. Make a restore plan first by houghi · · Score: 3, Informative

    I have the following:
    1) 1 SDD that I work on and another that is mirrored every day. If one disk fails, I have another. This is my working disk.
    2) Incremential backup of data that changes often, like emails or some directories I work in. Mostly use if I delete a file by accident. Just copy it back and be done. This goes to a NAS.
    3) Data that does not changes often, like movies, images and music is stored on a NAS.
    4) Second NAS to backup the data of the first NAS.
    5) Essential data (less than 10MB) is put on my website on a personal directory. This is data that I might need in case of the house burning down.

    So when something goes wrong (unless the house burns down, but the I have other problems and my music is not one of them.) I have a way to restore it.

    The most important thing however is not to backup, but the knowledge on how to restore it. You need to test that out from time to time. I have people seen who did backups to /dev/null to test it and forgot to remove that parameter.

    What you can do if you REALLY need to have things off site, like photos and other things that you can't replace is just buy a dedicated HD that you put this data on and keep it in a drawer at your office. Once a month or so you take it home and add the new data.
    And if that disk is full, buy a new one or a bigger one. If data is really THAT important, the price of the HD is well worth it.

    But again, test the restore.

    --
    Don't fight for your country, if your country does not fight for you.
  8. Re:RAID is not backup... by Jason+Levine · · Score: 3, Insightful

    I'll second BackBlaze - but with the caveat of expecting your initial upload to take a long time depending on your Internet speeds. I have a 15/1 connection so the ~1TB that I wanted to back up took me about 8 months. (I couldn't use my full 1Mbps upstream bandwidth for backup traffic.) Now that this is done, however, it's pretty much automatic. New data gets written and the backup occurs. They even have an app you can use so you can access your data no matter where you are.

    If you need to restore from backup, BackBlaze will ship you a thumb drive or external hard drive for a fee. The fee is refunded if you send the drive back (thus ensuring that people don't abuse this service) and it beats having to download TBs of data.

    Besides BackBlaze, I back up everything on to two external hard drives. This way, if one drive blows, the other drive keeps the data safe. As another person posted, follow the 3-2-1 rule. 3 copies of the data (for me, 2 external HDs and 1 on BackBlaze), 2 different mediums (e.g. external HDD and cloud), and 1 copy offsite (e.g. BackBlaze or another cloud provider).

    --
    My sci-fi novel, Ghost Thief, is now available from Amazon.com.
  9. That problem has already been solved by luis_a_espinal · · Score: 4, Funny

    You forgot checksumming and verification after transfer.....You have something on the other drive after the transfer, you wont know what until you verify it.

    By the tits of Baal, rsync or xcopy /v or robocopy in combination with fciv.

  10. Re:RAID is not backup... by im_thatoneguy · · Score: 3, Interesting

    I have over 10TB on Backblaze for $5/mo. Works great and recovery is easy.

    I would add though that if you want more control and more flexibility I've started using Backblaze's B2 API and SyncBack, Cloud Berry or whatever software backup solution you prefer. That costs about $5/month per TB but has the advantage of control over hash checks and retention.

  11. Can We Drop the "Online" Vapor? by rally2xs · · Score: 3, Informative

    C'mon, online backup? Really? The poster said "terabytes." Cable companies in this area say "hundreds of kilobits per second" as an upload speed. That'd be 10's of kilobytes per second. How long? Get optimistic at, say, 800 kbps -> 80 - 100 kBps and you have a really long time. Lessee, 2 X 10^12 bytes / 1 X 10^5 kB/s = 2 X 10^7 seconds = 20 million seconds to upload 2 terabytes. 20 X 10^6 seconds / 3.6 X 10^3 seconds / hour = about 5.5 X 10^3 hours, or 5,500 hours. 5,500 hours / 24 hours / day = 229 days. I aborted Carbonite some years ago when I had only a couple hundred gigabytes,it was _NOT_ uploading every single file on my disk, and looked like it was going to exceed 3 weeks to do it.