Ask Slashdot: What's The Best Way To Backup Large Amounts Of Personal Data? (foxdeploy.com)
An anonymous Slashdot reader has "approximately two terabytes of photos, currently sitting on two 4-terabyte 'Intel Rapid Storage' RAID 1 disks." But now they're considering three alternatives after moving to a new PC:
a) Keep these exactly as they are... The current configuration is OK, but it's a pain if a RAID re-sync is needed as it takes a long time to check four terabytes.
b) Move to "Storage Spaces". I've not used Storage Spaces before, but reports seem to show it's good... It's a Good Thing that the disks are 100% identical and removable and readable separately. Downside? Unknown territory.
c) Break the RAID, and set up the second disk as a file-copied backup... [This] would lose a (small) amount of resilience, but wouldn't suffer from the RAID-sync issues, ideally a Mac-like "TimeMachine" backup would handle file histories.
Any recommendations?
This is also a good time to share your experiences with Storage Spaces, so leave your answers in the comments. What's the best way to backup large amounts of personal data?
b) Move to "Storage Spaces". I've not used Storage Spaces before, but reports seem to show it's good... It's a Good Thing that the disks are 100% identical and removable and readable separately. Downside? Unknown territory.
c) Break the RAID, and set up the second disk as a file-copied backup... [This] would lose a (small) amount of resilience, but wouldn't suffer from the RAID-sync issues, ideally a Mac-like "TimeMachine" backup would handle file histories.
Any recommendations?
This is also a good time to share your experiences with Storage Spaces, so leave your answers in the comments. What's the best way to backup large amounts of personal data?
In addition, I forgot the 3-2-1 backup principle. 3 copies of data, on at least 2 different types of media, and 1 copy off-site.
RAID is fine to reduce downtime, but completely unsuitable as a replacement for backup.
The RAID does not have the following things which you critically need from backup (the following list is not complete):
- resilience against operator error (accidentally delete/overwrite files, e.g.)
- geographic redundancy, usually not even safe against the box killing the disks, lightening, fire, theft, etc.
- too few copies: Usually 3 (!) independent backup copies used in rotation are considered the minimum. RAID1 gives you one and it is not independent.
My recommendation is to get at least 3 external USB disks, and establish a backup with them, because currently you have none.
Steps:
- Select a backup interval. This represents the maximum time-interval for which you think losing new data is acceptable
- At the end of each interval, do the following:
1. Fetch oldest backup disk from off-site location
2. Put backup copy on it, making it the newest backup. Make sure to do a file-by-file comparison.
3. Move disk to off-site location
For somewhat reduced reliability keep the oldest copy at home and do the following:
1. Make backup, overwriting oldest copy. Make sure to do a file-by-file comparison.
2. Move new backup to off-site location and fetch oldest from off-site location.
An "off-site location" can be anything from a garden-shack to a storage locker at work to an arrangement with a neighbor or a friend you see regularly.
If you think this it too much effort, then your data must not be worth much. This is pretty much the agreed minimum experienced sysadmins want. Of course, there are always those that never lost any important data and they almost universally think this is way too much effort. Many of them learn in time when whatever they do results in that loss.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
I have the following:
1) 1 SDD that I work on and another that is mirrored every day. If one disk fails, I have another. This is my working disk.
2) Incremential backup of data that changes often, like emails or some directories I work in. Mostly use if I delete a file by accident. Just copy it back and be done. This goes to a NAS.
3) Data that does not changes often, like movies, images and music is stored on a NAS.
4) Second NAS to backup the data of the first NAS.
5) Essential data (less than 10MB) is put on my website on a personal directory. This is data that I might need in case of the house burning down.
So when something goes wrong (unless the house burns down, but the I have other problems and my music is not one of them.) I have a way to restore it.
The most important thing however is not to backup, but the knowledge on how to restore it. You need to test that out from time to time. I have people seen who did backups to /dev/null to test it and forgot to remove that parameter.
What you can do if you REALLY need to have things off site, like photos and other things that you can't replace is just buy a dedicated HD that you put this data on and keep it in a drawer at your office. Once a month or so you take it home and add the new data.
And if that disk is full, buy a new one or a bigger one. If data is really THAT important, the price of the HD is well worth it.
But again, test the restore.
Don't fight for your country, if your country does not fight for you.
Of course RAID isn't a backup technology. It's a way of providing fault tolerance across large filesystems. It does this by alerting administrators to failed drives, and allowing them to be swapped in & out while the filesystem stays online. At that task, it works reasonably well, although it does need to be supported by a robust alerting & "hands+feet" strategy. That's why it's still in widespread use in enterprise environments. They have the $$ and manpower to make it work.
Conversely, maintaining a good backup of your data (vs keeping it online) is a different beast. For that you have a whole bunch of other technologies like incremental copy, snapshotting, and clever combinations of the two, that store the resulting backups on everything from another RAID array, to tape systems, USB3 portable drives, remote filesystems, cloud solutions, etc etc.
What the OP seems to be asking is "what backup strategy should I consider to back up 2TB of personal data using SOHO technologies?" Personally, I wouldn't even consider doing it locally, as it's prone to human error and keeps all the data in the same location (thus failing to protect against the two most likely causes of data loss in a home environment: you forgot to run the backup, or your house got flooded/burnt/ransacked). I'd consider a cloud-based solution (rsync.net or something similar) as it solves both those issues, albeit at a higher ongoing (capex) cost rather than just a straight capital cost for a USB3 portable drive. It's hard to say an ongoing cost would be acceptable in this case, as the OP didn't mention whether $$ was a factor.
Bad idea, because it requires on-going effort. Most people will forget, or get lazy.
For most people encrypted online backup is the best option. I use Spideroak (I took up the unlimited space special offer, about £100/year), but there are others. It's automatic, happens constantly in background. I've got over 4TB on Spideroak, only took a few months to upload. Obviously you need a reasonable upload speed and no/high data caps.
const int one = 65536; (Silvermoon, Texture.cs)
SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
As others have said, 4TB isn't that much. The key is to have a 3-2-1 plan for the data -- 3+ copies, 2 on different media, one offsite:
First, I'd recommend purchasing a NAS appliance. Synology and QNAP offerings are inexpensive and even though one can build their own system with FreeNAS or something else, a small NAS appliance takes up relatively little in wattage, which is nice for the electric bill. I also like the fact that you have the ability to encrypt data, and segment it into shares. Some NAS models even allow for snapshots. They are not too expensive -- an ARM based dual-drive NAS is about $150 + drives.
For four terabytes, I would recommend a Synology DS216+ ii (the reason for the long name is that the DS216+ had components which were discontinued, so the mark 2 edition is current. This NAS model is x86 based and can use btrfs to detect bit rot on the RAID volumes) Then, drop in two WD Reds (6 or 8 TB), and you have RAID 1.
Second, buy an external USB drive to plug to the NAS. RAID and snapshots are nice, but this provides a true backup mechanism.
Third, get an offsite backup mechanism. QNAP and Synology have software that can back up to a number of providers, and back stuff up encrypted. There are many offsite backup providers out there.
Fourth, consider a manual offsite mechanism, even if it is another external hard drive that you plug in, dump the contents of the NAS to, remove, and put offsite somewhere. This way, if you lose your NAS and Net connection, you still have some means to access your data.
The problem with cloud-based solutions is that the cost for backing up several terabytes of data is typically several orders of magnitude higher than building your own RAID array, and the performance of Internet-based backup absolutely sucks beyond measure unless you're the sort of person whose data needs are measured in tens of megabytes.
The ideal solution, if you can pull it off, would be to build a small concrete bunker in your yard, run power out to it, put a UPS and power conditioner in there to protect against bad power, put a RAID array in there, wire it with Ethernet to your house underground, put a watertight door on the thing, add a power cutoff that shuts down power if water does get inside (e.g. a GFI breaker and an unused extension cord whose output end is lower than your equipment), and hope for the best.
But more realistically, I would tend to suggest an IOSafe fireproof RAID array loaded up with five 6 TB drives (or maybe even 8 TB drives). Put it in a closet somewhere, and hope for the best. If you want to increase your protection a bit, you could also get two RAID expansion cabinets, store them at work, and periodically bring one home, clone your main RAID array to it, and bring it back
Check out my sci-fi/humor trilogy at PatriotsBooks.
Linux, BSD, Unix and other *nix systems:
These operating systems are not supported and Backblaze can not be installed on them
Slashdot, fix the reply notifications... You won't get away with it...
C'mon, online backup? Really? The poster said "terabytes." Cable companies in this area say "hundreds of kilobits per second" as an upload speed. That'd be 10's of kilobytes per second. How long? Get optimistic at, say, 800 kbps -> 80 - 100 kBps and you have a really long time. Lessee, 2 X 10^12 bytes / 1 X 10^5 kB/s = 2 X 10^7 seconds = 20 million seconds to upload 2 terabytes. 20 X 10^6 seconds / 3.6 X 10^3 seconds / hour = about 5.5 X 10^3 hours, or 5,500 hours. 5,500 hours / 24 hours / day = 229 days. I aborted Carbonite some years ago when I had only a couple hundred gigabytes,it was _NOT_ uploading every single file on my disk, and looked like it was going to exceed 3 weeks to do it.