Best Way to Back Up Photos and Video?
jsalbre writes "I do a lot of digital video work, and my wife is a professional photographer. With raw DV from the video camera using up 11GB/hr, and raw images from the digital SLR using 7MB I'm quickly using up a lot of space. I currently back up all my important files each night from one harddrive to another, but I now have over 200GB of irreplaceable data (more than just DV and photos, but those make up the largest chunk) and I'm having to exclude the "less important" irreplaceable files as my backups have started failing. Several people have suggested backing up vital unchanging files to DVD (video, images,) and continue backing up frequently accessed files to harddrive, but with recent studies showing that optical media doesn't last very long I don't want to come back in a few years and find that all my backups are useless. Not to mention that some of my DV files are larger than even a dual-layer DVD, and it would be near impossible to automate backup to DVD. How do other Slashdotters back up their important data? I'd appreciate distinction between methods for frequently accessed files and for infrequently accessed files. Any suggestions will be highly appreciated!"
"How do other Slashdotters back up their important data?"
I memorize it.
Vincent J. Murphy
Spandex Justice
since when is tape archival quality? It's barely backup quality. I've had way more properly stored tapes fail than I have properly stored optical media.
Treat optical media like magnetic media (store in cool dry place) and use high-quality media and you'll get far better results than tape.
Add in the speed at which tape drives become obsolete and tapes hard to obtain, while CD's are still readable. And I've found optical to be a superior archive medium.
If you examine the study cited you'll notice that the study is for optical media in harsh conditions. Additionally they specifically state "It is demonstrated here that CD-R and DVD-R media
can be very stable (sample S4 for CD-R and sample D2 for DVD-R). Results suggest that these media types will ensure data is available for several tens of years and therefore may be suitable for archival uses."
Rename all of the files so they have filenames like "Teen_Lesbian_fff_Hot!Hot!Hot!.avi". Now make them available through your favorite p2p service. Even better, prepend these files with short snippets of pr0n. You'll find that years later you can kick up just about any p2p client and you'll find your files are still available.
Doesn't it make you feel good to know that our freedoms are protected by politicans, lawyers and journalists.
While I love raid, RAID is not a backup - raid is about availability and consistency. So if you delete one item in a RAID it is SUPPOSED to be lost to the entire array.
/. readers, but it's two 1-line scripts and I've seen them on here before :)
/etc belongs in "current"
In everything I've read, the moral definitely seems to be harddrives, lots of harddrives, for price performance. I'm assuming you have a reasonable LAN or can set one up.
Here's the setup I haven't finished implementing yet: PLEASE give me any comments about it to help me improve my setup.
1. Setup a file server using at least one big, inexpensive disk. (This can also be a desktop as long as it can reasonably serve files.) This is your "USE" server.
2. Separate you files (on a per-directory basis) into categories based on how frequently they are changed. The important consideration is: 'If a file is changed/deleted from USE how long should I wait delete a file in the backup' Personally, I only need two categories. "current" = a month or so depending on disk space and "archive" = never (family pics, videos, etc.)
That means that if I delete something in my "current" tree _AND_ I don't notice for a month, my backups will delete it and it's gone forever.
3. Setup a 'backup server' using at least one inexpensive hard disk. Set your backup server to login to your USE server and sync your files.
It should be able to do both "full" (copy everything) and "incremental versioning" = "IV" (if something is changed, keep BOTH copies, marking them appropriately) backups. Neither of these kinds of backups should ever eliminate any information automatically - they should just add information.
4) For me, I'd run:
1) An IV backup of "archive" every night.
2) A full backup of "current" every week.
3) An IV backup of "current" every night.
4) A job that deleted the oldest backups of current every week.
Notice that I'm _never_ running a full backup of "archive" but I'm also _never_ deleting the backup.
Notes:
rsync or rsync over ssh is my preference for doing this kind of backup. It works very nicely, but I'm too tired to get it right just this minute so I'm leaving IV/full backup commands as exercises for other
cron is fine for setting it up automatically.
wget has similar functionality to rsync for a website and you don't need any privileges.
I think most of
Do make sure you log the output of your syncing software. Also make sure you monitor disk usage. If you want to be fancy, it could keep all of the full-backups of "current" until space is short (with a reasonable margin) and then always delete as many of the oldest ones as it needs to to make enough room. This means your number of snapshots will vary with disk space - some people think that's evil.
This system scales reasonably well - for more size add more harddrives per server and/or more servers. For redundancy add more backups per live copy. As long as you can keep it organized and your network handles it, there's also no reason a USE server can't be served by two backup servers or a backup server can't also serve several smaller workstations - or any combination thereof.
Do not add multiple harddrives to a backup server for redundancy. These servers are essentially free and you get much more redundancy (and some scalability) if you use two backup servers. With a setup like this, any server should only have one copy (excepting multiple versions of the same tree)
You could just do a full backup of current every night or whatever, and you could have many possibly more complicated "current" backup schemes. But for me the total size of "current" is massively smaller than "archive" so it's really not important. Remember, having more of these isn't more redundant - they're all on the same drive.
This backup server should generally run no services except possibly ssh and certainly shouldn'
Looking for freelance Actionscript (Flash/Flex) or ColdFusion work and/or freelance developers. Email me, put Slashdot
I have been in the same position the Author discussed, and I have come to ONLY negative conclusions. In a few words, and I hate to say this, but buddy:
WE'RE FUCKED.
Digital is a loser's proposition. backing up to analogue or even digital data on analogic substrates (such as DV tape) fail. Simply nad purely.
The *only* thing that comes close is some kind of RAID, and those, even with the plummeting price of storage, are still too expensive given the needs.
Also, a RAID assumes a continuity of several things that are not likely to be continuous:
With Video:
Framerate, number of lines, colour depth, aspect ratio, file format, compression format, Operating system compatibility, etc etc etc. All of these things are variables.
With Audio:
sample rate, compression format, bit depth, file format, etc.
Basically all of it points to very bad places.
I am fairly well convinced that our age will simply disappear. They will find our garbage, the few books not pressed on acidic paper, our paintings (fat lot of good the abstract stuff will mean to them) and drawings, that's about it. the rest will just be shiny little bits of crap in the landfill.
Since we will have used up all the dense energy forms, they will be appalled at the energy requirements just to get the few remaining museum piece devices to work. Archiving the 21st century will be impossible. To the 25th century, the 21st century will be seen as a dark age - not only for the holocaust of the die caused by the failure of the petroleum based economy, but from the simple fact that very little of the information formats we are totally geared into will survive, including this note on /.
His problem of saving personal video is just the tip ofthe iceberg. His problem is the problem of our very civilisation, writ small.
That's why I am abandoning video, and going back to painting. In 500 years, my painting CAN survive. the video simply won't.
RS
Shoes for Industry. Shoes for the Dead.
Rsync ( http://rsync.samba.org/ is really great for backup of Unix-like systems. The ability to hardlink identical files allows me to store hundreds of daily full images of 100GB of sources to a single target 250GB hard disk. Rsync is very smart about moving only changed data over the network, resulting in speedups of 10x to 100x. This allows me to do full backup on my offsite colo without using a lot of bandwidth. Note that Rsync is great for Mac/Unix/Linux, but it does sometimes have problems with windoze clients. But then, so do I ...
Dirvish (originally written by jw schultz) is a Perl wrapper around Rsync. It facilitates the scheduling and management of Rsync based backups. We have a fairly active mailing list and contributions from around the world (open source is so cool!).
Backups should be safe against:
Backups should be automatic (or they will not get done) and cheap (hard disks are cheaper than tape, and much cheaper when you use hard linking). Rsync stores the data in a file system closely approximating the original, which facilitates restores.
If a cheap electrolytic filter capacitor dries out in your power supply, and the 5V output decides to start making a 15V squarewave instead, everything in your computer case will get fried. Including every one of the RAID disks. External USB enclosures (or airgaps!) protect against host and power supply failure.
If I was really paranoid about protecting my data, I would run a long ethernet cable to a nerdly neighbor a few houses away, and put a second dirvish server there. While I do rotate my drives into ziplok bags in a fire-resistant safe, the maximum credible accident (a furnace explosion) would tear open the firesafe. If I was paranoid and rich, I would use a high bandwidth VPN connection to a big disk in a colo machine in a different city.
The best backup is server-pull, frequent, automated backup onto multiple R/W media in multiple places, and frequent checking of that data. The closer you can approximate this, the more secure your data will be.
Keith
Keith Lofstrom server-sky.com
For the past five or six years, I've been taking my data, applying steganography techniques to encrypt it into the background of porn images, and then distributing those images via usenet and a few porn sites I've whipped together (ok, ok, the bangbus videos.)
At any time when I need to recover the data, I just use google to find someone with a copy of my data, download, decrypt, and voila!
This is my cheapskate's Network Storage Device!
Infomation now is much more perminant than it was in the past, and digital has improved this a great deal. The amount of information we generate these days is enormous, far more than ever before the digital age. Thus it's not supprising much of it gets destroyed. For that matter, most of it isn't worth saving anyhow.
Books are not such a perminant media as you might think. They wear out, and can be destoryed. A good example is the Mayan Codices. Records seem to indicate there were thousands, however Spanish priests burned them as "works of the devil" during the European conquest of the Americas. Today only 4 remain.
Digital data can be so perminant because it is so easily copied. Perminance of data does not come form trying to make a single, eternal copy, but from having many copies all over the world. Digital data can be copied for essentially zero cost very easily. Thus it's easy to give it a great deal of robustness. Also, as new formats come out, you simply copy and convert the data. I have data on my harddrive today that orignally existed on 5.25" floppy for the Apple II. It has simply been copied and converted a number of times.
Finally, it's not like book are going away. On the contrary we publish millions of works a year amounting to billions of books.
You seem to have a false sense of perminance, as though in the past things were archived forever. That's not the case, actually, most data was lost, that's one of teh reasons we have such an incomplete picutre of history. You don't even know all that was lost, because the record of it even existing, if there was one, is also lost. What has survived is by chance, or by effort, not because we had some wonderful archival system.
You don't have to have something on an immutable, indestructable medium for it to survive. The Nordic Legends weren't written down for centuries, yet today we still have them. They were passed down, as an oral traditon for generations. There was no perminance to them other than stories in people's minds, yet they've durvived thousands of years.