Automated CD/DVD Archival?
An anonymous reader asks: "Our department used to use a Cedar Technologies Desktop CD-R Publisher for fully automated backup of data (~2 CDs per day) controlled by a Linux PC. The publisher just broke and we are looking into a new backup solution to automatically burn and print CDs or DVDs. Solutions for CD/DVD duplication are available for Windows and Mac (for example: Primera and Rimage [which acquired Cedar in 2000]) but not for Linux. While a Mac would be OK, none of the manufacturers seems to offer scriptability or a command line interface which is essential for our task. Tape and HD backup are not an option - the data is already mirrored on RAIDs. Has anyone set up a similar archival system using Linux?"
You can take it to home/beach in the weekends.
Just go with a SCSI jukebox; it should work fine on Linux
with "mt", "mtx" and some shell woodoo; ours did.
You might perhaps check if your vendor supports standard
SCSI commands, though.
Tape and HD backup are not an option - the data is already mirrored on RAIDs.
What does one have to do with the other? Why are you insisting on CD backups when there are superior solutions available? It sounds like you're intent on duplicating a poor solution instead of examining the problem as a whole and finding the best solution for your needs.
Interested in open source engine management for your Subaru?
Get a DVD burner, write a perl script which will once a week do a complete tarball of directories to be backed up, and burn it to DVD (tarred up, preferably, for space saving), and daily write a tarred diff of the directorys to be backed up to the DVD. You could also set it up to send you an email, or whatever, with a manifest.
Next silly question?
Sounds exactly like what you need. There are many more tools like that. Good luck.
Sincerely,
Pan Tarhei Hosé, PhD.
"Homo sum et cogito ergo odi profanum vulgus et libido."
Recently at work I had to recommend an automated cd printing / burning system, and I went with the Rimage 2000i. We're a mac-only design shop in nyc, and needed the machine not for backup, but more for one-offs with automated labels, in a machine that was networkable.
One of the things on the Rimage website that's kind-of misleading (at least it was to me) was that it NEEDS a windows pc in order to share the rimage machine with other machines, like a mac. But once it's setup, the machine works wonders.
What's interesting about this machine though, is that despite the ridiculous setup hurdles, after it all works they provide a fairly decent way of writing your own scripts to control the machine.
The entire device uses xml files in order to handle job requests, and the client they ship it with is actually just a beast of a java app. But the xml files are used for the imaging orders, the production reports, everything. They also have a fairly extensive sdk that allows you to do pretty much anything.
I had an unfortunately difficult time setting this thing up, but the tech support (while their english was a little lacking) were actually incredibly knowledgeable. One of the things they told me was that almost no-one who buys this device uses the provided client. It is designed to be integrated into custom work solutions, so for you this might actually be appropriate.
If you're looking for a solid dvd archival device / printer that has an autoloading function and is fully scriptable, the Rimage 2000i (or any of their devices higher end than that one) could work.
If he says "Tape and HD backup are not an option", well,
I'd think he has considered all the other solutions.
There are cases where CD archiving is the only solution.
For example:
- Legal requirements of read-only media. My case.
- If archives must be guaranteed readable by common
hardware (and I mean COMMON, try buying a tape reader
in your favorite supermarket...)
Sure, backup on CD is a pain, but this was not
his question.
take an old linux box with a burner. Set up an automated backup burn... and use a LEGO Mindstorm setup to pull out the disc, slap an autmatically printed label on, and put a new, blank disc in. Sure it's not the most cost effective or efficient way, but it would be damn cool looking.
will backup to CD's no problem. www.amanda.org
Probably end up costing you $0.50 an hour and your data will have the added benefit of being held off-site.
I would not name my company Rimage -- I wouldn't be able to stomach the illicit alternative connotation.
Hire a monkey. Or a college student if you want cheaper.
I must say I have been horrified by my experience with a Primera Bravo box. Not because it's bad -- it's really great -- but because there are no linux drivers, and the Windows methods are absurdly awful.
To burn a disk, you go into a GUI and mockup (or just load) an image to print on the disk.
Then, you print it to file -- something.prn
Then you go into another GUI and set up the task, picking an ISO image, and the image file you just made, click here, click there, then burn.
That works just fine for 40 of the same disk, but if you want a different image on each (different date or different text, or the ISO filesize) you need to make each change manually (or with tags) and then print to file and then set up each task.
Man.
In unix/linux, or with command-line tools for windows, even, that would be:
create_postscript_with_substitutions [inputs] > printerfile.prn
burn_image_and_print isofile printerfile.prn
Done. You'd be able to do everything this guy wants and more with 10 seconds of typing. You'd be able to automate processes. And it's not hard. Primera's been selling this stuff for years, and yet, no Linux support, and no command-line support.
If this had Linux support, or even DOS command-lines, I'd recommend it to everyone I meet. As is, it's an anchor.
I'd like to do (somewhat) automated backup's as well... tho more on a homesize scale...
I'm trying to back up my pictures and my mp3's on a regular basis, but at 15 Gig's for one and 56 Gig's for the other even putting them on DVD can be a pain.
Every app I've used (at least on XP) can automatically "split" the files over multiple discs but they all use their own format for it making recovering the file difficult if not impossible with out the original program.
Is there anything that will split on the file structure so I can just read the files like they were burned normally?
Right now I keep adding and removing files from the "to burn" list and try and get as close to each disc limit and then do the same for the next disc... makes me put off my backups for longer then I should....
Wiwi
"I trust in my abilities,
but I want more then they offer"
Burned DVDs and CDrs have piss poor shelf life for archival material. RAID arrays can disappear at the snap of a raid card's whim (yes - it happened to me last week).
Please - re-evaluate your solution before you actually need to recover data or get (gasp) audited and need every single file to work 100% for the auditors.
Recordable optical media sold retail are not up to the standards of archival requirements of most governing bodies (like the SEC). There are optical WORM drives that are used by medical data centers and hold only 30GB on a huge platter. And most of those are getting retired for other methods.
(this is not a plug, it's just what works best for us)
There are many tape solutions like Exabyte's VXA-3 with 160GB native storage space on an $80 tape. Granted $/GB is higher than a DVD-R - the tape will not let you down. The tape is equal to ~35 DVDs and writes at 500MB+/minute.
We have an Exabyte autoloader with 10 tapes on a weekly rotation - and it was as close to heaven when we needed to restore a server. We also backup >400GB of data weekly from a few of our database servers - and need it to be there when something fails.
The entire rig will set you back about $3000 including tapes. This will give you over a TB of backup space. And the tapes are archival ready.
For your use - 1-2CDs/Day each tape will last you about 3-4 months. But it will also allow you to rotate your backups off site and give you much better utilization of space and much higher chances of recovering this data years down the road.
Please reconsider your backup solutions... if it's worth saving at all - it's worth being able to get it back later.
Check out Microtech's ImageAutomator line.
It's windows-based but I've set up a few of them to pull their data from a Samba share. Think of it as an appliance. I wrote some software on the linux side to control it - they have specs available for their file formats if you want to explore writing your own. That software has probably done about >30K CD's so it's definitely workable.
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
Product Brochure
This is probably overkill, but it is a really cool piece of equipment, and it doesn't rely on shitty windows software to do it's job. Unfortunately it costs $10000 fully loaded with 4 DVD-RW drives.
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
RAID isn't going to help you if your building goes up in flames.
RAID isn't going to help you if a file is deleted accidentally.
RAID isn't going to help you when someone comes in and steals your boxen.
In 15 years I have never, ever, EVER recommended that someone back up to optical media as their only recovery method. DAT drives can be had in the sub-$200 range, and the $/MB cost is cheaper than DVD media - and much more reliable.
I realize that this doesn't really answer your question, but it's an important point that shouldn't be overlooked.
http://unixhelp.ed.ac.uk/CGI/man-cgi?split
http://unixhelp.ed.ac.uk/CGI/man-cgi?cat
You can get them in XP through Cygwin.
You can't judge a book by the way it wears its hair.
on linux for a few years now. It's actually a fairly quick hack that I've never gotten around to fixing up. Each of my computers runs a backup program at 4am on monday morning, and saves a tar.gz of the dirs. The tar.gz is owned by an account that is not used by anything else. Then, at 5am, my main computer (the one with the burner) uses scp (with dsa id keys) to download the tar.gz files. It then mkisofs's and cdrecord's, so that I have a CDRW backup of my data.
The programs are pretty crap (only very basic error checking and file size checking are implemented), and I often forget to put a CDRW in the drive, but it works pretty well. The other main problem for your situation is that you need to burn multiple CD's per backup. It seems to be that a second burner (or a DVD burner) would work better than trying to get a system that can deal with multiple discs.
From my point of view, you want a dedicated machine to do this. A network writable SAMBA drive (limited to the maximum size of the burnable media?) would be my choice. Then, you can use generic backup software to backup neccessary files on whatever OS's you are running.
Don't save Windows XP! http://www.petitiononline.com/jjw1xp/petition.html
This is exactly what Tiger's (the new MacOS out next month supposedly) new Automater feature is for. The original demo of Automater involved someone setting up an automated DVD burn of a slideshow movie every so often.
l
Granted, this doesn't solve your problem TODAY, but it makes it ridiculously easy within a month or so.
http://www.apple.com/macosx/tiger/automator.htm
Depending on your needs, you might be able to get away with a handful of DVD burners. If, for example, you know that a backup will always take less than 3 DVDs, buy 3 or 4 burners. Then, make your backup, split it, and write a piece to each burner.
Spending a few hundred dollars on burners may be better than spending a few thousand on a robot.
--
There are many comments about how RAID isn't a backup scheme. While that's mostly true, I've known some shops that actually do use a RAID1 array as a backup scheme.
Not sure of all the details, but it works like this:
1. Setup 2 disks with RAID1 mirroring. Designate one as the backup disk.
2. Copy files to the mirror
3. Once a week, unmount the backup disk, take it out and insert a new disk.
4. Turn on the machine. When the array comes back online, the disks sync.
I'm still trying to figure out if this is a wise idea, but it's low-cost and effective. 1 Device and 4 decent-quality 200GB disks will cost you what-- $1200? How much do 200GB tape solutions cost now?
Anyone hear of some similar solutions?
"Can of worms? The can is open... the worms are everywhere."
$0.02,
ptd
I'm an animal lover -- they're delicious!
http://www.akfentertainment.com/vcic/encore.htm
How well it supports is anyones guess. We have an old Cedar CD duplicator that I would love find the protocal for. The strange thing is the robotics are controled via SCSI.
Anyone know how to capture SCSI data?
See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
2 years ago-
i te_series.asp
Worst tech support experience I've consistantly had was with rimage and it lasted over 1 month!! All I was trying to do was replace a broken drive they had assured me was standard way back in the presale days. They had no replacements, wanted me to buy a new unit at $6k. My old software which was great in its earlier days had gone through some 'improvements' post veritas buyout and wasn't working anymore. Rimage had a new $$product that was supposed to replace all these features and more. I demoed it all and it was horridly inefficient- IIRC- 4 cd-r's took 3 hrs to make 4 cds, nevermind the crashes. The old unit with 4x drives was much faster. Then they too told me nobody actually uses this out of the box, they just buy the dev kit and customize it. Then they recommended I do the same for my team of developers. Sounds like a handy tech center stock answer. Guess what. I dont have a team of developers! I suggested they use their team of developers to make their own stuff work as advertised.
After much strife they finally provided me with someone at another company who could provide me with a copy of their customized firmware so i could replace my drives with identical ones. The robotics finally gave up the ghost a year later.
My replacement for it was from a less known company handled through discmakers. Discjuggler runs it with a special autoloader product called imagejuggler which works very nicely (you can also use DJ.NET and they have a web interface too. They could improve a few things but all in all robotics are top notch and simple, no weird firmware, replacable (upgradable!) drives (as long as your burning app supports it, which dj handles many) and I can easily burn different images simultaneously. Furthermore both discmakers and DJ support was great.
http://www.discmakers.com/duplicators/products/el
and yup its only for windows. but at least you get browser ui.
Firefox &
Copy shows I tape onto a DVD with interface done with script. Any possibility? That'd be a good ask /.
#hostfile 0.0.0.0 primidi.com 0.0.0.0 www.primidi.com 0.0.0.0 radio.weblogs.com
Rimage Developers API
: .NET
:
The Rimage Client API
Tight integration with your application
Order and system status in real time - no polling
Fault tolerance
APIs in C, C++, VB, Java,
Industry standard XML support
Multi-platform support
Rimage Network Publishing Multi-Platform
Monitors a network folder for text file orders
Generates XML from file orders and submits to Rimage system
Multi-platform support
Based on poking around in (our company owns the DiskLab) the SDK, it'll do almost anything you could possibly imagine. I'd call Rimage up and talk to a developer -- they're friendly and quick on advice.
If you read the friggin' article sumary, you would see that what is needed is an automatic CD-R loader, not just the software to burn CDs.
We're using DVDs to back up 5 2TB NAS servers of video files. Our hardware is a 4-burner Flexwriter, similar to this one: http://www.amtren.com/products/sa4.shtml
The drivers come from Padus. We don't use their GUI much (DiskJuggler and ImageJuggler)...instead we primarily build jobs and submit them from a another machine via a perl script and their command-line tool, pfcnet. The level of automation we needed for this project simply required scripting. I haven't looked into their DJ.net product...it may have good potential.
If you like stats, we have the capacity of burning about 400 DVD's per day, but rarely hit it. The input spindle holds 200 disks. We don't often make it through the night without a failure that halts the burning. Don't get me wrong...the machine is a workhorse, but it should not be considered a lights-out unit. After we got our system built and pretty well tuned, we could pretty reliably back up one NAS (1200 files, 2 TB) in about two weeks. This includes a fair amount of idle time, especially on the weekends.
Also note that we're burning one file per disk, 60% are 1GB and 35% are 2GB. Project specs indicate that we only want one video (file) per disk. These files will never change (many are encoded videos 30 years old!), and must be accessible both by set-top DVD players as video and as ready-to-edit .mpg files on a computer. Fortunately, most modern set-tops will play .mpg files from a non-vide DVD, so we don't have to author these disks.
So far, we've burned about 5000 disks holding about 6TB of data. Disk failure rate is running around 5%. The Padus drivers do a good job of reburning failed disks, but with this many disks and the automation behind it, some still fall through the cracks. As such, we're printing a barcode on each disk, and all disks will be scanned at least twice so we can catch the missing ones.
Our biggest bottleneck initially was network speed. Over 100 Mbit ethernet, we could not keep four burners busy at 4x. We attempted to upgrade the machine to gigabit, but design limitations forced us to choose between fast network or a fast hard drive, and we decided the hard drive was more critical. Part of this was due to a high failure rate burning at 4x, and anecdotal advice that 2x disks have better shelf life than 4x.
After backing down to 2x burn, 100Mbit is adequate to keep the burners working full-time. The bottleneck now seems to be CPU utilization while burning and network transfers are both occurring.
All of the machines (other than the NAS's) are Windows boxes.
We recieved good support from our vendor (Amtren) and outstanding support from Padus. Both worked closely with us to resolve several software and hardware issues. Both admitted that we were pushing the system harder than any of their other clients and were eager to see us succeed. Extra special kudos go to Fred P. at Padus! :-D
...
In a related project, we will be producing video DVD's on demand. Customers can choose one or more episodes. Those files are authored into a DVD, complete with an onscreen menu with titles, semi-custom graphics, a custom printed label on the disk and a mailable jacket sleeve. This process is up and running now, with no hands-on intervention, but hasn't been launched yet. I've written a perl module that facilitates this, weaving the Padus software in with dvdauthor and ImageMagick to provide an end-to-end solution.
...
That's all the trivia I can think of. I'm happy to discuss the project with anyone who's interested.
-dave
Who the hell thought that using a DVD array would be better than a disk array/SAN? Just the thought of a mechanical anything moving constantly to keep up with random access makes my head hurt.
;-)
If you don't have at least 16 DVD readers it's pointless. You'd need at least that just to saturate SCSI bandwidth. And all those moving parts. GAASAAAH.
And $10000+ when 1TB RAID would cost the same at the time of purchase... double GAAAH.
*Cough cough* *wheeze*
I sure hope they fired the guy who thought of that brilliant idea.
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON