Ask Slashdot: >2GB Backup Software for Linux?
Fer asks:
"Are there backup program for
Linux that do not have filesystem or volume
size limits? I am trying to make a full backup of
a 22 GB FTP server, using 4 GB TR-4 tapes. I have tried tar,
dump, afio, taper, and afbackup, and every one of
them either did not allow >2GB volumes or had
weird problems with >4GB filesystems. Currently
I am using dd to do the job, but I think there
must be another option. Any suggestions on free
programs which I may use?"
Legato Networker works great and can be used in the enterprise network.
Try this command and see if it works
/filesystem/ | cdrecord -v fs=6m speed=2 dev=0,0 -eject -dummy -
/filesystem/ | cdrecord -v fs=6m speed=2 dev=0,0 -eject -
/filesystem
/filesystem
mkisofs -R
and if it does remove the -dummy to make the real thing
mkisofs -R
You may have to change some of the flags. If your CDR won't do double speed remove the speed= (or change it to 4 if you are lucky enough to have a 4x drive) and if the drive is at a different SCSI id you'll have to change the dev= flag. I find that the fs= flag is not necessary but that it doesn't hurt either.
One caveat is if the filesystem is larger that 650MB you will have just have made yourself a coaster. You can check the file system size with
du -s
or, for more accurate results,
mkisofs -print-size
The Intel 32-bit(ness) or any kind system architecture has nothing to do with >2GB file size problems. 64-bit+ integers all used all the time and you probably dont even think about it, i.e. 128-bit encription, NTFS, timestamps, version numbers on MS binaries, etc.... It's just a bit more of a bother on non-64bit processors.
When we had 16-bit DOS, did it mean we can only have 64K files. You might say, "but 32-bit only addresses 2GB (signed) of memory or whatever". Programs don't even typically load 2-4GB worth of file at a time, but that doesn't mean we can't or shouldn't use the contents of a 80GB file, just because we have a 32-bit processor.
A true 64-bit file system can exist on a 32-bit machine, which would lead to having >2GB files. In fact, if you abstract the filesystem from the OS well, allow for 64-bit references, you could have any kind of large file system, like SGI, WinNT, etc... Maybe he just needs to expand the API / working integer sizes in the kernel to support true(r) 64-bit file systems. Maybe that's what this is all about.
Yes, the reason is not the speed, or weather it does backups properly.. it's a cost of operation, and a reliabality factor.. travan tapes cost between 15 and 25 $ each, where dat tapes are around 5-10 $. DLT tapes are more expensive, but have a capacity of over 35 gig un-compressed. It's also a reliabilty thing. Travan tapes are limited to several hundred backup-restore cycles, DLT/DAT are in the thousands.. which means that they produce less errors. DLT tapes are even more cool, they have a shelf storage life of upto 30 years, compared to the 5 years of Travan (I don't know about DAT tapes)
We use a program called Lone-Tar at the ISP where I work. We have a couple of +2gig partitions and it handles them nicely. It also has internal dialogs for setting up cron jobs, and tons of other stuff. I like it a lot, as much as one can like a backup program I guess. Lone-Tar.com.
...around ten boxes running Linux 5.2: Red Hat Linux 5.2, I assume - or perhaps SuSE. There is no Linux 5.2 as of yet, though - it's 2.2. - Wait ages for tar?: I haven't noticed tar being slow. What in particular is wrong with it for backup? That's kind of its point - tape archiver.
- Textmode amanda?: What's wrong with textmode? If something does the work, why is it bad if it's textmode?
There are plenty of people talking about how GNU tar is perfect for what they need. I'm not sure exactly how the lack of GUI backup tools somehow makes Linux not 'enterprise-ready' - perhaps you'll enlighten me? After all, if there is a deficiency in this department, it takes people who actually need to use this sort of thing to point it out so it can be fixed.Posted by FascDot Killed My Previous Use:
If ANYONE suggests Legato Networker, run away as fast and as far as you can.
We had this piece of crap installed on Netware and it SUCKED. Oh, it backed up and restored just fine. But it was literally an all day event to restore a single file. And the "user interface" (in quotes because it barely qualified) was the WORST I have ever seen for ANY program. And it required numerous patched NLMs to handle file-locking correctly. And it still brought down our servers regularly.
--
"Please remember that how you say something is often more important than what you say." - Rob Malda
If I want to backup 4G or 8G of data, and I don't want to have to switch media halfway through, what should I buy?
Bonus points if it's media that is still likely to be in fashion (and thus easily readable) five years from now.
I'm leaning toward the idea of not using tapes for backups at all, but just cloning everything to a spare hard disk, that is only ever mounted during the backup process.
If you take large amounts of backup and need to store it off site, you need tape. Tape is much cheaper than harddisks. Only maintaining one backup is also foolish. Media fails. So you need tow harddisks anyway. And are you going to have 20 of those 22GB harddisks for backups? When you transport the harddisk for offsite storage, the heads might fail. Tape has not been superceded.
I would recommend the Amanda backup system which we have used in work for many years and can deal nicely with these problems.
The 2GB filesize is actually a VFS limitation, so it applies to all filesystems. But it only happens on 32-bit Linuxae. And there's a patch that addresses it. Other 2G limits: MS-DOS FAT partition size (addressed by FAT32 or a real FS), IDE on non-LBA BIOSes (can be worked around).
If you want to use files larger than 2 GB (the largest number that will fit in a signed 32-bit integer), then get a 64-bit system to put them on. If you want to have simple, efficient, easy to code and easy to port seeks within your files, then you're going to want to be able to use a signed integer to seek back and forth (let's not even mention the trouble with mmap() if your files are larger than a pointer can index...)
The *last* thing Linux should do about 2GB files is try and use hack after kludge to satisfy people who want to use Intel chips but don't want to hear about their limitations.
On some Linux installations, I have used the BRU 2000 backup software. It costs a couple of hundred $$$, IIRC, but it is really excellent software with many features. So If you are willing to spend some money, that should work for you. I must defer to others, however, in the area of doing it with free tools.
Tape is more reliable. And if you ever work for a company that is audited by its shareholders regularly you will find that you are required to keep backups around for quite some time... For instance. A financial inst. may keep backups of the transaction journal forever! An insurance company i once did work for was required to keep weekly grandfather and daily incremental backups for 5 years... thats a lot of storage for HDDs :-)- -
--------------------------------------------
bash# lynx http://www.slashdot.org >>/dev/geek
Matt on IRC, Nick: Tuttle
----------------------------------------------
bash# lynx http://www.slashdot.org >>/dev/geek
Matt on IRC
Yes, for systems under 100 GB which don't need to keep historical data, backing up to disk is feasible, and in many circumstances is better than tape. However, that doesn't mean that tape has "been superceded" entirely.
afbackup is pretty painless to setup, speedy backups, can run over ssh, prompts by email when tape changes are needed, reasonable restores of entire backup sets, but is very slow for selected file restores.
burt is wicked fast for backups, tcl-based interface, imho elegant, and can run over ssh. afbackup was better documented and offered an emergency restore option that i preferred at the time.
i ruled out amanda because it is complex and tends to want a holding disk the size of an entire backup set.
about sean dreilinger
I use standard GNU tar v 1.12 to back up several systems to 12G DAT tapes; and have never had a problem using the '--multi-volume' switch to put an 18G filesystem on 2 tapes.
I've had problems with Tar (and the other commands) as well when the number of files I had was extremely large. This is regardless of file size, eg: I had 200,000+ small files, but storage size was about 1 gig or so and I had problems with tar.. I ended up just breaking down my backups into batches. something like:
/files/parta /files/partb
tar -cf part-a-of-tree.tar
tar -cf part-b-of-tree.tar
etc...
-Booya "No Try Not. Do or do not, there is no try." -Yoda
15 Linux boxes, 2 HP9000s, 5 IRIX, a dozen NT, handful of Macs, 4 VMS, and one NetApp toaster backed up through Linux NFS-mount (poor Linux 2.0 NFS performance is an advantage here... when mounted on the HP it pretty much killed the network).
The interface isn't that terrible. 5.5 is much better. We have had intermittent problems backing up a 36GB RAID filesystem (Linux 2.2) though.
It's far from free, and the server requires NT or a commercial UNIX. We run it on HP-UX 10.20. But for multiplatform backup on a high-end tape changer robot, you need o go the commercial route.
True, put in that context the previous 4 or 5 posts are true and okay I see circumstances that tapes are useful (and probably necessary) .. for backing up financial records for tax purposes and for multi-terabyte databases.
But,
I was responding to the article about backing up just 22GB and for the home or semi-pro tape is just not worth the hassle any more (tapes stretch , to counter the 'heads crash' argument) when multiple disks (can you say RAID even if you want online reliability of data) can be had for much better performance.
Okay I admit maybe 'superceded' was a bit strong given the arguments raised, but I think anyone has to admit that storage capacity of HDs (vs. cost) has shot up incredibly against tapes in the last few years.
Delphis
oops forgot to mention that kdat uses "tar" which is good if you have multiple servers incase you need to untar it on any unix box :)
:)
(did i mention that it has a nifty gui front end
im currently doing a 8+gig backup using kdat from kde and it works well (no compression tho)
:(
:)
we had some trouble with backing up to another server via nfs mount that would only allow me to do a 2 gig max file
if i use the tar with compression i can get up to 24 gig backup to tape (12 gig without compression)
but kdat allows you to span tapes and keeps a nice little index of all previous files backed up on that tape (very nice gui app)
not sure if that will help but it's all i got rite now
That limit only exists on 32bit machines. 64bit Linux platforms, such as Linux/Alpha, have a truly astronomic file size limit, IIANVMM.
They laughed at Einstein. They laughed at the Wright Brothers. But they also laughed at Bozo the Clown. -- C. Sagan
You can also use AMANDA backup, which I use on my GNU/Linux machines for backup. It seems to handle the large backup sizes acceptably.
Finally, you can always just split up huge files using dd.
Cheers,
Joshua.
--jon. Postel is dead. May we all mourn his, and our, loss.
I think that a lot of decisions will depend on what sort of disasters you want to recover from. Planning for only 'normal' aging drives having media failures is a lot different (and easier) than planning for your office burning down.
There are all sorts of potentially problematic tradeoffs with various sorts of hard drives for backup. Things like:
I suspect that everything short of Zip/Jazz cartridges are more susceptible to damage when removed than tapes. Especially if they aren't mounted in an external enclosure and are carried around with the drive electronics bare.
If you make more than one backup, tapes may become substantially cheaper than more and more HD space. If you store and handle lots of backups, again tapes may become easier and cheaper to deal with than other media.
I have a fairly high trust for the long term durability of tapes sitting around. Manufacturers test this stuff and will tell you about it, including cautions for temperature limits and so on. Do disk manufacturers give equivalent figures for removed drives?
For personal home use, I think that anything is better than nothing; a second HD is cheap and easy (especially if you aren't worried about things that would take both drives out at once, like overheating or fire). For professional use in an office or the like (even a home office), I'd trust tape more. The up-front costs are bigger, but the benefits can be substantial, and there are things that are far more convenient with tapes that are very important for professional use (such as periodic offsite backups).
If you trust tapes they can also be used for archival purposes (where you delete the data off the HD after storing it on tape and verifying the tape) as well as normal backups. Depending on how much call you have for this, this may also represent a money savings with tape over disks.
A couple people have written in asking how to do a restore opereration with "restore" (the companion program to dump). By far the easiest way is to do:
/dev/st0
/dev/st0
# restore -i
(where 0 is your tape drive number)
However, If you have multiple filesystems on a single tape (like the example above), you must first use mt to fast-forward to the correct tape-mark. Let's say we want to get the second file-system off the tape:
# mt fsf 1
# restore -i
This will then put you in restore's little "shell" for adding files/directories to be restored. For example:
-- 8- *snip* ---
restore > ls
.:
.automount/ bin/ lib/ proc/ usr/
.bash_history boot/ lost+found/ root/ var/
.mc.hot dev/ misc/ sbin/
.mc.ini etc/ mnt/ tftpboot/
.netwatch home/ net/ tmp/
restore > ?
Available commands are:
ls [arg] - list directory
cd arg - change directory
pwd - print current directory
add [arg] - add `arg' to list of files to be extracted
delete [arg] - delete `arg' from list of files to be extracted
extract - extract requested files
setmodes - set modes of requested directories
quit - immediately exit program
what - list dump header information
verbose - toggle verbose flag (useful with ``ls'')
help or `?' - print this list
If no `arg' is supplied, the current directory is used
restore > add etc
restore > extract
You have not read any tapes yet.
Unless you know which volume your file(s) are on you should start
with the last volume and work towards towards the first.
Specify next volume #: 1
( it will now restore from the tape to your cwd)
Done!
--- 8-
Just to sum up, the example above opens the tape, lists the files, and adds "etc/" to the list of files to be extracted. Since this is a level 0 backup (a full non-incremental backup) I need not put in any other tapes and simply say "1" when it asks me for the next volume number.
The etc/ directory (with all its sub-directories) will be in whatever directory I started restore from. If you are doing a system restore, do it from "/".
-AP (Jordan Husney)
Here is our cron.daily/daily.dump file:
This dumps all three of our partitions out to a single tape. The 0 ("zero") option dumps the entire thing, as out tape drive is fast, vs. specifing a dump level > 0 (which is for doing various levels of incremental backups); The u, which updates a human-readable /etc/dumpdates file; B for the number of blocks ("kilobytes") the tape is long (this is your problem); and finally f: the device to dump to.
One of the things that really gets people is how to pass arguments correctly to dump. A little diagram might serve as an aid:
Hope that helps!We use the /dev/nst0 device to write to the tape three times without the thing rewinding. This is the key to putting more than one filesystem on per tape.
If anybody has any questions about using dump, I would be happy to help.
-AP
jordanh@remotepoint.com
WARNING A tape or harddisk in a fireproof container will still be destroied in a fire. Most fireproof containers are designed to save paper from burning by a combination of steaming away water and thermal insulation. As such, the internal tempeture of the container will easily get over 210 degrees F. Most tapes and harddisks will be destroied at that point.
I'd be interested to know what trouble you've been having with dump. I've been doing dumps of 2+ gig partitions for ages with dump and it works very well. Perhaps you're experiencing some other problem which is not related to the backup program you're using? If you can send me more information about where dump fails, I'd be happy to have a look at it.
Tapes are a great way to get a snapshot of your file system as of right now. With a sensible tape rotation, it's easy to go back and get a file as of a few months ago, i.e. before someone accidentally deleted the first 20 pages of the document or whatever, and nobody noticed until now.
Keeping stuff online with big hard drives is a great way to backup data that is static, i.e. scanned documents, but is somewhat less useful for stuff that is constantly changing.
My 0.02 $CAN
I use a commercial backup software called arkeia. You can get it from http://www.arkeia.com/.
There is also a free for personal use version for linux server/clients. Just go to http://www.arkeia.com/downloadfree.html .
Regards,
Oliver.
Both, I believe. To access more than 2GB, you need to use 64 bit file access functions. fseek(), for example, uses an offset value that is a 32-bit signed int, so it can only address 2GB of a file. I think that SGI and others use special functions for 64 bit file access (fseek64() for example), and leave the traditional system calls alone. It's going to mean recompiling everything, at the least.
I highly recommend the Sony AIT drives (I think Seagate or Quantum also sells a variation of the same format). They do 25 GB native per 8mm tape, at 5MB/s. The drives are under $2000, and tapes are around $60. It may be a bit overkill for what you need, but the speed is VERY nice. It does compression as well, but people that quote compressed capacities should be shot.
:)
They also have a cool feature that allows storing directory info on NVRAM on the tape cartridge - 16 KB or so.
And because it's Sony, it's definately likely to stay around. I think they still sell Betamax decks, and I kinda think they know what they are doing when it comes to helical scan recording equipment
Tar and others should work fine as long as you are writing directly to tape, instead of to a temp file. Linux has a 2GB (2^31-1) maximum file size, so if your backup software is trying to spool to disk before streaming to tape, it may fail.
Amanda handles this by splitting the disk files into 2 GB chunks and reassembling them when it writes to tape. It also deals well with network backups. The filesystem side backend is dump or GNU TAR, so it's fairly standard in that regard. I've had no problems with 8+ GB filesystems using Amanda.
I would not recomend using e2fsdump - AFAIK, it's still beta, and I had problems with the interactive restore and some other issues. Because it accesses the filesystem at a lower level than standard file access (I believe), I'd be careful with trusting important backups to it.
TAR definately a safer choice.
BTW, I have a question myself... does anyone know how to get TAR (or something else) to restore permissions on symlinks? Typically it doesn't matter, but Apache uses symlink permissions for the SymlinksIfOwnersMatch directive, and every time I restore or copy a web partition, I have to go through and fix all the links that are now root owned.
BackupEdge is the most powerfull backup program for linux (or any other unix) that I've seen.
It does do what you want, & has alot of other great features.
-great automatic backup/verifys
- backup recovery programs
- bootdisk manager.
downside: it's comercial
see www.microlite.com for details
"Nyquil - The stuffy, sneezy, why-the-hell-is-the-room-spinning medicine."
AFAIK Amanda uses dump or tar behind the scenes.
-- Wodin
there is at alpha.gnu.org and mirrors a version of tar which supports 64 bit file access and which can be used with glibc >= 2.1 to make archives bigger than 2 GB on 32 bit hosts. I just have downloaded it and it compiles and checks succeed without any problem. I just can't really test it because I have only 50 MB left...
If you split your backup in a few different files (say, backup ~ftp/pub/mirrors and then ~ftp/pub/linux or whatever) then you should be able to get away with tar.
This will change soon when SGI releases portions of xfs as open source, and when ext3fs is ready.
One more comment to add to the other replies re: usage of tape for backups:
Not only do most companies need backup rotations, they also need a backup for disaster recovery.
Ideally, you have a backup of your data kept offsite, in case of disaster. Tape makes this simple - it's readily portable. Every week night our DBA throws a couple tapes into his bag before heading out. If our building goes down in flames, we still have our data.
Insurance money can help to rebuild, but it won't get back data!
I have used a program called CTAR to cure these problems. Check it out here.
http://www.ctar.com