Slashdot Mirror


User: ccGecko

ccGecko's activity in the archive.

Stories
0
Comments
7
First seen
Last seen
Profile
(view on slashdot.org)

Comments · 7

  1. I have installed this product multiple times . . . on New 25x Data Compression? · · Score: 1

    . . . so I might be able to clear up some confusion. The word 'compression' is probably not the right choice. 'De-duplication' is probably a better word. Try this: "ProtecTIER can achieve a 25:1 de-duplication ratio." That sounds more accurate to me. Currently it works as a virtual tape engine. Take 10+ TB of disk and attach to a Linux server (x86_64 only). ProtecTIER makes that disk look like a tape library and tape drives filled with tape cartridges for use by an enterprise backup system like Veritas NetBackup, IBM/Tivoli TSM, Legato NetWorker, etc. Most large companies today use a pretty similar backup strategy: Fulls once a week, incrementals the other days; weekly fulls are kept for 2-8 weeks, 'monthly' fulls are kept 2-6 months, daily incrementals are kept for 7-21 days. Depending on the retentions chosen, that's 10-30 or more copies of the same data, plus the maybe 5-10% that actually changed. ProtecTIER gets the 25:1 ratio by eliminating the redundent copies.

    The algorithm is pretty elegant, actually. It holds a meta data index in RAM. As data comes in (at rates up to 200MB/s) it looks for a similar data set already stored. It reads the old data in, does a diff against the new data, stores the unique data untouched and uses pointers to refer to the duplicate data. With this method even if the system is completely wrong about which existing data set to match with, the data will be safely stored (with a low de-duplication ratio in this instance).

    Yes, the product works as advertised. If you don't have several terabytes of data to protect in an enterprise environment, it's probably not for you. But, if you do have a large environment and are tired of dealing with tape, this product rocks.

  2. This will be expensive, but . . . on Large-Scale Video Archiving? · · Score: 2, Informative

    I am an Engineer for a company that does only storage, so I might be able to offer some suggestions. The best solution would probably be SamFS, which is a Hierarchical Storage Management product developed by LSC software, now part of Sun. SamFS runs only on Solaris Sparc, so that means a Sun box. Your reqs. would max out an E450, so you should look at a 4500 or 4800 at the minimum. For disk, avoid Sun T3's like the plague. They suck. For your needs, a Clariion FC4700 running RAID 3 is perfect. So perfect, that Sony just signed an OEM agreement to sell Clariions with their video editing solutions. For tape, I would suggest LTO drives in a StorageTek L700 library. SDLT is too new to be trusted. Also look at AIT-3 in SpectraLogic Gator 64000 libraries. If you have the cash, the ultimate tape solution would be STK T9940 or 9840B drives in a StorageTek 9310 powderhorn (as seen in the movie Eraser). Unfortunately, a powderhorn with no drives is about $200k, T9940 drives are $35k each, and 9840B drives are about $30k. Good luck.

  3. Re:Why the reluctance... on Early Man: The Cause of Mass Extinction? · · Score: 1

    OT, but I'm guessing you saw that giraffe killed in an anthropology class of some kind. I saw that one in college, too. There's a sort of behind the scenes part you may not have seen that my professor showed us: the giraffe was actually killed by a couple bushmen standing behind the camera with rifles. Too funny.

  4. Mach 2.1 record for Production? Hardly . . . on NASA Prototype Plane Scheduled To Attempt Mach 5+ · · Score: 1

    . . . unless you only consider NATO countries and you classify the SR-71 as either not a standard jet (being a scramjet) or not production. Since espionage and aerial interception is the hot topic right now, I think it's appropriate to single out the MiG-25. The Soviets built the MiG-25 for one reason: to intercept the US B-70 Valkyrie bomber. This was a four-engine, MACH 3+ bomber that the US never even produced (except some prototypes which I think all crashed during testing). But, the Soviets knew about the project and countered with a MACH 3+ fighter, which became the MiG-25.

  5. Storage Considerations on 30+ GB Databases On Unix? · · Score: 1

    Before I get to the storage, yes Sybase works on Linux, and yes, cross-OS data migration is possible (and actually not that hard) with Sybase. Where I work we replicate a production Sybase database from AIX to a reporting server running HP-UX. Multi-hosted, network-connected databases is one of Sybase's strengths.

    Anyway, on to the storage. Sybase works best when you give it raw devices, which if I remember correctly Linux doesn't support (yet). So, your stuck with a filesystem. I'll let other, more competent linux fs folks advise you there. Databases stress two things hardest: memory bandwidth and disk I/O. Memory bandwidth can be best dealt with on x86 boxes by getting Xeon-bases systems with as much L2 cache as you can afford in addition to as much main memory as you can afford. As for disk, forget IDE. Go SCSI or Fibre Channel all the way. Definitely use RAID, but before you choose which RAID level, consider your usage of the database. If 80% or more of your transactions are read-only, then RAID 5 is okay. If more than 20% are write, DO NOT USE RAID 5. You will regret it. Every write on a RAID 5 volume requires 2 reads and 2 writes to the physical disks. You will notice this big time once the write mix passes 20%. In this case use RAID 1+0 aka RAID 10. This is different from (and significantly better than) RAID 0+1 for reasons I won't go into. Use hardware RAID. Without a ballpark on your budget, I have no idea what is realistic, but get a hardware RAID system with as much cache as possible. Spread the RAID volume across as many physical drives as possible. One last thing: spend some time developing a solid backup strategy. This step is so often overlooked because it doesn't affect you until you have a problem. Don't make that mistake, and most other problems can be recoverd from. Good luck.

  6. Translation not all that new . . . on Is The x86 Obsolete? · · Score: 2
    DEC developed a technology called fx!32 for the Alpha processor version of Windows NT. This was available when NT 4.0 came out in 1996. Not having used an alpha-NT box in several years I'll have to just assume it's still around. Basically, fx!32 runs x86 binary applications in emulation mode on the alpha all the while watching the execution and translating the binary into a native alpha version. The first several times you run it the application gets quite a bit faster each time. At best it's still slower than a comparable intel box, but it's better than having two machines. Here's a link to some info at compaq:

    http://www.digital.com/amt/fx32/fx-r elnotes.html

  7. Thoughts on commercial solutions on Unix Backup And Recovery · · Score: 2
    I haven't read this book yet, but others have touched on its brief comments on commercial products and have asked for more information on what's available commercially. Since my current job is head of distributed backup for one of the largest private companies in the world, my experiences with the three biggest (by market share in fortune 500) commercial backup products might be of interest to some. I currently use Legato Networker on Solaris, but I have evaluated Veritas NetBackup and IBM's Tivoli Storage Manager (formerly ADSM). All three have a lot in common. Each has a server that runs on the major commercial unix platforms as well as nt, each has clients for even more OS's, and each supports a wide variety of tape and optical drives, or you can write to files on a hard disk. All have modules to provide backups for the major databases and database-driven apps, like Oracle, Sybase, Informix, SAP R/3, Lotus Notes, MS SQL Server, MS Exchange, etc. All three are actively developing bare-metal recovery solutions for the major (read: money-making) platforms. On comparable hardware, the relative performance is a wash. All three support HSM to some degree. The three are radically different under the covers, however.

    I'll start with Legato Networker. I have kind of a love-hate relationship with with Legato. The product has many strengths to recommend it, but also many significant weaknesses. It has a good graphical interface on both unix and nt. The nt interface is a lot better for configuration, while the unix version is better for operations and monitoring. Both GUI's connect via the network to the backup server and are installed with the agent on all client machines. It also has a well-rounded set of cli tools that again are network-based and installed on all clients. In general, everything in the gui can be done from the command line, but some of it is rather painful. Still, if you are planning to support the product 24x7, you better learn the command line for those nights when the VPN server craps out and you have to dial-in by modem instead of using DSL. The overall architecture is well thought out and works pretty good for the most part. I can sustain 50MB/s on a Sun E450 writing to 10 DLT7000 tape drives in a single robotic library, and have seen the peak go over 80MB/s. The biggest weakness of the current version is the index structure. This is the system that stores which files were backed up from which client, when, and to what tape. Legato uses a hacked-up b-tree structure stored in compressed binary files. The lookups are pretty fast, but it can choke if you are backing up many streams of small files simultaneously because it can't write as fast. The real problem, though, is that the indexes get corrupt too easily. The result is a lot of time spent cross-checking and recompressing the file indexes. The media index is worse because it can't be repaired. If an error is found, you have to restore from an earlier version (the media index is written to tape several times a day). This doesn't happen very often, but it shouldn't happen at all. Acknowledging the problem, Legato will be replacing their index system in the next version. Another annoyance is the lack of a decent global management utility. I have many E450 backup servers, each ignorant of the others, and each has to be configured seperately. The final major drawback is its use of a proprietary tape format. You can only read the data with Legato. Still, I throw several terabytes at the system each day, and it gets the job done, for the most part.

    Veritas NetBackup is the newest of these three products, but has come on strong in the large datacenter segment these products play in. It supports several advanced features like dynamic robotic tape library sharing, which is very usefull in a Fibre-Channel Storage Area Network. The index structure is flat-file based, so it doesn't get corrupt and is human readable, but takes up more hard drive space. That last part is a non-trivial point. My oldest Legato server has accrued over 120GB of index information. If this were flat files, I would need to buy a lot more disk than I currently have. Another positive is the data is written in tar format, whether to tape, optical, or filesystem. NetBackup supports more client OS's than Legato, including support for Linux, but not BSD. Legato has unsupported clients for Linux, NetBSD, and BSDi. The major drawback is administration. NetBackup is more complicated to configure correctly, particularly in a large environment. It is also harder to maintain as the environment expands.

    IBM's Tivoli Storage Manager is the only package that can backup the entire enterprise, from the Mac desktop in PR to the OS/390 in the datacenter. TSM supports just about any client platform you can think of that's still in use, except (curiously) Linux or BSD. For the index structure, TSM doesn't mess around: it comes with a custom version of DB2 specifically hardened for use with TSM. Because it uses a DBMS, TSM has by far the best reporting abilities of the three. You can buy a package of reports from IBM, or roll your own using standard SQL. Another major advantage is the backups are 'incremental always,' to use the IBM marketese. The first time a client does a backup, it is a full. From then on, only changed files are sent to the server. While the other packages support this, rolling through all the incrementals in the case of a full restore is painfully slow and requires a lot of tape mounts. TSM can do this because of the DB2 index system and very advanced media management inherited from the mainframe world. Like NetBackup, TSM writes all data in tar format. All this power comes at a price, unfortunately. TSM is extremely complicated to configure across a large enterprise and appallingly expensive.

    On a final note, a word of caution: backup administration is the most thankless job in all of IT. No one notices the 99+% of backups that run successfully every day, but one failure on a business-critical system and you get crucified. Also, be prepared for your damned pager to go off at the most unfortunate times, day and night. To anyone considering a job as a backup admin, Just Say No. Trust me.