Disaster Recovery?
M. Grochmal asks: "A three-alarm fire at Southern Maine Technical College burned through the Computer Technology and Technical Graphics departments. We have salvaged most of what we can, but cannot return into the building until the asbestos risk decreases. The hard part now is rebuilding the networks in another building. The schedules have been rearranged, many of the department students and faculty are volunteering to relocate salvageable computers, as well as install/configure the new computers that will be arriving in the next day or so. On top of that, we have to rebuild the Netware servers, restore from backups, and get them networked again. I was wondering how other Slashdot readers were able to recuperate from unforeseen damage to their work (and learning) environments. You can read about the fire here and see what the schedule is. Wish us luck."
I was visiting some friends at your campus just this past December; sorry to hear about your loss.
Sadly, I can't give you any suggestions on how to better recover from your current situation -- seems like what can be done now is being done. It seems there's not been much of a response to this as yet, so I'll go out on a limb and offer some ideas that may sound obvious, but forest and trees and all that.
I'm reading between the lines, but I suspect that prior thoughts of backups and disaster recovery were shot down by the PHBs as being too expensive or time consuming. Here's your chance!
You now have a rare opportunity where proposals for FUTURE disaster recovery would actually be listened to!
First off, document what you are doing now! Write it down in a notebook, carry around a pocket tape recorder, use a PDA, hire some students who will answer a phone so that when something comes to mind, you can just dial a phone and get it recorded; whatever, but document what it is actually costing to recover! And not just the hardware/software expenses either! Increased calls to the help desk. Impact on faculty and students' schedules. Reconstructing the network topology.
Anything you can think of, now, document it! If, upon later review, some things are questionable, you can omit it then. But, if during that later review the thought was: "Gee this took more than we had thought it would, too bad we didn't keep track..." Get the picture?
So, now you'll have some kind of baseline as to what the actual recovery costs were, in this case. With that, you can now make a strong business case to implement a solid disaster recovery plan. Include server configs, backups, inventory of hardware and software... in short you've got a list of what you actually had to do to recover from this disaster; use that to identify what you'd need to do again.
Other ideas off the top of my head: Get a fire supression system. Split some of the equipment (e.g. labs) across multiple buildings so that if one burns down, there's some infrastructure that is still usable. You'll have a working system that you can refer to while rebuilding the destroyed system, too.
First off, servers belong in a nice server room, not in a closet near the lab. It may be ok for your home network, but for a network at a college or company, this is a must. Also, if you can, have the server room in one building, and labs in others. This way your lab may go up in smoke and your servers will be fine, or your server may get damaged, and your clients are fine. When doing a server room, make sure it has elevated floors (about 1 foot above rest of the floors floor), conveyance trays, redundant air conditioning, FM200 fire supression, TSM or some other backup solution, possible offsite mirroring of servers, NO WINDOWS (the glass kind, not the OS kind), UPS's and if possible, make it a hardened, 1 floor building with the chillers located inside (storms can't rip chiller off ground if they are inside), generator backup and some bathrooms, food storage, and maybe even a shwoer facility if admins must pull an all nighter. This may sound silly for a school, but that depends on how important your data is. We used to have servers serving the labs all over campus, but now they are all centrally located in the data center. Management is easier, but then we have more to loose if our data center is hit. That's why we have a halon fire supression (until new center is built, and it will use FM200) and a disaster recovery plan including a hotsite. Have all of the servers centrally located also assists in running backups either via a networked TSM type solution (Tivoli software, IBM hardware) or individual tapes (not reccomended, but better then nothing).
Gorkman
Contrary to much popular belief, a good data recovery contingency (off-site back-ups, etc...) is only half of a sound DRP. When it comes to recovering from a cataclysmic disaster of this nature - the second, and equally critical component of a well thought out DRP is an all-inclusive BCP (Business Continuation Plan)...
Without this vital aspect, companies such as Deutsche Bank (who were ravaged by the WTC disaster on 9/11), would have been down for days/weeks while attempting to relocate, rebuild and restore their data center operations...
I, for instance, work at a rather large, international fortune 500 company and we have BCP strategies that include a complete off-site location. This facility houses fail-over systems for all business critical processes including a 1.2 terabyte, mirrored SAP database that can go online within minutes notice, and a phone bank/workstations for our 50+ CSR's (customer service reps) and our global helpdesk. Even more, we frequently (twice yearly) perform non-production drills to validate the systems health and improve upon our strategies...
This is obviously a bit late for you, but I would suggest reading up on the matter a bit more thoroughly prior to redesigning your future systems and developing your next DRP...
Beer is proof that God loves us and wants us to be happy. -- Benjamin Franklin