Slashdot Mirror


Planning for Survivable Networks

Priscilla Oppenheimer writes "Annlee A. Hines' book Planning for Survivable Networks, is quite a page-turner. Yes, that's surprising for a technical book, but I found it to be true. I was fascinated by the stories of real companies (Lehman Brothers, the Wall Street Journal, and others) that survived the 9/11 attack and resumed business quickly. There are also stories from other disasters, both man-made and natural, and information on companies that were not able to quickly resume business. The author summarizes the stories with explanations of what went right and what went wrong, with advice on developing your own disaster recovery plan." Read on for the rest of her review. Planning for Survivable Networks author Annlee A. Hines pages 320 publisher Wiley Publishing, Inc. rating 10 reviewer Priscilla Oppenheimer ISBN 047123284X summary Designing networks that can recover from natural and unnatural disasters
As Hines explains, Lehman Brothers had headquarters in Tower 1, as well as in 1,2,3 World Financial Center (across the street from the WTC towers). Lehman moved to a backup recovery location and performed cash-management functions the same day as the attack. The company was online trading fixed-income securities by the next day. They had 400 traders online when the NYSE reopened Monday, 9/17.

The Wall Street Journal (WSJ) published the story of its own recovery and Hines used that as source material for her book. WSJ had an extensive disaster recovery plan, based on lessons learned in the 1990 power blackouts in New York. After the blackouts and a subsequent fire in the emergency generator room, WSJ decided that it would never again depend on just one location being operational. WSJ opened other offices that could perform some of the necessary tasks to bring out a paper. Geographical diversity of resources seems to be a key to success.

When the 9/11 terrorists attacked the buildings across the street from WSJ's main offices, senior managers called for an evacuation, knowing that they could still produce the paper. The Wall Street Journal managed to publish a full newspaper with eyewitness accounts of the tragedy the next day.

Hines' writing is easy to follow. Although she delves into some technical details, with the requisite IP and TCP header depictions that you will find in so many networking books, the book can easily be read by managers and business people. Planning for Survivable Networks has many factual tidbits about disasters of all sorts, and although these are interesting, the primary benefit of reading the book is to gain an understanding of the characteristics of companies that sustained business after a disaster compared to companies that did not.

As Hines says, the companies that survived disasters all had disaster recovery plans in place. The plans were activated by decisive managers, who also promptly got their people out of harm's way. (If people don't survive, it won't matter much if systems survive.) Another point she makes is that the managers had to be adaptable. Not everything went according to plan, and it shouldn't be expected that it will.

The book opens with the author being rocked by a terrorist-caused explosion herself. She wasn't present for the 9/11 attackers. Rather, the bombing she survived occurred at Ramstein Air Base in Germany, 20 years before. A retired Air Force officer, she has dealt with threats all over the world for many years. Her direct command and control experience has taught her many lesson, which she shares with the reader in Planning for Survivable Networks.

Probably one of the most useful chapters, Chapter 11, "The Business Case," offers advice on presenting to management a case for a network continuity plan. According to the back cover, Hines has taught economics at a community college, and I would say that experience helped her explain the many costs involved in having a disaster recovery plan, including fixed, variable, direct, and indirect costs. She also explains the expected value of having a plan and how to sell that to management.

I recommend this book as an informative discussion of how companies can ensure business and technology continuity in a world with hackers, terrorists, natural disasters, and human error. It's a practical book, but also a surprisingly uplifting book, considering its technical content. I truly enjoyed reading about the adaptable human spirit that enabled managers and workers to keep their businesses going after the 9/11 attacks.

You can purchase the Planning for Survivable Networks from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.

27 of 115 comments (clear)

  1. The irony by teamhasnoi · · Score: 5, Funny
    of this book being on /. is enormous.

    My book on this subject is one page long.

    Page 1: Don't let Slashdot link to you.

    1. Re:The irony by Lennie · · Score: 2, Interesting
      This sounds like a bit of a time-critical section. :-) You shouldn't be using a perl-compatible regular expression for that (quietly slow string comparison) or if you do, you should make it fast:
      "/^http:\/\/slashdot.org/"
      (starts with), but best is ofcourse:
      if (strtolower(substr ($_SERVER['HTTP_REFERER'], 0, 16)) == 'http://slashdot.org') {
      exit ();
      }
      Well, just for completeness.
      --
      New things are always on the horizon
  2. I’m planning on dying in the disaster… by Anonymous Coward · · Score: 2, Funny

    â¦as I would prefer death to running a network these days.

  3. speaking of irony... by ed.han · · Score: 2, Funny

    from the article:

    "probably one of the most useful chapters, chapter 11, "the business case," offers advice on presenting to management a case for..."

    in light of the current economy, i find this particular chapter arrangement particularly funny.

    ed

  4. No mention of slashdot in the book? by Prince_Ali · · Score: 4, Funny

    Why didn't they mention the survival of slashdot in the face of countless disasters. The great troll strike of 2002 comes to mind! The revival of beowulf jokes, the lawsuit from Nat Portman and the hot grits famine that followed were all destructive but /. survived. Slashdot is able to survive just about any disaster whether in Soviet Russia or at home, and for that it should be commended!

  5. Worst Chapter Name Ever by teamhasnoi · · Score: 4, Funny
    Chapter 11, "The Business Case"

    Seems like that chapter is required reading these days.

  6. other survival books... by TWX · · Score: 5, Funny

    "Surviving Slashdot" by Oliver Clozoff

    "Surviving Slashdot" Illstrates how to build a corporate network that accepts large numbers of incoming connections from stories posted at Slashdot.org, while still allowing employees to make network connections that they need. Techniques covered include round-robin DNS with different servers in different geographical locations, multiple HTTP servers with load balancing, and smooth transition over to a volume web host. like Conxion or cNet at a moment's notice without significant downtime. Other Anti-Slashdotting tactics also discussed.

    --
    Do not look into laser with remaining eye.
  7. What about a natural outbreak scenario? by zptdooda · · Score: 4, Insightful

    "Planning for Survivable Networks has many factual tidbits about disasters of all sorts..."

    I wonder if that's included.

    When SARS hit earlier this year our disaster recovery planning team was faced with a situation they hadnâ(TM)t anticipated: potential quarantining of large numbers of staff with critical business-continuity functions.

    The building and computer systems would be physically secure, but staff would not allowed into the workplace.

    So there was a scramble to survey everyoneâ(TM)s job function and set up broadband and VPN access from home if needed.

    --
    Esteem isn't a zero sum game
    1. Re:What about a natural outbreak scenario? by bobbozzo · · Score: 2, Funny
      group that could stay isolated for some time. Ie, they don't have contact with other people and the outside world for a considerable length of time

      You mean like programmers?

      --
      Nothing to see here; Move along.
  8. Lehman Brothers by gorbachev · · Score: 4, Interesting

    Their trading floor might've been up in no time, but speaking as someone who worked with the Lehman Brothers in WTC on 9/11, I can say some of their other divisions weren't as lucky.

    The team I was on lost 2 months worth of work, because it wasn't backed up on a remote site. The version control servers were at WTC.

    If it wasn't for a single developer, who had made an unauthorized copy of the project on a floppy, we would've lost much more than just 2 months.

    Proletariat of the world, unite to kill terrorism

    --
    In Soviet Russia, I ruled you
  9. Re:I've made my own list of disaster lessons by I8TheWorm · · Score: 4, Funny

    I've been involved with disaster recovery plans since 1993 in Houston (hurricane seasons, a propensity for flooding). Most reputable companies down here have viable plans including offsite call centers, daily backups to servers/db's offsite, etc..
    I have to relate a funny story though. I wrote code for a large bank with a few offices in downtown Houston. As tropical storm Allison approached (you may have seen pictures of the aftermath), we started sending people home. Unfortunately, the shortsighted management had placed two offsite databases IN HOUSTON for data and call center recovery. The last I saw of our particular network administrator was him loading the physical DB server into his truck in hopes that he could get it home and upstairs. The two DR sites both flooded and we lost those servers. Needless to say, that manager is no longer employed with .

    --
    Saying Android is a family of phones is akin to saying Linux is a family of PCs.
  10. Re:I've made my own list of disaster lessons by kc0dxh · · Score: 2, Insightful

    Ha! I love the political incorrectness. Seriously, isn't the whole idea planning and an second location? Really, when disaster hits, whether external (terrorism) or internal (hard disk failure), is the person responsible for these systems in a frame of mind to create a plan?

    I've watched my 24/7 server choke and die. I had a fever and still got things up and running in less than 8 hours. Why? A plan. I knew where it was and where all my manuals and documentation were.

    Just because a server is small and easy to set up doesn't mean it should be treated any less than a mainframe should. Let me say that again, because this is why this is a topic of discussion: treat your servers like mainframes were treated 20 years ago.

    --

    --- "1.21 Jigawatts!" -Doc

  11. Rammstein bombing by chiph · · Score: 4, Funny

    Rather, the bombing she survived occurred at Ramstein Air Base in Germany, 20 years before.

    I happened to be at Rammstein the day after the bombing mentioned. The transmission from the car got blown over the top of a four-story building (other parts didn't quite make it through the building). Quite a powerful bomb that killed and hurt many people. I think it eventually got pinned on the Red Army Faction.

    The fun part was I was returning a Siemens teletype to the maintenance depot there, and the other guy in the VW pickup with me had forgotten his military ID (he had left it in his field jacket back at our base). So here we are pulling up to the main gate with this huge wooden crate in the back, and only one of us has any ID. We were lucky they didn't strip search us on the spot.

    Chip H.

  12. our current plan in full by Anonymous Coward · · Score: 2, Funny

    if it can't be recovered from the on-site week+ old backup, then we close the doors (if the doors are still there) and file for chapter (7, 11, 13, whatever the lawyer suggests)

  13. take a page from a bank by Archfeld · · Score: 4, Interesting

    They are the best prepared for a disaster, by the virtue of being required to be open on the fourth day. Ever since the stock market crash, banks have exactly 3 day to recover from ANY disaster and open the doors or the federal government will step in and take over. The fines for failing to uphold any of the fed reg's is ENORMOUS. Both BofA and WellsFargo have used their plans successfully in the past. BofA in both SF during the quake, and in LA during the riots, and Welss Fargo's main headquarters burned. A good Contingency Operations Program is VERY EXPENSIVE, and requires many things beyond the obvious. Do your sales people have all their numbers in a rolodex on their desk, will they be able to function without it ?

    --
    errr....umm...*whooosh* *whoosh* Is this thing on ?
  14. That developer by Faust7 · · Score: 4, Interesting

    If it wasn't for a single developer, who had made an unauthorized copy of the project on a floppy,

    I ask this question only half-jokingly:

    Was s/he fired?

    1. Re:That developer by gorbachev · · Score: 3, Informative

      As far as I know, no, he wasn't.

      --
      In Soviet Russia, I ruled you
  15. Disaster recovery, what 911 taught me by Anonymous Coward · · Score: 5, Insightful

    Run down on what I learned from 9-11.
    Were constantly under attack on some front, hey I knew this in my Marine corps days, some attacks are just worse than others.

    What YOU should have learned from 9-11.
    Dont take life for granted, your a freaking SysAdmin, A programmer, a Techie or god forbid some kind of manager that can be replaced. Work when your at work, back shit up and when you leave work, leave work, dont take it with you if your gone tomorrow, someone will notice, in a week there will be a new face in the crowd to replace you.
    You never really know when your gonna be part of some F-ed up shit that is going to happen. Go surfing, get a Girlfriend, get a life outside of work.

    The most important disaster you should be planning for is your own, is this mentioned in the book?

  16. But whatever you do, by EvanED · · Score: 4, Funny

    ...when disaster strikes, don't forget your towel.

  17. Disaster Recovery != Survivable Network by sczimme · · Score: 3, Informative


    The Survivable Network Technology program at the Software Engineering Institute (part of Carnegie Mellon University) describes in detail what "survivable network" actually means. The author [of the book in the /. review] seems to have missed some key points. Nutshell version: a survivable network keeps going despite disasters, etc; moving to a different network to continue business does not mean you have a survivable network.

    In fact, a quick google on "survivable network" turns up several hits (on the first page) from the SEI.

    (Disclaimer: I used to work at the SEI, but in a different area.)

    --
    I want to drag this out as long as possible. Bring me my protractor.
    1. Re:Disaster Recovery != Survivable Network by villy · · Score: 2, Funny

      The editor probably thought "Survivable Network" had a more sexy yet ambiguous (profitable) connotation then "Survivable IT Infrastructure". My $.02.

    2. Re:Disaster Recovery != Survivable Network by khallow · · Score: 2, Insightful
      Nutshell version: a survivable network keeps going despite disasters, etc; moving to a different network to continue business does not mean you have a survivable network.

      What if it's cheaper to move your functions to a new network than maintain the old one after a disaster? Ie, if the new network appears exactly the same to the user as the old network did, then the network has "survived" whether or not it is the same network as before.

  18. Price by jdehnert · · Score: 3, Informative

    $40 at Barnes and Noble
    $28 at Amazon

    --
    Eschew Obfuscation
  19. Error-proof networks != Attack-proof networks by rfischer · · Score: 2, Informative

    There was an interesting article in Nature a while back... said that networks like the Internet, which are very tolerant of faults in links and nodes, are not so tolerant of intentional attacks on nodes with high connectivity.

    here's the ref. for the curious:
    Albert A, Jeong H, Barabasi AL, Error and attack tolerance of complex networks Nature 406:378-382, 2000

  20. Re: Programming Satan's Computer by plcurechax · · Score: 2, Interesting

    Ross Anderson, professor at Cambridge University has some works on this including Programming Satan's Computer (PDF) which looks at cryptographic protocols being attacked by being deployed on hostile system. Such as Satellite TV decoders which rely on smartcards which are in the posession of the attacker / customer.

    The Tamper Lab is pretty impressive too.

    Making your system realible in the present of the hostile attacker or on a hostile system is very hard, well nearly impossible.

  21. Lehman Brothers Headquaters by dvk · · Score: 2, Interesting

    #1: OK, small nitpick: Lehman's HQ was in WFC3, NOT in WTC1. However, it did have presence in all 4 buildings mentioned.

    #2: While thoretically Lehman was migrated more-or-less OK (we did have off-site backups, backup datacenter, etc...), in practice the only thing that saved them was the working-to-death of IT people in the next week.

    Many of backups were made on the same-site servers. Restores were difficult, obviously. (read: almost impossible in some cases).

    Many servers didn't have decent failover h/w in the backup datacenter. Hint: the datacenter was increased by over 100% in 4 days, based on my visual estimates while carrying servers up there).

    FYI, I was "blessed" with starting off with a 24-hour shift, and then pulling 12-hour night shifts for over a week. Considering the fact that 9/12/01 was my 1-month wedding anniversary and that both Mrs. and myself were in WTC1 when the plane flew into it, one can see how I was a bit upset at the management, ESPECIALLY since my own application failed over with no problems - i'd rather have spent more time with her.
    What did I get for all that effort? Yay! A plaque, with an image of WTC. Nice gesture, Mr. CEO! :(

    -DVK

    --
    "The right to figure things out for yourself is the only true freedom everyone shares. Go use it"-R.A.Heinlein
  22. A few things to keep in mind about Disaster Plans by Dolemite_the_Wiz · · Score: 3, Interesting

    1) Just because you have a disaster plan doesn't put your company in the clear. You've got to put it into action and make sure that this plan will be ready to go at a moment's notice.

    2) You've got to test the plan/Backups pierodically.

    3) During 9/11 in NYC, the only portable communication devices that worked in the Twin Towers were Blackberry devices.

    4) A Remote, out of state, location for a backup datacenter is a good thing.

    5) If you need justification for Management for putting together a disaster plan, say this "Which will cost more, putting together a Disaster Plan or repairing a companies reputation as a result for not having one?

    Dolemite
    _______________________

    --
    Save the World! Use a Quote!