Slashdot Mirror


City's IT Infrastructure Brought To Its Knees By Data Center Outage

An anonymous reader writes "On July 11th in Calgary, Canada, a fire and explosion was reported at the Shaw Communications headquarters. This took down a large swath of IT infrastructure, including Shaw's telephone and Internet customers, local radio stations, emergency 911 services, provincial services such Alberta Health Services computers, and Alberta Registries. One news site reports that 'The building was designed with network backups, but the explosion damaged those systems as well.' No doubt this has been a hard lesson on how NOT to host critical public services."

10 of 102 comments (clear)

  1. First post! by Svartormr · · Score: 3, Informative

    I use Telus. >:)

    1. Re:First post! by clarkn0va · · Score: 5, Funny

      So Shaw customers get all their disappointment in one fell swoop, while you suffer subclinical abuse on an ongoing basis. Congrats.

      --
      I am literally 3000 tokens away from the chaotic crossbow --Stephen
  2. Or... by Transdimentia · · Score: 4, Insightful

    ... it just points out what should be practical thought in that no matter how redundancies you build, you can never escape the (RMS) Titanic effect. So stop claiming stupidity.

  3. No Site Level Resiliency? by sociocapitalist · · Score: 5, Insightful

    Whoever designed this should be smacked in the head. You never have critical services relying on a single location. Should have redundancy at every level, including geographic (ie not in the same flood / fault / fire zone).

    --
    blindly antisocialist = antisocial
    1. Re:No Site Level Resiliency? by sumdumass · · Score: 3, Insightful

      This is why i do not understand the rush to cloud space. The same types of outages that apply to locally hosting the data apply to the cloud space providers. You still need the backup's, disaster plans with the ability to access the servers and such, much of the same stuff if not more then you would need if hosting it yourself. Is the clouds that much cheaper or something? Or is it more about marketing hype that talks PHBs and supervisors who want to sound cool into situations like this where diligence is not necessarily a priority?

    2. Re:No Site Level Resiliency? by foradoxium · · Score: 3, Interesting

      Imagine if the library of Alexandria had backup copies of all those books, manuscripts and other treasures? How about Constantinople? I'm sure there were people that tried to protect that data who believed it was worth more then their life. I hope that brings stuff into perspective.

  4. Maybe the city/provinces should skip on redundancy by Anonymous Coward · · Score: 3, Interesting

    The issue with the city/provincial critical services is that they didn't have geographical redundancy due to the cost. Yes the building had redundant power, and networks but it was the whole building that was affected by this. At the end of the day, Shaw did fuck up, but all the essential servers completely fucked up.

  5. Not surprising by Anonymous Coward · · Score: 3, Informative

    There are buildings all over the US that can have a similar effect but worse. In Seattle it would be the Westin Tower, get the two electrical vaults in that building and you'll pretty much take most phone service, internet service and various emergency agency services all over the state offline for a while.

    What I now consider a classic example is the outage of Fischer plaza. It not only took down credit card processors, bing travel and a couple other big online services. It also took out Verizon's FiOS service for western washington.
    http://www.datacenterknowledge.com/archives/2009/07/03/major-outage-at-seattle-data-center/
    (apologies don't comment a lot and don't know how to properly link)

    The big problem is that many services no matter how redundant they may seem to be, now-in-days have a upstream geographic single point of failure (Ala my Westin tower example.)

  6. What really happened... by Anonymous Coward · · Score: 4, Interesting

    Shaw had a generator overheat and literally blow up which damaged their other 2 generators and caused an electrical arc fire. This fire set off the sprinklers and in turn, the water shut down the backup systems.

    Yes, it was stupid that Shaw housed all their critical systems, including backups, in one building but even more stupid was the fact that they used a water based sprinkler system in a bloody telecom room.

    Also, Alberta has this wonderful thing called Alberta SuperNet, which, if I recall, all health regions use to use before our government decided to spend hundreds of millions of dollars to merge everything together and spend even more money to use the Shaw network to connect everything. The SuperNet was specifically designed with government offices in mind but nooo, why use something you have already paid for when you can spend more money and use something different.

  7. It was so bad.. by Megahard · · Score: 5, Funny

    It caused a stampede.

    --
    I eat only the real part of complex carbohydrates.