Slashdot Mirror


Power Problems Force Seattle To Throttle City Data Center For Days

Nerval's Lobster writes with an except from sister site SlashDataCenter: "On Aug. 23, Mayor Mike McGinn of Seattle informed residents that the city would partially shut down its municipal data center for five days including the Labor Day weekend. As a result, city residents will be unable to pay bills, apply for business licenses, or take advantage of other online services. In a Webcast press conference, McGinn isolated the issue as a failure in one of the electrical 'buses' that supplies power to the data center. Because that piece of equipment began overheating, the city had to begin taking servers and applications offline to prevent overloading the system. The maintenance will cost the city $2.1 million of its maintenance budget. A second power bus will remain operational, supplying enough electricity to power redundant systems for critical life and fire safety systems, including 911 services and fire dispatch. The city's Web sites should also be up and running in some capacity."

23 of 85 comments (clear)

  1. Hey, so let's post it to Slashdot! by Anonymous Coward · · Score: 4, Funny

    That should help the situation.

  2. Seattle Times, Where Are You? by Frosty+Piss · · Score: 2

    Interesting that this is not on the front page of the Seattle Times. In fact, I can't find it at Washington's biggest paper at all.

    --
    If you want news from today, you have to come back tomorrow.
  3. 5 days no government, is that so bad? by DevotedSkeptic · · Score: 2

    If you lived in podunk nowhere then no probably not, if emergency services continue to operate it wouldn't be a big issue. But for such a large municipality to go dark for 5 days...would definitely be impactful locally and possibly regionally/nationally to a smaller degree. Emergency services are very important but the business of government (no matter how i feel about it from time to time) needs to continue and serve it's people...I am sure (at least i hope) that they looked into portable power generation, but it seems that this is a poor solution. just my 2 pennies.

    --
    Chief Thinker www.devotedskeptic.com
    1. Re:5 days no government, is that so bad? by jrmcferren · · Score: 3, Insightful

      They have the power, they just can't get it where they need it without equipment overheating. Since it is a busbar overheating you can't just switch over to emergency power to fix it, you have to route power around the issue which is not economically feasible in this case except for the emergency services systems which can use their redundant power supplies.

      --
      sudo mod me up
    2. Re:5 days no government, is that so bad? by DevotedSkeptic · · Score: 2

      Well being able to keep up emergency services is definitely most important, i don't think we are getting the whole story since either something was added to create extra electrical draw or something is failing. I wonder if that is the 2.1 million spoken of to add capacity...or repair.

      --
      Chief Thinker www.devotedskeptic.com
    3. Re:5 days no government, is that so bad? by dnay · · Score: 2

      They have the power, they just can't get it where they need it without equipment overheating. Since it is a busbar overheating you can't just switch over to emergency power to fix it, you have to route power around the issue which is not economically feasible in this case except for the emergency services systems which can use their redundant power supplies.

      Run down to Autozone and grab a couple dozen jumper cables.

      --
      Since I gave up hope, I feel much better.
    4. Re:5 days no government, is that so bad? by Isaac-1 · · Score: 2

      Am I the only one to think, how many modern servers does the city of Seattle really need? Google says the population is only 608,000 in 2010

    5. Re:5 days no government, is that so bad? by Anonymous Coward · · Score: 3, Funny

      The city of Seattle, or any modern city, needs exactly three modern servers to provide their public services. And two of them are to provide redundancy for the one that does the actual work. Internally they may need more servers for VDI or some such, or need to physically isolate one service from another. But one modern server is adequate to provide all of the public services Seattle provides, and two more provide geographic redundancy through their fiber network, which could be upgraded to 100 Gig for a reasonable cost because they own the fiber and the endpoints. The devil is in the I/O, and SSD takes care of that.

      But I can't tell them that. I sell their multitudinous departments a lot of servers.

  4. Forgetting something? by LostCluster2.0 · · Score: 2

    If power problems are downing the city's datacenter for a holiday weekend, couldn't they just rent a few $100/mo servers and run the city apps on them for the downtime and make the problems transparent to the end user? No one-place site is ever safe for important apps, we call that a Single Point of Failure around here.

    --
    I'm LostCluster but I lost my password to that user. Hey Slashdot, how about helping me get it back!
  5. the cloud by Lord+Ender · · Score: 2

    Seattle? The home of Amazon? Why on earth don't they just move their datacenter to Amazon Web Services? They could probably do it for less than the $2.1 million they're spending on this single part!

    --
    A slashdotter who didn't build his own computer is like a Jedi who didn't build his own lightsaber.
    1. Re:the cloud by 93+Escort+Wagon · · Score: 2

      It's also the home of Microsoft; and Google is also strongly represented. You can't afford to piss off any of these guys...

      --
      #DeleteChrome
  6. Re:Looking to move to Seattle... by 93+Escort+Wagon · · Score: 3, Informative

    I live south of Seattle, and work in the city.

    Any political gridlock is largely because current Mayor McGinn is a joke. Seattle is a fairly liberal city, but McGinn was largely seen as too extremely left-wing to be electable even there; so he remade himself into a pragmatist - a change that lasted until he was sworn in. McGinn made specific promises pre-election that he wouldn't let his personal ideology affect policies where the citizenry clearly differed from him... then he turned around and spent most of his time fighting ideological political battles, ignoring real problems while devoting 100% of his time tilting at his personal windmills.

    --
    #DeleteChrome
  7. 911 and emergency services by girlintraining · · Score: 4, Insightful

    What I'm trying to figure out is why 911 and emergency services didn't have a separate offsite backup. I mean, how much more mission critical can you get than that? Everytime I see one of these articles I think to myself: Why are they mentioning this if there wasn't some risk of failure? And the answer is... because quite obviously, there was some risk.

    I don't want my cause of death to be "Your call could not be completed as dialed. Please check the number and try your call again later..."

    --
    #fuckbeta #iamslashdot #dicemustdie
    1. Re:911 and emergency services by adolf · · Score: 2

      In my experience with 911 and emergency communications (none of which is anywhere near the scale of what Seattle must have), they have power redundancy (consisting of one or more UPS and one or more standby generator), connectivity redundancy (multiple telephone/data circuits going to different places), and physical redundancy.

      So if one 911 PSAP goes completely offline for any reason, there is one or more geographically independent backup PSAPs which can take over in quid-pro-quo fashion.

      Do things get a little bit harrier when this happens? Absolutely: You've got folks who, no matter how good they are at doing their usual job, are now doing a somewhat different and more complex job. Efficiency goes to shit, but more hands are easily called in/moved around to help with that in short order.

      So. The 911 phone will still be answered, and your ambulance/fire brigade/armed posse is still within easy grasp.

  8. Re:Okay, A Point Here by AK+Marc · · Score: 2

    I've had a similar issue with a private data center. There wasn't a UPD bypass switch because the UPS had an internal bypass switch (installed with the datacenter years before. But the UPS was old, and a new UPS was cheaper than replacing all the batteries (and more powerful with better features). So my coworker planned out the switch, 2 days outage over a weekend. Of course, since I took most of the classes to be an EE, I re-drew the plans and got the project done with half the labor time and two 30-second outages (well, both were about a second, but longer than the time a server could live without power, so it was safer to turn everything off as if it were a longer outage). The problem was caused by a stupid "cost saving" choice on installation.

    Sounds like something similar here, where there's an issue with part of the redundancy, but it's not actually capable of running fully redundantly. Otherwise, cut everything over, then fix it. Or just turn it off and fix it (and the power will flow). I've seen it more than once in corporate world, so it's not an example of governmental oops, just IT oops.

  9. Re:It has to be said... by AK+Marc · · Score: 2

    The cloud doesn't need power?

  10. Use the remote site by hawguy · · Score: 2

    Why don't they just fail over the critical life and fire safety systems to the backup datacenter, and keep normal services up at the primary datacenter while they do the work? They do have a second site, right? Surely no one would host a system deemed "critical" and "life safety" at a single site?

    1. Re:Use the remote site by hawguy · · Score: 2

      Because while things may have been well designed originally or planned including all the fancy redundancy, after years of no major issues it becomes a target of its own success: cutbacks and people saying "see, we never needed it, and look at how much money we can save". Such is the way of things.

      If you personally are worried about 911 services being out then go write down the various 7 (or 10-digit if your exchange requires it) numbers for your local emergency services. 911 is not an exclusive to reach them, just the easiest. Whatever happened to the days of the list of those various numbers on the fridge? I'm not even that old and I remember my parents having the list posted just in case.

      I thought I was already paying for a reliable E-911 service through the 911 service fees we've all been paying on our phone bills for years.

      So what you're saying is that even though we've been paying for 911 for years, we've been paying for cheap, non-redundant service, and it we expect the type of multi-site redundancy that's normally reserved for moderately successful websites, then we need to pay even more? What value are we getting from the hundreds of millions of dollars already collected?

      I've called 911 a handful of times, but never from my own house so I'm not sure how that list of phone numbers taped to the fridge is supposed to help me. There used to be a time when you could count on finding a phone book under the phone in your friend's house with the local emergency numbers inside the front cover, but I haven't seen a phone book at a friend's house in years.

  11. overheating power buses / wires are a fire risk by Joe_Dragon · · Score: 2

    overheating power buses / wires are a fire risk and that comes from them being under sized for the load.

    See the towering inferno to see where that can get you.

  12. Re:Looking to move to Seattle... by symbolset · · Score: 4, Funny

    Seattle has great parking. You can park your car on I5 for several hours each day without concern that traffic might move forward while you're shopping.

    --
    Help stamp out iliturcy.
  13. Re:Okay, A Point Here by CAIMLAS · · Score: 2

    I had an almost identical situation happen to me this past spring, too. I was the sysadmin at one of the facilities. It happened right after I gave my two weeks, and damn was I busy. :P I ended up having to take all my UPSes off the mains and run them over some two phase at one point to get additional power onto a secondary genset, because the amp load simply was too high (oops, poor planning - someone forgot to figure high load overhead amperage requirements).

    Unlike this situation, my situation only had a single power run due to the topographical location of where we were: on top of a hill/small mountain, on the edge of a park. There were 5 fairly sizeable facilities on the hill, some of which have some fairly significant power requirements due to the type of work they perform (lots of sciencey stuff).

    Fortunately, all of the buildings had (100 KW+) gensets. Unfortunately, only one of the 5 was NG, and the others were diesel. This gets really costly, really quickly, since it's California, diesel's at something like $4.50/gallon, and the things will burn through a full 500 gallon tank in a day at around 60% utility. So we're talking ~$10k a day just to keep these things fueled (including an extra pulled up due to additional crunch demand).

    Plant faculty - probably a good 30-60 people in all - were in the conduit going up the hill for a day trying to figure out where the fault was, and then another three days getting new cable run and relay substation. (God, I hate how slow many union workers work.) Turns out the relay fused up pretty solidly, welding itself nicely into the culvert.

    I seem to recall talk back and forth that the total damage was going to be over $500,000, so it really doesn't surprise me that a large city's power infrastructure would cost a multiple of that. If cities are like some of the hospitals I've seen, they've got lecherous IT sales people at their door on an almost-daily basis. They also buy a lot of the crap the sales people are peddling, many of which seem to (still) require being run on their own propriety platform and/or a dedicated piece of hardware. And then, the old systems don't really go away until they die, and there's a cost incurred to recover the lost data - because they're non-profit, they don't really seem to understand cost of maintenance, depreciation, or anything like that. So, I can certainly see the power requirements for some poorly designed cluster for public facing things, a handful or three of interface systems to tie in with the governmenty systems, and so on.

    In my mind, it makes sense that they just shut those services down temporarily. "Forced vacation use" for city workers, maybe? They'll save a lot more than 2.5 million that way, if they can do it, I'm sure (funny how government is able to cut costs when there's no alternative :P). I imagine it's too much of a cost and/or risk to try to move essential services (fire/PD/911) to the hot site, and really no reason to do so, especially when they've not yet tested their DR plan.

    --
    ~/ssh slashdot.org ssh: connect to host slashdot.org port 22: too many beers
  14. Re:2.1 million? What? by Xero · · Score: 4, Informative

    The datacenter is on the 26th floor of the municipal tower and the overheating bus runs up to that floor. The power company in question is municipally owned, either way it would be the city's problem.

  15. Story doesn't make sense by Xero · · Score: 2

    McGinn had quite a few facts wrong in the press conference. The equipment is working fine now and the overheating only caused a minor amount of downtime. The major issue though was the backup generator never kicked in because as it turns out, the electric starter for the diesel generator is connected to the same bus. Labor Day weekend was then choosen to fix this majorly obvious design deficiency.