Slashdot Mirror


Explosion At ThePlanet Datacenter Drops 9,000 Servers

An anonymous reader writes "Customers hosting with ThePlanet, a major Texas hosting provider, are going through some tough times. Yesterday evening at 5:45 pm local time an electrical short caused a fire and explosion in the power room, knocking out walls and taking the entire facility offline. No one was hurt and no servers were damaged. Estimates suggest 9,000 servers are offline, affecting 7,500 customers, with ETAs for repair of at least 24 hours from onset. While they claim redundant power, because of the nature of the problem they had to go completely dark. This goes to show that no matter how much planning you do, Murphy's Law still applies." Here's a Coral CDN link to ThePlanet's forum where staff are posting updates on the outage. At this writing almost 2,400 people are trying to read it.

10 of 431 comments (clear)

  1. Kudo to their support team by QuietLagoon · · Score: 5, Insightful

    ... for posting frequent updates to the status of the outage.

    1. Re:Kudo to their support team by larien · · Score: 4, Insightful

      It's probably less effort to spend a few minutes updating a forum than it would be to man the phones against irate customers demanding their servers be brought back online.

  2. Re:Recovery costs by macx666 · · Score: 4, Insightful

    Not to mention the cost of pulling all those consultants in, overnight, on a weekend...

    Also, only the electrical equipment (and structural stuff) was damaged - networking and customer servers are intact (but without power, obviously). I read that they pulled in vendors. Those types would be more than happy to show up at the drop of a hat for some un-negotiated products that insurance will pay for anyway, and they'll even throw in their time for "free" so long as you don't dent their commission.
  3. Re:Server/customer ratio? by 42forty-two42 · · Score: 5, Insightful

    Wouldn't people who want such redundancy consider putting the other server in another DC?

  4. Re:Server/customer ratio? by p0tat03 · · Score: 4, Insightful

    ThePlanet is a popular host for hosting resellers. Many of the no-name shared hosting providers out there host at ThePlanet, amongst other places. So... Many of these customers would be individuals (or very small companies), who in turn dole out space/bandwidth to their own clients. The total number of customers affected can be 10-20x the number reported because of this.

  5. Re:More planning could have prevented this by ottawanker · · Score: 5, Insightful

    so you're agreeing with me. The servers getting blown up was a huge mistake, one that certainly could have been avoided with a little proper planning. you are a fucking moron

  6. I'm a customer in that DC, and I'm a firefighter by CFD339 · · Score: 4, Insightful

    My servers dropped off the net yesterday afternoon, and if all goes well they'll be up and running late tonight. At 1700PST they're supposed to do a power test, then start bringing up the environmentals, the switching gear, and blocks of servers.

    My thoughts as a customer of theirs:

    1. Good updates. Not as frequent or clear as I'd like, but mostly they didn't have much to add.

    2. Anyone bitching about the thousands of dollars per hour they're losing has not credibility to me. If your junk is that important, your hot standby server should be in another data center.

    3. This is a very rare event, and I will not be pulling out of what has been an excellent relationship so far with them.

    4. I am adding a fail over server in another data center (their Dallas facility). I'd planned this already but got caught being too slow this time.

    5. Because of the incident, I will probably make the new Dallas server the primary and the existing Houston one the backup. This is because I think there will be long term stability issues in this Houston data center for months to come. I know what concrete, drywall, and fire extinguisher dust does to servers. I also know they'll have a lot of work in reconstruction ahead, and that can lead to other issues.

    For now, I'll wait it out. I've heard of this cool place called "outside". maybe I'll check it out.

    --
    The problem with quotes on the internet, is that nobody bothers to check their veracity. -- Abraham Lincoln
  7. UMM.. USE STATIC PAGE?? by kyoorius · · Score: 5, Insightful

    There's no reason to use the forum software when they've locked the thread and are only using it to disseminate information. A Pentium one running lighttpd serving a static html page would be sufficient to handle the flood of requests.

  8. Re:5 servers, 5 cities, 5 providers by aronschatz · · Score: 4, Insightful

    Yeah, because everyone can afford redundancy like you can.

    Most people own a single server that they make backups of in case of it crashing OR have two servers in the same datacenter in case one fails.

    I don't know how you can easily do offsite switch over without a huge infrastructure to support it which most people don't have the time and money to do.

    Get off your high horse.

  9. I'm a firefighter AND a geek. You, not so much. by CFD339 · · Score: 4, Insightful

    Look, when I go into a building in gear and carrying an axe and an extinguisher, breathing bottled air, wading through toxic smoke I couldn't give crap number one about your 100 sites being down.

    I have a crew to protect. In this case, I'm going into an extremely hazardous environment. There has already been one explosion. I don't know what I'm going to see when I get there, but I do know that this place is wall to wall danger. Wires everywhere to get tangled in when its dark and I'm crawling through the smoke. Huge amounts of currents. Toxic batteries everywhere that may or may not be stable. Wiring that may or may not be exposed.

    If its me in charge, and its my crew making entry, the power is going off. Its getting a lock-out tag on it. If you wont turn it off, I will. If I do it, you won't be turning it on so easily. If need be, I will have the police haul you away in cuffs if you try to stop me.

    My job, as a firefighter -- as a fire officer -- is to ensure the safety of the general public, of my crew, and then if possible of the property.

    NOW -- As a network guy and software developer -- I can say that if you're too short sighted or cheap to spring for a secondary DNS server at another facility, or if your servers are so critical to your livelihood that losing them for a couple of days will kill you but you haven't bothered to go with hot spares at another data center then you sir, are an idiot.

    At any data center - anywhere - anything can happen at any time. The f'ing ground could open up and swallow your data center. Terrorists could target it because the guy in the rack next to yours is posting cartoon photos of their most sacred religious icons. Monkeys could fly out of the site admin's [nose] and shut down all the servers. Whatever. If its critical, you have off site failover. If not, you're not very good at what you do.

    End of rant.

    --
    The problem with quotes on the internet, is that nobody bothers to check their veracity. -- Abraham Lincoln