Slashdot Mirror


1 In 3 Data Center Servers Is a Zombie

dcblogs writes with these snippets from a ComputerWorld story about a study that says nearly a third of all data-center servers are are comatose ("using energy but delivering no useful information"). What's remarkable is this percentage hasn't changed since 2008, when a separate study showed the same thing. ... A server is considered comatose if it hasn't done anything for at least six months. The high number of such servers "is a massive indictment of how data centers are managed and operated," said Jonathan Koomey, a research fellow at Stanford University, who has done data center energy research for the U.S. Environmental Protection Agency. "It's not a technical issue as much as a management issue."

25 of 107 comments (clear)

  1. Money by 14erCleaner · · Score: 4, Insightful

    It's not a management issue, either - it's money. People cost more than dead servers.

    --
    Have you read my blog lately?
    1. Re:Money by gstoddart · · Score: 4, Insightful

      But how hard is it to automate a process that says, in effect, "if no data is going in or out of this server, shut it down"?

      Why should the data center even care.

      Most of them are essentially charging rent ... as long as the customer keeps paying, WTF do they care if you actually use them for anything?

      This isn't incompetence on behalf of the data centers. Maybe companies who have machines they've lost track of what they're for.

      --
      Lost at C:>. Found at C.
    2. Re:Money by myowntrueself · · Score: 2

      Money (or lack of it) IS a management issue....

      But how hard is it to automate a process that says, in effect, "if no data is going in or out of this server, shut it down"? I suspect that there is a more nefarious purpose here and I propose a corollary to Hanlon's (Heinlein's) Razor:

      This is the 21st Century - "You have attributed conditions to villainy that simply result from villainy". Incompetence is for the proletariat - we're the NSA. You're toast.

      If a customer is paying for it to be there and be kept turned on *maybe* that customer has some use for the server oh I don't know maybe its a hot spare in case another server in another data center goes down? So you turn it off, their other server goes down, their service can't fail over and now your customer has a problem.

      --
      In the free world the media isn't government run; the government is media run.
    3. Re:Money by thegarbz · · Score: 2

      Depends if you can virtualise then you can over provision. I'd love having multiple people pay rent for the same system.

    4. Re:Money by jbolden · · Score: 2

      At this point for almost all companies good quality colo space is infinite. Most times a company isn't even using a meaningful fraction of their colo's space and so they could double or triple instantly without hassle much less an extra 33%. And even if their colo doesn't other's direct connected to it do have extra space... So consider space infinite once you are willing to rent.

      That being said, I have problems believing the 1/3rd of severs figures from the article. That's not my experience at all.

  2. Sounds about right. by Anonymous Coward · · Score: 2, Insightful

    One in three people consumes energy and produces nothing interesting.

    1. Re: Sounds about right. by Anonymous Coward · · Score: 2, Funny

      Like this comment.
      Crap, now it's 2 out of 3.

  3. Re:Zombies or fail over? by prefec2 · · Score: 4, Informative

    A fail over server is not considered useless. They did not monitor server output and decided then after a period of time that the server were not doing anything. You can infer this knowledge by reading the "paper", as they switched these servers off after identifying them. Switching of fail over servers normally would raise alarms and then you get thrown out ;-) So you could safely assume that they mean unused servers.

  4. Bad Title by seven+of+five · · Score: 4, Informative

    Reading the title, my first thought was, cripes, those botnets have taken over everything!

    1. Re:Bad Title by JustAnotherOldGuy · · Score: 2

      Lol, I had the same exact thought here.

      --
      Just cruising through this digital world at 33 1/3 rpm...
  5. Obviously by penguinoid · · Score: 5, Insightful

    Those are the servers hosting Slashdot's new "share" button. No one's ever clicked on it.

    --
    Don't waste your vote! Vote for whoever you want, unless you live in a swing state it won't matter anyways
    1. Re:Obviously by jones_supa · · Score: 2

      Also the "Share" links under comments are quite redundant as well IMHO.

  6. Re:Yes, it's called redundancy by petes_PoV · · Score: 4, Informative

    In a modern data center you would be able to shutdown the servers not used for a longer period and restart them automatically when the load rises.

    Many businesses that rely on servers (i.e. all of them) will be running hot standby systems - ones that can automatically take load if there's a hardware failure or software problem.

    One major (world-ranked) international company I consulted at was legally required to have 100% failover capacity - so it was inevitable that they would automatically have 50% of their production servers performing no functions - except for the twice a year when they were "flipped" just to make sure that each set of servers worked as expected.

    Although the source paper does specify physical "zombie" servers, if you need failover VMs, the same basis is applied there, too.

    --
    politicians are like babies' nappies: they should both be changed regularly and for the same reasons
  7. They are not consuming 30% of power by iamacat · · Score: 3, Insightful

    Modern systems are good at reducing power consumption when idle. It's quite reasonable to have 30% of capacity as spares, reserve for unexpected load, capacity for new apps and so on. They probably consume 3% of the power and nobody is motivated enough to look for more savings. Keeping things completely off is problematic, because you never know how much of the hardware and software will come up in time to handle an emergency unless you run and test it all the time.

    There is certainly room for further environmental/financial improvement, but the 30% figure is sensationalized.

    1. Re:They are not consuming 30% of power by Dagger2 · · Score: 2

      Maybe. But on the other hand, even active servers spend a lot of their time idle (the paper says server utilization "rarely exceeds 6%"), and I bet a lot of these "comatose" servers are actually long-forgotten old hardware, or machines that nobody can be bothered to decommission -- it's possible that on average they're older than active servers and thus eating a lot more power.

  8. Chaos Monkey by Netflix by tepples · · Score: 2

    I was under the impression that a fail-over server that does not occasionally handle traffic in periodic tests could not be trusted to handle traffic in a true failure situation. Netflix routinely conducts tests of its failover infrastructure, shutting down large blocks of its leased Amazon capacity to make sure the rest of its capacity can keep up.

  9. Re:Zombies or fail over? by slydder · · Score: 5, Informative

    I've been in IT Management for 15+ and I can assure you it is a good thing you are not in management. I would lose my job in a heartbeat if production server decided to take a dump and I had shut off all our fail-over servers.

    It's not just a matter of what those fail-over servers costs. It's the question "Can we afford (financially) to NOT have fail-over servers?". If you stand to lose more due to a production server failure than the cost of running a fail-over for a year then you will not EVER wish to be caught without one.

  10. Re:Yes, it's called redundancy by iamacat · · Score: 2

    A hardware server start may take ten minutes - if it actually comes up successfully. If you are starting a cluster in an emergency outage, you never know how many servers, power supplies and network switches kicked the bucket since you last used them. Plus, your DNS, NFS, db and other dependencies have to be unaffected by the outage and handle the added load of hundreds of servers starting at the same time. If you do a staggered restart of 100 servers in groups of 10, that's an hour and 40 minutes of outage if everything goes without a hitch. Worth the power savings from idle standby?

  11. Re:Zombies or fail over? by pfleming · · Score: 2

    I've been in IT Management for 15+ and I can assure you it is a good thing you are not in management. I would lose my job in a heartbeat if production server decided to take a dump and I had shut off all our fail-over servers.

    It's not just a matter of what those fail-over servers costs. It's the question "Can we afford (financially) to NOT have fail-over servers?". If you stand to lose more due to a production server failure than the cost of running a fail-over for a year then you will not EVER wish to be caught without one.

    How is it a failover server if no data has traveled into or out of the machine in six months? Wouldn't you want to keep a failover server up to date (data and software updates) so you don't notice the failover? What good is a failover server if you have to load six months of data from tape? The machine could be off until you need it in that case.

  12. Re:Yes, it's called redundancy by Iamthecheese · · Score: 2

    Because doing it right involves a full fail-over test including transferring loads or test loads, DNS auto-reconfiguration, and possibly even paying extra to bring up extra capacity elsewhere. You need to make sure it happens right when it's needed. Extra paperwork, overtime, it's all in there.

    --
    If video games influenced behavior the Pac Man generation would be eating pills and running away from their problems.
  13. Re:Zombies or fail over? by rubycodez · · Score: 2

    wrong, you don't understand how it's usually done these days

    it only need have the ability to access a SAN where replicated information from the primary server exists

    you will not see any data movement to the machine

  14. Bad terminology by pubwvj · · Score: 4, Insightful

    Unfortunate confuse of terminology. Zombie computers is a term also used to mean those taken over by bot nets.

  15. Re:Zombies or fail over? by rubycodez · · Score: 3, Informative

    yes, but these researchers were ignoring traffic below a certain threshold.

  16. So, an average 1.33 safety factor? by spiritplumber · · Score: 2

    A bit low, but reasonable. Try making stuff that goes on ships, there's usually double redundancy AND a completely mechanical system in case everything goes to pot.

    --
    Liberty - Security - Laziness - Pick any two.
  17. turn them into mail servers ... by Skapare · · Score: 2

    turn them into mail servers ... then spammers will keep them active.

    --
    now we need to go OSS in diesel cars