1 In 3 Data Center Servers Is a Zombie
dcblogs writes with these snippets from a ComputerWorld story about a study that says nearly a third of all data-center servers are are comatose ("using energy but delivering no useful information"). What's remarkable is this percentage hasn't changed since 2008, when a separate study showed the same thing. ... A server is considered comatose if it hasn't done anything for at least six months. The high number of such servers "is a massive indictment of how data centers are managed and operated," said Jonathan Koomey, a research fellow at Stanford University, who has done data center energy research for the U.S. Environmental Protection Agency. "It's not a technical issue as much as a management issue."
It's not a management issue, either - it's money. People cost more than dead servers.
Have you read my blog lately?
Are these really dead servers, or only servers that are there in case of failure elsewhere?
So you leave zombies on the wire, staking your claim. Then when you need the space, you swap it. Otherwise it's an endless wait for power, cooling, CAB, governance, and all sorts of fail.
We need enough servers for peak load, not average load.
One in three people consumes energy and produces nothing interesting.
Reading the title, my first thought was, cripes, those botnets have taken over everything!
Apparently, the researchers have never heard of business continuity planning. If your primary data center gets knocked offline because your company located it in a hurricane-prone area of the country in order to take advantage of state tax breaks and a cheaper labor force (happens all the time), then you're gonna need another site you can switch over your data/voice traffic instantly when the inevitable hurricane hits. That means maintaining a certain amount of redundant equipment at the failover site that will largely just sit there until its needed for disaster recovery. None of this is mentioned in that paper that reads more like an advertisement for some software that measures energy efficiency.
Those are the servers hosting Slashdot's new "share" button. No one's ever clicked on it.
Don't waste your vote! Vote for whoever you want, unless you live in a swing state it won't matter anyways
Modern systems are good at reducing power consumption when idle. It's quite reasonable to have 30% of capacity as spares, reserve for unexpected load, capacity for new apps and so on. They probably consume 3% of the power and nobody is motivated enough to look for more savings. Keeping things completely off is problematic, because you never know how much of the hardware and software will come up in time to handle an emergency unless you run and test it all the time.
There is certainly room for further environmental/financial improvement, but the 30% figure is sensationalized.
I was under the impression that a fail-over server that does not occasionally handle traffic in periodic tests could not be trusted to handle traffic in a true failure situation. Netflix routinely conducts tests of its failover infrastructure, shutting down large blocks of its leased Amazon capacity to make sure the rest of its capacity can keep up.
How do they judge whether or not a server is contributing useful information? I have two person VPSs out there that do almost nothing on the public internet. They mostly act as a place where I can store data as a form of backup, but also a place I can access when I need it to test programs, get a really fast download, etc. But most of the time these vps's just act as central nodes in my private VPN. So by their definition are my servers in the 1/3 "zombie" serviers? I pay the rent, so to speak, so I'm paying for the energy costs.
... of purple prose.
The mere existence of servers on standby is not a problem, let alone a "massive" one.
This ratio seems pretty close to the ratio of zombie public servants.
Achille Talon
Hop!
Unfortunate confuse of terminology. Zombie computers is a term also used to mean those taken over by bot nets.
Isn't this what we want, an instant capacity boost of 1/3 ready at hand for holidays, product launches, etc? Would we really expect to be running at 100% capacity?
In my experience, it is a management issue and comes from management being risk averse and not willing to make commitments. When you have a project to roll out new servers they always want to keep the old ones around "just in case", or sometimes it's a department like finance saying they need to keep them around just in case they need to access some old data. If the migration job was done correctly this isn't necessary, but you can't tell them that. They have no reason not to let them just sit around. The bill for the rack space and electricity is all lumped together, so there's no accountability.
Another minor reason is failed projects. We get a lot of "We just HAVE to have this new app/server/widget NOW" and then they never use it because they didn't think it through with their business model.
A bit low, but reasonable. Try making stuff that goes on ships, there's usually double redundancy AND a completely mechanical system in case everything goes to pot.
Liberty - Security - Laziness - Pick any two.
Back when I was a sysadmin for a government department I had been assigned a couple of chassis of HP blades that were bought in one of the famous fiscal year end splurges. For the most part I had no use for them and I didn't even install Linux on them. I think I only ever used a couple of blades and I hated them. It was the first generation and they ran very hot and we had lots of issues with bad RAM. The other three chassis on the rack belonged to the VMWare team and were in heavy use.
Since I had no need for the servers to be on I tried to have the blades turned off but the VMWare team was always turning them on. I would go on occasionally and turn off all of my machines but not long after they would be back on doing absolutely nothing. I never got a good answer from them why my servers had to be turned on wasting electricity and heating up the data centre (especially since special cooling had to be installed for the HP blades, yes they ran that hot).
The last time Microsoft had a major Xbox Live outage due to high demand they just spun up a bunch of VMs and everything was fine 4 hours later. You keep them idling so that when you need 'em they're ready on a moments notice. Also if you're not Microsoft or Oracle this means you're not paying the licensing costs associated with the software being in production non stop.
Hi! I make Firefox Plug-ins. Check 'em out @ https://addons.mozilla.org/en-US/firefox/addon/youtube-mp3-podcaster/
I know the industry I'm in, we have regulations which require 3+ years of data retention which "isn't providing anything useful" until it is. If we have a legal "issue" then that will extend until the legal issue goes away and the judge says we can destroy data. While we can use archive methods, sometimes the live system is really what is needed to retrieve data. It's better to just keep disks spinning than shut them down and hope they spin back up.
IT has a long tail where I work. Things are planning to last 5 years often have a good deal of life for another 2-5 years (not all, but many). The "usage" of these systems may only be once a month, quarter, or even annually, but it makes more sense than to port data over that doesn't need to be kept in the replacement system.
Many times even when we have an "official" cutoff for a system, we just power it down and let it sit in the rack until the next years' inventory, at which time it is then sent off the the auction yard (sans hard drives) to be bid on by the pallet load.
People pay for servers all the time and never use them. If they paid for the year, then why should the hoster care?
Buck Feta. You know what to do.
I would argue that for much of the corporate computing that goes on, the real worth of mony of the bits produced on the active machines is less than the null bits from the inactive ones. It would therefore make more sense to turn off at least some of the active machines.
Month end management presentations come to mind.
The purpose of meetings is to keep management out of the hair of the people doing the work.
I frequently purge old unused application servers. If I think a server/service isn't being used, I monitor its logs for a few days, then turn it off for a month, if no one screams I back it up and delete it. Has worked pretty well so far.
A research paper would be peer reviewed.. and if approved likely published in a journal. I'm not saying it is inaccurate (nor am I claiming that it is), it just reads more like an ad for TSO's product. There is zero indication of research in the paper. There are conclusions (and scathing remarks of blame) w/o any apparent attempt to comprehend the results in the context of reliability, accessibility and/or service... three components which can easily contribute to a perceived waste of energy without dutiful scrutiny. There may be truth to the conclusions... one just won't find any hint of justification in this ... "report".
From personal experience, the bureaucracy of our org makes it that procurement of servers is so difficult that section managers tend to horde them when they get them.
I'm hoping virtualization will improve this situation, but something tells me it will only create different problems. The bureaucratic culture usually invents new ways to foul up new tools.
Table-ized A.I.
turn them into mail servers ... then spammers will keep them active.
now we need to go OSS in diesel cars
There's this rumor that when Yahoo expanded its Lockport "chicken coop" data centers in upstate NY they vacated at least two large data centers in Northern VA and because the lease isn't up for another two years they have been mostly empty ever since.
Yet, Yahoo is saving lots of money by doing this.
Kriston
I mean, that's still better than the human race
You do not spec for "average" usage; you spec for *max*. You also have to spec for how many machines (when we're talking about thousands, or tens of thousands of servers) are going to fail today, to be picked up by the "zombie" machines that are, in fact, hot spares.
And then there's the Big Events, like the shooting in Charleston, or when the SCOTUS announces about gay marriage or the ACA - how many of those "zombie" machines are going to go live to help carry the traffic load?
mark
It cost $50 to get the data chimp to power a server on.