Green Grid Argues That Data Centers Can Lose the Chillers
Nerval's Lobster writes "The Green Grid, a nonprofit organization dedicated to making IT infrastructures and data centers more energy-efficient, is making the case that data center operators are operating their facilities in too conservative a fashion. Rather than rely on mechanical chillers, it argues in a new white paper (PDF), data centers can reduce power consumption via a higher inlet temperature of 20 degrees C. Green Grid originally recommended that data center operators build to the ASHRAE A2 specifications: 10 to 35 degrees C (dry-bulb temperature) and between 20 to 80 percent humidity. But the paper also presented data that a range of between 20 and 35 degrees C was acceptable. Data centers have traditionally included chillers, mechanical cooling devices designed to lower the inlet temperature. Cooling the air, according to what the paper originally called anecdotal evidence, lowered the number of server failures that a data center experienced each year. But chilling the air also added additional costs, and PUE numbers would go up as a result."
Tree huggers telling an IT manager it's OK for his servers to burn up so save a baby seal.
Well, Google has already started running their data center much warmer than many data centers of the past, apparently with no ill effect.
It has nothing to do with hugging trees, simply hard nosed economics. If 5 degrees induces 3 more mother board failures in X number of months and you already have the fail-over problem handled it only takes a few seconds on a hand held calculator to figure out that trees have nothing to do with it.
The rules were written, as the article explaines, based on little if any real world data, designed for equipment that no longer exists, built with technology long since obsolete. It was probably never justified, and even if it was back in thr 70s and 80s, it isn't any more.
Google and Amazon and others have carefully measured real world data talen from bazillions of machines in hundreds of data centers. They know how to do the math.
Sig Battery depleted. Reverting to safe mode.
They aren't going to die of heatstroke in 95 degrees. Drama queen much?
I've been an operator and sysadmin for many years now, and I've seen this experiment done involuntarily a lot of times, in several different data centers. Trust me, even if you accept 35 C, the temperature goes well beyond that in a big hurry when the chillers cut out.
Heat is death to computer hardware. Maybe not instantly, but it definitely causes premature failure. Just look at electrolytic capacitors, to name one painfully obvious component that fails with horrifying regularity in modern hardware. Fifteen years ago, capacitors were made with bogus electrolyte and failed prematurely. Some apparently still do, but the bigger problem NOW is that lots of items are built with nominally-good electrolytic capacitors that fail within a few months, precisely when their official datasheet says they will. A given electrolytic capacitor might have a design half-life of 3-5 years at temperatures of X degrees, but be expected to have 50/50 odds of failing at any time after 6-9 months when used at temperates at or exceeding X+20 degrees. Guess what temperature modern hardware (especially cheap hardware with every possible component cost reduced by value engineering) operates at? X+Y, where Y >= 20.
Heat also does nasty things to semiconductors. A modern integrated circuit often has transistors whose junctions are literally just a few atoms wide (18 is the number I've seen tossed around a lot). In durability terms, ICs from the 1980s were metaphorically constructed from the paper used to make brown paper shopping bags, and 21st-century semiconductors are made from a single layer of 2-ply toilet paper that's also wet, has holes punched into it, and is held under tension. Heat stresses these already-stressed semiconductors out even more, and like electrolytic capacitors, it causes them to begin failing in months rather than years.
Yes, it's generally in the nature of these companies to spend unneeded money. They hire people who's exact job is to make data centers' as efficient as possible. Even to the extent Facebook and others are open sourcing their information to try and get others involved to improve data center design. I say generally as I'm sure most seen the story on here recently over Microsoft wasting energy to meet a contract target, that however is a totally different kettle of fish.
If the owners of the building could run cooler I would think they would.
Have a look here for more background.
Basically, they're describing four types of data centers. Have you seen the Google data centers with their heat curtains and all that? I surely don't work in any of those types of data centers. Some of the fancier ones around here have hot/cold aisles, but the majority are just machines in racks, sometimes with sides, stuck in a room with A/C. Fortunately it's more split systems than window units these days!
The conventional wisdom was that AC is cheaper than downtime/hardware so they told the building owner what to run the temperature at and they paid for it. Some of those assumptions are now being challenged.
I do dig energy effecient IT - I focus on this whenever I spec gear - but many people just 'go big', 'go cheap', or 'go IBM' (for various values of 'IBM'). Focusing on operating heat is an after-the-fact approach if you have opportunity to cut down on heat (freebie: do you put SSD's in front of your big drives to keep them cooler?)
With that said, there's one very good reason to run a cold room: power failures. I typically see places with decent to nice UPS units, but the A/C units are almost never on battery backup, and generators are too rare (even when they're there, they're rarely sized for or connected to the A/C). A data room can get hot in a hurry without A/C and if you're running at 65, you get to 95 much less slowly than you do when you're running at 82. Yeah, if you're a government contractor you just buy a CAT diesel and go about your day, but for many businesses the monthly cost of A/C is weighed against the purchase of the generator to make it able to sustain those kinds of conditions.
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
Well, Google has already started running their data center much warmer than many data centers of the past, apparently with no ill effect.
This is an understatement. Google increased the temp in their data centers after discovering that servers in areas with higher temps had fewer hard errors. So they went with higher temps across the board, saved tons of money on lower utility bills, and have fewer hard errors.
Back in the 1950s, early computers used vacuum tubes, which failed often and were difficult to replace. So data centers were kept very cool. Since then, data centers have continued to be aggressively cooled out of tradition and superstition, with little or no hard data to show that it is necessary or even helpful.
Imagine that you used no cooling at all. The components wouldn't get infinitely hot; they'd get very hot, but the hotter they get the more readily the heat would escape, until they reach some steady state where they're hot enough that the heat escapes fast enough that it doesn't get any hotter.
So technically you're correct--a steady state always means that exactly the same amount of energy is being added and removed at the same time--but using cooling will allow this steady state to exist at lower temperatures where the natural escape of heat isn't so efficient.
The board of directors of the "Green Grid" is composed almost entirely of the companies that would benefit if data centers had to buy more computing hardware more frequently, rather than continued paying for cooling equipment.
Liberty in your lifetime
Yes and no. If the room is properly insulated, any heat generated in the room will have to be forcefully removed. At some point, the room will reach equilibrium -- heat will escape at the rate it's generated, but it will be EXTREMELY hot in there by then. Rate of thermal transfer is dependant on the difference in temperature; the larger the difference, the faster energy transfers. Raising the temp of the room will lead to higher equipment temps; until you do it, you won't know if you've made the difference better (wider) or worse (narrower).
The key finding from Google's research was that temperature stability was the most important factor. Fluctuating temperatures are very hard on machines -- esp. hard drives.
I once worked in an office building where the building would shut off the HVAC in the evenings and all day weekends... it would be 100F+ in there Sunday evening. (over 120 on 100 degree days.) Then they have tones of heat to dump come Monday morning; and all the while, they're destroying every piece of electronics in the building. (net cooling costs... they saved very little. add in replacing the damaged everything, and it cost them money.)
A data room can get hot in a hurry without A/C and if you're running at 65, you get to 95 much less slowly than you do when you're running at 82.
That really depends on the size of your datacenter and your server load. If you've got a huge room with one rack in the middle, you're good to go. If you've got a 10x10 room with 2 or 3 loaded racks and your chiller goes tits up, you're going to be roasting hardware in a few short minutes. Some quick back-of-the-napkin calculations show that a 10x10x8 room with a single rack pulling all the juice it can from a 20 amp circuit will raise the temperature in the room about 10 degrees every 2 minutes. From 82 to 95 is about 3 minutes, from 65 to 95 is about 6.