Google Reveals "Secret" Server Designs
Hugh Pickens writes "Most companies buy servers from the likes of Dell, Hewlett-Packard, IBM or Sun Microsystems, but Google, which has hundreds of thousands of servers and considers running them part of its core expertise, designs and builds its own. For the first time, Google revealed the hardware at the core of its Internet might at a conference this week about data center efficiency. Google's big surprise: each server has its own 12-volt battery to supply power if there's a problem with the main source of electricity. 'This is much cheaper than huge centralized UPS,' says Google server designer Ben Jai. 'Therefore no wasted capacity.' Efficiency is a major financial factor. Large UPSs can reach 92 to 95 percent efficiency, meaning that a large amount of power is squandered. The server-mounted batteries do better, Jai said: 'We were able to measure our actual usage to greater than 99.9 percent efficiency.' Google has patents on the built-in battery design, 'but I think we'd be willing to license them to vendors,' says Urs Hoelzle, Google's vice president of operations. Google has an obsessive focus on energy efficiency. 'Early on, there was an emphasis on the dollar per (search) query,' says Hoelzle. 'We were forced to focus. Revenue per query is very low.'"
I think Google may be selling themselves short. Once you start building standardized data centers in shipping containers with singular hookups between the container and the outside world, you've stopped building individual rack-mounted machines. Instead, you've begun building a much larger machine with thousands of networked components. In effect, Google is building the mainframes of the 21st century. No longer are we talking about dozens of mainboards hooked up via multi-gigabit backplanes. We're talking about complete computing elements wired up via a self-contained, high speed network with a combined computing power that far exceeds anything currently identified as a mainframe.
The industry needs to stop thinking of these systems as portable data centers, and start recognizing them for what they are: Incredibly advanced machines with massive, distributed computing power. And since high-end computing has been headed toward multiprocessing for some time now, the market is ripe for these sorts of solutions. It's not a "cloud". It's the new mainframe.
Javascript + Nintendo DSi = DSiCade
Google claims they did the math and found it was cheaper with commodity hardware. I advise everyone else to do the same and run the calculations for themselves to determine the optimal hardware for their particular load. With out the specifics of their situation, its difficult to criticize in an intelligent fashion, other than a more generalized statement expressing surprise at their configuration.
Well.. maybe. Or Maybe not. But Definitely not sort of.
A patent is an implementation of an idea.
You can have the idea of how to put an UPS in a computer one way, and I can do it another way and both be valid patents.
I do know this gets abused, and companies try to sue becasue it's there 'idea', but that's ot how it works.
If you find a different way to do a hard drive plugin board, then yes you can patent it. I would advise you only do it if it's better in some way, and there is a demand.
The Kruger Dunning explains most post on
I've a few questions, if the data centre is built in the desert don't you have a number of issues?
* Latency, if you have all your data centre's located in essentially a single part of the USA (lets ignore the rest of the world for this.. regardless that there are no deserts in Europe for example) won't that increase latency quite a bit to the more further away places that want the search results?
* Bandwidth/redundancy, if you have all your eggs in one basket as it were aren't you going to have to pay extra to have lots of extra fibre laid down to be able to handle all that extra traffic? What about natural disasters, if you have all your data centres in a single location then surely you run the risk of things going pear shaped if it burns down, suffers earthquakes, aliens destroy the building etc.
* Cooling, because it's in the desert isn't a lot of the electricity that is generated going to be cooling not only the building because of the outside heat, but also the heat generated by the servers? Surely it makes more logical sense to build in a colder climate say further north and use hydroelectricity? (if you're talking of using exclusively non active polluting (and non radioactive) natural electricity solutions)
Greater than 99.9% efficiency? They likely made a mistake in their measurements.
Maybe they measured 99.92% efficiency.
That is greater than 99.9% efficiency and they aren't breaking any laws of thermodynamics.
...why desktops didn't have a built in battery deal that lived in an expansion bay. If you could even keep RAM alive for extended periods even with the machine shut down that would be spiffy as an option, let alone as a little general UPS.
This is a questionable number. The best DC-DC conversion is around 95% so they aren't including voltage conversions from the battery to what the system is actually using.
This is composed purely of commodity parts. The power supply is the same thing you'd buy for your desktop, those are SATA disks (not SAS), and that looks like a desktop motherboard (see the profile view where all the ports on the "back" are lined up in the same manner they would need for a standard desktop enclosure).
Only the battery is custom (or even non-consumer grade), and you can note that since the power goes through the PSU first, that's DC power. DC is significantly better than AC, since the PSU then has to convert AC-to-DC (which wastes power and generates needless heat). While you can get DC battery supplies for server-grade systems, these are not server-grade systems. Built-in DC battery backup therefore affords them the ability to keep the motherboards cheaper. Very smart.
Also, if you recall from a few months ago, Google has applied pressure on its suppliers (I'm not sure why Dell comes to mind...) to develop servers that can tolerate a significantly higher operating temperature (IIRC, they wanted at 20 degree (Fahrenheit?) boost). I wouldn't be surprised if the higher temperature cuts down on operating expenses more than smarter battery placement.
Use my userscript to add story images to Slashdot. There's no going back.
Dawned on me the other day how little innovation occurs in our industry EXCEPT by hungry companies. For example, Desktops and Laptops have not really changed, while both have a piss poor design. ABout 4 years ago, it dawned on me that a much better way to design these is to merge them. Basically, different cases where the laptop has keyboard and a monitor hookup while the desktop is sans the prior. The smart move is to move the battery OUT of the case and into the power supply. Right now, you do not get to buy variable amounts of batteries. But a company would do well to sell an external power supply with varying storage capacities, but with a simple 12V line. In this fashion, ppl can pick the parts for a laptop similar to a desktop, while the desktop gets to take advantage of the drop in prices of the laptop linage.
I prefer the "u" in honour as it seems to be missing these days.
Or maybe they think bigger...
They're deploying containers of servers. Maybe when a container gets a to a certain age or a certain failure rate, they replace/refurbish the entire container.
I doubt they care if some of their nodes go down in a power outage as long as some percentage of them stay up.
"Google's designs supply only 12-volt power, with the necessary conversions taking place on the motherboard"
This seems to be a more interesting point than the battery part. 12V-only?
This means that there's some serious power conversion done on each of the motherboards, and with SMPS evolving at the rate that it is, this could be relevant to anything larger than a laptop.
How much exactly is gained by making such a big change, to a point where you'd need to redesign all of your motherboards, each time for each different chipset? (they mention they use both Intel and AMD)
Will this particular change make it into desktops? How much *more* efficient would it make the overall system?
Entomologically speaking, the spider is not a bug, it's a feature.
Hundreds of thousands of servers == thousands of dead batteries each month, since those batteries don't last more than a few years.
I would imagine that the battery replacement schedule mimics the server obsolescence perfectly.
LOL, when the battery catches fire, time to replace the server.
Yes, but without looking at the specs, I would imagine that if the technology is significantly different, Google would still be eligible for a patent. Especially so if they were aware of the "prior art" and took the necessary steps not to include language that would overlap. Though IANAL, nor am I a patent expert.
I cut it three times, and it's still too short.
Clearly Google's entire business model has failed because of your insight!
Read the Google paper on hard drive failure. They may have thought about things.
Just because something is published on April 1st doesn't mean it's an April Fools joke. In the case of this article, it's clearly not.
You're seeing a connection where there is none. The two SATA cables run back behind the plate the drives are mounted on. Presumably, the mainboard connectors are back there as they're not visible on the rest of the mainboard.
Javascript + Nintendo DSi = DSiCade
googles pretty sure about it...
do you also run a multibillion dollar server farm?
turn up the jukebox and tell me a lie
How did this get marked informative?
I mean it's certainly true that Deserts are defined by lack of rainfall but since the GP said
"Build your data center in the desert and build 150 MW industrial solar thermal system to power it."
I think it's fair to assume they were talking about the stereotypical sunny and hot desert.
Secondly the reason it's cool underground is because soil is generally a very good insulator. I would suggest that it's a really bad idea to put things that are going to get hot inside a huge lump of insulating material.
Arguably, APC has become a mainframe vendor. They sell rack systems with integrated power, cooling, and cable management. Add commodity motherboards, CPU parts, disk drives, and software, and you have a mainframe. It's not that different from what HP or SGI or IBM or Sun will sell you. Especially since the "mainframe" vendors have mostly moved to commodity CPU parts.
I've pointed out before that computing is becoming more like stationary engineering. Stationary engineers run and maintain all the equipment in building basements and penthouses. With containerized data centers, computing looks more and more like that.
Have you ever even seen Mainframe pricing? No really have you?
It will cost you at least 10000$ to match the power of a single quad core intel/amd cpu.
And you do not want to run a mainframe(Or other computer that have a cpu bound task) for a decade. I think my current desktop computer have more power then avg mainframe
from a decade ago, and when I buy a new development workstation in then next decade, it will most likely have more cpu power then a 1 million $ mainframe you could buy today.
Just to set things in perspective: I am pretty sure, that google have more cpu power, more ram, more hd space and more aggregate io, then all mainframes in USA combined.
Given what has been said about Google's maintenance policies in the past, probably not. Google doesn't do detail maintenance - they wait till an entire rack (or now probably container) falls below a certain performance level, and then replace it with a new one and scrap the old.
There is no possible way their solution is cheaper than a real mainframe (created for the task) when all costs are considered.
Nor is there any possible way their solution is more reliable, or more "green".
That depends on how you're measuring cost, reliability and "green"itude. Cost-wise, there's an enormous opportunity cost associated with going with a single mainframe vendor. Reliability... well, they've made the choice of having small, frequent failures that are cheap and easy to deal with rather than single large uncommon events that might put a division out of action all at once. Green credentials? Again, it's a trade-off. They've traded physical resource cost against energy cost.
Also, by doing it this way, they can take incremental improvements far more easily than they could with a mainframe installation. Once your mainframe is installed, that's it - you don't get to improve power efficiency or processing power ever again. With these, if you figure out how to get a percentage point improvement, you can roll it into the next build cycle, knowing that it'll probably be across half the company in a couple of years.
Oh, and you're slightly wrong about hard drives. They don't RAID them. They just chuck them.
Trash (Magnatek) power supply.
A couple of years ago, they announced that they had their own PSU design that was supposedly much more efficient than anything available on the market. If this is a cheap commodity PSU, it predates that.
A 12v battery. I never knew DC was more efficient than AC!
Dude... UPS. If you're using the battery, you don't *have* AC.
A good mainframe would last decades. Google's frankenframe (lets call it what it is) must be sloughing off parts like skin cells from a Texan with eczema.
And that, presumably, is just the way they like it, because if you upgrade something that hasn't failed yet, you lose whatever value was left in it.
Reality is the ultimate Rorschach.
Ok. So, your load fits onto 5 mainframes. Now your requirement increases. What do you do? Do you buy number 6 now, and have it running at less than capacity for the next 18 months (or whatever)? That's a huge waste. Do you degrade your service for the next 9 months until number 6 would be at half capacity, then install? Again, you've wasted an opportunity, and number 6 is *still* not going to be at capacity.
Smaller computational units means better matching of demand to supply.
Reality is the ultimate Rorschach.
In Google case, Id say they just seal off the container and be done with it. If there is a fire, they bring in a new (40') box.
But anyway. A rack mount HP UPS I installed in the past year has a stand-off that you can hook into the "Big Red Button System". I'm guessing such hookups are either standard on rack mount units, or at least it wouldnt be hard to find models with that feature.
Wait, my laptop has one of those too...
In other news, is anyone else surprised that a built-in UPS is so slow to catch on for the desktop when notebooks have had it by definition for years? Sure, powerful batteries are expensive, but you'll wish you had one when a power blackout destroys half a day's work. It's one reason why I hesitate to get a desktop PC.
Pretty sure if the fire department is coming in to throw water lines around, they are going to cut the power to the building and not to just the circuit on the datacenter floor.
Yes, but if they cut the power to the building the server room will still be fully energized thanks to all those huge batteries running the place. That's why they have the big red buttons - they kill all the power in the room so that there is no electrocution danger.
As another posted indicated, commercial UPS systems typically have an input for the big red button so that they cut off. Your $80 home UPS probably doesn't have this.
There are a lot of safety concerns with UPS devices in large datacenters - you're talking about a LOT of power in a semi-industrial setting. Among other things it is important to make sure that the hardware doesn't leak much power to ground. Without a UPS power leaking to ground isn't a big deal - it goes out the plug and isn't much of a shock hazard (within reason). However, if you have a UPS and somebody disconnects the plug then the whole rack is isolated from ground (until you touch it and the rack next to it). If you have 100 devices each leaking a few mA of power to the chasis that is a potentially dangerous situation.
And 12V DC isn't automatically safe - I don't know enough to say for sure one way or another, but lead acid batteries can produce fairly high current levels. Do you think that turning over the engine in your car requires a trivial amount of power? An arc welder only requires a few volts of potential difference - although it relies on more than just batteries. A room full of 12V batteries capable of each running a 500W power supply isn't a trival matter.
I'm sure Google has thought this out. Probably by wiring every server to that big red button...
Only by converting it from DC power. Which is less efficient than using the DC power directly.
And is DC even any less efficient? I know it's more efficient to transmit AC power over long distances (i.e. power lines), but does that apply to short distances like these?
And you do not want to run a mainframe(Or other computer that have a cpu bound task) for a decade. I think my current desktop computer have more power then avg mainframe from a decade ago
You can run a mainframe for a decade or more because every part except the steel frame is hot-replacable. You upgrade the processors, memory, everything really every few years, without ever interrupting service. There's a reason they aren't cheap.
Even good minicomputers (or expensive servers, if you like that term berret) let you swap processors, I/O processors, memory, and sometimes motherboards while the machine is running. High-end mainframes just take that to the next level, by ensuring that every board is hot-replacable.
Of course, Google approach to the same problem (just hot-swap cheap commodity servers in and out of the cloud as units) may well be cheaper, in terms of hardware costs. I doubt it's cheaper if you include all of the related development costs, but sometimes that's a good trade-off to avoid vender lock-in.
Socialism: a lie told by totalitarians and believed by fools.
Modern high speed chips (which draw the bulk of the power in a typical PC) run thier core logic at much lower voltages. Typically somewhere between 1V and 2V though I think some may have gone below a volt now. Theese very low voltages have to be produced very close to the chip that uses them to avoid huge losses.
This means that modern PC motherboards take most of thier power at 12V anyway. The 5V and 3.3V lines really only serve to power the low speed chips and some of the interfaces between chips.
Given that I doubt there would be too much efficiancy loss from making a 12V only board. You could probablly even design it to hapilly deal with an input that was only approximately 12V without losing too much (since most of that 12V power is going to the input of switchers anyway).
note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
there's also the issue that using cheap unreliable hardware mean you need MORE of it and it'll be using resources because you have to have the parellelism in order to cope with failures, so even if the power conversion chain from AC mains to system board is 100%, you're wasting at least 30% at least due to redundancy if maintaining 1 in 3.
:-)
having more hardware around means having more engineers to maintain it, engineers are expensive.
modern computers offer a better bang for the buck - the biggest cost in datacentres is electricity, so it makes sense to maximise the performance of the server against power. whenever I get a quote for datacentre facilities, the first question asked is how much power I need, then space, then bandwidth! by the time I've had four rackfulls of servers in a datacentre, the cost of the servers is irrelevant!
having to maintain your own pool of spares is expensive and wasteful. it also occupies space, and wasteful because it will never get used before it's obsolete. take a look at ebay - there's huge quantities of servers for sale in excellent condition, most are from disaster recovery centers where the primary site has been upgraded and the secondary unused site can no longer be used as a failover.
buying cheap disposable hardware is also particularly bad for the environment, but who cares, all the old server tin goes to the third world, eh, google?
once apon a time, google's strategy might have been sensible, but these days I don't think it makes much sense.
submitted A/C 'cos I already moderated. and my brother works for google and I don't want them to take away his gphone
Please see the Patent Cooperation Treaty which covers this situation; China acceded in 1993, India in 1998.
Yes, and we all know how seriously India & China deal with intellectual property infringement. They don't.
The law is one thing, enforcement of the law is very different.