Slashdot Mirror


Google Reveals "Secret" Server Designs

Hugh Pickens writes "Most companies buy servers from the likes of Dell, Hewlett-Packard, IBM or Sun Microsystems, but Google, which has hundreds of thousands of servers and considers running them part of its core expertise, designs and builds its own. For the first time, Google revealed the hardware at the core of its Internet might at a conference this week about data center efficiency. Google's big surprise: each server has its own 12-volt battery to supply power if there's a problem with the main source of electricity. 'This is much cheaper than huge centralized UPS,' says Google server designer Ben Jai. 'Therefore no wasted capacity.' Efficiency is a major financial factor. Large UPSs can reach 92 to 95 percent efficiency, meaning that a large amount of power is squandered. The server-mounted batteries do better, Jai said: 'We were able to measure our actual usage to greater than 99.9 percent efficiency.' Google has patents on the built-in battery design, 'but I think we'd be willing to license them to vendors,' says Urs Hoelzle, Google's vice president of operations. Google has an obsessive focus on energy efficiency. 'Early on, there was an emphasis on the dollar per (search) query,' says Hoelzle. 'We were forced to focus. Revenue per query is very low.'"

11 of 386 comments (clear)

  1. The New Mainframe by AKAImBatman · · Score: 5, Insightful

    Most people buy computers one at a time, but Google thinks on a very different scale. Jimmy Clidaras revealed that the core of the company's data centers are composed of standard 1AAA shipping containers packed with 1,160 servers each, with many containers in each data center.

    Mainstream servers with x86 processors were the only option, he added. "Ten years ago...it was clear the only way to make (search) work as free product was to run on relatively cheap hardware. You can't run it on a mainframe. The margins just don't work out," he said.

    I think Google may be selling themselves short. Once you start building standardized data centers in shipping containers with singular hookups between the container and the outside world, you've stopped building individual rack-mounted machines. Instead, you've begun building a much larger machine with thousands of networked components. In effect, Google is building the mainframes of the 21st century. No longer are we talking about dozens of mainboards hooked up via multi-gigabit backplanes. We're talking about complete computing elements wired up via a self-contained, high speed network with a combined computing power that far exceeds anything currently identified as a mainframe.

    The industry needs to stop thinking of these systems as portable data centers, and start recognizing them for what they are: Incredibly advanced machines with massive, distributed computing power. And since high-end computing has been headed toward multiprocessing for some time now, the market is ripe for these sorts of solutions. It's not a "cloud". It's the new mainframe.

    1. Re:The New Mainframe by AKAImBatman · · Score: 4, Insightful

      By some measurements they exceed the computing power of a mainframe, by others they don't.

      A fair point. However, I should probably point out that mainframe systems are always purpose built with a specific goal in mind. No one invests in a hugely expensive machine unless they already have clear and specific intentions for its usage. When used for the purpose this machine was built for, these cargo containers outperform a traditional mainframe tasked for the same purpose.

    2. Re:The New Mainframe by divisionbyzero · · Score: 4, Insightful

      Not quite. While these server farms in a box are fault-tolerant they are not fault-tolerant in the same way as at least some mainframes where the calculations are duplicated. With mainframes you'd have wasted resources (doing every calculation twice) with lower latency. With server farms in a box you get, arguably, better resource utilization (route around something that is broken but wait till it breaks before doing so) but higher latency. The difference is incorporating the way the internet works into "mainframe" design.

    3. Re:The New Mainframe by Znork · · Score: 4, Insightful

      by others they don't.

      Seriously, I've fairly recently gone through every single benchmark, comparison, inference, etc, that I've been able to find on the subject (they're not exactly sprinkled all over the place) and I can't find any indications anywhere that mainframe hardware can surpass modern commodity hardware on any measurement. On price/performance variants it's not rare to see it outclassed more than an order of magnitude, and in absolute performance, well, there's very little magic hardware in the mainframe either anymore, it's pretty much the same silicon as anywhere else; Power CPU's, DDR infiniband, CPU to SC bandwidth almost equivalent to Hypertransport, same SAN as is used anywhere else, and as far as I can tell, to my horror, DDR2 533 memory(??). Please, correct me if I'm wrong and I very well may be, because actual specs aren't exactly flaunted. I mean, it's nice enough, but it's hardly magic.

      Sure, there's the old trick of moving system and IO load into extra dedicated CPUs, but that's becoming less and less relevant as pretty much any significant IO load has long since moved to dedicated ASICs that do DMA on their own without any CPU cost, and things like encryption accelerators aren't that hard to find. And it's not like you're not paying for the assist processors.

      Two or three years ago it might have been conceivable that it could have had at least a possibility of being superior in consolidation capabilities like being able to have the most unused OS instances running at a time, but with paravirtualized xen-derived tech commodity x86 hardware can accomplish the same or higher density. I can't say I've tried running 1500 instances, but for fun I did try running 100 instances on 5 years old junked x86 hardware which went fine until I ran out of memory at 6GB on the (like I said, junk) hardware in question. No significant performance degradation in relation to load versus what could be expected of the hardware, all 100 instances fully loaded both IO and CPU for a week to test for any throughput issues or over-time degradation, but that worked as well.

      IE, no practical limit for any non-contrived consolidation situation, and I have no doubt that it scales fine up to 1500 instances on reasonably modern hardware as well as it did on that hardware (and if you need higher density than that you should seriously be considering why you're using that number of OS instances that don't appear to actually be doing anything or consider moving to system-level virtualization like vserver or openvz)).

      So have you found any measurements that I couldn't find that you could point out that demonstrate lingering categories in which a mainframe might consistently outperform commodity hardware (ie, any measurement that is or can be compared to another at least somewhat related measurement on commodity hardware which demonstrates an advantage for the mainframe)?

      Outside pure performance there is the in-system redundancy which is nice in theory but which in practice seems to rarely result in higher actual uptime (mainframes appear to require an inordinate amount of scheduled service time and admins often engage in a disturbingly high IPL frequency).

      There is also the consistent load levels they tend to get (which seems to be largely due to culture, load selection and ROI requirements, rather than any inherent capacity), but beyond that it seems that the remaining aura of capability doesn't have much basis in reality anymore.

  2. Re:Hey google, want to save some money? by Bill,+Shooter+of+Bul · · Score: 5, Insightful

    Google claims they did the math and found it was cheaper with commodity hardware. I advise everyone else to do the same and run the calculations for themselves to determine the optimal hardware for their particular load. With out the specifics of their situation, its difficult to criticize in an intelligent fashion, other than a more generalized statement expressing surprise at their configuration.

    --
    Well.. maybe. Or Maybe not. But Definitely not sort of.
  3. Re:Onboard UPS not new by geekoid · · Score: 5, Insightful

    A patent is an implementation of an idea.

    You can have the idea of how to put an UPS in a computer one way, and I can do it another way and both be valid patents.

    I do know this gets abused, and companies try to sue becasue it's there 'idea', but that's ot how it works.

    If you find a different way to do a hard drive plugin board, then yes you can patent it. I would advise you only do it if it's better in some way, and there is a demand.

    --
    The Kruger Dunning explains most post on /. http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect
  4. Re:Hey google, want to save some money? by EvilMonkeySlayer · · Score: 4, Insightful

    I've a few questions, if the data centre is built in the desert don't you have a number of issues?

    * Latency, if you have all your data centre's located in essentially a single part of the USA (lets ignore the rest of the world for this.. regardless that there are no deserts in Europe for example) won't that increase latency quite a bit to the more further away places that want the search results?
    * Bandwidth/redundancy, if you have all your eggs in one basket as it were aren't you going to have to pay extra to have lots of extra fibre laid down to be able to handle all that extra traffic? What about natural disasters, if you have all your data centres in a single location then surely you run the risk of things going pear shaped if it burns down, suffers earthquakes, aliens destroy the building etc.
    * Cooling, because it's in the desert isn't a lot of the electricity that is generated going to be cooling not only the building because of the outside heat, but also the heat generated by the servers? Surely it makes more logical sense to build in a colder climate say further north and use hydroelectricity? (if you're talking of using exclusively non active polluting (and non radioactive) natural electricity solutions)

  5. Re:No way by Anonymous Coward · · Score: 4, Insightful

    Greater than 99.9% efficiency? They likely made a mistake in their measurements.

    Maybe they measured 99.92% efficiency.

    That is greater than 99.9% efficiency and they aren't breaking any laws of thermodynamics.

  6. 99.9% efficiency by Anonymous Coward · · Score: 4, Insightful

    This is a questionable number. The best DC-DC conversion is around 95% so they aren't including voltage conversions from the battery to what the system is actually using.

  7. Re:Who swaps out all those dead batteries? by WPIDalamar · · Score: 4, Insightful

    Or maybe they think bigger...

    They're deploying containers of servers. Maybe when a container gets a to a certain age or a certain failure rate, they replace/refurbish the entire container.

    I doubt they care if some of their nodes go down in a power outage as long as some percentage of them stay up.

  8. Re:Who swaps out all those dead batteries? by mlwmohawk · · Score: 4, Insightful

    Hundreds of thousands of servers == thousands of dead batteries each month, since those batteries don't last more than a few years.

    I would imagine that the battery replacement schedule mimics the server obsolescence perfectly.

    LOL, when the battery catches fire, time to replace the server.