Slashdot Mirror


How Many Google Machines, Really?

BoneThugND writes "I found this article on TNL.NET. It takes information from the S-1 Filing to reverse engineer how many machines Google has (hint: a lot more than 10,000). 'According to calculations by the IEE, in a paper about the Google cluster, a rack with 88 dual-CPU machines used to cost about $278,000. If you divide the $250 million figure from the S-1 filing by $278,000, you end up with a bit over 899 racks. Assuming that each rack holds 88 machines, you end up with 79,000 machines.'" An anonymous source claims over 100,000.

30 of 476 comments (clear)

  1. $278k ?? by r_cerq · · Score: 5, Insightful

    That's $3159 per machine, and those are today's prices... They weren't so low a couple of years ago...

    1. Re:$278k ?? by Gilk180 · · Score: 5, Interesting

      I really doubt they are spending anywhere near this for the machines themselves. A former student a google employee made one of those recruiting/marketing visits to my university last semester. I got to speek to him at length about Google's operation. According to him (and he had pictures to back this up). All of their boxen are a motherboard, an ide drive and a processor sitting on a shelf in the rack. No cases, no fans, no cd, etc. Plus they buy in bulk and get good prices.

    2. Re:$278k ?? by sql*kitten · · Score: 5, Insightful

      so it means if you are smart enough, you don't need to have a $1,500,000 Sun server or that kind of shit. leave that for big corporations with lame-ass programmers. imagine what google could do with that kind of shit

      The difference is that if Google loses track of a few pages due to node failure it's no big deal because a) they don't guarantee to index every page on the web anyway and b) the chances are that page will be spidered again in the near future - and it may not even still exist anyway.

      Your bank, on the other hand, can't just "lose" a few transactions here and there. FedEx can't just lose a few packages there and there. Sure they occasionally physically lose one, but they never lose the information that at one point, they did have it. Your phone company can't just lose a few calls you made and not bill you for them. Your hospital can't just lose a few CAT scans and think oh well, he'll be in for another scan eventually.

      Now, I'm not saying that Google's technique isn't clever - I'm saying that it can't really be generalized to other applications. And that's why very smart people - and big corporations can afford to hire very smart people - keep on buying Sun and IBM kit by the boatload.

    3. Re:$278k ?? by geniusj · · Score: 5, Insightful

      I can confirm this as well.. I have seen their racks in Equinix in Ashburn, VA. I pass by their cages every time I go to my cage there. I believe I also saw them in Exodus in Santa Clara a couple of years ago. They are 1U half depth and do indeed lack a case. There are definitely thousands of their servers in Ashburn, VA, and they are very space efficient (as they would need to be).

    4. Re:$278k ?? by Anonymous Coward · · Score: 5, Informative

      The high-end Sun machines are designed for high availability. Not only will a CPU failure not crash the machine, the CPUs are hot swappable so you can replace a failed CPU without so much as a reboot.

    5. Re:$278k ?? by sql*kitten · · Score: 5, Informative

      Any of those 64 CPUs fails, and your system will crash.

      Doesn't work like that, kid. A CPU on a high-end Sun fails, and the system will keep on running. You can swap the CPU out and replace it with a new one, the system will simply pick it up, assign threads to it, and keep on running. Had a couple of CPUs fail a little while ago... the first we users noticed of it was that the application slowed down slightly. Sysadmin just said yeah, I know, I'll replace 'em when the parts arrive this afternoon. Cool, we said. No data lost, no need to shut down or even restart our app. 'Course you gotta architect your app to deal with that - like don't have just one thread that does a crucial task, 'cos there's a chance that might be on the CPU that fails. But still, it's no big deal.

    6. Re:$278k ?? by Anonymous Coward · · Score: 5, Informative

      Dude, big iron is not comparable in the slightest to that dinky little dual PPro Linux 'server' you keep in your closet. A CPU can fail, on a live running system, and the machine and Solaris or AIX won't even hiccup. Your application will notice, because suddenly a couple of its threads will quit, but that's ok, software like Oracle already knows how to deal with failed transations. And if you can schedule a CPU board removal/swap, then there won't be ANY problems at all, as the OS will migrate threads to other CPUs and allow the removal or hardware.

      And hey, if you want to mix and match CPU types (uSparc 2 and 3, etc), speeds, etc, no problem either. So if you wanna upgrade your server's CPUs, there will be zero downtime, you just do it a board at a time (board = 2 or 4 CPUs).

  2. Google, will you marry me? by Anonymous Coward · · Score: 5, Funny

    1) google is so pretty and smart
    2) google is worth so much money
    3) google has a huge rack!!

    1. Re:Google, will you marry me? by Anonymous Coward · · Score: 5, Funny

      whoops, forgot to sign the letter!

      Love,
      Yahoo.

  3. IPO changes things by Have+Blue · · Score: 5, Interesting

    There was an article recently about how Google constantly understates various statistics about itself to mislead potential competitors. This article also said that the SEC would not allow them to do this once they became a publically traded company.

  4. At $699 per CPU by earthforce_1 · · Score: 5, Funny

    SCO now knows how big an invoice to send Google! :-D

    --
    My rights don't need management.
  5. All that power by Chucklz · · Score: 5, Funny

    With all those TFlops, no wonder Google converts units so quickly.

  6. Really? by irikar · · Score: 5, Funny

    You mean the PigeonRank(tm) technology is a hoax?

  7. Re:Assumptions? by digitac · · Score: 5, Funny

    That's right, they probably got in on the "Buy 899 Get 1 Free" Sale. So in reality they have a nice even 900 racks. Makes much more sense that way.

  8. Re:What is that as a percentage ... by Anonymous Coward · · Score: 5, Funny

    Well, let's count. I have two servers at home and 8 at work. They all run linux. Now, if everyone else in the world joins this thread, we can find out.

  9. Re:What a waste by phoxix · · Score: 5, Interesting

    If you've ever read a white paper of Google's, you'd realize that they even tell people why they deal with massive clusters over mainframes: lower latency.

    Sunny Dubey

  10. Re:Can you imagine by mkavanagh · · Score: 5, Funny

    can you imagine a beowulf cluster of karma in soviet russia whoring YOU, you insensitive cliched clod?

  11. Heat by gspr · · Score: 5, Informative

    A Pentium 4 dissipates around 85 W of heat. I don't know what the Xeon does, but let's be kind and say 50 W (wild guess). Using the article's "low end" estimate, that brings us to 4.7 MW!
    I hope they have good ventilation...

    1. Re:Heat by Neil+Blender · · Score: 5, Funny

      Google engineer reading your post: OH SHIT!

      <sound of door slamming>

      <sound of car engine starting>

      <sound of tires squeeling away>

    2. Re:Heat by gammelby · · Score: 5, Informative
      I once attended a talk by google fellow Urs Hölzle on the google architecture, and he mentioned how they handle the cooling issue: They do not depend on each individual unit to be cooled separately - instead they have an enormous flow of air between the racks (sitting back to back), generated by some large fan in the roof.

      Ulrik

  12. Re:Why do we care? by 0xC0FFEE · · Score: 5, Funny

    Well, everybody knows that black ink is best for the job and that Linus prefers it.

  13. hardcore by mooosenix · · Score: 5, Funny
    After many scientific and time consuming experiments, we have found the number of servers to be.........

    42.

  14. Why reverse engineer... by SporkLand · · Score: 5, Informative

    When you can just open "Computer Architecture: A Quantitavie Approach, 3rd Edition" by Hennessy and Patterson to page 855 and find out that in summary:
    Google has 3 sites (two west coast, one east)
    Each site connected with 1 OC48
    Each OC48 hooks up to 2 Foundry BigIron 8000 ...
    80 Pc's per rack * 40 racks(at an example site)
    = 3200 PC's.
    A google site is not a homogenous set of PC's instead there are different types of PC's that are being upgraded on different cycles based on the price/performance ratio.

    If you want more info get the patterson hennessy book that I mentioned. Not the other version they sell. This one rocks way harder. You get to learn fun things like Tomosulo's algorithm.

    If I am violating any copy rights feel free to remove this post.

  15. Re:15 Megawatts by gspr · · Score: 5, Funny

    According to Google herself dried wood contains 15.5 MJ of energy per kg. It seems that Google consumes about 1 kg of wood per second (if they've found a way to utilize 100% of the energy, which they of course have - they're Google, after all), and that the pigeons are just there to use their wings to dry the wood!
    We're on to you, Google!

  16. Re:What a waste by Waffle+Iron · · Score: 5, Informative
    I'm sure a single IBM mainframe could do the same amount of work in half the amount of time and cost a fraction of what that Linux cluster cost.

    Mainframes are optimized for batch processing. Interactive queries do not take full advantage of their vaunted I/O capacity.

    Moreover, while a mainframe may be a good way to host a single copy of a database that must remain internally consistent, that's not the problem Google is solving. It's trivial for them to run their search service off of thousands of replicated copies of the Internet index. Even the largest mainframe's storage I/O would be orders of magnitude smaller than the massively parallel I/O operations done by these thousands of PCs. Google has no reason to funnel all of the independent search queries into a single machine, so they shouln't buy a system architecture designed to do that.

  17. I'm more interested.. by diegomontoya · · Score: 5, Funny

    in how they recycle their gigantic heat output...perhaps move data center to the windy city, open up a homeless shelter next door, and put the hot air to good use for once. They might even get a tax break on this.

    Better yet, open up a nursery (plant type) next door , build a green house, and piple 25% of the heat to it. Have you guys see the price of trees lately? Google could make a killing with the "recycling" plant.

  18. Absolutely Beautiful by Anonymous Coward · · Score: 5, Insightful

    All those machine, all that complexity and activity, all boiled down to one little box under a Google logo. The most useful input box on the internet.

    Thanks Google!

  19. False advertising. by duckpoopy · · Score: 5, Funny

    They better have at least 10^100 machines, or they will be getting a call from my lawyers.

    --
    word.
  20. Re:Acquisition by Anonymous Coward · · Score: 5, Informative

    >>Disks are going to fail at a rate of several hundred or thousand PER DAY

    that's a little over the top big guy. i've worked at a 10,000 node corp doing desktop support. We lost ONE disk perhaps a week....if that much. We often went several weeks with no disks lost.

    even if you factor in multiple drives per server, say TWO (because they are servers not desktops)

    Interpolate for 100,000, that's a max of 20 disks per week...on the high end.

  21. Server pricing by JWSmythe · · Score: 5, Informative

    His pricing in the summary may be a bit off.

    Every article I've read about Google's servers says they use "commodity" parts, which means they buy pretty much the same stuff we buy. They also indicate that they use as much memory as possible, and don't use hard drives, or use the drives as little as possible. From my interview with Google, they asked quite a few questions about RAID0, RAID1 (and combinations of those), I'd believe they stick in two drives to ensure data doesn't get lost due to power outages.

    We get good name brand parts wholesale, which I'd expect is what they do too. So, assuming 1u Asus, Tyan, or SuperMicro machines stuffed full of memory, with hard drives big enough to hold the OS plus an image of whatever they store in memory (ramdrives?), they'd require at most 3Gb (OS) + 4Gb (ramdrive backup). I don't recall seeing dual CPU's, but we'll go with that assumption.

    The nice base machine we had settled on for quite a while was the Asus 1400r, which consisted of dual 1.4Ghz PIII's, 2Gb RAM, and 20Gb and 200Gb hard drives. Our cost was roughly $1500. They'd lower the drive cost, but incrase the memory cost, so they'd probably cost about $1700, but I'm sure Google got better pricing, buying the quantity they were getting.

    The count of 88 machines per rack is a bit high. You get 80u's per standard rack, but you can't stuff it full of machines, unless you get very creative. I'd suspect they have 2 switches, and a few power management units per rack. The APC's we use take 8 machines per unit, and are 1u tall. There are other power management units, that don't take up rack space, which they may be using, but only the folks at Google really know.

    Assuming the maximum density, and equipment that was available as "commodity" equipment at the time, they'd have 2 Cisco 2948's and 78 servers per rack.

    $1700 * 78 (servers)
    +
    $3000 * 2 (switches)
    +
    $1000 (power management)
    --------
    $139,600 per rack (78 servers)

    Lets not forget core networking equipment. That's worth a few bucks. :)

    Each set of 39 servers would probably be connected to their routers via GigE fiber (I couldn't imageine them using 100baseT for this) Right now we're guestimating 1700 racks. They have locations in 3 cities, so we'll assume they have at least 9 routers. They'd probably use Cisco 12000's, or something along that line. Checking eBay, you can get a nice Cisco 12008 for just $27,000, but that's the smaller one. I've toured a few places who had them, and pointed at them citing them to be just over $1,000,000.

    So....

    $250,000,000 (ttl expenses)
    - $ 9,000,000 (routers)
    ------
    $241,000,000
    / $ 139,600
    ------
    1726 racks
    * 78 (machines per rack)
    ------
    134,682 machines

    Google has a couple thousand employees, but we've found that our servers make *VERY* nice workstations too. :) Well, not the Asus 1400r, those are built into a 1u case, but other machines we've built for servers are very easy to build into midtowers instead. Those machines don't get gobs of memory, but do get extras like nice sound cards and CD/DVD players. The price would be the same, as they'd probably still be attaching them to the same networking equipment. 132,000 servers, and 2,682 workstations and dev machines is probably fairly close to what they have.

    I believe this to be a more fair estimate, than the story gave. They're quoting pricing for a nice fast *CURRENT* machine, but Google has said before that they buy commodity machines. They do like we do. We buy cheap (relatively) and lots of them, just like Google does. We didn't pattern ourselves after Google, we made this decision long before Google even existed.

    When *WE* decided to go this router, we looked at many options. The "provider" we had, before we went on our own, leasing space and bandwidth directly from Tier 1 providers, opted for the monolythic sy

    --
    Serious? Seriousness is well above my pay grade.