Slashdot Mirror


How Many Google Machines, Really?

BoneThugND writes "I found this article on TNL.NET. It takes information from the S-1 Filing to reverse engineer how many machines Google has (hint: a lot more than 10,000). 'According to calculations by the IEE, in a paper about the Google cluster, a rack with 88 dual-CPU machines used to cost about $278,000. If you divide the $250 million figure from the S-1 filing by $278,000, you end up with a bit over 899 racks. Assuming that each rack holds 88 machines, you end up with 79,000 machines.'" An anonymous source claims over 100,000.

126 of 476 comments (clear)

  1. Nice Rack! by turnstyle · · Score: 4, Funny

    No wonder I'm'a Googlin'

    --
    Here's what I do: Bitty Browser & Andromeda
  2. What is that as a percentage ... by Alain+Williams · · Score: 3, Interesting

    * of servers in the world
    * of servers in the USA
    * of servers running Linux

    1. Re:What is that as a percentage ... by Anonymous Coward · · Score: 5, Funny

      Well, let's count. I have two servers at home and 8 at work. They all run linux. Now, if everyone else in the world joins this thread, we can find out.

    2. Re:What is that as a percentage ... by Phurd+Phlegm · · Score: 3, Funny
      "* of servers in the world"

      0.0001%

      * of servers in the USA

      0.00000045%

      So there are 222 times as many servers in the United States as in the entire world, and there are 1.96e13 servers in the USA. Too bad there isn't a "-1 bad arithmetic" moderation....

    3. Re:What is that as a percentage ... by Lispy · · Score: 4, Funny

      Nono, only those you have root access to. Post your pass as proof.

    4. Re:What is that as a percentage ... by identity0 · · Score: 2, Funny

      Hello, my name is Thabo Mugabe and I am a subsistance farmer in Africa. I have 5 head of cattle and a herd of sheep, of which none run Linux. Thank you for asking. :)

    5. Re:What is that as a percentage ... by theCoder · · Score: 4, Funny

      OK, it's 1-2-3-4-5 -- the same as my luggage!

      --
      "Save the whales, feed the hungry, free the mallocs" -- author unknown
  3. $278k ?? by r_cerq · · Score: 5, Insightful

    That's $3159 per machine, and those are today's prices... They weren't so low a couple of years ago...

    1. Re:$278k ?? by toddler99 · · Score: 4, Informative

      google doesn't buy pre-built machines they have been building costum machines from the very beginning. Although, with fab'n their own memory, i'm sure today they do a lot more. Google runs the cheapest most unreliable hardware you can find. It's in the software that they make up for the unreliable hardware. Though unreliable hardware is ok so long as you have staff to get the broken systems out and replaced with a new unreliable cheap ass system. When google started they used lego's to hold their costum built servers together

    2. Re:$278k ?? by hjf · · Score: 3, Flamebait

      so it means if you are smart enough, you don't need to have a $1,500,000 Sun server or that kind of shit. leave that for big corporations with lame-ass programmers.
      imagine what google could do with that kind of shit

    3. Re:$278k ?? by Gilk180 · · Score: 5, Interesting

      I really doubt they are spending anywhere near this for the machines themselves. A former student a google employee made one of those recruiting/marketing visits to my university last semester. I got to speek to him at length about Google's operation. According to him (and he had pictures to back this up). All of their boxen are a motherboard, an ide drive and a processor sitting on a shelf in the rack. No cases, no fans, no cd, etc. Plus they buy in bulk and get good prices.

    4. Re:$278k ?? by jarich · · Score: 2, Insightful
      Agreed. If you are able to code in your fault tolerance, it's a heck of a lot cheaper than buying it.

      What's cheaper... buying a round robin DNS router (hardware) or coding your client to try the next web server in it's list (software). Now, multiply that savings for every customer you sell to.

      The problem is finding someone who knows how to do that robustly and reliably. Most places have troubling finding developers whose programs don't crash every 15 minutes. This sort of thing is a little more advanced.

    5. Re:$278k ?? by sql*kitten · · Score: 5, Insightful

      so it means if you are smart enough, you don't need to have a $1,500,000 Sun server or that kind of shit. leave that for big corporations with lame-ass programmers. imagine what google could do with that kind of shit

      The difference is that if Google loses track of a few pages due to node failure it's no big deal because a) they don't guarantee to index every page on the web anyway and b) the chances are that page will be spidered again in the near future - and it may not even still exist anyway.

      Your bank, on the other hand, can't just "lose" a few transactions here and there. FedEx can't just lose a few packages there and there. Sure they occasionally physically lose one, but they never lose the information that at one point, they did have it. Your phone company can't just lose a few calls you made and not bill you for them. Your hospital can't just lose a few CAT scans and think oh well, he'll be in for another scan eventually.

      Now, I'm not saying that Google's technique isn't clever - I'm saying that it can't really be generalized to other applications. And that's why very smart people - and big corporations can afford to hire very smart people - keep on buying Sun and IBM kit by the boatload.

    6. Re:$278k ?? by Anonymous Coward · · Score: 2, Informative

      Google Filesystem replicates same data on three nodes (by default, can be configured to more), so the probability of data loss is rather small. Source here.

    7. Re:$278k ?? by geniusj · · Score: 5, Insightful

      I can confirm this as well.. I have seen their racks in Equinix in Ashburn, VA. I pass by their cages every time I go to my cage there. I believe I also saw them in Exodus in Santa Clara a couple of years ago. They are 1U half depth and do indeed lack a case. There are definitely thousands of their servers in Ashburn, VA, and they are very space efficient (as they would need to be).

    8. Re:$278k ?? by Anonymous Coward · · Score: 5, Informative

      The high-end Sun machines are designed for high availability. Not only will a CPU failure not crash the machine, the CPUs are hot swappable so you can replace a failed CPU without so much as a reboot.

    9. Re:$278k ?? by sql*kitten · · Score: 5, Informative

      Any of those 64 CPUs fails, and your system will crash.

      Doesn't work like that, kid. A CPU on a high-end Sun fails, and the system will keep on running. You can swap the CPU out and replace it with a new one, the system will simply pick it up, assign threads to it, and keep on running. Had a couple of CPUs fail a little while ago... the first we users noticed of it was that the application slowed down slightly. Sysadmin just said yeah, I know, I'll replace 'em when the parts arrive this afternoon. Cool, we said. No data lost, no need to shut down or even restart our app. 'Course you gotta architect your app to deal with that - like don't have just one thread that does a crucial task, 'cos there's a chance that might be on the CPU that fails. But still, it's no big deal.

    10. Re:$278k ?? by Anonymous Coward · · Score: 5, Informative

      Dude, big iron is not comparable in the slightest to that dinky little dual PPro Linux 'server' you keep in your closet. A CPU can fail, on a live running system, and the machine and Solaris or AIX won't even hiccup. Your application will notice, because suddenly a couple of its threads will quit, but that's ok, software like Oracle already knows how to deal with failed transations. And if you can schedule a CPU board removal/swap, then there won't be ANY problems at all, as the OS will migrate threads to other CPUs and allow the removal or hardware.

      And hey, if you want to mix and match CPU types (uSparc 2 and 3, etc), speeds, etc, no problem either. So if you wanna upgrade your server's CPUs, there will be zero downtime, you just do it a board at a time (board = 2 or 4 CPUs).

    11. Re:$278k ?? by mikis · · Score: 2, Informative

      google doesn't buy pre-built machines

      Yes, they do.

    12. Re:$278k ?? by jburroug · · Score: 4, Interesting

      Your hospital can't just lose a few CAT scans and think oh well, he'll be in for another scan eventually.

      You've never worked in a medical field have you? You'd think that that would be a big deal and in theory data integrity is a very high priority but in reality...

      I used to work as the IT Manager for a diagnostic imaging and cancer treatment center (and still do contract work with them because my replacement is kind of a noob) While loosing studies isn't exactly a "no big deal" situation it's still far more common than patients will ever realize. The server that stores and processes all of the digital images from the scanning equipment is a single CPU home rolled P4 using some shitty onboard IDE raid controller (doesn't even do RAID5!) running Windows 2K. The most money I could get for setting up a backup solution was the $200 an external firewire drive cost. Somehow we never managed to loose a study once it reached my network in the 9 months I worked there but I know three or four were deleted from the cameras themselves before being sent properly so whoops it's gone, gotta reschedule (and bill their insurance or Medicare again!) Two weeks ago one of the drives in that 0+1 array failed and despite my pleadings they still haven't ordered a replacement yet...

      Now it's tempting to think that this place is just a special case of cheapness and sloppiness but from talking to the diagnostic techs (the people that operate the cameras) that's not so. That clinic is a little worse than average in terms of loosing patient information but by no means the worst some of them at seen/heard of/worked at in their careers. It's worse in general at small facilities but even large hospitals often suffer from the same unprofessionalism.

      Your bank and the phone company keep much better track of your calls or your ATM transactions than most hospitals do with your CT or MRI scans...

      --
      "Listen: We are here on Earth to fart around. Don't let anybody tell you any different!" - Kurt Vonnegut
    13. Re:$278k ?? by Anonymous Coward · · Score: 2, Informative
      DIMMs is one thing, CPU's are another.
      If its a soft error - "cosmic ray" or what ever then it will log it and keep going. For CPU's if you haven't affinitied any processes/threads to it you should be able to do :
      psradm -v -f cpuid
      To take the processor offline yourself, obviously failing is not exactly the same but I thought I have had some fail without crashing. - Unless, of course you are trolling and if so, you got me.
    14. Re:$278k ?? by Hans+Lehmann · · Score: 2, Funny
      When google started they used lego's to hold their costum built servers together

      They clearly had to put an end to that practice once they reached their first 10,000 machines or so. Have you looked at the price of Legos lately?

      --
      09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
    15. Re:$278k ?? by tekunokurato · · Score: 2, Informative

      Funny--I've spent LOADS of time on the consumer side of Brigham & Women's/Dana Farber in Boston getting various operations, chemotherapies, catscans, etc, and seeing many different doctors and nurses (don't worry I'm fine now ^_^). I'm consistently impressed with both the ubiquity and the reliability of their information systems. They're extremely universal and always seem to display quite simply exactly what Medical Care Personnel X needs to access. Perhaps a model to suggest to your clinic.

    16. Re:$278k ?? by Craigy · · Score: 2, Informative

      As it happens on high end (regatta-class) pSeries Kit from IBM, you don't need to worry about how your app works either as the firmware reassigns your (single, very important) thread to a still-working CPU. Craigy

  4. Can you imagine by Sadiq · · Score: 4, Funny

    Can you imagine a beowul.... oh.. wait..

    --
    SysWear - Geek T-shirts (UK/Europe)
    1. Re:Can you imagine by mkavanagh · · Score: 5, Funny

      can you imagine a beowulf cluster of karma in soviet russia whoring YOU, you insensitive cliched clod?

  5. Google, will you marry me? by Anonymous Coward · · Score: 5, Funny

    1) google is so pretty and smart
    2) google is worth so much money
    3) google has a huge rack!!

    1. Re:Google, will you marry me? by Anonymous Coward · · Score: 5, Funny

      whoops, forgot to sign the letter!

      Love,
      Yahoo.

    2. Re:Google, will you marry me? by slickwillie · · Score: 3, Funny

      Of course you can't marry Google, but you might have a chance with Sergey.

    3. Re:Google, will you marry me? by Bobdoer · · Score: 2, Funny

      There's only one problem: she's always being used by other men.

    4. Re:Google, will you marry me? by madsh · · Score: 2, Funny

      Don't worry...

      just send it through your gmail...

      and you will also get a few adds about rings and flowers to go with that...

      Mads

  6. IPO changes things by Have+Blue · · Score: 5, Interesting

    There was an article recently about how Google constantly understates various statistics about itself to mislead potential competitors. This article also said that the SEC would not allow them to do this once they became a publically traded company.

  7. Why do we care? by the_raptor · · Score: 4, Funny

    Seriously? What is the point of this article? What next? Linus found to prefer blue ink, over black ink?

    --

    ========
    CINC, 4th Penguin Legion
    1. Re:Why do we care? by 0xC0FFEE · · Score: 5, Funny

      Well, everybody knows that black ink is best for the job and that Linus prefers it.

  8. Not unexpected... by avalys · · Score: 4, Insightful

    I don't think this is that strange: after all, that 10,000 machines figure is several years old. It's only logical that Google has expanded their facilities since then.

    --
    This space intentionally left blank.
  9. At $699 per CPU by earthforce_1 · · Score: 5, Funny

    SCO now knows how big an invoice to send Google! :-D

    --
    My rights don't need management.
    1. Re:At $699 per CPU by ComaVN · · Score: 4, Funny

      Yeah, but they stole it from SCO.

      --
      Be wary of any facts that confirm your opinion.
    2. Re:At $699 per CPU by damiam · · Score: 2, Informative

      No. They run Linux, with their own proprietary software over it.

      --
      It's hard to be religious when certain people are never incinerated by bolts of lightning.
  10. What a waste by Anonymous Coward · · Score: 3, Funny
    I'm sure a single IBM mainframe could do the same amount of work in half the amount of time and cost a fraction of what that Linux cluster cost.

    I hang around too many old-timer mainframe geeks. MVS forever!!! and such.

    1. Re:What a waste by phoxix · · Score: 5, Interesting

      If you've ever read a white paper of Google's, you'd realize that they even tell people why they deal with massive clusters over mainframes: lower latency.

      Sunny Dubey

    2. Re:What a waste by Waffle+Iron · · Score: 5, Informative
      I'm sure a single IBM mainframe could do the same amount of work in half the amount of time and cost a fraction of what that Linux cluster cost.

      Mainframes are optimized for batch processing. Interactive queries do not take full advantage of their vaunted I/O capacity.

      Moreover, while a mainframe may be a good way to host a single copy of a database that must remain internally consistent, that's not the problem Google is solving. It's trivial for them to run their search service off of thousands of replicated copies of the Internet index. Even the largest mainframe's storage I/O would be orders of magnitude smaller than the massively parallel I/O operations done by these thousands of PCs. Google has no reason to funnel all of the independent search queries into a single machine, so they shouln't buy a system architecture designed to do that.

  11. Assumptions? by waytoomuchcoffee · · Score: 4, Interesting

    According to calculations by the IEE, in a paper about the Google cluster, a rack with 88 dual-CPU machines used to cost about $278,000

    Um, don't you think if you were buying 899 racks you might actually, you know, negotiate for a better price?

    This isn't the only assumption in your analysis, and the problems with them will be compounded. What's the point of this, really?

    1. Re:Assumptions? by digitac · · Score: 5, Funny

      That's right, they probably got in on the "Buy 899 Get 1 Free" Sale. So in reality they have a nice even 900 racks. Makes much more sense that way.

    2. Re:Assumptions? by 2MuchC0ffeeMan · · Score: 4, Insightful

      i thought of this too, but then i thought that they probably bought them 5/10/20 at a time as they grew.

      --
      Runnin' On Empty .... I'm Still Alive
  12. Maybe just me... by hot_Karls_bad_cavern · · Score: 4, Insightful

    Might just be me, but damn, don't you think this has raised the interested of our three letter entities? i mean, damn that is just some serious computing and indexing power on cheap, "disposable" hardware...with a filesystem that can keep track of that many machines? If i headed one of such entities, i'd sure want to know more about it!

  13. Come on! Does it really matter? by diegomontoya · · Score: 2, Insightful

    My guess is just as your guess which would be:

    your guess + 1 = my guess.

    We already know they have enough servers to saturate a T1000 line so might as well stop here and talk about something more constructive.

    1. Re:Come on! Does it really matter? by MrHanky · · Score: 4, Funny
      How in the hell does a present day search engine saturate a fictional liquid metal robot from the future???

      Well, that depends on what sort of time portal they use. Now, a T1000 would probably be saturated by a time portal following the Terminator rules: one way only. But Google seems to favour Back to the Future rules, as shown by number of hits:

      13,500,000 for back to the future

      3,460,000 for terminator

      This would make saturating a T1000 a lot easier, since you could saturate it while travelling back in time yourself, or maybe even while standing still in time. This would make Google's bandwidth infinite, as a measly T1000 would stand still. Unless it was using its own time portal to travel back in time to destroy Google, but that would create a paradox, since, as we all know, Google will become Skynet, which will create the T1000 in the first place.

      What I'm trying to say is: I don't know, but I'm sure Google could do it.

  14. Re:Pretty Broad by avalys · · Score: 4, Insightful

    Yes, but aside from dealing with hardware failures and other physical / logistical problems, there really isn't much of a difference between managing 45,000 computers and managing 80,000. They're both Really Big Numbers, and I'm sure whatever software they're using is scaleable enough to smoothly handle many more machines than that.

    --
    This space intentionally left blank.
  15. wait by Docrates · · Score: 4, Insightful

    Remember there's a little thing called "volume discount"...

    It's gotta be more than that.

    --

    There are two kinds of people in the world: Those with good memory.
  16. All that power by Chucklz · · Score: 5, Funny

    With all those TFlops, no wonder Google converts units so quickly.

  17. Really? by irikar · · Score: 5, Funny

    You mean the PigeonRank(tm) technology is a hoax?

  18. This is actually useful by 2MuchC0ffeeMan · · Score: 2, Interesting

    because with ~80,000 machines, they can easily put a few hard drives in each, and give everyone 1gb of gmail space... I didn't think it was possible.

    where do you go to buy 80,000 hard drives?

    --
    Runnin' On Empty .... I'm Still Alive
    1. Re:This is actually useful by Anonymous+Brave+Guy · · Score: 4, Insightful
      where do you go to buy 80,000 hard drives?

      You don't; their Sales Director comes to you...

      --
      If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
    2. Re:This is actually useful by drinkypoo · · Score: 4, Funny

      You almost have it right... first, a couple of hookers come to you. Then, a few hours later, they are followed by the sales director.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
  19. 88 machines per rack? hardly. by cyclop5 · · Score: 3, Interesting

    In your standard 42U cabinet, you're talking a half-U per server. Umm.. not happening. Let's just say I happen to know they use 2U servers, for a total of 21 per cabinet. Custom jobs - just the "floor pan" (i.e. no sides, or top for the case), system board, power supply, and I think a single (or possibly dual) hard drive (I didn't want to be too nosy staring into someone else's colo space). Oh, and network. And rumor has it, they're putting in close to 200 cabinets in just this location alone.

    1. Re:88 machines per rack? hardly. by Anonymous Coward · · Score: 2, Informative

      In your standard 42U cabinet, you're talking a half-U per server. Umm.. not happening.

      IBM has a blade center that can hold 84 2-way blades in a 42U cabinet.

    2. Re:88 machines per rack? hardly. by PenguinOpus · · Score: 4, Informative

      Racksaver was selling dual-machine 1U racks for several years and I owned a few of them. Think deep, not tall. Racksaver seems to have renamed itself Verari and only has dual-Opteron in a 1U now. Most dense configs seem to be blade-based these days. Verari advertises 132 processors in a single rack, but I suspect they are not king in this area.

      If Google is innovating in this area, it could either be on price or in density.

    3. Re:88 machines per rack? hardly. by cyclop5 · · Score: 3, Interesting

      From the cabinets I saw, it was definitely 2U vertical space. It was one of those things that surprised me a little - I would have assumed they'd use blade servers, or at least 1U boxes just to get the rack density. So when I had the opportunity to "sneak a peek", I tried to notice as much as I could, without poking and prodding. Unfortunately, there wasn't much to notice, other than what I mentioned previously. That, and they were all pre-installed in the cabinets before shipping out to the colo. (There were 30 or 40 cabinets in the shipping/receiving area of the colo).

  20. Power by ManFromAnotherPlace · · Score: 3, Funny

    This many computers must use quite a bit of power and they probbably also need some serious airconditioning. I sure wouldn't want to receive their electricity bill by mistake. :)

  21. Google hosting by titaniam · · Score: 4, Interesting

    I wonder if google will start up a web-hosting business? I bet you can't beat their uptime guarantees. They could provide sql, cgi, etc, and build in multi-machine redundancy for your data just like they do for theirs. It'll be the google server platform, just one more step to replacing Microsoft as the evil monopoly.

    1. Re:Google hosting by cyberformer · · Score: 4, Insightful

      If they did, there's a real chance that there could be no more Internet for a lot of applications: people would just upload their Web pages to Google, users would log on to Google to search, and most email will go through Gmail.

      This is a good thing for Google, but not for the world as a whole.

    2. Re:Google hosting by Angostura · · Score: 4, Interesting

      Actually, I would be more worried if I was Akamai. If Google went after the corporate market and offered some kind of grid-esque caching-and-execution environment, that would be something to look at. However it would need some rather nifty scheduling an admin tools, and would add a lot complexity, so I don't think that's too likely.

    3. Re:Google hosting by Ian+Bicking · · Score: 3, Interesting

      There's an interesting article comparing Google and Akamai which talks about that as well, since they have technical similarities, but are strategically very different -- Akamai does massive web hosting, while Google does massive web applications.

  22. Someone call the FBI by Durandal64 · · Score: 4, Funny

    The number of machines Google uses is considered a trade secret. By attempting to determine how many machines they have, you're in violation of the DMCA. I'm calling the FBI.

  23. I have seen the light by Anonymous Coward · · Score: 3, Informative

    working at abovenet google has pulled there machines in and out of our data centers many a times. its incredible the way they have there shit is setup.

    they fit about 100 or so 1u's on each side of the rack, there double sided cabinets that look like refrigerators. there seperated in the center by noname brand switches and they have castor wheels on the bottoms of them. google can at the drop of a dime roll there machines out of a datacenter onto there 16 wheeler, move, unload and plug into a new data center in less than a days time.

  24. Makes Perfect Since by peterdaly · · Score: 2, Insightful

    Since the 10k server number was first floated, I believe google has added quite a few, meaning 6 to 10 whole new datacenters around the world.

    It would only make sense that the server count would now be in the ballpart of what is mentioned here.

    Google hasn't been standing still, and I've heard the "Google has 10k servers" for 1-2 years now.

    -Pete

  25. 15 Megawatts by SuperBanana · · Score: 4, Interesting

    ...assuming 200W per server, which is probably low, but probably compensates for 79,000 being most likely an overestimate. However, that doesn't even begin to account for the energy used to keep the stuff cool.

    Anyone know how many trees per second that would be? Conversion to clubbed-baby-seals-per-sec optional.

    1. Re:15 Megawatts by gspr · · Score: 5, Funny

      According to Google herself dried wood contains 15.5 MJ of energy per kg. It seems that Google consumes about 1 kg of wood per second (if they've found a way to utilize 100% of the energy, which they of course have - they're Google, after all), and that the pigeons are just there to use their wings to dry the wood!
      We're on to you, Google!

    2. Re:15 Megawatts by Sponge+Bath · · Score: 3, Funny

      The servers could be powered by 15 Megahamsters on treadmills (@ 1 watt/hamster). But that would require sufficient management to motivate the hamsters with the threat of off-shoring their jobs.

    3. Re:15 Megawatts by glenstar · · Score: 4, Funny
      According to Google herself...

      Hm... Google seems decidedly male to me.

      1) Answers questions rapidly without offering any description of how the answer was derived? Check.

      2) Works in short, fast bursts of energy and then tells you proudly it only took them .009 seconds? Check

      3) Has an inability to accessorize his appearance? Check.

      4) Returns 82,200,000 results when asked about porn? Check and match!

  26. Heat by gspr · · Score: 5, Informative

    A Pentium 4 dissipates around 85 W of heat. I don't know what the Xeon does, but let's be kind and say 50 W (wild guess). Using the article's "low end" estimate, that brings us to 4.7 MW!
    I hope they have good ventilation...

    1. Re:Heat by Neil+Blender · · Score: 5, Funny

      Google engineer reading your post: OH SHIT!

      <sound of door slamming>

      <sound of car engine starting>

      <sound of tires squeeling away>

    2. Re:Heat by gammelby · · Score: 5, Informative
      I once attended a talk by google fellow Urs Hölzle on the google architecture, and he mentioned how they handle the cooling issue: They do not depend on each individual unit to be cooled separately - instead they have an enormous flow of air between the racks (sitting back to back), generated by some large fan in the roof.

      Ulrik

    3. Re:Heat by bob_jordan · · Score: 4, Funny

      More like a google engineer taking wet squelching footsteps to the door crying out,

      "I'm meeeelllllttttiinnnnnggggg"

      Bob.

    4. Re:Heat by Dawn+Keyhotie · · Score: 2, Funny
      Well, by my calculations, taking into account the effects of Einstein's theory of general relativity, once the data center is accelerated to 88mph, it will consume exactly 1.21GW of electricity!

      Now, where's my Delorean?

      Cheers!

      --
      "The only good windmill is a tilted windmill."
  27. SCO by WindBourne · · Score: 2, Insightful

    Since it is known that Google has the largest installed base of Linux and now they are about to go IPO in the billions, I wonder why SCO has not gone after them? Apparently, it is not use of Linux that makes SCO persue a company.

    The interesting thing is, that if SCO really has MS backing and MS is pulling strings, then I would think that MS would want SCO to persue google to tie them up for awhile.

    --
    I prefer the "u" in honour as it seems to be missing these days.
  28. hardcore by mooosenix · · Score: 5, Funny
    After many scientific and time consuming experiments, we have found the number of servers to be.........

    42.

    1. Re:hardcore by njcoder · · Score: 2, Interesting
      "42"

      Actually, that's pretty close to the number of copies of Red Hat Google actually paid for in 200.

      The price was right; Google doesn't pay any significant amount of money to Red Hat. Google downloads the software for free and gets support in-house and from the Linux community. Google actually paid for only about 50 copies of Red Hat, and those purchases were more of a goodwill gesture. "I feel like I should be nice, so when I go to Fry's I pick up a copy," Brin said.
      From here
  29. Re:Which brings up an interesting question... by gregwbrooks · · Score: 3, Interesting
    Not a thing, in terms of the number of their servers, or internal data such as line-item hardware purchases.

    This is how it should be, since knowing the size of Google's hardware capacity is a very, very strategic bit of information, and the kind of thing that would allow Yahoo/MSN/whoever to get a feel for how much capital would be necessary to duplicate or improve upon it.

    --


    "It was a summer's tale: Just a boy, his Linux, and a head full of dreams..."
  30. Re:Cheap hardware by flxkid · · Score: 2, Funny

    Right, 12-15 per rack...they're smart enough to develop an amazing search engine, but not to understand proccessing power density issues...

    --
    Better VDF than VD...check it out: Data Access
  31. Why reverse engineer... by SporkLand · · Score: 5, Informative

    When you can just open "Computer Architecture: A Quantitavie Approach, 3rd Edition" by Hennessy and Patterson to page 855 and find out that in summary:
    Google has 3 sites (two west coast, one east)
    Each site connected with 1 OC48
    Each OC48 hooks up to 2 Foundry BigIron 8000 ...
    80 Pc's per rack * 40 racks(at an example site)
    = 3200 PC's.
    A google site is not a homogenous set of PC's instead there are different types of PC's that are being upgraded on different cycles based on the price/performance ratio.

    If you want more info get the patterson hennessy book that I mentioned. Not the other version they sell. This one rocks way harder. You get to learn fun things like Tomosulo's algorithm.

    If I am violating any copy rights feel free to remove this post.

    1. Re:Why reverse engineer... by Anonymous Coward · · Score: 2, Insightful

      That book was written about 3 years ago.

  32. Re:Ask by FuzzyBad-Mofo · · Score: 2, Funny

    Or they could ask Jeeves:

    Say, Jeeves old boy: how many servers does Google have?

    Jeeves: Piss off!

  33. inside information by sir_cello · · Score: 4, Interesting

    Interesting People 2004/05:
    I know for a FACT they passed 100,000 last November. One thing the Louis calculation may have missed is Google's obsession with low cost. For example read the company's technical white paper on the Google file system. It was designed so that Google could purchase the cheapest disks possible, expecting them to have a high failure rate. What happens when you factor cost obsession into his equation?

    1. Re:inside information by gammelby · · Score: 4, Informative
      In the talk mentioned in a previous posting, mr. Hölzle also talked about disk failures: They have so many disks (obviously of low quality, according to you) and read so much data, that they cannot rely on standard CRC-32 checks. They use their own checksumming in a higher layer to circumvent the fact that CRC-32 gives false positive results in one out of some-large-number.

      Ulrik

  34. I'm more interested.. by diegomontoya · · Score: 5, Funny

    in how they recycle their gigantic heat output...perhaps move data center to the windy city, open up a homeless shelter next door, and put the hot air to good use for once. They might even get a tax break on this.

    Better yet, open up a nursery (plant type) next door , build a green house, and piple 25% of the heat to it. Have you guys see the price of trees lately? Google could make a killing with the "recycling" plant.

    1. Re:I'm more interested.. by bgarcia · · Score: 3, Funny
      I'm more interested in how they recycle their gigantic heat output...perhaps ... open up a homeless shelter next door, and put the hot air to good use for once.
      By cooking the homeless people?
      I guess that's one way to solve the homeless problem.
      --
      I'm a leaf on the wind. Watch how I soar.
    2. Re:I'm more interested.. by JDWTopGuy · · Score: 2, Funny

      Hey, with a little hot sauce... after all, beggars can't be choosers, but they make great tacos!

      --
      Ron Paul 2012
    3. Re:I'm more interested.. by identity0 · · Score: 2, Funny

      Even better - open up a greenhouse next door, and buy some grow lamps. I'm sure geeks will pay a lot of cash for "Google Doobies" :)

      Google could corner the geek pothad market with their revolutionary "plant ranking" engine, and the power consumption from the grow lamps can be easily hidden in Google's normal power bill, too.

  35. Absolutely Beautiful by Anonymous Coward · · Score: 5, Insightful

    All those machine, all that complexity and activity, all boiled down to one little box under a Google logo. The most useful input box on the internet.

    Thanks Google!

  36. Re:Pretty Broad by victor_the_cleaner · · Score: 4, Funny

    Yeah it's kind of like:

    Your wife has slept with 80 other men, or was it 200?

    Either way, it's not good for you.

  37. You're not factoring in Google's culture by gregwbrooks · · Score: 3, Interesting
    Google is all about two things from an operational standpoint:

    • Keep costs down; and
    • What happens inside the company, stays inside the company.
    Figuring out the number of servers they have is why we're noodling over the second point, but the first point is what probably as us all thrown off. Someone in a position to know said recently that he could state as a an absolute fact they have more than 100,000 servers -- and added that merely mentioning it probably violated multiple NDAs he had.
    --


    "It was a summer's tale: Just a boy, his Linux, and a head full of dreams..."
  38. When the CIO was at SVLUG by nbahi15 · · Score: 2, Informative

    The CIO and Head Brainsurgeon (he really is a medical doctor) was at SVLUG last year he said there were about 11500 Linux boxes at Google.

  39. Re:The things you could do with that... by polyp2000 · · Score: 2, Funny

    With that much computer power at their disposal they could do some cool things - maybe some sort of distributive computing thingie or big database of some kind.

    What about building a really big search engine?
    To build a really big search engine, your going to need some serious distributed computing, and a big database! Hey wait a minute thats what they are doing!

    --
    Electronic Music Made Using Linux http://soundcloud.com/polyp
  40. False advertising. by duckpoopy · · Score: 5, Funny

    They better have at least 10^100 machines, or they will be getting a call from my lawyers.

    --
    word.
  41. Environmental impact: power to 68,000 homes by XavierItzmann · · Score: 3, Funny

    Did anyone think of the electricity needed to power and cool 50,000 servers

    The 1,100 Apple cluster at Virginia tech uses 3 megawatts, sufficient to power 1,500 Virgina homes
    http://www.research.vt.edu/resmag/2004resmag/HowX. html

    Yes, it is true: every time you hit Google, you are polluting the Earth.

    --
    The next pasture is always greener
    1. Re:Environmental impact: power to 68,000 homes by A.T.+Hun · · Score: 3, Interesting

      Yes, it is true: every time you hit Google, you are polluting the Earth.

      Whereas Slashdot uses nothing but solar power.

    2. Re:Environmental impact: power to 68,000 homes by lawpoop · · Score: 4, Insightful

      Yes, it is true. We can't exist without polluting. However, I'm willing to bet, without doing the calaulations, that the pollution you personally generate by querying google is much less than what you generate browsing slashdot on your home computer.

      --
      Computers are useless. They can only give you answers.
      -- Pablo Picasso
  42. Scary... DDOS? by moosesocks · · Score: 2, Interesting

    Isn't it scary that according to these figures, Google's datacenter should theoretically be able to DDOS the entire Internet?

    Someone mentioned that they have enough bandwidth/processing power to saturate a T1000 line. Scary...

    --
    -- If you try to fail and succeed, which have you done? - Uli's moose
  43. Acquisition by MrChuck · · Score: 4, Insightful
    recall that important mantra:
    The cost of acquiring the machine is a fraction of the cost of owning it.

    And lets not forget the overhead of 2 networks per machine and all the patch panels, wiring, switches. Toss in console management (which may not be on all machines at all time), monitoring and management of said machines. Oh, and one really tired guy running around.

    Disks are going to fail at a rate of several hundred or thousand PER DAY, just statistically. (along with power supplies etc)

    Toss in that in three years, ALL of those machines are obsolete.
    That's huge.

    I've got ~300 racks in a half full data center upstairs from me. All network cables run to a room below it to patch panels. Around 50% the size of the DC is cable management. Next to that is a room FILLED with chest high batteries - these are used during outages until the generators need to be kicked on. And a NOC takes up about 1/5th the space of the DC (monitoring systems worldwide, but it's got seating for maybe 40 people - tight and usually filled with 10 folks, but in a crunch we live up there).

    So that $3159 is only a bit of it. And in 3 years, all those machines will likely be replaced for whatever $3k buys then. That's about to be a 2 CPU Athlon64 box. If Sun can pull a rabbit out of its ass, we'll have 8 and 16CPU Athlon64 boxes. At least with that, some of the CPUs can talk to each other really really really fast.

    1. Re:Acquisition by Anonymous Coward · · Score: 5, Informative

      >>Disks are going to fail at a rate of several hundred or thousand PER DAY

      that's a little over the top big guy. i've worked at a 10,000 node corp doing desktop support. We lost ONE disk perhaps a week....if that much. We often went several weeks with no disks lost.

      even if you factor in multiple drives per server, say TWO (because they are servers not desktops)

      Interpolate for 100,000, that's a max of 20 disks per week...on the high end.

    2. Re:Acquisition by onepoint · · Score: 2, Informative

      Well it really depends on what your willing to spend for a drive and the quality. I will agree, 10000 drives should give you about 2 to 6 failures per week. But I have seen that sometimes in a web server situation 10000 drives have a failure rate of about 15 per week. In one case ( very very bad case ) we had a batch of bad drives come in, the first 70 had a complete failure within 4 weeks then the rest of the order failed within 6 months.... we nevered ordered that model of drive again.

      Now we have had some great luck also, where we found a brand that almost never failed for 12 to 18 months at a time, so we set up a specific policy that we used those drives as back-up redundancy drives for every main drive ( about 2500 drives ), to this day I have yet to see more than 1 failure per every 3 weeks with those drives.

      Now I have a pc at home that has been abused daily and have never had drive failure, it's been turned on every day since 1999, so it cycles completely from hot/cold and sleep/aware. maybe I'm lucky but I've abused that drive consistantly ( and back up weekly ) so maybe I'm due.

      Drive spin has become a huge factor in relation to drive failure in a web server farm. You want the fastest spin rate and at the same time you need the fast read times, but the faster spin rates give you higher failures, so you really have to learn to blend cache's, hardware and software and the dreaded mix drive raid.

      best of luck to all

      Onepoint

      --
      if you see me, smile and say hello.
  44. But his low end number are Wrong... by quasi-normal · · Score: 3, Interesting

    He displayed a little numerical dyslexia... it's 359 racks, not 539 for $100 Mil. which makes the stats a little different: 31592 machines 63184 CPU's 63184 GB RAM 2527.36 TB of Disk space and I'm not sure what his logic is behind the Teraflops calculations... looks like he's taking 1Ghz==1TFlop which would give about 126.4 TFlops. Aside from that error, the figures sound pretty realistic to me. But I wanna know how much bandwidth they use.

  45. Re:They don't need 40K machines! by DiscoOnTheSide · · Score: 2, Insightful

    Google also indexes images, newsgroups, has things like froogle, as well as the upcomming gmail. Not to mention all the research and other things they have going, on top of redundancy...

    --
    Viva La Revolucion! Buy a Mac!
  46. Re:Nobody has 88 systems in a rack by Grimster · · Score: 4, Interesting

    I was in Exodus - Toyama facility in Sunnyvale, CA back in 2001 and was talking to some of the data center techs, they were bitching because Google DOES stack 44 -half depth- servers in a rack, on EACH SIDE (aka 88 servers per rack indeed) and how the heat that produces is absolutely fucking insane and how he can't believe they don't meltdown. He was comlaining how frugal google was not giving the systems more room to breath.

    --
    --- www.f-theocean.com
  47. lego? by sfraggle · · Score: 2, Insightful

    Sounds like a pretty stupid idea to me. Lego is expensive stuff.

    --
    were you expecting to see a sig here? perhaps you'd rather see the inside of an ambulance!
    1. Re:lego? by james+b · · Score: 2, Interesting

      I think the parent is probably referring to some of the pictures on google's early hardware photos page, courtesy of the wayback machine. If so, the lego never necessarily went into `production', it was just when they were messing around.

  48. Corrected version - Re:I have seen the light by imroy · · Score: 2, Funny

    Geez dude, go back to school and learn how to punctuate properly and the proper use of there/they're/their. I'm not a grammer/spelling nazi. Even though mistakes annoy the shit out of me, I usually let it pass. I know I make the occasional mistake myself. But your post was just too much.

    I don't know why I'm doing this, but here's a corrected version of your post:

    Working at AboveNet, Google has pulled their machines in and out of our data centers many a time. It's incredible the way they have their shit set up.

    They fit about 100 or so 1U machines on each side of the rack. They're double sided cabinets that look like refrigerators, separated in the center by noname brand switches and they have castor wheels on the bottom. Google can at the drop of a dime roll their machines out of a data centre onto their 16 wheeler then move, unload and plug into a new data centre in less than a days time.

    1. Re:Corrected version - Re:I have seen the light by NoData · · Score: 4, Funny

      I'm not a grammer/spelling nazi.

      Obviously.

    2. Re:Corrected version - Re:I have seen the light by darkmeridian · · Score: 3, Funny

      I'm not a grammer/spelling nazi.

      Obviously.

      I'm not a grammer/spelling Nazi.

      Obviously.

      --
      A NYC lawyer blogs. http://www.chuangblog.com/
    3. Re:Corrected version - Re:I have seen the light by AVryhof · · Score: 3, Funny

      According to Google...

      Did you mean: grammar

  49. Hey by daishin · · Score: 2, Funny

    Google executives sir,mam,person, do you mind if you could lend me a few boxes?

    --
    (\_/)
    (O.o) This is Bunny. Add Bunny to your signature
    (> <) to help him achieve world domination.
  50. google is starting to resemble... by Anonymous Coward · · Score: 2, Funny

    google is starting to resemble the wonka factory from willy wonka and the chocolate factory.

  51. Interesting list... by glpierce · · Score: 4, Funny

    "Your phone company can't just lose a few calls you made and not bill you for them."

    Wait, what's wrong with that one?

    --
    G
    1. Re:Interesting list... by Anonymous Coward · · Score: 2, Insightful

      Working for a phone company, I can say that "we can" and "we do" :-)

  52. Server pricing by JWSmythe · · Score: 5, Informative

    His pricing in the summary may be a bit off.

    Every article I've read about Google's servers says they use "commodity" parts, which means they buy pretty much the same stuff we buy. They also indicate that they use as much memory as possible, and don't use hard drives, or use the drives as little as possible. From my interview with Google, they asked quite a few questions about RAID0, RAID1 (and combinations of those), I'd believe they stick in two drives to ensure data doesn't get lost due to power outages.

    We get good name brand parts wholesale, which I'd expect is what they do too. So, assuming 1u Asus, Tyan, or SuperMicro machines stuffed full of memory, with hard drives big enough to hold the OS plus an image of whatever they store in memory (ramdrives?), they'd require at most 3Gb (OS) + 4Gb (ramdrive backup). I don't recall seeing dual CPU's, but we'll go with that assumption.

    The nice base machine we had settled on for quite a while was the Asus 1400r, which consisted of dual 1.4Ghz PIII's, 2Gb RAM, and 20Gb and 200Gb hard drives. Our cost was roughly $1500. They'd lower the drive cost, but incrase the memory cost, so they'd probably cost about $1700, but I'm sure Google got better pricing, buying the quantity they were getting.

    The count of 88 machines per rack is a bit high. You get 80u's per standard rack, but you can't stuff it full of machines, unless you get very creative. I'd suspect they have 2 switches, and a few power management units per rack. The APC's we use take 8 machines per unit, and are 1u tall. There are other power management units, that don't take up rack space, which they may be using, but only the folks at Google really know.

    Assuming the maximum density, and equipment that was available as "commodity" equipment at the time, they'd have 2 Cisco 2948's and 78 servers per rack.

    $1700 * 78 (servers)
    +
    $3000 * 2 (switches)
    +
    $1000 (power management)
    --------
    $139,600 per rack (78 servers)

    Lets not forget core networking equipment. That's worth a few bucks. :)

    Each set of 39 servers would probably be connected to their routers via GigE fiber (I couldn't imageine them using 100baseT for this) Right now we're guestimating 1700 racks. They have locations in 3 cities, so we'll assume they have at least 9 routers. They'd probably use Cisco 12000's, or something along that line. Checking eBay, you can get a nice Cisco 12008 for just $27,000, but that's the smaller one. I've toured a few places who had them, and pointed at them citing them to be just over $1,000,000.

    So....

    $250,000,000 (ttl expenses)
    - $ 9,000,000 (routers)
    ------
    $241,000,000
    / $ 139,600
    ------
    1726 racks
    * 78 (machines per rack)
    ------
    134,682 machines

    Google has a couple thousand employees, but we've found that our servers make *VERY* nice workstations too. :) Well, not the Asus 1400r, those are built into a 1u case, but other machines we've built for servers are very easy to build into midtowers instead. Those machines don't get gobs of memory, but do get extras like nice sound cards and CD/DVD players. The price would be the same, as they'd probably still be attaching them to the same networking equipment. 132,000 servers, and 2,682 workstations and dev machines is probably fairly close to what they have.

    I believe this to be a more fair estimate, than the story gave. They're quoting pricing for a nice fast *CURRENT* machine, but Google has said before that they buy commodity machines. They do like we do. We buy cheap (relatively) and lots of them, just like Google does. We didn't pattern ourselves after Google, we made this decision long before Google even existed.

    When *WE* decided to go this router, we looked at many options. The "provider" we had, before we went on our own, leasing space and bandwidth directly from Tier 1 providers, opted for the monolythic sy

    --
    Serious? Seriousness is well above my pay grade.
    1. Re:Server pricing by Andy_R · · Score: 3, Insightful

      "hard drives ... they'd require at most 3Gb (OS) + 4Gb (ramdrive backup)"

      Which is why they have no problems finding space for GMail - you can't buy full size drives as small as 7Gb anymore, so they already have countless Tbs of unused drive space in their racks.

      --
      A pizza of radius z and thickness a has a volume of pi z z a
  53. Re:Not to sound like your Mom [or Señor Ashcr by Anonymous Coward · · Score: 4, Funny

    You might also be interested to know that there are a lot of government buildings in Washington DC.

  54. I think they include infrastructure & air cool by melted · · Score: 2, Interesting

    I think they include infrastructure and air cooling into their $250M figure. I these things can actually cost MORE than the racks themselves, especially if these racks consist of commodity hardware, and considering the size of their data center.

  55. Re: Did you factor in screws and cable? by Darthmalt · · Score: 2, Funny

    NO they just use legos duh. Though I personally prfer duct tape

  56. No by metalhed77 · · Score: 4, Informative

    It would not be a very distributed DDOS and that would stop any attack quite quickly. Quite simply google's bandwidth providers (or the providers above them) would just unplug them. They may be global, but they probably have less than 40 datacenters. It would not be distributed enough to sufficiently attack. If you could take over the same number of machines with the same amount of bandwidth, but distributed globally on various subnets (say a massive virus), *then* you'd have a DDOS machine. As is, google's DDOS would be shut down quite quickly.

    --
    Photos.
    1. Re:No by rayvd · · Score: 3, Insightful

      The best way for Google to accomplish a DDOS if they _really_ wanted to would be to make every search result point to the target website. :)

      Now that would be impressive...

  57. Re:Not to sound like your Mom [or Señor Ashcr by geniusj · · Score: 2, Informative

    I'm not saying that it's impossible. I'm sure any dedicated individual could do it. However, tours in datacenters are typically guided (especially at equinix). As far as getting in via unlocked doors, I'd say definitely would not happen here. You have to go through about 4 doors and 4 hand scanners to get in. There are no other entrances.

    Of course, most of it is more for show than practicality. I mean, they have hand scanners on every single cage. Definitely a little bit excessive :). However, I'm sure it impresses many decision makers.

    -JD-

  58. Redundancy by crucini · · Score: 3, Interesting
    The google file system is redundant. Loss of one node does not lose data.

    Some of the reasons these techniques aren't used in enterprise computing:
    1. They're hard, and business programmers are not that bright. And nobody has encapsulated these technologies in an IT product.
    2. The system can only respond quickly to a finite set of transactions that was known at design time. It lacks the flexibility of a standard file system or relational database.
    3. By the time a business has a lot of data, it usually has enough money to store the data conventionally. Search engines are a bit different.

    Since I've seen it up close a few times, I can say that the standard "enterprise way" (Oracle/Sun/EMC) delivers very poor bang for the buck. If Google wanted to, they could deliver a modified GFS with any desired level of reliability by increasing the redundancy. And even after that bloating, it would still deliver greater bang for the buck than the conventional solutions.
    1. Re:Redundancy by sql*kitten · · Score: 2, Interesting

      They're hard, and business programmers are not that bright. And nobody has encapsulated these technologies in an IT product.

      Hmm, yes. The really bright programmers are living in their parents' basement and working for IBM for free. The dumb ones are getting paid a pile of money to code up forms and reports in fancy code-generation tools, then clocking off at 5 and enjoying themselves.

      The system can only respond quickly to a finite set of transactions that was known at design time.

      Those dumb business programmers left that paradigm behind in the 80s. The tech to do it (the relational database) was developed in the 70s.

      Since I've seen it up close a few times, I can say that the standard "enterprise way" (Oracle/Sun/EMC) delivers very poor bang for the buck.

      You've "seen it up", I've "set it up", kid. Once you've been around the block a few times, you'll drop your tech-snobbery and just choose the right tool for the job.

  59. Sure, 10 years ago... by B4RSK · · Score: 2, Informative

    The high-end Sun machines are designed for high availability. Not only will a CPU failure not crash the machine, the CPUs are hot swappable so you can replace a failed CPU without so much as a reboot.

    Yes, 10 years ago this was a important thing to have... As were many other "big iron" features. And it still sounds very cool in a geeky kinda way.

    But with redundant relatively cheap clusters available, these types of things aren't worth the $$$ they used to be.

    Except at the extreme high end of the computing world hardware is steadily progressing to commodity level.

    --
    Some people are like slinkies--basically useless but they bring a smile to your face when pushed down the stairs.