How Many Google Machines, Really?
BoneThugND writes "I found this article on TNL.NET. It takes information from the S-1 Filing to reverse engineer how many machines Google has (hint: a lot more than 10,000).
'According to calculations by the IEE, in a paper about the Google cluster, a rack with 88 dual-CPU machines used to cost about $278,000. If you divide the $250 million figure from the S-1 filing by $278,000, you end up with a bit over 899 racks. Assuming that each rack holds 88 machines, you end up with 79,000 machines.'" An anonymous source claims
over 100,000.
* of servers in the world
* of servers in the USA
* of servers running Linux
There was an article recently about how Google constantly understates various statistics about itself to mislead potential competitors. This article also said that the SEC would not allow them to do this once they became a publically traded company.
According to calculations by the IEE, in a paper about the Google cluster, a rack with 88 dual-CPU machines used to cost about $278,000
Um, don't you think if you were buying 899 racks you might actually, you know, negotiate for a better price?
This isn't the only assumption in your analysis, and the problems with them will be compounded. What's the point of this, really?
I was always under the impression that Google used a lot of "cheap" hardware. Meaning, they only used IDE and non-rackmount machines.
So, they probably don't used "racks" but if they were, that means they could only get about 12-15 desktop machines (single proc) per rack. That's a whole lot less than 42 - 1U rackmounts to fill the rack.
You might be able to get machines slightly cheaper than retail if you, say, buy 79,000 of them.
because with ~80,000 machines, they can easily put a few hard drives in each, and give everyone 1gb of gmail space... I didn't think it was possible.
where do you go to buy 80,000 hard drives?
Runnin' On Empty
In your standard 42U cabinet, you're talking a half-U per server. Umm.. not happening. Let's just say I happen to know they use 2U servers, for a total of 21 per cabinet. Custom jobs - just the "floor pan" (i.e. no sides, or top for the case), system board, power supply, and I think a single (or possibly dual) hard drive (I didn't want to be too nosy staring into someone else's colo space). Oh, and network. And rumor has it, they're putting in close to 200 cabinets in just this location alone.
I wonder if google will start up a web-hosting business? I bet you can't beat their uptime guarantees. They could provide sql, cgi, etc, and build in multi-machine redundancy for your data just like they do for theirs. It'll be the google server platform, just one more step to replacing Microsoft as the evil monopoly.
If you've ever read a white paper of Google's, you'd realize that they even tell people why they deal with massive clusters over mainframes: lower latency.
Sunny Dubey
...assuming 200W per server, which is probably low, but probably compensates for 79,000 being most likely an overestimate. However, that doesn't even begin to account for the energy used to keep the stuff cool.
Anyone know how many trees per second that would be? Conversion to clubbed-baby-seals-per-sec optional.
Please help metamoderate.
This is how it should be, since knowing the size of Google's hardware capacity is a very, very strategic bit of information, and the kind of thing that would allow Yahoo/MSN/whoever to get a feel for how much capital would be necessary to duplicate or improve upon it.
"It was a summer's tale: Just a boy, his Linux, and a head full of dreams..."
Interesting People 2004/05:
I know for a FACT they passed 100,000 last November. One thing the Louis calculation may have missed is Google's obsession with low cost. For example read the company's technical white paper on the Google file system. It was designed so that Google could purchase the cheapest disks possible, expecting them to have a high failure rate. What happens when you factor cost obsession into his equation?
- Keep costs down; and
- What happens inside the company, stays inside the company.
Figuring out the number of servers they have is why we're noodling over the second point, but the first point is what probably as us all thrown off. Someone in a position to know said recently that he could state as a an absolute fact they have more than 100,000 servers -- and added that merely mentioning it probably violated multiple NDAs he had."It was a summer's tale: Just a boy, his Linux, and a head full of dreams..."
I really doubt they are spending anywhere near this for the machines themselves. A former student a google employee made one of those recruiting/marketing visits to my university last semester. I got to speek to him at length about Google's operation. According to him (and he had pictures to back this up). All of their boxen are a motherboard, an ide drive and a processor sitting on a shelf in the rack. No cases, no fans, no cd, etc. Plus they buy in bulk and get good prices.
Isn't it scary that according to these figures, Google's datacenter should theoretically be able to DDOS the entire Internet?
Someone mentioned that they have enough bandwidth/processing power to saturate a T1000 line. Scary...
-- If you try to fail and succeed, which have you done? - Uli's moose
Yes, it is true: every time you hit Google, you are polluting the Earth.
Whereas Slashdot uses nothing but solar power.
He displayed a little numerical dyslexia... it's 359 racks, not 539 for $100 Mil. which makes the stats a little different: 31592 machines 63184 CPU's 63184 GB RAM 2527.36 TB of Disk space and I'm not sure what his logic is behind the Teraflops calculations... looks like he's taking 1Ghz==1TFlop which would give about 126.4 TFlops. Aside from that error, the figures sound pretty realistic to me. But I wanna know how much bandwidth they use.
I was in Exodus - Toyama facility in Sunnyvale, CA back in 2001 and was talking to some of the data center techs, they were bitching because Google DOES stack 44 -half depth- servers in a rack, on EACH SIDE (aka 88 servers per rack indeed) and how the heat that produces is absolutely fucking insane and how he can't believe they don't meltdown. He was comlaining how frugal google was not giving the systems more room to breath.
--- www.f-theocean.com
I think the parent is probably referring to some of the pictures on google's early hardware photos page, courtesy of the wayback machine. If so, the lego never necessarily went into `production', it was just when they were messing around.
Given how fast Google is, we expect that they keep all the text of all the web pages that they index in memory. If we estimate 100K machines and 4,285,199,774 web pages, that's 42,852 web pages per machine. Let's guess 1 GB RAM per machine, then that's an allocation of about 25 KB per page (quite a bit larger than the average page size, I suspect). Of course, they've probably replicated the web a few times; let's guess 3 times, so that's about 8 KB per page -- still room to spare, and it's possible that the average memory per machine is greater than 1 GB. Plus, they could compress less popular pages -- the delay of decompression in memory is probably small.
Of course, once you consider that they keep thumbnails of al the images they index, things get tight very quickly. Plus, we can't forget the actual INDEX from words to documents -- that's in memory, too. And Orkut (which is probably pretty small, come to think of it).
GMail is another story altogether. 1 GB per user for 100K users would saturate their cluster. Plus indexes for searching mail. It seems unlikely that we'll have all-memory mail accounts anytime soon.
I think they include infrastructure and air cooling into their $250M figure. I these things can actually cost MORE than the racks themselves, especially if these racks consist of commodity hardware, and considering the size of their data center.
Your hospital can't just lose a few CAT scans and think oh well, he'll be in for another scan eventually.
You've never worked in a medical field have you? You'd think that that would be a big deal and in theory data integrity is a very high priority but in reality...
I used to work as the IT Manager for a diagnostic imaging and cancer treatment center (and still do contract work with them because my replacement is kind of a noob) While loosing studies isn't exactly a "no big deal" situation it's still far more common than patients will ever realize. The server that stores and processes all of the digital images from the scanning equipment is a single CPU home rolled P4 using some shitty onboard IDE raid controller (doesn't even do RAID5!) running Windows 2K. The most money I could get for setting up a backup solution was the $200 an external firewire drive cost. Somehow we never managed to loose a study once it reached my network in the 9 months I worked there but I know three or four were deleted from the cameras themselves before being sent properly so whoops it's gone, gotta reschedule (and bill their insurance or Medicare again!) Two weeks ago one of the drives in that 0+1 array failed and despite my pleadings they still haven't ordered a replacement yet...
Now it's tempting to think that this place is just a special case of cheapness and sloppiness but from talking to the diagnostic techs (the people that operate the cameras) that's not so. That clinic is a little worse than average in terms of loosing patient information but by no means the worst some of them at seen/heard of/worked at in their careers. It's worse in general at small facilities but even large hospitals often suffer from the same unprofessionalism.
Your bank and the phone company keep much better track of your calls or your ATM transactions than most hospitals do with your CT or MRI scans...
"Listen: We are here on Earth to fart around. Don't let anybody tell you any different!" - Kurt Vonnegut
Actually, that's pretty close to the number of copies of Red Hat Google actually paid for in 200.
From hereOpen Source Java DAO Generator
Some of the reasons these techniques aren't used in enterprise computing:
Since I've seen it up close a few times, I can say that the standard "enterprise way" (Oracle/Sun/EMC) delivers very poor bang for the buck. If Google wanted to, they could deliver a modified GFS with any desired level of reliability by increasing the redundancy. And even after that bloating, it would still deliver greater bang for the buck than the conventional solutions.