How Many Google Machines, Really?
BoneThugND writes "I found this article on TNL.NET. It takes information from the S-1 Filing to reverse engineer how many machines Google has (hint: a lot more than 10,000).
'According to calculations by the IEE, in a paper about the Google cluster, a rack with 88 dual-CPU machines used to cost about $278,000. If you divide the $250 million figure from the S-1 filing by $278,000, you end up with a bit over 899 racks. Assuming that each rack holds 88 machines, you end up with 79,000 machines.'" An anonymous source claims
over 100,000.
No wonder I'm'a Googlin'
Here's what I do: Bitty Browser & Andromeda
* of servers in the world
* of servers in the USA
* of servers running Linux
That's $3159 per machine, and those are today's prices... They weren't so low a couple of years ago...
Can you imagine a beowul.... oh.. wait..
SysWear - Geek T-shirts (UK/Europe)
1) google is so pretty and smart
2) google is worth so much money
3) google has a huge rack!!
There was an article recently about how Google constantly understates various statistics about itself to mislead potential competitors. This article also said that the SEC would not allow them to do this once they became a publically traded company.
Seriously? What is the point of this article? What next? Linus found to prefer blue ink, over black ink?
========
CINC, 4th Penguin Legion
I don't think this is that strange: after all, that 10,000 machines figure is several years old. It's only logical that Google has expanded their facilities since then.
This space intentionally left blank.
SCO now knows how big an invoice to send Google! :-D
My rights don't need management.
I hang around too many old-timer mainframe geeks. MVS forever!!! and such.
According to calculations by the IEE, in a paper about the Google cluster, a rack with 88 dual-CPU machines used to cost about $278,000
Um, don't you think if you were buying 899 racks you might actually, you know, negotiate for a better price?
This isn't the only assumption in your analysis, and the problems with them will be compounded. What's the point of this, really?
Might just be me, but damn, don't you think this has raised the interested of our three letter entities? i mean, damn that is just some serious computing and indexing power on cheap, "disposable" hardware...with a filesystem that can keep track of that many machines? If i headed one of such entities, i'd sure want to know more about it!
My guess is just as your guess which would be:
your guess + 1 = my guess.
We already know they have enough servers to saturate a T1000 line so might as well stop here and talk about something more constructive.
Yes, but aside from dealing with hardware failures and other physical / logistical problems, there really isn't much of a difference between managing 45,000 computers and managing 80,000. They're both Really Big Numbers, and I'm sure whatever software they're using is scaleable enough to smoothly handle many more machines than that.
This space intentionally left blank.
Remember there's a little thing called "volume discount"...
It's gotta be more than that.
There are two kinds of people in the world: Those with good memory.
With all those TFlops, no wonder Google converts units so quickly.
You mean the PigeonRank(tm) technology is a hoax?
because with ~80,000 machines, they can easily put a few hard drives in each, and give everyone 1gb of gmail space... I didn't think it was possible.
where do you go to buy 80,000 hard drives?
Runnin' On Empty
In your standard 42U cabinet, you're talking a half-U per server. Umm.. not happening. Let's just say I happen to know they use 2U servers, for a total of 21 per cabinet. Custom jobs - just the "floor pan" (i.e. no sides, or top for the case), system board, power supply, and I think a single (or possibly dual) hard drive (I didn't want to be too nosy staring into someone else's colo space). Oh, and network. And rumor has it, they're putting in close to 200 cabinets in just this location alone.
This many computers must use quite a bit of power and they probbably also need some serious airconditioning. I sure wouldn't want to receive their electricity bill by mistake. :)
I wonder if google will start up a web-hosting business? I bet you can't beat their uptime guarantees. They could provide sql, cgi, etc, and build in multi-machine redundancy for your data just like they do for theirs. It'll be the google server platform, just one more step to replacing Microsoft as the evil monopoly.
The number of machines Google uses is considered a trade secret. By attempting to determine how many machines they have, you're in violation of the DMCA. I'm calling the FBI.
working at abovenet google has pulled there machines in and out of our data centers many a times. its incredible the way they have there shit is setup.
they fit about 100 or so 1u's on each side of the rack, there double sided cabinets that look like refrigerators. there seperated in the center by noname brand switches and they have castor wheels on the bottoms of them. google can at the drop of a dime roll there machines out of a datacenter onto there 16 wheeler, move, unload and plug into a new data center in less than a days time.
Since the 10k server number was first floated, I believe google has added quite a few, meaning 6 to 10 whole new datacenters around the world.
It would only make sense that the server count would now be in the ballpart of what is mentioned here.
Google hasn't been standing still, and I've heard the "Google has 10k servers" for 1-2 years now.
-Pete
Soccer Goal Plans
...assuming 200W per server, which is probably low, but probably compensates for 79,000 being most likely an overestimate. However, that doesn't even begin to account for the energy used to keep the stuff cool.
Anyone know how many trees per second that would be? Conversion to clubbed-baby-seals-per-sec optional.
Please help metamoderate.
A Pentium 4 dissipates around 85 W of heat. I don't know what the Xeon does, but let's be kind and say 50 W (wild guess). Using the article's "low end" estimate, that brings us to 4.7 MW!
I hope they have good ventilation...
Since it is known that Google has the largest installed base of Linux and now they are about to go IPO in the billions, I wonder why SCO has not gone after them? Apparently, it is not use of Linux that makes SCO persue a company.
The interesting thing is, that if SCO really has MS backing and MS is pulling strings, then I would think that MS would want SCO to persue google to tie them up for awhile.
I prefer the "u" in honour as it seems to be missing these days.
42.
This is how it should be, since knowing the size of Google's hardware capacity is a very, very strategic bit of information, and the kind of thing that would allow Yahoo/MSN/whoever to get a feel for how much capital would be necessary to duplicate or improve upon it.
"It was a summer's tale: Just a boy, his Linux, and a head full of dreams..."
Right, 12-15 per rack...they're smart enough to develop an amazing search engine, but not to understand proccessing power density issues...
Better VDF than VD...check it out: Data Access
When you can just open "Computer Architecture: A Quantitavie Approach, 3rd Edition" by Hennessy and Patterson to page 855 and find out that in summary: ...
Google has 3 sites (two west coast, one east)
Each site connected with 1 OC48
Each OC48 hooks up to 2 Foundry BigIron 8000
80 Pc's per rack * 40 racks(at an example site)
= 3200 PC's.
A google site is not a homogenous set of PC's instead there are different types of PC's that are being upgraded on different cycles based on the price/performance ratio.
If you want more info get the patterson hennessy book that I mentioned. Not the other version they sell. This one rocks way harder. You get to learn fun things like Tomosulo's algorithm.
If I am violating any copy rights feel free to remove this post.
Or they could ask Jeeves:
Say, Jeeves old boy: how many servers does Google have?
Jeeves: Piss off!
Interesting People 2004/05:
I know for a FACT they passed 100,000 last November. One thing the Louis calculation may have missed is Google's obsession with low cost. For example read the company's technical white paper on the Google file system. It was designed so that Google could purchase the cheapest disks possible, expecting them to have a high failure rate. What happens when you factor cost obsession into his equation?
in how they recycle their gigantic heat output...perhaps move data center to the windy city, open up a homeless shelter next door, and put the hot air to good use for once. They might even get a tax break on this.
Better yet, open up a nursery (plant type) next door , build a green house, and piple 25% of the heat to it. Have you guys see the price of trees lately? Google could make a killing with the "recycling" plant.
All those machine, all that complexity and activity, all boiled down to one little box under a Google logo. The most useful input box on the internet.
Thanks Google!
Yeah it's kind of like:
Your wife has slept with 80 other men, or was it 200?
Either way, it's not good for you.
- Keep costs down; and
- What happens inside the company, stays inside the company.
Figuring out the number of servers they have is why we're noodling over the second point, but the first point is what probably as us all thrown off. Someone in a position to know said recently that he could state as a an absolute fact they have more than 100,000 servers -- and added that merely mentioning it probably violated multiple NDAs he had."It was a summer's tale: Just a boy, his Linux, and a head full of dreams..."
The CIO and Head Brainsurgeon (he really is a medical doctor) was at SVLUG last year he said there were about 11500 Linux boxes at Google.
With that much computer power at their disposal they could do some cool things - maybe some sort of distributive computing thingie or big database of some kind.
What about building a really big search engine?
To build a really big search engine, your going to need some serious distributed computing, and a big database! Hey wait a minute thats what they are doing!
Electronic Music Made Using Linux http://soundcloud.com/polyp
They better have at least 10^100 machines, or they will be getting a call from my lawyers.
word.
Did anyone think of the electricity needed to power and cool 50,000 servers
. html
The 1,100 Apple cluster at Virginia tech uses 3 megawatts, sufficient to power 1,500 Virgina homes
http://www.research.vt.edu/resmag/2004resmag/HowX
Yes, it is true: every time you hit Google, you are polluting the Earth.
The next pasture is always greener
Isn't it scary that according to these figures, Google's datacenter should theoretically be able to DDOS the entire Internet?
Someone mentioned that they have enough bandwidth/processing power to saturate a T1000 line. Scary...
-- If you try to fail and succeed, which have you done? - Uli's moose
The cost of acquiring the machine is a fraction of the cost of owning it.
And lets not forget the overhead of 2 networks per machine and all the patch panels, wiring, switches. Toss in console management (which may not be on all machines at all time), monitoring and management of said machines. Oh, and one really tired guy running around.
Disks are going to fail at a rate of several hundred or thousand PER DAY, just statistically. (along with power supplies etc)
Toss in that in three years, ALL of those machines are obsolete.
That's huge.
I've got ~300 racks in a half full data center upstairs from me. All network cables run to a room below it to patch panels. Around 50% the size of the DC is cable management. Next to that is a room FILLED with chest high batteries - these are used during outages until the generators need to be kicked on. And a NOC takes up about 1/5th the space of the DC (monitoring systems worldwide, but it's got seating for maybe 40 people - tight and usually filled with 10 folks, but in a crunch we live up there).
So that $3159 is only a bit of it. And in 3 years, all those machines will likely be replaced for whatever $3k buys then. That's about to be a 2 CPU Athlon64 box. If Sun can pull a rabbit out of its ass, we'll have 8 and 16CPU Athlon64 boxes. At least with that, some of the CPUs can talk to each other really really really fast.
He displayed a little numerical dyslexia... it's 359 racks, not 539 for $100 Mil. which makes the stats a little different: 31592 machines 63184 CPU's 63184 GB RAM 2527.36 TB of Disk space and I'm not sure what his logic is behind the Teraflops calculations... looks like he's taking 1Ghz==1TFlop which would give about 126.4 TFlops. Aside from that error, the figures sound pretty realistic to me. But I wanna know how much bandwidth they use.
Google also indexes images, newsgroups, has things like froogle, as well as the upcomming gmail. Not to mention all the research and other things they have going, on top of redundancy...
Viva La Revolucion! Buy a Mac!
I was in Exodus - Toyama facility in Sunnyvale, CA back in 2001 and was talking to some of the data center techs, they were bitching because Google DOES stack 44 -half depth- servers in a rack, on EACH SIDE (aka 88 servers per rack indeed) and how the heat that produces is absolutely fucking insane and how he can't believe they don't meltdown. He was comlaining how frugal google was not giving the systems more room to breath.
--- www.f-theocean.com
Sounds like a pretty stupid idea to me. Lego is expensive stuff.
were you expecting to see a sig here? perhaps you'd rather see the inside of an ambulance!
Geez dude, go back to school and learn how to punctuate properly and the proper use of there/they're/their. I'm not a grammer/spelling nazi. Even though mistakes annoy the shit out of me, I usually let it pass. I know I make the occasional mistake myself. But your post was just too much.
I don't know why I'm doing this, but here's a corrected version of your post:
Google executives sir,mam,person, do you mind if you could lend me a few boxes?
(\_/)
(O.o) This is Bunny. Add Bunny to your signature
(> <) to help him achieve world domination.
google is starting to resemble the wonka factory from willy wonka and the chocolate factory.
"Your phone company can't just lose a few calls you made and not bill you for them."
Wait, what's wrong with that one?
G
His pricing in the summary may be a bit off.
:)
:) Well, not the Asus 1400r, those are built into a 1u case, but other machines we've built for servers are very easy to build into midtowers instead. Those machines don't get gobs of memory, but do get extras like nice sound cards and CD/DVD players. The price would be the same, as they'd probably still be attaching them to the same networking equipment. 132,000 servers, and 2,682 workstations and dev machines is probably fairly close to what they have.
Every article I've read about Google's servers says they use "commodity" parts, which means they buy pretty much the same stuff we buy. They also indicate that they use as much memory as possible, and don't use hard drives, or use the drives as little as possible. From my interview with Google, they asked quite a few questions about RAID0, RAID1 (and combinations of those), I'd believe they stick in two drives to ensure data doesn't get lost due to power outages.
We get good name brand parts wholesale, which I'd expect is what they do too. So, assuming 1u Asus, Tyan, or SuperMicro machines stuffed full of memory, with hard drives big enough to hold the OS plus an image of whatever they store in memory (ramdrives?), they'd require at most 3Gb (OS) + 4Gb (ramdrive backup). I don't recall seeing dual CPU's, but we'll go with that assumption.
The nice base machine we had settled on for quite a while was the Asus 1400r, which consisted of dual 1.4Ghz PIII's, 2Gb RAM, and 20Gb and 200Gb hard drives. Our cost was roughly $1500. They'd lower the drive cost, but incrase the memory cost, so they'd probably cost about $1700, but I'm sure Google got better pricing, buying the quantity they were getting.
The count of 88 machines per rack is a bit high. You get 80u's per standard rack, but you can't stuff it full of machines, unless you get very creative. I'd suspect they have 2 switches, and a few power management units per rack. The APC's we use take 8 machines per unit, and are 1u tall. There are other power management units, that don't take up rack space, which they may be using, but only the folks at Google really know.
Assuming the maximum density, and equipment that was available as "commodity" equipment at the time, they'd have 2 Cisco 2948's and 78 servers per rack.
$1700 * 78 (servers)
+
$3000 * 2 (switches)
+
$1000 (power management)
--------
$139,600 per rack (78 servers)
Lets not forget core networking equipment. That's worth a few bucks.
Each set of 39 servers would probably be connected to their routers via GigE fiber (I couldn't imageine them using 100baseT for this) Right now we're guestimating 1700 racks. They have locations in 3 cities, so we'll assume they have at least 9 routers. They'd probably use Cisco 12000's, or something along that line. Checking eBay, you can get a nice Cisco 12008 for just $27,000, but that's the smaller one. I've toured a few places who had them, and pointed at them citing them to be just over $1,000,000.
So....
$250,000,000 (ttl expenses)
- $ 9,000,000 (routers)
------
$241,000,000
/ $ 139,600
------
1726 racks
* 78 (machines per rack)
------
134,682 machines
Google has a couple thousand employees, but we've found that our servers make *VERY* nice workstations too.
I believe this to be a more fair estimate, than the story gave. They're quoting pricing for a nice fast *CURRENT* machine, but Google has said before that they buy commodity machines. They do like we do. We buy cheap (relatively) and lots of them, just like Google does. We didn't pattern ourselves after Google, we made this decision long before Google even existed.
When *WE* decided to go this router, we looked at many options. The "provider" we had, before we went on our own, leasing space and bandwidth directly from Tier 1 providers, opted for the monolythic sy
Serious? Seriousness is well above my pay grade.
You might also be interested to know that there are a lot of government buildings in Washington DC.
I think they include infrastructure and air cooling into their $250M figure. I these things can actually cost MORE than the racks themselves, especially if these racks consist of commodity hardware, and considering the size of their data center.
NO they just use legos duh. Though I personally prfer duct tape
It would not be a very distributed DDOS and that would stop any attack quite quickly. Quite simply google's bandwidth providers (or the providers above them) would just unplug them. They may be global, but they probably have less than 40 datacenters. It would not be distributed enough to sufficiently attack. If you could take over the same number of machines with the same amount of bandwidth, but distributed globally on various subnets (say a massive virus), *then* you'd have a DDOS machine. As is, google's DDOS would be shut down quite quickly.
Photos.
I'm not saying that it's impossible. I'm sure any dedicated individual could do it. However, tours in datacenters are typically guided (especially at equinix). As far as getting in via unlocked doors, I'd say definitely would not happen here. You have to go through about 4 doors and 4 hand scanners to get in. There are no other entrances.
:). However, I'm sure it impresses many decision makers.
Of course, most of it is more for show than practicality. I mean, they have hand scanners on every single cage. Definitely a little bit excessive
-JD-
Some of the reasons these techniques aren't used in enterprise computing:
Since I've seen it up close a few times, I can say that the standard "enterprise way" (Oracle/Sun/EMC) delivers very poor bang for the buck. If Google wanted to, they could deliver a modified GFS with any desired level of reliability by increasing the redundancy. And even after that bloating, it would still deliver greater bang for the buck than the conventional solutions.
The high-end Sun machines are designed for high availability. Not only will a CPU failure not crash the machine, the CPUs are hot swappable so you can replace a failed CPU without so much as a reboot.
Yes, 10 years ago this was a important thing to have... As were many other "big iron" features. And it still sounds very cool in a geeky kinda way.
But with redundant relatively cheap clusters available, these types of things aren't worth the $$$ they used to be.
Except at the extreme high end of the computing world hardware is steadily progressing to commodity level.
Some people are like slinkies--basically useless but they bring a smile to your face when pushed down the stairs.