A Look At the Workings of Google's Data Centers
Doofus brings us a CNet story about a discussion from Google's Jeff Dean spotlighting some of the inner workings of the search giant's massive data centers. Quoting:
"'Our view is it's better to have twice as much hardware that's not as reliable than half as much that's more reliable,' Dean said. 'You have to provide reliability on a software level. If you're running 10,000 machines, something is going to die every day.' Bringing a new cluster online shows just how fallible hardware is, Dean said. In each cluster's first year, it's typical that 1,000 individual machine failures will occur; thousands of hard drive failures will occur; one power distribution unit will fail, bringing down 500 to 1,000 machines for about 6 hours; 20 racks will fail, each time causing 40 to 80 machines to vanish from the network; 5 racks will "go wonky," with half their network packets missing in action; and the cluster will have to be rewired once, affecting 5 percent of the machines at any given moment over a 2-day span, Dean said. And there's about a 50 percent chance that the cluster will overheat, taking down most of the servers in less than 5 minutes and taking 1 to 2 days to recover."
I understand distributed computing and I understand distributed searching. But the fact of the matter is that at some point at the top of the chain, you're usually transferring very large amounts of data--no matter how tall your 'network pyramid' is. The coding itself is no simple feat but I have heard rumors that Google was building their own 10-Gigabit ethernet switches since they couldn't find any on the market. You'll notice a lot of sites are just speculating but it certainly is a nontrivial problem to network clusters of thousands of computers with more than 200,000 in the whole lot and not require some serious switch/hub/networking hardware to back it.
My work here is dung.
At what point is skimping on hardware because the system is failure tolerant costlier than using more reliable hardware?
I'd like to see the traffic patterns for their data centers. Our University has a daily and weekly pattern, no surprise there, but I wonder how much their traffic changes through the night.
When looking at it on that massive scale, you really get the idea of just how fragile a hard drive really is. I wonder how much money the new generations of data storage is going to cost for large corporations like Google. And not to mention how existing corporations will handle it, once those devices goes from "super computers" to mainstream hardware.
It's all fun & games until someone loses the game.
The hardware failures I can understand, but needing to rewire the data center after it's been wired once, and the fact that half of them overheat? Those sound like problems that should be addressed in the engineering and installation phases of the datacenter.
All pass beyond reach of medicine. None pass beyond the reach of love.
I want this google file system thing for myself, Imagine, if we could even figure out how to use it, how much fun it might be to use!
or maybe not
I've been managing a dorm network consisting of two "servers" (routing, PPPoE, some services like network printing etc.), a single industrial rack-mounted swithch and dozens of consumer switches spread all over the building.
;)
And they failed. And then they failed again. And again. Sometimes completely, but usually just a single port, or just "a bit" - it looked as if the switch was working, but every - or every n-th, or every bigger than x - packet got mangled, misdirected or whatever. Or sometimes packets appeared just out of the blue (probably some partial leftovers from the cache) and a few of them made enough sense to be received and reported. Sometimes a switch with no network cables attached to it started blinking its lights - sometimes on two ports, sometimes just on a single one.
Well, I could go on for hours, but you get the idea. What happens at Google happens everywhere, they just have some nice numbers.
Regardless, the article is quite entertaining to read for a networking geek
This is Slashdot. Common sense is futile. You will be modded down.
The fact that they attribute success to the software did not surprise me; the chunk and shard (not mentioned in the article) approach has been known for some time. But the fact that the GFS architecture works with BigTable and MapReduce was interesting, and that it handles many data/content types. What this creates is not only a scalable structure volume size, AND a sustainable business model. As new content types are added, regardless of size or type, they can generally be indexed appropriately. I am looking forward to searching more within types like video and audio, or even medical records like xRays or MRI results. The possibilities are staggering.
no comment
Comment removed based on user account deletion
It's always going to be cheaper to use anthill labor on this type of problem. Even relatively powerful 1RU and .5RU servers are dirt cheap these days. Hell, I was able to buy a pile of .5RU machines for one of my projects this week. I can't believe how cheap things have gotten:
quad-core xeon @2.66ghz
4gb RAM
2 x 500gig barracudas (RAID1)
dual gigabit ether
CentOS 5.1
US$1100 per unit
They are all stashed behind a Foundry ServerIron to load balance the cluster. So far, it seems to scale VERY well and increasing capacity is as simple as tossing another US$1k server on the pile.
Cheers,
That's a LOT of porn!
Seems that the whole server/complex monitoring aspect was left off. With 100K servers per complex, how do they even know which ones are broken? How do they even find them on the floor and in the racks?
My ism, it's full of beliefs.
I look at their data farm - and its complexity.. and cannot help but wonder how Google has organized itself for their thousands of employees to properly maintain it. No one employee can know every piece of it - or is it simply so simple that every employee knows all of it?
Then.. we realize that our own lifespans and lives are as prone to failure as the servers in their datacenters. Our lifespans are short and everyone has problems.. So Google has mastered the ability to make us interchangeable.
WE ARE ANTS!
--- We need more Ron Paul!
and yet many companies manage to have 100% uptime with a very small number of reliable machines. I guess this simply proves that there are many ways to skin a cat...
Exists also another two software packages, first called Open Job Queue for handle a batch job and dispatch this into some server into some cluster. And another called chubby, it's like a center exclusion mutex distribution, and it's used for another packages like GFS, BigTable and MapReduce.
You can send a lot of information about chubby in google scholar, and i had hear about Open Job Queue in some conference in my university
A Beowulf cluster of Google's data centers ?
{{.sig}}
He was my friend in high school and roommate in college for a year. Smartest guy I've ever met in my life, easily smarter than any other PhDs I've known, including people I know with Harvard post-med school doctorates.
I'm not trying to be a jerk (I just play one on the internet), but this worries me:
And there's about a 50 percent chance that the cluster will overheat, taking down most of the servers in less than 5 minutes and taking 1 to 2 days to recover.
I'm not much of a cluster guy, but if the risk over overheating is so great, and the damage so vile, maybe they should beef up the A/C ? Just a thought.
-Billco, Fnarg.com
This whole article is a persuasive argument for mainframes. Continually servicing cheap hardware that fails is labor-expensive, and replacing all those failed components costs real money. They're only cheap the first time you buy them. I don't see where they're saving money, though the guy does sound like he's having a ball directing armies of lackeys all through these vast, twisty datacenters.
we will end no whine before its time
that Google isn't being green enough? Think how much power they are pulling off the grid to run their thousands of servers especially when those servers are doing something useful. Not to mention the power required to cool these servers. I bet when they turn on new data centers the lights in the nearby cities lose a few lumens.
this nation, under God, shall have a new birth of freedom. -- Lincoln, Gettysburg Address
Lots of talk about DC hardware and networking here, but the part about parallelization was the really fascinating part to me. The software is the really cool bit here.
There is a meme, here and some other places, that multi-core/multi-unit processors are some great swindle by chip manufacturers. I've come to the conclusion that this is not the case; rather, we're all just dinosaur programmers stuck in the procedural/oop paradigm.
Python in its functional paradigm and MapReduce are amazing safe and efficient ways to solve Google's class of problems with parallel hardware.
What I'm saying here is that every self-respecting geek should learn a functional language.
Wikipedia - MapReduce Functional ProgrammingAlso, Google's engEdu videos are freaking AWESOME.