How Google Uses Linux
postfail writes 'lwn.net coverage of the 2009 Linux Kernel Summit includes a recap of a presentation by Google engineers on how they use Linux. According to the article, a team of 30 Google engineers is rebasing to the mainline kernel every 17 months, presently carrying 1208 patches to 2.6.26 and inserting almost 300,000 lines of code; roughly 25% of those patches are backports of newer features.'
Hmmm... Techno-Amish? (i.e. "We'll use your roads, but not your damned cars!")
I. Want. This.
Google does not distribute the binaries, so they are not obliged to publish the source.
you missed the point of open source then
I'm god, but it's a bit of a drag really...
The whole article sounds so painful, what do they actually get out of it?
Google started with the 2.4.18 kernel - but they patched over 2000 files, inserting 492,000 lines of code. Among other things, they backported 64-bit support into that kernel. Eventually they moved to 2.6.11, primarily because they needed SATA support. A 2.6.18-based kernel followed, and they are now working on preparing a 2.6.26-based kernel for deployment in the near future. They are currently carrying 1208 patches to 2.6.26, inserting almost 300,000 lines of code. Roughly 25% of those patches, Mike estimates, are backports of newer features.
In the area of CPU scheduling, Google found the move to the completely fair scheduler to be painful. In fact, it was such a problem that they finally forward-ported the old O(1) scheduler and can run it in 2.6.26. Changes in the semantics of sched_yield() created grief, especially with the user-space locking that Google uses. High-priority threads can make a mess of load balancing, even if they run for very short periods of time. And load balancing matters: Google runs something like 5000 threads on systems with 16-32 cores.
Google makes a lot of use of the out-of-memory (OOM) killer to pare back overloaded systems. That can create trouble, though, when processes holding mutexes encounter the OOM killer. Mike wonders why the kernel tries so hard, rather than just failing allocation requests when memory gets too tight.
Ooooh... efficiency.. I'm curious what the net savings is.. compared to buying more cheap hardware.
So what is Google doing with all that code in the kernel? They try very hard to get the most out of every machine they have, so they cram a lot of work onto each.
(30 * kernel engineer salary) / (generic x86 server + cooling + power) = ?
This is something I have been wondering too. Doesn't it just lead to applications crashing more often than them normally reporting they cannot allocate more memory?
Does Google give any code and patches back to the Linux kernel maintainers? Since they probably only use it internally and never distribute anything they are not required to by the GPL, but it would still be the right thing to do.
Hmm, you realize that Android alone is over 10 million lines of code right? That's a pretty big open source contribution right there. But then there's also over a million lines of code across 100+ smaller projects too. So I am not sure what your definition of "table scraps" is but it's significantly more lines of code than most companies do.
In Unix if malloc returns null then the memory allocation failed and you don't have the memory. A well written program should check that. Overcommitting memory can have efficiency advantages, but things can also turn out badly. Linux has heuristics to determine how much to overcommit the memory, or it can be disabled entirely.
http://utcc.utoronto.ca/~cks/space/blog/unix/MemoryOvercommit
http://utcc.utoronto.ca/~cks/space/blog/linux/LinuxVMOvercommit
I'm not a huge goog fan, I never take their cookies so I don't use anything but search..but JUST search is way more "give back" than table scraps. If they announced tomorrow their search would now cost x-dollars a year, as long as it was somewhat reasonable,like an extra 5 bucks a month on top of my ISP bill, I'd pay for those table scraps. Google search has done more than anything else to make the web actually *useful* since the invention of the hyperlink.
Sure, there are other search engines, but if you actually learn to *use* the features and filters present wih google's, it just stomps all the others flat.
Whatever they give back in terms of code is just gravy on top of that.
I. Want. This.
DTrace code:
#pragma D option quiet
io:::start
{
@[args[1]->dev_statname, execname, pid] = sum(args[0]->b_bcount);
}
END
{
printf("%10s %20s %10s %15s\n", "DEVICE", "APP", "PID", "BYTES");
printa("%10s %20s %10d %15@d\n", @);
}
Output:
# dtrace -s ./whoio.d
^C
DEVICE APP PID BYTES
cmdk0 cp 790 1515520
sd2 cp 790 1527808
More examples at:
http://wikis.sun.com/display/DTrace/io+Provider
Oh sorry...title had me thinking this was penguin porn
Back in the 90's, we had a customized patch to Apache to make it forward tickets within our intranet as supplied by our (also customized) Kerberos libraries for our (also customized) build of Lynx. It all had to do with a very robust system for managing customer contacts that ran with virtually no maintenance from 1999 to 2007--and I was the only person who understood it because I wrote it as the SA--when it was scrapped for a "modern" and "supportable" solution that (of course) requires a dozen full-time developers and crashes all the time.
Not really bitching too much, because that platform was a product of the go-go 90's, and IT doctrine has changed for the better. No way should a product be out there with all your customer information that only one person understands. But it was a sweet solution that did its job and did its job well for a LONG time. Better living through the UNIX way of doing things!
But, anyway, I never bothered to contribute any of the patches from that back to the Apache tree (or the other trees) because they really only made sense in that particular context and as a group. If you weren't doing EXACTLY what we were doing, there was no point in the patches, and NOBODY was doing exactly what we were doing.
"He who would learn astronomy, and other recondite arts, let him go elsewhere. " -- John Calvin, commenting on Genesis 1
That's a drop in the bucket compared to what Sun has contributed to open source. Of course, slashdot appears to be perversely against Sun for some reason I cannot fathom.
Names are very important. The name Sun reminds of that place on the other side of the door where if we go, our skin gets red and burns. Google reminds us of that friendly homepage that would load under 5 seconds on dial-up.
Somehow I'm reminded about the whole Android thing. Google really seems to have the urge to only do their own thing. Same thing with android where they have thrown out the whole "Linux" userspace to reinvent the wheel (only not as good, see Harald Welte's Blog for a rant about it). Here it seems the same thing they just do their own thing without merging back and disregarding experiences others might have had.
On a side note, their problems with the Completely Fair Scheduler should be a good argument for pluggable schedulers. It shows one scheduler can't fit all use cases, but I doubt Linus will listen.
C
They take and take from open source and throw back a couple of table scraps and you people all kiss their ass for it.
300K lines of code? Yep, table scraps.
For people who wonder why I continue to want to see the end of the FSF, the above attitude is the reason why. Stallman and his organisation are the reason for it.
Aside from being ugly and spiritually bankrupt, reciprocity paranoia is based on completely erroneous reasoning, as well. The same people who talk about how music piracy isn't harming anyone, because it doesn't physically take away from a finite supply of copies, are also those who express the above paranoia about people "taking," from FOSS, as if that is somehow a physically finite resource, when music isn't.
Get rid of your fear.
And yet, tar is still broken. Well, maybe not today, but it sure as fuck was in 1994.
Pick your poison.