How Google Uses Linux
postfail writes 'lwn.net coverage of the 2009 Linux Kernel Summit includes a recap of a presentation by Google engineers on how they use Linux. According to the article, a team of 30 Google engineers is rebasing to the mainline kernel every 17 months, presently carrying 1208 patches to 2.6.26 and inserting almost 300,000 lines of code; roughly 25% of those patches are backports of newer features.'
Hmmm... Techno-Amish? (i.e. "We'll use your roads, but not your damned cars!")
I. Want. This.
Google does not distribute the binaries, so they are not obliged to publish the source.
you missed the point of open source then
I'm god, but it's a bit of a drag really...
The whole article sounds so painful, what do they actually get out of it?
Google started with the 2.4.18 kernel - but they patched over 2000 files, inserting 492,000 lines of code. Among other things, they backported 64-bit support into that kernel. Eventually they moved to 2.6.11, primarily because they needed SATA support. A 2.6.18-based kernel followed, and they are now working on preparing a 2.6.26-based kernel for deployment in the near future. They are currently carrying 1208 patches to 2.6.26, inserting almost 300,000 lines of code. Roughly 25% of those patches, Mike estimates, are backports of newer features.
In the area of CPU scheduling, Google found the move to the completely fair scheduler to be painful. In fact, it was such a problem that they finally forward-ported the old O(1) scheduler and can run it in 2.6.26. Changes in the semantics of sched_yield() created grief, especially with the user-space locking that Google uses. High-priority threads can make a mess of load balancing, even if they run for very short periods of time. And load balancing matters: Google runs something like 5000 threads on systems with 16-32 cores.
Google makes a lot of use of the out-of-memory (OOM) killer to pare back overloaded systems. That can create trouble, though, when processes holding mutexes encounter the OOM killer. Mike wonders why the kernel tries so hard, rather than just failing allocation requests when memory gets too tight.
Ooooh... efficiency.. I'm curious what the net savings is.. compared to buying more cheap hardware.
So what is Google doing with all that code in the kernel? They try very hard to get the most out of every machine they have, so they cram a lot of work onto each.
(30 * kernel engineer salary) / (generic x86 server + cooling + power) = ?
This is something I have been wondering too. Doesn't it just lead to applications crashing more often than them normally reporting they cannot allocate more memory?
Does Google give any code and patches back to the Linux kernel maintainers? Since they probably only use it internally and never distribute anything they are not required to by the GPL, but it would still be the right thing to do.
this is very interesting, but i have a question. As I understood, google have its own kernel development line?
Under Windows, if you commit memory, it's yours and it will be there. If the system can't make that promise, it will fail to commit the memory and return an error.
For Free Software, 'take' is fine. 'Provide but restrict' is not.
Hmm, you realize that Android alone is over 10 million lines of code right? That's a pretty big open source contribution right there. But then there's also over a million lines of code across 100+ smaller projects too. So I am not sure what your definition of "table scraps" is but it's significantly more lines of code than most companies do.
This vs the other metric.. how many anonymous posters downplay their massive contributions.
I'm not a big fan of Google, but god damn man. These guys are a huge player no matter what they do.
"His name was James Damore."
Is "Amazingly short sighted" your sig, that is a self referential thing you need to tack onto everything you write? Seems very apt.
Using the "I only count the bits I want them to release the source to" metric is also a shit way to gauge their contribution to open source.
I'm not a huge goog fan, I never take their cookies so I don't use anything but search..but JUST search is way more "give back" than table scraps. If they announced tomorrow their search would now cost x-dollars a year, as long as it was somewhat reasonable,like an extra 5 bucks a month on top of my ISP bill, I'd pay for those table scraps. Google search has done more than anything else to make the web actually *useful* since the invention of the hyperlink.
Sure, there are other search engines, but if you actually learn to *use* the features and filters present wih google's, it just stomps all the others flat.
Whatever they give back in terms of code is just gravy on top of that.
I. Want. This.
DTrace code:
#pragma D option quiet
io:::start
{
@[args[1]->dev_statname, execname, pid] = sum(args[0]->b_bcount);
}
END
{
printf("%10s %20s %10s %15s\n", "DEVICE", "APP", "PID", "BYTES");
printa("%10s %20s %10d %15@d\n", @);
}
Output:
# dtrace -s ./whoio.d
^C
DEVICE APP PID BYTES
cmdk0 cp 790 1515520
sd2 cp 790 1527808
More examples at:
http://wikis.sun.com/display/DTrace/io+Provider
Oh sorry...title had me thinking this was penguin porn
Back in the 90's, we had a customized patch to Apache to make it forward tickets within our intranet as supplied by our (also customized) Kerberos libraries for our (also customized) build of Lynx. It all had to do with a very robust system for managing customer contacts that ran with virtually no maintenance from 1999 to 2007--and I was the only person who understood it because I wrote it as the SA--when it was scrapped for a "modern" and "supportable" solution that (of course) requires a dozen full-time developers and crashes all the time.
Not really bitching too much, because that platform was a product of the go-go 90's, and IT doctrine has changed for the better. No way should a product be out there with all your customer information that only one person understands. But it was a sweet solution that did its job and did its job well for a LONG time. Better living through the UNIX way of doing things!
But, anyway, I never bothered to contribute any of the patches from that back to the Apache tree (or the other trees) because they really only made sense in that particular context and as a group. If you weren't doing EXACTLY what we were doing, there was no point in the patches, and NOBODY was doing exactly what we were doing.
"He who would learn astronomy, and other recondite arts, let him go elsewhere. " -- John Calvin, commenting on Genesis 1
Hmm, you realize that Android alone is over 10 million lines of code right? That's a pretty big open source contribution right there. But then there's also over a million lines of code across 100+ smaller projects too. So I am not sure what your definition of "table scraps" is but it's significantly more lines of code than most companies do.
I see millions of lines of Code from the Apache Foundation's various Java projects in Android.
That's a drop in the bucket compared to what Sun has contributed to open source. Of course, slashdot appears to be perversely against Sun for some reason I cannot fathom.
Android is not GPL'd. Android is released under the Apache license. As of Android 2.0 Google has opted to not released the code to the Android Open Source Project. Those 10 million lines of code are for the most part closed. Sure, they have to release the kernel itself, but "Android" is theirs and they are keeping it.
I'm assuming this is to give Verizon exclusivity with their "droid" phone to be the only one running 2.0. I don't think they anticipated projects like cyanogenmod taking off quite like they have. Why buy a droid if your cheap g1 can run the latest software?
Do No Evil?
Somehow I'm reminded about the whole Android thing. Google really seems to have the urge to only do their own thing. Same thing with android where they have thrown out the whole "Linux" userspace to reinvent the wheel (only not as good, see Harald Welte's Blog for a rant about it). Here it seems the same thing they just do their own thing without merging back and disregarding experiences others might have had.
On a side note, their problems with the Completely Fair Scheduler should be a good argument for pluggable schedulers. It shows one scheduler can't fit all use cases, but I doubt Linus will listen.
C
It's amazing how many of these problems, especially with regard to multi-threading issues and multiple cores, have already been solve and implemented in Sun Solaris. In 1994. Fifteen years ago.
Kriston
They take and take from open source and throw back a couple of table scraps and you people all kiss their ass for it.
300K lines of code? Yep, table scraps.
For people who wonder why I continue to want to see the end of the FSF, the above attitude is the reason why. Stallman and his organisation are the reason for it.
Aside from being ugly and spiritually bankrupt, reciprocity paranoia is based on completely erroneous reasoning, as well. The same people who talk about how music piracy isn't harming anyone, because it doesn't physically take away from a finite supply of copies, are also those who express the above paranoia about people "taking," from FOSS, as if that is somehow a physically finite resource, when music isn't.
Get rid of your fear.
The articles cited are like 2 weeks old, Isn't this odd for slashdot to discuss the news that old...
404 Not Found
Could explain what tables scraps is?
- Their google summer of code program where they have invested millions of dollars into for many years now?
- The huge open source android framework you can use on mobile phones?
- The numerous number of useful projects they have released including the tools they used to make most of their products?
- The free project hosting resources they give open source projects?
- The government lobbying they have done to level the playing field for open technologies?
Yes, you are!
Google is using extensively open source, but is not giving back any significant technology to the open source world.
No efficient search technology.
No decent OCR software (ocropus + tesseract are still years behind what you get for free with any multifunction HP printer on the windows world) No GIS technology No JSP cooperation, Minimal kernel patches, etc, etc
Google could be a major open/free source contributor, they have the money and the skills, but they have no will to do-it. In fact, Google is behaving like any other big greedy corporation, they only do what they see fit for his own interest. The bleeding point is that Google exist THANKS TO open/free programming.
What's in a sig?
I think you need to distinguish between true FOSS zealots and leeches who just want stuff for free. Hint: the grandparent is the latter.
Who said they were throwing those lines back? They don't have to, and a short look at TFA didn't make it look as if they did. (Not that I mind -- I myself maintain such kernel code at work.)
As of Android 2.0 Google has opted to not released the code to the Android Open Source Project. Those 10 million lines of code are for the most part closed. Sure, they have to release the kernel itself, but "Android" is theirs and they are keeping it.
Dude. Android 2.0 was just announced in October. Give them a little time, sheesh. You think they're going to go through all this trouble to set up an open-source alliance for their code, then not release it as open source?
They're most likely just busting their butts to support all the new phones coming out for Christmas. I bet they work on cleaning up the code & releasing it early next year.
I'm a leaf on the wind. Watch how I soar.
A simple "the code will be released later" statement from Google would certainly clear that up and keep the wolves at bay. Currently, the only word from them is noncommittal comments from Jean Baptiste Queru. Why so sheepish about it if that's the case? I don't think they like having stories like this about their utopian OS. Seems like a simple fix.
Google and the kernel developers should talk to IBM and ask for publishing the scheduler used in OS/2 v2 and up which in turn was a makeover of a mainframe scheduler. It would certainly solve a large part of their current problems.
oops, wrong bad news story link. that's the one I was aiming for. A source release would fix that.
Google runs their own flavor of linux on most engineers' workstations, too. When I first got there, it was based on Red Hat 9 (and called grhat). After Red Hat fucked their users with the whole Fedora nonsense, Google went to a version based on Ubuntu (called goobuntu). I don't recall the amounts, but I do know that they submit patches back to Ubuntu fairly frequently.
:-)
Not telling how many servers they run. My numbers would be 18 months out of date anyway.
-B
Ash and Hickory, straight-grained and true, make excellent bludgeons, dandy for the cudgeling of vegetarians.
In fairness, the new system has capabilities that the old system did not. And I also think that there is a certain amount of overhead that necessarily accompanies a serious development effort, especially one that involves more than one person. If I just want to throw something up that will work, I can do it, by myself, without documentation etc., in a matter of weeks. Require documentation, controlled processes, and make me work with a team, and it will take months. The "mythical man month" is an ever-present reality--software projects do NOT grow in a linear fashion, and in fact as you add developers productivity regresses with depressing frequency.
However, I do think this example is a bit of a commentary on the power of the "unix way of doing things": flat files are a fundamentally powerful approach for those applications where they can do the job, and Java, Oracle, and the like will sap productivity in applications where their features are not actually needed. This is why I'm always highly suspicious of any developer who "does everything in [Java|.net|perl|whatever]. You need to pick the right tool for the job.
"He who would learn astronomy, and other recondite arts, let him go elsewhere. " -- John Calvin, commenting on Genesis 1
you've missed his point - the emphasis is on the 'kissing ass', not the table scraps.
he's not worried about the table scraps per se, he's worried about the fact that people go apeshit with the whole "google are omfg so awesome holy crap" thing to an extent that they believe (by default) google actually make important oss contributions.
The fact that Google is talking is all good - sounds like there is not too much secret squirrel stuff going on and everyone wins. They do very cool stuff with their hardware and software sooo tempting to be lured in ... But I still think Google=Evil (so much data/knowledge + nothing lasts forever).
Or buy one of their Solaris/ZFS/Dtrace based storage devices, you can do what you ask with a few clicks of the mouse...
IANAL but write like a drunk one.
People just don't know about the innovation that has being going on in the storage arena by Sun.
It is funny, but I think what let Sun down was their marketing, not their Engineering.
You can currently get storage devices that run those diagnostics at the click of a mouse.
How regrettable that this wonderful technology may be shelved. Tragic really.
IANAL but write like a drunk one.
Support of older applications has a great pedigree in the IT industry.
You will find much more interesting problems from a technical point of view doing that, because you will be basically on your own.
IANAL but write like a drunk one.
One had cpio and dumpfs which worked fine as far as I can tell.
When you bought Sun back then the last thing you were worrying about was if tar worked or not ....
IANAL but write like a drunk one.
You should really read carefully the licenses of open source software before the foam in your mouth asphyxiates you.
IANAL but write like a drunk one.