Drive-By Contributors to the Linux Kernel
eldavojohn writes "There's an interesting post over at the Kernel Trap that focuses on a man's attempt to find out how many one-time contributors Linux averages per release. Although imperfect due to some obvious unavoidable flaws, he got a few dirty numbers of 'never seen from agains' in the commits from patches 2.6.11 through 2.6.25 and the numbers are: {63, 148, 128, 92, 96, 122, 137, 140, 135, 95, 136, 153, 179, 179, 304}. This makes sense as another reader, Greg KH, pointed out that the distribution curve is tilted towards one-hit contributions, 'the distribution of all of our users are: 50% only contributed 1 patch; 25% contributed 2; 12% contributed 3; 6% contributed 4 and so on ...'"
How long til we get Jack Thompson on the case? Drive-bys and other kernel related violence is just not on! ~
Caesar si viveret, ad remum dareris.
There doesn't seem to be a lot of room for a web of trust here. I wonder how hard it would be to inject some sort of extremely non-obvious race condition hidden in a large and useful patch that just so happens to let you execute arbitrary code?
If you were found out, you could just claim, plausibly, that you hadn't seen the possibility of the race. Or maybe just be a one-time contributor so there is no way to track you down if the hole is discovered.
And how many years will it take to find and fix them?
Well when you think about it, 1 patch just happens to be the number that most people will use. For example, a hardware business may put in one patch to get a device to work, a software company may put in one patch to make other things work, an individual may put in one patch to fix an obvious bug, etc. The kernel, though needed is not what the end-user generally uses, so for the most part it is in the background as a stable kernel should be, making it less of a chance someone would find many bugs to have to fix.
Taxation is legalized theft, no more, no less.
My own confusion as to the meaning of "drive by" in this context did make me wonder about ease of contribution.
What if there were a bifurcation or distribution of the bug-fixing/feature-adding problem? This may be really stupid, but I imagine a situation where testers go through finding things that are wrong or where they go wrong, then submit that bit of code.
On top of this, there is a system which grabs the trace and shows the bit of code where everything got derailed, and in other panes the stuff it called to, so anybody could look over the "offending" code, without having to be intimately familiar with the kernel or the library or whatever, since it is all laid out for them. Then, people can tinker with the code and submit them for (automatic, since you know what to look for) testing, maybe leave comments on the ticket to help others' or as a group try to figure it out.
I don't have time to wade through mountains of kernel code looking for bugs, but I would be more than willing to look over a (relatively) small bit of code in a collaborative fashion to see if I pick up on something others had missed.
Nice to see my name in 2.6.25 :)
I submitted a workaround for a buggy USB device a few months ago, which was my first patch after using linux for more than 10 years. Usually when I find a problem in the kernel it's either already been fixed in a later version, or it looks too complicated for me to risk wasting my time on. I would bet that a lot of my one-off colleagues have had the same experience.
F0 07 C7 C8
Hmm? I would say it would take less time and effort to find the malicious changes to the Linux kernel then it is for MS to fix all the (hopefully) non-malicious changes to Windows. Basically, the system isn't perfect, but it is Linux or Windows if you want an OS (yes, there are Macs but they are more expensive then most computers and the OS doesn't work on non-Macs without a bit of hacking) and it seems that Linux is much more secure then Windows can ever hope to be.
Taxation is legalized theft, no more, no less.
... whenever I've tried to submit a patch, it's a nightmare of process. Make sure you have the latest build, use the right diff tool, package it the right way, submit it to the right place, get the attention of the right person. Even then, you have to convince whomever that your patch is actually needed (are you sure you have the latest code, I'm not convinced that this is a real bug), and that your code doesn't stink -- "not invented here" is a huge problem.
I predict a record number of AC posts on this article.
One of our competitors trademarked the term "hypothesis". From now on, we will call them "boneheaded ideas".
I have been thinking of a single contribution to the kernel myself. I would like to see a link somewhere in /proc that points to the location of kernel source code used to build the currently running kernel. This would remove the need to the current hacks of using a link in /usr/src/kernel that may or may not point to the correct kernel source. This would be used by scripts that build kernel modules for code that is not in the normal kernel.
If not a link, then a file whose contents is a file (somewhere in /proc) containing the pathname would also provide a similar capability.
The real "Libtards" are the Libertarians!
I'm a drive by contributor to the "Drive-By Contributors to the Linux Kernel" thread!
That's not a very good example. Hacking a CVS repository and adding a relatively easy to detect root exploit isn't how you'd want to go about this. It's far too likely to get noticed.
As the parent poster mentioned, you'd ideally want to submit a fairly large patch that does something useful (fixes a bug, adds a minor feature), but whicvh itself contains an exploitable bug.
The trick would be in making the submission large enough and the bug subtle enough that it just skates past review. Of course, it'd be *much* easier to insert bugs into any other Linux project, rather than the kernel. Not as much exposure to ultimately exploit, but there are quite a few executables and libraries that nearly every Linux system uses.
I think this is a very good idea as well. Right now I wish I had mod points. The kernel is just massive code-wise, not to mention there are tons of legacy bits of code that can be reduced, or just removed, or rewritten but people don't have the time to hunt them all down.
Quite frankly, I'd love to see professors assign this sort of thing as homework to their students. Grab a piece of OSS, and squash a bug. Or give them a reasonable size chunk of code and have them review it.
http://blindscribblings.com - Tasty pop-culture in conceptual fashion.
Comment removed based on user account deletion
I think that would be a very interesting statistic.
G.
I was even going to call out the Debian OPENSSL debacle as an example, but I figured they'd had enough grief over it already.
Sorry if someone already said this, but here's what I think: All you have to do is write a patch that gets in *once* and you can forever brag about how your code is in the Linux kernel.
Property is theft.
You are an idiot if you think there is only Linux, Windows, and OS X to choose from.
And it's far more secure and logically built than Linux.
The Internet is a wide open place. Software testing is inadequate. I'm sure there is no way you could possibly have known that when combined with that patch from the florist in Scotland, your code could possibly overwrite a few bytes of memory if called with an unlikely sequence of parameters. Whoops, I clobbered the userid? How silly of me.
Alternatively, accomplish the same thing via 2+ patches to 2+ projects which are likely to be used together. One patch against httpd + one patch against php = potential for a lot of mischief. Its not hard to fit a key to a lock when you're allowed to modify both the key and the lock to suit your tastes.
Help poke pirates in the eyepatch, arr.
These stringent development procedures are precisely what make the Linux kernel that robust. It's not the code, it's the coding procedures. Your patches should be documented, easy to review, QA, and apply. I wouldn't accept patches from a sloppy developer who wouldn't be bothered following the appropriate procedures.
Maybe not straight into the kernel tree, but if a person can only contribute 25% of what needs doing and do so, shouldn't inherently prevent a patch from being accepted.
Patch submission shouldn't inherently require the patch to be 'complete' to be catalogued (and potentially included).
However, this barrier to entry may help ensure that the patches that do get submitted are only of a reasonably high quality. (If you're going to take the time to figure out how to do the rest, you're probably fairly thorough/methodical).
Would you please get a clue, who Greg Kroah-Hartman is? Thanks.
I'm not sure about that. Arguably, if you're not experienced enough to follow good coding procedures and have proper tests and documentation, you have no business to be messing about with the kernel. Same if you don't have knowledge of the concepts behind low level OS code.
As a user of the Linux kernel, I expect it to perform reliably. Having a high code quality standards helps ensure that. In the open source model you only get an opportunity to have your code included, not a right.
We all know what to do, but we don't know how to get re-elected once we have done it
The most obvious flaw I can think of is that the newer releases have fewer follow-up releases with which to submit extra patches. This doesn't seem to be mentioned in the article. For instance the 2.6.25 release is the most current. Of course all 'new' submitters to the 2.6.25 release will not have submitted any more patches. All 'new' submitters to the 2.6.24 release will have had the chance to also submit a patch for 2.6.25.
Let's assume that each release has 100 'new' submitters and that the chance they will submit another patch in each following release decreases by 50%.
This makes perfect sense if 'new' submitters start submitting all the time. Anyone that expects someone that 'just' submitted a patch to have submitted a patch to a non-existent future release already is an idiot. This would only make any kind of sense if you included the number of 'new' submitters in each release, not just the ones that were never heard from again. Having the number of future patches and how long it took for them to release another patch would help you understand the numbers even better. For instance, if there were 304 new submitters to 2.6.11 and it took them on average 8 releases to submit another patch then the 304 number for 2.6.25 could be the same and simply the sign of a continually growing developer community. Give those people that first submitted to 2.6.25 another 14 releases to submit more patches and maybe the number of "one-time" submitters would shrink to 63.