Linux Kernel Surpasses 10 Million Lines of Code
javipas writes "A simple analysis of the most updated version (a Git checkout) of the Linux kernel reveals that the number of lines of all its source code surpasses 10 million, but attention: this number includes blank lines, comments, and text files. With a deeper analysis thanks to the SLOCCount tool, you can get the real number of pure code lines: 6.399.191, with 96.4% of them developed in C, and 3.3% using assembler. The number grows clearly with each new version of the kernel, that seems to be launched each 90 days approximately."
Lines of code is not a good metric for performance. I'm in a software engineering class listening to how to use metrics on code.
Because we'd all like to know how many man-months something a big as the linux kernel should take to implement. And laugh at the huge price tag sloccount will put on it.
“Common sense is not so common.” — Voltaire
I think they are including modules as well. And there are a growing number of userland drivers as well. So you can't come to a conclusion without knowing the size of the parts outside the kernel.
"Thanks for all the money you paid to us. We've used it to buy off ISO among other things" -Microsoft
Funny that the summary calls attention to the fact that the number of lines includes comments and whitespace without any mention of how worthless lines of code is as a metric. Someone could easily go in and add or remove newlines wherever they wanted and without changed a bit of code make it 50 million or 50 thousand.
Whale
Makefiles, build scripts, etc., perhaps?
I, for one, am looking forward to the inevitable
While Linux is huge, for a backdoor to be successful it would need to hit a huge number of systems. The majority of the kernel at this point tends to be drivers, not all of which are used in a given kernel.
For it to be even remotely worthwhile, it'd have to be placed into something that was both heavily used AND given little attention. These two positions are almost mutually exclusive.
Can anyone think of a place that would fall into these two categories? Even the more seemingly obscure parts of the kernel get attention fairly often and malicious changes wouldn't go unnoticed for long.
I'm a developer and was wondering what kind of testing is done to verify the code.
Guinea pigs. Millions of us.
Pathological kinda promises Path + Logical - but instead, you get stuck with pathetic.
This only proves that the Linux Kernel is in need of a significant refactoring effort. The capacity for any single developer to understand or even read a significant portion of this code is NIL. As a result, the opportunity to reduce duplication of effort is quickly diminishing, and the ability of new users to contribute anything other than additional bloat is similarly diminishing. And while the core of the kernel may be "small", and much of this code is dealing with special cases for specific hardware, because of the size of the code involved it is increasingly difficult to identify what is substantial and what is merely stylistic differences between two drivers. Increasing LOC counts is a sure sign of under analysis and over reliance on the availability of cheap labor. You can pick any arbitrary number of lines of code (less than say 20k) and pick that as the number of lines the kernel should occupy. As an individual line may define a new abstraction, LOC represent a potential for a geometric increase in complexity. So either these 6-10 million lines of code represent some truly staggering level of irreducible complexity (most unlikely), or are merely the result of not refactoring the code sufficiently (most likely). This really is a milestone in gratuitous complexification that should be morned, not hailed.
Comments are also code.
If you only count as code what can be feed to the machine, you should look at the size of the compiled binary. Source code is meant to be read by *humans*, so comments do count. That's why the GPL requires them to be left in the files (the "preferred form" to edit), otherwise it wouldn't be source code.
Singularity: a belief in the "God" idea with the "demiurge" relation inverted.
...still, we should think about adding Asimov's three laws before we reach such an event horizon, no?
No sig for the moment.
``Unfortunately, as you approach the limit, the performance must drop as you've now abstracted so far that your code becomes essentially a virtual machine on which your data runs.''
I don't see that. Not all abstraction makes things slower. In many cases, abstraction lets you write code at a higher level, while still compiling down to the code you would have written if working at a lower level.
Please correct me if I got my facts wrong.
i believe a more appropriate measure of the 'bloat' (i.e. useless functions) or the size of any software package is through function point analysis
I recall many years ago, a PHB (this is long enough ago that nobody called them that yet) was talking about developer productivity metrics; he announced that the powers that be were considering either KLoC or Function Points. The guy sitting next to me said "I have no idea what function points are, but they've got to be better than KLoC". The remark made one of those wonderful whooshing sounds as it sailed straight over the PHB's head...
LOC is without question one of the easiest measurement (aside from total package size in bytes, which is nearly as uninformative)
+1 - Fundamental Law Of Physics.
LoC's only redeeming feature as a metric of anything is that it's (relatively) easy to measure. Of course, there's the debate about "do we count comments", "do we count whitespace", "how do we count curly braces" - so it turns out it's actually NOT all that easy to measure. But don't let a PHB hear you speaking such heresy...
Not everything that can be measured matters; Not everything that matters can be measured.