Linux Kernel Surpasses 10 Million Lines of Code
javipas writes "A simple analysis of the most updated version (a Git checkout) of the Linux kernel reveals that the number of lines of all its source code surpasses 10 million, but attention: this number includes blank lines, comments, and text files. With a deeper analysis thanks to the SLOCCount tool, you can get the real number of pure code lines: 6.399.191, with 96.4% of them developed in C, and 3.3% using assembler. The number grows clearly with each new version of the kernel, that seems to be launched each 90 days approximately."
That the line count increases with each new version unless you are starting from scratch?
--
Oh Well, Bad Karma and all . . .
Beer is proof that God loves us and wants us to be happy.
And how much of this lines are for core functions (Memory Managements, Scheduler, etc) and for drivers (USB, Filesystem)
Â_Â
AND???
In other news, trees tend to grow up unless they tend to grow down or sideways. Sharks tend to eat anything they can, unless they are not hungry.
Anonymous will beat me to FP for sure, unless they dont.
NO SIG
Too bad 9,999,999 lines of that code were ripped off from SCO.
*cough*assembly*cough*
"assembler" is the tool, not the language.
"When life gives you lemons, don't make lemonade. Make life take the lemons back!" -- Cave Johnson
Because we'd all like to know how many man-months something a big as the linux kernel should take to implement. And laugh at the huge price tag sloccount will put on it.
“Common sense is not so common.” — Voltaire
I used to have GEOS on my Commodore 64. I have absolutely no idea how many lines of code it used, but it could squeeze itself into just 20 kilobytes of RAM, and yet had lots of functionality (as good as an 80s-era Mac). I consider "how much RAM occupied" to be a FAR more useful metric.
I would love to see someone develop an OS that followed a similar philosophy of using as little RAM as possible.
FOX NEWS.com should be BANNED from television and internet. Have the Congress take it over and give us Truespeak.
I'm a developer and was wondering what kind of testing is done to verify the code. Do they use unit testing? Regression testing?
I'm just curious because keeping 6+ million lines of code almost completely bug free is pretty amazing.
Yeah but you can customize the Linux kernel. If you don't want features, just don't compile them in.
It's easy, there's even a gui interface.
Good luck compiling a custom NT kernel. :)
Mod me down, my New Earth Global Warmingist friends!
Exactly. The better metric would be how many Libraries of Congress the kernal is.
If you have something that you dont want anyone to know, maybe you shouldnt be doing it in the first place -Eric Schmidt
It's significantly easier to hide a malicious backdoor inside a huge software project than a small one. Linux has already had a near miss back in 2003, when the CVS repository was compromised. Considering how many mission-critical applications run under Linux, there's a huge financial incentive to hide a backdoor somewhere in those 10 million lines.
Now, where do we find a birthday cake with ten million candles?
15. The Residents - Not Available
If Obama is missing that record, I'd be glad to lend him my copy.
Momentarily, the need for the construction of new light will no longer exist.
96,4% of them developed in C, and 3,3% using assembler
That leaves .3% that is unaccounted for. What was it written in?
Damn_registrars has no butt-hole. Damn_registrars has no use for a butt-hole.
May I suggest that large parts of this shouldn't be in the kernel at all? That there should be independent sub-systems so that in the event of a crash or panic, the entire OS doesn't come tumbling down?
So that badly written drivers (especially graphic card drivers) don't affect the stability of the entire system?
May I suggest that flame-wars are good and the EMACS is also bloated?
(And lots of other folks have already talked about the bad metric that lines of code is...)
I wank in the shower.
Since that many lines = approx. 125,000 pages, which = approx. 0.0175 terabytes, and... a LOC is approx. 18 TB, I'd say they have a ways to go...
I wonder what the breakdown is of the almost 4 million lines that were omitted in the count, for blank lines, comments, etc.? I've always said that commenting your code is a very good thing to do, so it would be interesting to see what the percentage of this is comments, as opposed to blank lines (which isn't a bad thing for readability).
Attention all planets of the Solar Federation! We have assumed control! - Neil Peart
Vista had 50 million lines at Beta 2
Basically, this story is "Linux kernel surpasses 10 million lines of code! Just kidding."
Funny that the summary calls attention to the fact that the number of lines includes comments and whitespace without any mention of how worthless lines of code is as a metric. Someone could easily go in and add or remove newlines wherever they wanted and without changed a bit of code make it 50 million or 50 thousand.
Whale
I'm in a software engineering class listening to how to use metrics on code.
No, you're in a software engineering class posting on Slashdot.
And what would be better, a kernel that you could simply include or not include certain modules without the need for compilation, making the kernel truly modular, and hot-swapping them in or out based on your needs. That would make the kernel much more powerful and also useful for "normal" users/admins who might not want to mess with compiling. But, I'm sure my argument will be slapped at by some leave-things-be get-off-my-lawn fanboy who hates the idea of scary new features like true/better modularity.
Save a tree. Let the actual devs do compiling unless someone really actually wants to see the code.
Promote true freedom - support standards and interoperability.
Ship Date Product Dev Team Size Test Team Size Lines of code (LoC)
Jul-93 NT 1.0 (released as 3.1) 200 140 4-5 million
Sep-94 NT 2.0 (released as 3.5) 300 230 7-8 million
May-95 NT 3.0 (released as 3.51) 450 325 9-10 million
Jul-96 NT 4.0 (released as 4.0) 800 700 11-12 million
Dec-99 NT 5.0 (Windows 2000) 1,400 1,700 29+ million
Oct-01 NT 5.1 (Windows XP) 1,800 2,200 40 million
Apr-03 NT 5.2 (Windows Server 2003) 2,000 2,400 50 million
Offcourse, you can't compare a whole OS to a kernel.
Data is from http://www.knowing.net/PermaLink,guid,c4bdc793-bbcf-4fff-8167-3eb1f4f4ef99.aspx
Why? Are you still using an 80s-era Mac as your primary computer?
I live ze unknown. I love ze unknown. I am ze unknown.
If 1 Line of Code = 1 Library of Congress, you should acquaint yourself with the Enter key.
Momentarily, the need for the construction of new light will no longer exist.
is the same length as this...
This article summary is not very informative. The very least they could do is tell us which ten million lines of code Linux has surpassed.
Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
The better metric would be how many Libraries of Congress the kernal is.
Perhaps better would be number of times the size of the Unix System 6 kernel.
That's the one that the University of Waterloo printed as a textbook, half of a two book set. (The other book was the OS course text using it as the example.) They printed it at 50 lines per page column and added (lots of) whitespace and adjusted comments so routines fell on nice page boundaries. Even padded this way it came out to a total of ten thousand lines (of which I think 2 thousand were still in assembly code). Just right for one person to maintain full-time by the then-current rule-of-thumb.
So the linux kernel is a thousand times the size of that (whitespace-padded) version of the Unix kernel.
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
i believe a more appropriate measure of the 'bloat' (i.e. useless functions) or the size of any software package is through function point analysis--
http://en.wikipedia.org/wiki/Function_point
http://www.softwaremetrics.com/fpafund.html
the lines of code metric has long been considered an inadequate measure of software cost, complexity, or size - here is an article on why:
http://www.creativyst.com/Doc/Articles/Mgt/LOCMonster/LOCMonster.htm
but LOC is without question one of the easiest measurement (aside from total package size in bytes, which is nearly as uninformative)
I think that what you are suggesting is already standard fare for the Linux kernel.
Typically, the kernel and all modules are precompiled. Then, modules are swapped in and out as needed.
Kernel only or included soft included? Because if you count included soft, it doesn't make it a fair comparison. And note that the whole graphic subsystem is included in there also, so add X11 to the lot... but whatever, comparing the number of lines of code is akin to comparing the number of bolts in a car.
it's interesting information nonetheless. Divide the number of bugs by the number of LoC and you get a better-than-industry ratio in both cases. Which says a lot.
Of Code And Men
This only proves that the Linux Kernel is in need of a significant refactoring effort. The capacity for any single developer to understand or even read a significant portion of this code is NIL. As a result, the opportunity to reduce duplication of effort is quickly diminishing, and the ability of new users to contribute anything other than additional bloat is similarly diminishing. And while the core of the kernel may be "small", and much of this code is dealing with special cases for specific hardware, because of the size of the code involved it is increasingly difficult to identify what is substantial and what is merely stylistic differences between two drivers. Increasing LOC counts is a sure sign of under analysis and over reliance on the availability of cheap labor. You can pick any arbitrary number of lines of code (less than say 20k) and pick that as the number of lines the kernel should occupy. As an individual line may define a new abstraction, LOC represent a potential for a geometric increase in complexity. So either these 6-10 million lines of code represent some truly staggering level of irreducible complexity (most unlikely), or are merely the result of not refactoring the code sufficiently (most likely). This really is a milestone in gratuitous complexification that should be morned, not hailed.
You could try:
DIVIDE SLOC BY 1000 GIVING KLOC.
Not everything that can be measured matters; Not everything that matters can be measured.
If you're actually serious, (sarcasm is kind of hard to detect in plain text): man modprobe. Since Linux 2.0.
In addition there is also ksplice, to swap the actual kernel too.
IranAir Flight 655 never forget!
Comments are also code.
If you only count as code what can be feed to the machine, you should look at the size of the compiled binary. Source code is meant to be read by *humans*, so comments do count. That's why the GPL requires them to be left in the files (the "preferred form" to edit), otherwise it wouldn't be source code.
Singularity: a belief in the "God" idea with the "demiurge" relation inverted.
>at 100 characters per line
No no, you are thinking of Java. Linux is written in C
Climate Progress - Hell and High Water
I'm in a software engineering class listening to how to use metrics on code.
No, you're in a software engineering class posting on Slashdot.
You are likely to be eaten by a GNU.
Escher was the first MC and Giger invented the HR department.
the real number of pure code lines: 6.399.191, with 96.4% of them developed in C, and 3.3% using assembler.
Personally I thought the news was that no one knows what 0.3% of the linux kernel is written in. THAT'S news! (I'm betting it's BASIC).
It's COBOL, that crap is still just everywhere.
In Capitalist America, bank robs you!
Check out cyclomatic complexity. It basically measures the number of different execution paths you can go through in a given function. It's not quite what you're looking at, but it's close. It's also closely related to the nesting depth of conditionals/loops, which is a good way to eyeball conceptual "size".
/* 3k lines of workaround for 8 lines of code. WTF were they thinking? */
//This might work.
//Blocks undocumented interface used only by WordPerfect.
//Passes test. Ship it. I'm done. <Allchin>
Help stamp out iliturcy.
No but a modern PC running windows uses 1000 times more RAM than GEOS Commodore 64, but doesn't really do anything extra. The OS needs to go on a diet.
GEOS supported thousands of printers, hundreds of hard drive adapters, hundreds of video cards, streaming network video, 3d gaming, virtual memory, several CPU vendors, hundreds of mice, and all that in 20KB of memory? Impressive!
Less sarcastic answer: modern computers do a whole awful lot more than GEOS did.
Dewey, what part of this looks like authorities should be involved?