Slashdot Mirror


Linux Kernel Surpasses 10 Million Lines of Code

javipas writes "A simple analysis of the most updated version (a Git checkout) of the Linux kernel reveals that the number of lines of all its source code surpasses 10 million, but attention: this number includes blank lines, comments, and text files. With a deeper analysis thanks to the SLOCCount tool, you can get the real number of pure code lines: 6.399.191, with 96.4% of them developed in C, and 3.3% using assembler. The number grows clearly with each new version of the kernel, that seems to be launched each 90 days approximately."

41 of 432 comments (clear)

  1. Isn't that normal? by arizwebfoot · · Score: 4, Interesting

    That the line count increases with each new version unless you are starting from scratch?

    --
    Oh Well, Bad Karma and all . . .

    --
    Beer is proof that God loves us and wants us to be happy.
    1. Re:Isn't that normal? by jd · · Score: 5, Interesting

      Yes, but it can go down with optimizations and refactoring (finding duplicated code and pushing it into a function or macro, for example) and with eliminating dead code. Ideally, code size should be asymptotic to an optimal size. As you approach the optimal size, more and more of what you need to do is already available to you. As you approach the limit, the amount of special-case logic and hardcoding approaches zero, and the amount of data-driven logic approaches 100%. Unfortunately, as you approach the limit, the performance must drop as you've now abstracted so far that your code becomes essentially a virtual machine on which your data runs. Simulating a computer is always going to be slower than actually using the real computer directly. In most cases, this is considered "acceptable" because your virtual machine is simply too advanced for any physical hardware to support at this time. (There is also the consideration of code changes, but as you approach the limit, your changes will largely be to the data and not to the codebase. At the limit, you will change the codebase only when changing the hardware, so if you could hardwire the code, it would not impact maintenance at all. All the maintenance you could want to do would be at the data level, given this level of abstraction.)

      Linux is clearly nowhere near the point of being that abstract, although some components are probably getting close. It would be interesting to see, even if it could only be done by simulation, what would happen if you moved Linux' VMM into an enlarged MMU, or what would happen if an intelligent hard drive supported Linux' current filesystem selection and parts of the VFS layer. Not as software running on a CPU, but as actual hard-wired logic. Software is just a simulation of wiring, so you can logically always reverse the process. Given that Linux has a decent chunk of the server market, and the server market is less concerned with cost as it is with high performance, high reliability and minimal physical space, it is possible (unlikely but possible) that there will eventually be lines of servers that use chips specially designed to accelerate Linux by this method.

      --
      It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
  2. Core functions vs Drivers? by bubulubugoth · · Score: 4, Interesting

    And how much of this lines are for core functions (Memory Managements, Scheduler, etc) and for drivers (USB, Filesystem)

    --
    Â_Â
  3. Meh by alexborges · · Score: 4, Funny

    AND???

    In other news, trees tend to grow up unless they tend to grow down or sideways. Sharks tend to eat anything they can, unless they are not hungry.

    Anonymous will beat me to FP for sure, unless they dont.

    --
    NO SIG
    1. Re:Meh by V!NCENT · · Score: 5, Funny

      Yeah so!? Cars are also getting bigger and more complex over time, so Linux must be heading in the right direction!

      Did I just... ? Oh sh-

      --
      Here be signatures
  4. Stolen code by CRCulver · · Score: 5, Funny

    Too bad 9,999,999 lines of that code were ripped off from SCO.

    1. Re:Stolen code by Tubal-Cain · · Score: 4, Funny

      And the unique line is commented out.

    2. Re:Stolen code by RiotingPacifist · · Score: 5, Funny

      only in the Debian version

      --
      IranAir Flight 655 never forget!
    3. Re:Stolen code by earlymon · · Score: 4, Funny

      Take one down, pass it around, 9,999,998 lines of code from SCO

      --
      Pathological kinda promises Path + Logical - but instead, you get stuck with pathetic.
  5. assembler? by TheRealMindChild · · Score: 5, Informative

    *cough*assembly*cough*

    "assembler" is the tool, not the language.

    --

    "When life gives you lemons, don't make lemonade. Make life take the lemons back!" -- Cave Johnson
    1. Re:assembler? by lilomar · · Score: 5, Funny

      Sure it is, why, I was assembly some assembler code just the other day. I was using my assemble to do it.

      --
      The creator of this post (Jacob Smith) hereby releases it, and all of his other posts, into the public domain.
    2. Re:assembler? by hondo77 · · Score: 4, Informative

      Then again, maybe not.

      --
      I live ze unknown. I love ze unknown. I am ze unknown.
    3. Re:assembler? by Hatta · · Score: 4, Funny

      I realize English is hard for you, but you can usually use verbs as nouns, and nouns as verbs.

      It's better if you don't. Verbing weirds language.

      --
      Give me Classic Slashdot or give me death!
  6. Re:Lines of Code by theaveng · · Score: 4, Interesting

    I used to have GEOS on my Commodore 64. I have absolutely no idea how many lines of code it used, but it could squeeze itself into just 20 kilobytes of RAM, and yet had lots of functionality (as good as an 80s-era Mac). I consider "how much RAM occupied" to be a FAR more useful metric.

    I would love to see someone develop an OS that followed a similar philosophy of using as little RAM as possible.

    --
    FOX NEWS.com should be BANNED from television and internet. Have the Congress take it over and give us Truespeak.
  7. Reply from actual kernel developer please . . . by EraserMouseMan · · Score: 4, Interesting

    I'm a developer and was wondering what kind of testing is done to verify the code. Do they use unit testing? Regression testing?

    I'm just curious because keeping 6+ million lines of code almost completely bug free is pretty amazing.

    1. Re:Reply from actual kernel developer please . . . by Anonymous Coward · · Score: 5, Funny

      Almost completely bug free? What are you smoking?

    2. Re:Reply from actual kernel developer please . . . by ZombieRoboNinja · · Score: 5, Funny

      >>There are literally thousands of men runnning the code on even more setups regularly

      Plus upwards of 7 women!

    3. Re:Reply from actual kernel developer please . . . by earlymon · · Score: 5, Insightful

      I'm a developer and was wondering what kind of testing is done to verify the code.

      Guinea pigs. Millions of us.

      --
      Pathological kinda promises Path + Logical - but instead, you get stuck with pathetic.
  8. Re:Um by binarylarry · · Score: 5, Informative

    Yeah but you can customize the Linux kernel. If you don't want features, just don't compile them in.

    It's easy, there's even a gui interface.

    Good luck compiling a custom NT kernel. :)

    --
    Mod me down, my New Earth Global Warmingist friends!
  9. Re:Lines of Code by megamerican · · Score: 5, Funny

    Exactly. The better metric would be how many Libraries of Congress the kernal is.

    --
    If you have something that you dont want anyone to know, maybe you shouldnt be doing it in the first place -Eric Schmidt
  10. Line Count Not Always a Good Thing? by linuxmeepster · · Score: 5, Interesting

    It's significantly easier to hide a malicious backdoor inside a huge software project than a small one. Linux has already had a near miss back in 2003, when the CVS repository was compromised. Considering how many mission-critical applications run under Linux, there's a huge financial incentive to hide a backdoor somewhere in those 10 million lines.

    1. Re:Line Count Not Always a Good Thing? by Microlith · · Score: 4, Insightful

      While Linux is huge, for a backdoor to be successful it would need to hit a huge number of systems. The majority of the kernel at this point tends to be drivers, not all of which are used in a given kernel.

      For it to be even remotely worthwhile, it'd have to be placed into something that was both heavily used AND given little attention. These two positions are almost mutually exclusive.

      Can anyone think of a place that would fall into these two categories? Even the more seemingly obscure parts of the kernel get attention fairly often and malicious changes wouldn't go unnoticed for long.

  11. Happy Ten Million, Linux! by Drakkenmensch · · Score: 5, Funny

    Now, where do we find a birthday cake with ten million candles?

    1. Re:Happy Ten Million, Linux! by Anonymous Coward · · Score: 5, Funny

      Now, where do we find a birthday cake with ten million candles?

      At John McCain's Birthday Party?

  12. What about the other .3% ? by damn_registrars · · Score: 5, Funny

    96,4% of them developed in C, and 3,3% using assembler

    That leaves .3% that is unaccounted for. What was it written in?

    --
    Damn_registrars has no butt-hole. Damn_registrars has no use for a butt-hole.
    1. Re:What about the other .3% ? by atomic-penguin · · Score: 5, Funny

      Visual Basic 6.

      --
      /^([Ss]ame [Bb]at (time, |channel.)){2}$/
  13. Not as much as you'd think by djupedal · · Score: 4, Informative

    Since that many lines = approx. 125,000 pages, which = approx. 0.0175 terabytes, and... a LOC is approx. 18 TB, I'd say they have a ways to go...

  14. Lines of code as a metric by qoncept · · Score: 4, Insightful

    Funny that the summary calls attention to the fact that the number of lines includes comments and whitespace without any mention of how worthless lines of code is as a metric. Someone could easily go in and add or remove newlines wherever they wanted and without changed a bit of code make it 50 million or 50 thousand.

    --
    Whale
  15. Re:Lines of Code by stephentyrone · · Score: 4, Funny

    I'm in a software engineering class listening to how to use metrics on code.

    No, you're in a software engineering class posting on Slashdot.

  16. Re:Micro-kernel vs massive kernel? by soulsteal · · Score: 5, Funny

    Tanenbaum, is that you? If so, give it up! It's been 16 years and you're not fooling anybody!

  17. Re:Lines of Code by hondo77 · · Score: 5, Insightful

    Why? Are you still using an 80s-era Mac as your primary computer?

    --
    I live ze unknown. I love ze unknown. I am ze unknown.
  18. Re:Lines of Code by QRDeNameland · · Score: 5, Insightful

    If 1 Line of Code = 1 Library of Congress, you should acquaint yourself with the Enter key.

    --
    Momentarily, the need for the construction of new light will no longer exist.
  19. Re:Lines of Code by rumblin'rabbit · · Score: 4, Interesting
    A better metric is the number of semicolons. Thus this

    for (int i = 0; i < n; i++) a[i] = b[i];

    is the same length as this...

    for (int i = 0;
    i < n;
    i++)
    {
    a[i] = b[i];
    }

  20. Not very informative. by hey! · · Score: 4, Funny

    This article summary is not very informative. The very least they could do is tell us which ten million lines of code Linux has surpassed.

    --
    Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
  21. A thousand Unix System 6 kernels. by Ungrounded+Lightning · · Score: 4, Interesting

    The better metric would be how many Libraries of Congress the kernal is.

    Perhaps better would be number of times the size of the Unix System 6 kernel.

    That's the one that the University of Waterloo printed as a textbook, half of a two book set. (The other book was the OS course text using it as the example.) They printed it at 50 lines per page column and added (lots of) whitespace and adjusted comments so routines fell on nice page boundaries. Even padded this way it came out to a total of ten thousand lines (of which I think 2 thousand were still in assembly code). Just right for one person to maintain full-time by the then-current rule-of-thumb.

    So the linux kernel is a thousand times the size of that (whitespace-padded) version of the Unix kernel.

    --
    Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
  22. Re:Can this be converted into kloc ? by DrVxD · · Score: 4, Funny

    You could try:

                  DIVIDE SLOC BY 1000 GIVING KLOC.

    --
    Not everything that can be measured matters; Not everything that matters can be measured.
  23. "Actual" code? by TuringTest · · Score: 4, Insightful

    Comments are also code.

    If you only count as code what can be feed to the machine, you should look at the size of the compiled binary. Source code is meant to be read by *humans*, so comments do count. That's why the GPL requires them to be left in the files (the "preferred form" to edit), otherwise it wouldn't be source code.

    --
    Singularity: a belief in the "God" idea with the "demiurge" relation inverted.
  24. Re:What did sloccount say the kernel was worth? by bendodge · · Score: 4, Informative

    Ohloh has a COCOMO calculator, which spits out ~$181M if you pay coders $55,000 a year.

    http://www.ohloh.net/projects/linux
    http://en.wikipedia.org/wiki/COCOMO

    --
    The government can't save you.
  25. Re:Lines of Code by TeknoHog · · Score: 5, Funny

    I'm in a software engineering class listening to how to use metrics on code.

    No, you're in a software engineering class posting on Slashdot.

    You are likely to be eaten by a GNU.

    --
    Escher was the first MC and Giger invented the HR department.
  26. Re: so freaking what? by Smauler · · Score: 4, Funny

    the real number of pure code lines: 6.399.191, with 96.4% of them developed in C, and 3.3% using assembler.

    Personally I thought the news was that no one knows what 0.3% of the linux kernel is written in. THAT'S news! (I'm betting it's BASIC).

  27. Re: so freaking what? by colmore · · Score: 4, Funny

    It's COBOL, that crap is still just everywhere.

    --
    In Capitalist America, bank robs you!