Slashdot Mirror


The Linux Kernel Has Grown By 225,000 Lines of Code This Year, With Contributions From About 3,300 Developers (phoronix.com)

Here's an analysis of the Linux kernel repository that attempts to find some fresh numbers on the current kernel development trends. He writes: The kernel repository is at 782,487 commits in total from around 19.009 different authors. The repository is made up of 61,725 files and from there around 25,584,633 lines -- keep in mind there is also documentation, Kconfig build files, various helpers/utilities, etc. So far this year there has been 49,647 commits that added 2,229,836 lines of code while dropping 2,004,759 lines of code. Or a net gain of just 225,077 lines. Keep in mind there was the removal of some old CPU architectures and other code removed in kernels this year so while a lot of new functionality was added, thanks to some cleaning, the kernel didn't bloat up as much as one might have otherwise expected. In 2017 there were 80,603 commits with 3,911,061 additions and 1,385,507 deletions. Given just over one quarter to go, on a commit and line count 2018 might come in lower than the two previous years.

Linus Torvalds remains the most frequent committer at just over 3% while the other top contributions to the kernel this year are the usual suspects: David S. Miller, Arnd Bergmann, Colin Ian King, Chris Wilson, and Christoph Hellwig. So far in 2018 there were commits from 3,320 different email addresses. This is actually significantly lower than in previous years.

41 of 88 comments (clear)

  1. doesn't seem surprising by thePsychologist · · Score: 1

    This doesn't seem surprising, given that new architecture comes out all the time. What's great is that this amazing piece of work is still FLOSS, powering my Macbook as I type this comment.

    --
    "What lies behind us, and what lies before us are tiny matters compared to what lies within us." Ralph Waldo Emerson
  2. Lines of code by SCVonSteroids · · Score: 2

    Why do we measure in lines of code? Serious question.

    --
    I tend to rant.
    1. Re:Lines of code by Archtech · · Score: 1

      What's a good alternative? Serious answer.

      Function points?

      At least LOC gives us a rough estimate of how much work has been done writing source code. Of course it may be bad code, or completely wasted, and of course it must be calibrated by the language used.

      The problem IMHO is not the use of LOC but the foolish assumptions some people make based on that metric.

      --
      I am sure that there are many other solipsists out there.
    2. Re:Lines of code by Anonymous Coward · · Score: 1

      Because we are lacking better ways to measure code.

      Sure, lines of code doesn't have to be a good thing and neither does binary size, but those are easy to quantify.
      Would you rather that we used electro-psychometry to measure it?

    3. Re:Lines of code by ShanghaiBill · · Score: 4, Insightful

      Why do we measure in lines of code? Serious question.

      LOC is an important metric, for quantifying both progress and complexity.

      The mistake is assuming that more is better.

    4. Re:Lines of code by Kjella · · Score: 1

      Why do we measure in lines of code? Serious question.

      Lack of a better metric? Though at this level of abstraction I think classifying a project as small = 1 kLOC, medium = 10 kLOC, big = 100 kLOC and huge = 1000 kLOC project works just fine. I would think the number of maintainers you need scales pretty linearly with LOC. But it doesn't mean it's a useful measure of productivity....

      --
      Live today, because you never know what tomorrow brings
    5. Re:Lines of code by QuietLagoon · · Score: 1

      ... Why do we measure in lines of code? Serious question. ...

      Some metrics have as their base the number of lines of code. In and of itself, LoC is pretty meaningless. Other views which use LoC, however, can be quite useful.

    6. Re:Lines of code by fahrbot-bot · · Score: 3, Insightful

      Why do we measure in lines of code? Serious question.

      LOC is an important metric, for quantifying both progress and complexity.

      And yet, LOC doesn't necessarily quantify either progress or complexity.

      --
      It must have been something you assimilated. . . .
    7. Re:Lines of code by Zero__Kelvin · · Score: 1

      The question isn't Why measure in Lines of Code, but why not put those numbers in a context. It is necessary to understand how LOC increase does, and more importantly perhaps, doesn't increase overall complexity, vulnerability / attack surface, etc. I don't think there is any intention to deceive here ... I just think the author forgot how much they know that the target audience doesn't.

      --
      Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
    8. Re:Lines of code by arth1 · · Score: 1

      What's a good alternative? Serious answer.

      Function points?

      At least LOC gives us a rough estimate of how much work has been done writing source code.

      Z factor is a better one - the size of the code after being fed through compression (like compress, giving .Z files). Large amounts of extra spacing or unnecessary line breaks won't be factored in, while code originality gives a higher score than copy/paste jobs.

    9. Re:Lines of code by AlanObject · · Score: 1

      Why do we measure in lines of code? Serious question.

      I would like to know a decent alternative. For the past 20 years whenever I was involved in a contractual transfer of intellectual property of software I would invariably get asked "how many lines of code?"

      Generally the question is not too hard to answer as long as you don't get too picky. After all we are talking about a "find" command piped into "wc" on various checked out directory trees. Who knows what percentage of that actual compile-active source code versus everything else. The legal and accounting and M&A departments never seem to think to ask.

      To make things more uncomfortable for the software engineers who just want to be honest about it, more often than not the seller wants to include the lines of open source in the total because that is much, much bigger. Of course. One example: last century I worked with an embedded (Linux) product selling pretty well and the unique, proprietary code was about 300,000 lines or so. Include the kernel and utilities compiled and bundled in and you ended up with about 3,500,000 lines of code. Guess which number was used to price the sale.

      I spent some time trying to think of some way to vault over this kind of BS but I never got anywhere with it and ceased trying. They don't want to hear it.

      At the end of the day it doesn't really matter. What matters is the revenue stream the resulting software generates and the customer base it has. What they really are asking is how hard it is to duplicate the software by, say, some cheap programming team in the third world. That in fact is a forever unknowable quantity but you have to give them something to believe in that they can use to convince the buyer to sign the check.

    10. Re: Lines of code by jd · · Score: 4, Informative

      Lines of code tells you how much work was put in.

      The ratio of lines of code to code blocks tells you how maintainable the code is.

      Defect density tells you the quality of the code.

      A triple of these would give you a reasonable analysis.

      --
      It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
    11. Re:Lines of code by AHuxley · · Score: 1

      The idea was for small amount of code to do DOTADIW, or "Do One Thing and Do It Well."
      That would see a lot of really well understood code working hard to make a really great OS.

      --
      Domestic spying is now "Benign Information Gathering"
    12. Re:Lines of code by Pseudonym · · Score: 1

      Because that's what diff measures.

      --
      sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
    13. Re:Lines of code by gTsiros · · Score: 1

      considering that the best code is the one you don't write, i'd say LOC is a horrible, horrible metric.

      at work, our main product is roughly 2 million LOC (not counting blank lines, lines with only comments, preprocessor statements...)

      considering that so far any class i've tried to simplify was down by *at least* 95% (not exaggerating, in some cases i straight up *removed* classes because they were duplicated), i'd say the actual code is less than 100k. Possibly on the order of 10k.

      --
      Looking for people to chat about multicopters, coding, music. skype: gtsiros
  3. LOC != kernel bloat by Zero__Kelvin · · Score: 4, Informative

    An important caveat here is that increase in LOC count does not mean a linear increase in loaded kernel memory usage. For example, for every new driver each line of code is counted, but that driver may or may not compiled in or loaded as a module. If a driver for wireless card X is 1200 lines of code, but your system doesn't have that card and it was compiled as a module, then zero of those added lines of code generated machine code get loaded at runtime .

    There are more than 1000 .config options, and over 30 supported hardware architectures, so your code mileage *will* vary.

    --
    Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
    1. Re:LOC != kernel bloat by deathguppie · · Score: 1

      the Linux kernel is a mature piece of code. The amount of changes to core architecture should be limited to planned events and take years to mature to the point of inclusion. What we are looking at is just the amount of interest in making sure that the Linux kernel is being maintained at a prodigious rate so that we can wake up knowing that our bugs are being fixed and our security patches are on time. If there are people out there thinking that any amount of "lines of code" are some kind of developmental feature in the Linux kernel without a subsequent announcement they are deluding themselves. This is just to let people know that the machine is working fine.

      --
      once more into the breach
    2. Re:LOC != kernel bloat by Zero__Kelvin · · Score: 1

      Absolutely, my brother!

      --
      Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
  4. Linux jumped the shark years ago by Anonymous Coward · · Score: 1

    Never had the critical mass to win over desktop users (real workstation desktops, not hobbyists). Now it's just a bunch of change for the sake of change, plus a ton of meltdown/spectre bloat.

    It's too bad, it could have been a contender had the chips fallen their way and they had make a few critical decisions differently. Now with the systemd malware being stuffed down everyone's throat, it's assuredly game over.

  5. Embedded = Fubar by Anonymous Coward · · Score: 1

    If/When you look at the linux embedded scene, you'll notice that there is way too much duplication of functionality going on.
    Then you have to look at the linux embedded kernel forks which haven't been integrated into the mainstream kernel, and you find even more duplication going on.
    It is horrible, and it is getting worse every day.

    Also, the newly integrated Cryptocoin miner for the x86 SMM EC is a real LOC hog, but that's what you get when you need to hide its functionality.

  6. All in C? by Anonymous Coward · · Score: 1

    Is all written in C by men? That's horrible!

  7. How many bugs in, say, 10,000 lines of code? by QuietLagoon · · Score: 3, Interesting

    What's the rate of bug occurrence per 10k LoC in the Linux kernel? I'm less concerned about additional kernel bloat than I am about additional kernel bugs.

    1. Re:How many bugs in, say, 10,000 lines of code? by Anonymous Coward · · Score: 3, Informative

      For example, this: https://scan.coverity.com/proj...

      They state 0.45 defects/kLOC. Of course, they won't "find" all defects... and there might be some false positives in there. But you get the ballpark.

      Use your favourite search engine (hopefully not Google and its ilk).

      Kids, these days. When I was young, I queried Altavista with telnet. Tsk, tsk.

    2. Re:How many bugs in, say, 10,000 lines of code? by phantomfive · · Score: 1

      They state 0.45 defects/kLOC. Of course, they won't "find" all defects...

      They won't find most defects.

      But you get the ballpark

      Not really because the kernel developers know how to avoid the kind of bugs Coverity scans for (Coverity has been haranguing them over it for nearly two decades now).

      --
      "First they came for the slanderers and i said nothing."
    3. Re:How many bugs in, say, 10,000 lines of code? by Lost+Race · · Score: 1

      If kernel code is like most code, and it probably is, there is about 1 bug per line of code. So 10,000 or so.

  8. Re: LOC != kernel bloat. FALSE!! by Anonymous Coward · · Score: 1

    The kernel is bloat. Most of the new code should be outside and indenpent of of the kernel. Look at ZFS it is outside of the kernel and proves that NO file system code is needed in the kernel except for boot file system so the kernel can load first needed modules like a FS so rest will follow.

    Break the kernel up and we can get to the point of true replace parts and modules. Even having 2 or more of the âoesameâ FS so you do not have to do Big Bang upgrades.

  9. Kernel code or drivers? by Gravis+Zero · · Score: 2

    What really matters here is if we are talking about if this is a lot of code that has a direct effect on the functionality of all kernels or if this is really about code for specific kernel drivers. Last I read, the kernel core was like 2M LOC while kernel drivers made up 31M LOC.

    --
    Anons need not reply. Questions end with a question mark.
  10. So, these contributors.... by dwywit · · Score: 1

    is Sievers still banned?

    --
    They sentenced me to twenty years of boredom
  11. Re: LOC != kernel bloat. FALSE!! by Zero__Kelvin · · Score: 1

    Congratulations ... You win the SFC award!

    --
    Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
  12. Uncontrolable Bloat by aberglas · · Score: 2

    If those statistics are really true. Over 200,000 NEW lines in one year in an existing, complex system is as unmanageable as 3000 contributers. As products age, the rate of additions goes down as things have to be integrated into a complex system.

    Sure, Linux is not actually as monolithic as described. But a bug in any one of those lines could bring down the whole kernel.

    It is a credit to skill of the maintainers that they can make this work. And a debit that they try.

    1. Re:Uncontrolable Bloat by fibonacci8 · · Score: 1

      You're one of those people that compiles kernels with every available option enabled, regardless of the use case, aren't you?

      --
      Inheritance is the sincerest form of nepotism.
    2. Re:Uncontrolable Bloat by phantomfive · · Score: 1

      But a bug in any one of those lines could bring down the whole kernel.

      Most of it is drivers, and most of those drivers are for devices not running on your computer, so if there is a bug, it will be in a code path that is impossible to reach on your system (are you using JFS?). The core kernel is a lot smaller.

      --
      "First they came for the slanderers and i said nothing."
    3. Re:Uncontrolable Bloat by phantomfive · · Score: 1

      It really doesn't beg any question, you are not writing clearly. If you want to know why little-used drivers are in the kernel, the answer is that even if only .05% of the users actually need it, and someone is willing to maintain it, then that is a good enough reason to put it in. The kernel is well architected so adding another driver won't affect the quality of the other parts of the code.

      --
      "First they came for the slanderers and i said nothing."
  13. 5 commits per day by mapkinase · · Score: 1

    Linus Torwalds commits 5 commits per day

    --
    I do not believe in karma. "Funny"=-6. Do good and forbid evil. Yours, Oft-Offtopic Flamebaiting Troll.
  14. Yup. Moving to L4 by meburke · · Score: 1

    As a person who's been doing UNIX since 1984, ATT SVR3.2 was my favorite, although I've used other variants. I'm tired of Linux crap. I'm tired of systemd and the systemd wars. I'm tired of having to learn nuanced differences in various distros just to do basic, common, tasks. I'm tired of package repositories that suck when it comes to good maintenance (although they are still better than rpm hell). I'm tired of half-baked security measures that are badly designed and beyond human understanding. (IMO, admin should be a straight-forward set of tasks where doing one thing doesn't break 10 others.)

    I'd rather move to seL4 and have to write drivers for the few things that I use which aren't totally implemented yet. Luckily, I am old enough that I don't have to work for anyone who doesn't agree with me, and I still have skills current enough to make the change for me and my clients.

    Linux is going to drown in its own sh*t.

    --
    "The mind works quicker than you think!"
    1. Re:Yup. Moving to L4 by Bengie · · Score: 1

      FreeBSD is still maintained by many of the original AT&T Unix, BSD, and Solaris engineers. Some have taken more high level project management positions, but quite a few are still in the trenches writing code.

  15. Line counting by peretto · · Score: 1

    And how many lines did they removed? :)

    1. Re:Line counting by sabbede · · Score: 1

      2,004,759

  16. Totally massive fail!!! by LostMyBeaver · · Score: 1

    25 million lines of code is inexcusable at any level.

    The amount of code required to make Linux work is not even a million. Let's assume you can get every feature of interest into a LOT less than that and then depend on modules for everything else.

    Consider that the Linux Kernel as it stands today is one massive repository of trash on trash on trash on something less trashy.

    Linus mad Linux and he made Git and Git has things like submodules and there are things like GVFS as well in some environments. Why not scale the kernel back to bare minimums and then link modules in separate repositories and make something like a package manager to tie it all together? I mean seriously, there is not now or will there ever be a suitable excuse for such a horrible monstrosity as the Linux kernel.

    If Linus is taking a break... maybe it's time to just boot him from the whole thing and get someone in who is more focussed on cleanliness and organization. For example... let the Linux foundation toss the new maintainer $100,000 per million lines of code stripped from the kernel and offloaded into an external project. Rebuild the make config system so that it works based on a series of features and git URLs instead of being one gigantic pile of define nastiness.

    I think it's time for a real change to Linux.

    BTW... I don't think it's ever possible without a complete rewrite, but it's time to clean up the headers. I mean seriously... have you seen the pile of shit in the linux header directories these days? Github as well as others won't even let you see the directory listing anymore. What about multi-thousand line headers which contain so much crud you can't even read them anymore?

    I think Linux is awesome in its scale... but if you've ever written a kernel module, you'd know how severely horrible the current design ... or lack of design is.

  17. So, they're counting documentation? by sabbede · · Score: 1

    The summary seems to indicate that part of those changes are in documentation, which is not code.

  18. Re:Bloatware? by cyberchondriac · · Score: 1

    Latest release name for Redhat AES8: Bloaty McBloatface.

    Just poking fun. Linux is still quite lean compared to most OSes.

    --

    Look back up at my post, now look back down, you're on the Internet. Now look back up. I'm a signature.