Slashdot Mirror


Linux Kernel Gets Fully Automated Test

An anonymous reader writes "The Linux Kernel is now getting automatically tested within 15 minutes of a new version being released, across a variety of hardware and the results are being published for all to see. Martin Bligh announced this yesterday, running on top of IBM's internal test automation system. Maybe this will enable the kernel developers to keep up with the 2.6 kernel's rapid pace of change. Looks like it caught one new problem with last night's build already ..."

25 of 159 comments (clear)

  1. now all we need is automated.... by 3seas · · Score: 4, Funny

    code generation...

    1. Re:now all we need is automated.... by Baal+Sebub · · Score: 3, Funny

      I already got 1 million monkeys in my basement working on it.

      --
      120 chars are not enough for a signature. I have discovered a truly remarkable proof which this margin is too small to c
    2. Re:now all we need is automated.... by Curtman · · Score: 4, Interesting

      Actually, that could be done, could it not?

      Apparently it works for Samba. :)

  2. Question: by bogaboga · · Score: 4, Interesting

    How were the previous kernels being tested? Were sources for improvement/change/modification, bugs and areas requiring refactoring being discovered by chance?

    1. Re:Question: by Anonymous Coward · · Score: 3, Informative
      " How were the previous kernels being tested?"
      Hey guys, new kernel is out, bang away at it and let me know what you think.
  3. What took so long by Timesprout · · Score: 3, Interesting

    Most projects of any complexity use automated continuous build and testing as a standard development practise.

    --
    Do not try to read the dupe, thats impossible. Instead, only try to realize the truth
    What truth?
    There is no dupe
  4. This is awesome by jnelson4765 · · Score: 5, Insightful

    But it can't catch everything - the 1394 bus was screwed in 2.6.11. There are a lot of regressions that show up - and even that healthy cluster of systems will not show every problem.

    Sound issues? Older network and SCSI cards? There are a lot of drivers that break, and no one notices it because there is nobody with the hardware testing the -rc or -mm kernels.

    Wouldn't it make more sense to package these tools for someone to install on their collection of oddball equipment, and assist in the debugging/testing?

    Where's the ARM, MIPS, and SH?

    --
    Why can't I mod "-1 Idiot"?
    1. Re:This is awesome by Meshach · · Score: 5, Insightful
      But it can't catch everything...
      But that is not the point of automated testing. As a member of a qa team who is developing automated tests I get comments like that every day

      Automated tests are not intended to catch everything or test strange permutations of pre-conditions. There purpose is to provide a mechanism for verifying that a build satisfies the basic requirements of the project.

      More exotic configs need to be tested manually as usual but automated tests can provide a "failsafe" just in case a basic part of the build is broken.
      --
      "Maybe this world is another planet's hell"
      Aldous Huxley
  5. ARM Linux has something similar by kyllikki · · Score: 5, Informative

    ARM Linux has had something similar in Kautobuild for some time.

    Although the testing and building is limited to the ARM platform.

    The site also has a whos who thats worh looking at ;-)

  6. Re:Within 15 Minutes? WTF by DigiShaman · · Score: 3, Insightful

    Sounds like the solution to this problem is clear. Always use the second to latest kernel released. Stay away from the new one untill it's fully tested to your satisfaction.

    --
    Life is not for the lazy.
  7. Presumably... by Kjella · · Score: 4, Insightful

    ...the cross-platform, cross-hardware part? Setting up one machine to build automatically is easy. Setting up a whole bunch of them (and all unique, read administration nightmare) and tie them together to a system, that's quite a bit of work.

    Kjella

    --
    Live today, because you never know what tomorrow brings
    1. Re:Presumably... by oxfletch · · Score: 5, Informative

      Indeed. The automation system I wrote is just a wrapper around an internal harness called ABAT that has a massive amount of work behind it. If systems crash it can detect that, power cycle them, etc.

      Going from 90% working to 99.9% working is frigging hard. I had all this working 3-6 months ago, but the results weren't good enough quality to be published. Several people internally put a massive amount of work into improving the quality and stability of the harness.

  8. Re:Within 15 Minutes? WTF by doshell · · Score: 3, Insightful

    "Release" in the open source world has a broader sense than in commercial software. In open source not all "released" versions are meant for general public consumption; they include unstable versions targeted mostly at developers, so that severe isues can be detected and patched quickly.

    Taking this into account, I believe this is meant to catch bugs mainly in nightly (unstable) builds and release candidates, not in "final" versions (those should, at least in theory, have no serious bugs left around as the latter have already been eradicated from release candidates).

    --
    Score: i, Imaginary
  9. Re:Within 15 Minutes? WTF by oxfletch · · Score: 5, Informative

    I automatically test every nightly -git snapshot release, so it's fairly well tied in anyway. This also means my heaviest usage of our machines is at night, when most of the (US) developers are asleep.

    So it's fairly well tied in already ... and the whole -rc cycle should enable us to catch a lot of stuff.

  10. News Flash by sirReal.83. · · Score: 4, Informative

    Red Hat (and probably Novell/SuSe, since they use over one thousand kernel patches) runs a myriad of tests on each of its own kernel builds nightly - and has been doing so for years. On more than just the 3 architectures covered by this test.

    That said, pushing tests upstream is a great idea. Just not revolutionary or anything.

  11. Long uptimes by rice_burners_suck · · Score: 4, Interesting
    This is a very smart system. The Samba team uses something very similar. The key to finding regressions with this method is to create tests for every piece of functionality, and to integrate it with the rest of the testing suite, so that each function of the kernel will be continuously tested. For new features, it is preferable to create these tests as the features are being coded. For existing millions of lines of code, it is necessary for some brave souls to go in and create these tests.

    I hope they are using code from the Linux testing suite. That piece of work has already formed a nice set of tests. Also, I hope that the kernel is automatically built with many different combinations of options. And with time, I hope this will become better. The more tests, with the more hardware configurations, with the more kernel configurations, with the more types of input data (including many imaginative forms of incorrect input data to test that the kernel handles it gracefully and thwarts attacks based on such methods), the better quality we will have in the kernel, and it is likely that Linux will be unmatched in quality, stability, efficiency (well, maybe not efficiency necessarily), and long uptimes.

  12. Re:How much testing? by oxfletch · · Score: 5, Informative

    Compiles, boots, runs dbench, tbench, kernbench, reaim, fsx. If one test fails, it'll highlight it
    in yellow, rather than green or red. I have a few of those in the internal tests, but not the external set.

    This is only the tip of the iceberg as to what can be done. We're already running LTP, etc internally, and several other tests. Some have licensing restrictions on results release (SPEC) ... LTP is a pain because some tests always fail, and I have to work out the differential against baseline. Will come later.

  13. through the looking glass... by moviepig.com · · Score: 3, Funny

    With an automated test suite, what happens when a class of bug is discovered to be untested-for? Presumably, the suite is modified to detect it. Then, is the resulting new suite itself subjected to an automated test suite? And, then...[divide-by-zero error...]

    --
    Seeing bad movies only encourages them. Watch responsibly
    1. Re:through the looking glass... by oxfletch · · Score: 4, Informative

      There is indeed an internal self-test suite on the harness. It's not desperately sophisticated, and I wouldn't dare show it to anyone ;-) However, it does catch a lot of stupid bugs. It requires some manual intervention/inspection to work.

      Plus, there's a separate development grid where we test new test-harness code before it's put onto the
      production grid.

  14. Re:Within 15 Minutes? WTF by Metteyya · · Score: 5, Informative

    because they are nightly builds, that is - versions with applied patch, but untested yet.

  15. Re:Maybe... by oxfletch · · Score: 5, Informative

    The results are all there if anyone wants to play with them. Go to the results matrix, and click on the numerical part of the green box. Pick a test, and drill down to the results directory.

    The numbers are there, it's just a question of drawing graphs, etc. I have some for kernbench already, but I'm not finished automating them. If anyone wants to email me code to generate them from the directory structure published there, feel free ;-) Preferably python or perl into gnuplot.

  16. Safety issues by DruggedBunny · · Score: 5, Funny

    Martin Bligh announced this yesterday, running on top of IBM's internal test automation system.

    Hope he doesn't fall off and hurt himself.

  17. Re:Why has it taken so long? by teh_cn · · Score: 3, Informative

    mod me troll, but (free)bsd had this for years and not only for the kernel, but for world, too.

  18. Furthermore, it prevents regressions by xant · · Score: 3, Insightful

    Reliable, repeatable testing is a great way to prevent fixes in one area from causing bugs in another. When I fix A, I generally only test A manually. I don't test every other conceivable code path, even though my fix for A might well impact them.

    An automated test for B will catch regressions caused by my fix in A, making it harder to backslide. Backsliding is very expensive because bugs are far removed from their cause. If an automated test sees that changes in A caused a regression in B, the cause is immediately obvious.

    --
    It's rare that you're presented with a knob whose only two positions are Make History and Flee Your Glorious Destiny.
  19. Re:Within 15 Minutes? WTF by digitalunity · · Score: 3, Insightful

    Ummm...

    If everyone did this, the newest kernels would never get tested. I think it is important that we have a diverse range of users using new, almost new, and older but well tested kernels.

    --
    You can't legislate goodness. Let each to his own destiny, by will of his freely made choices.