Lack of Testing Threatening the Stability of Linux
sebFlyte writes "Andrew Morton, a Linux kernel maintainer, has said that he thinks that the lack 'credit or money or anything' given to those people who put in long hours testing Linux releases is going to cause serious problems further down the line. In his speech at Linux.Conf.Au he also waded into the ongoing BitKeeper debate, saying 'If you pick a good technology and the developers are insane, it's all going to come to tears.'"
just imagine what'll happen if Linux actually makes a dent in the non-geek desktop market, and widespread use by "appliance operators" ensues.
I'm really surprised no company really has used this as a business model.
I think it'd be awesome to run a software debugging/testing firm, where basically you have a bunch of computers and a bunch of users come in and try their best to break the software. Cheap labor and a good variety in machines, and you could quite quickly clean up even some of the nastiest code.
"Victory means exit strategy, and it's important for the President to explain to us what the exit strategy is." G.W.Bush
Testing of Linux might be easier if it contained some automated features for sending crash reports back to a central database. Gathering some basic data on the stack trace, thread states, processes, etc. might help troubleshoot the OS in the context of the wide array of systems, configurations, and usage patterns. I know that both Microsoft and Apple have benefited strongly from this feature. Some tin-foil-hat wearers might object to their box phoning home. Tin foil hatters can just disable the feature but it might mean that the types of bug they experience never get fixed.
If developers are going to fix the bugs that occur in the real world, they need data from the real-world.
Two wrongs don't make a right, but three lefts do.
I would think it would be hard, really hard, not to develop a bit of a premadonna attitude when you get all the attention someone like Linus does. Even geeks who tend to use logic not to become that way will still succumb to the pressure. It seems they become quite insane at times.
He needs a break, but not the kind you are thinking of. He needs a break away from all the attention. He needs to be "taken down a notch." You know, a good swift kick in the balls to make him realize he is just a regular human like the rest of us.
"Bugzilla is fine for tracking bugs, but as it's currently set up, it's not very good for resolving bugs."
Hmm... I'd be interested to understand what alternatives to a web-based system he has in mind. Any thoughts?
"This process, where individuals communicate via a Web site, is very bad for the kernel overall."
Someone you trust is one of us.
You may be able to 'break it', but can you repeatedly 'break it'?
Can you predictably reproduce the bug?
You are being MICROattacked, from various angles, in a SOFT manner.
Virtual machines can help with this; running the kernel in a sandbox to get an actual snapshot of the kernel in action. But at the same time, the kernel's going to be running, and userland/kernel-land interaction will cause plenty of bugs to crop up and show themselves. But you are right; it's hard to poke at a kernel to see what's broken, especially when some code paths are very hard to follow and others are almost never used on certain systems.
"Victory means exit strategy, and it's important for the President to explain to us what the exit strategy is." G.W.Bush
I don't think that is the issue at hand. The issue is the way he is behaving in public. The flames, the "fuck off" attitude towards people working on the kernel, etc...
The kernel did not get where it is with his current attitude.
As I said, the pressure can get to anyone and the kernel is now a mighty beast of a project to maintain. He just needs to get his head screwed on straight. Either that or risk turning into another Theo.
I have successfully used Test Driven Development in several of my projects and it is a uniquely satisfying experience. Writing test cases before writing the code then completing each test case one after another in steady progression gives a constant stream of small victories. It also means you can run all test cases at a later time and see that "yep, everything still works" or "doh! that change just broke 10 things I already had working."
There are several other benefits to writing tests first as well. The experts in the link above explain it all better than I could, I'm sure.
Many open source projects are taking this approach already and usually boast the number of unit tests along with the lines of code included in the distribution. Anyone can type in "build test" for example and it will show the program run and pass some odd thousand tests.
Is it time for the Kernel to embrace this methodology? I certainly think it is a genuine best practice. But is it applicable to OS development as well? I don't see any reason why it wouldn't be, but I am not a kernel developer myself.
I remember in the early days there was a program called 'crashme' that threw randomly-generated executables at the system, and it was credited bolstering stability. Do tests like this still hapen frequently by the unappreciated? Is there a good place online to read about these tests and their results for different point-releases? Along similar lines, I recall someone throwing random input at the various gnu utilities, and it was discovered that they were more robust against this sort of abuse than the commercial unix equivalents. Are there any other interesting tests that anybody knows about? Breaking stuff is fun.
The "cue the foo posts in 3, 2, 1..." posts will commence with no subsequent foo posts in 3, 2, 1...
It may not be a Linux issue per se (more of a distro issue, I think), and it's purely anecodatal, but I've been seeing some QA problems lately in the mainstream distro I use. They include a bug that requires me to hand-edit the X11 config file to get my mouse to work, having to manually rebuild the routing table after every boot, and a so-far baffling total freeze of the system after rand() hours, only when it's serving web pages. I've been using Linux to do this job for six years, and never had these kinds of problems before.
http://alternatives.rzero.com/
If anybody reading this is interested in participating in the test procedure, check out the Linux Test Project.
My lame blog.
If you actually engineer something, you can design in testability.
But Linus T. in an interview back last centruy, said that 'Linux is organic' - not engineered.
So who is surprised that testing is 'hard'?
BTW, we have not one, but two of my colleagues down under right now listening to Andrew in person. It should be interesting to get a first-hand account of what was said.
- Necron69
True. Torvalds has always acted like an ass. Up to now however, he has benefitted from being the centre of a personality cult. As a consequence, he has developed a reputation of being a brilliant manager of men. The truth however is that very few people were prepared to contradict his "managerial decisions" even when he was being at his most hostile because of this idolatry that so many people were practicing.
I never thought I'd see the day when the zeigeist on Slashdot was very much opposed to Torvalds and his behaviour. I can only conclude therefore that the Torvalds personality cult is crumbling.
that is NOT Microsoft's approach to testing.
;) These really dont happen that often during the product cycle, because ad-hoc testing doesn't catch that much stuff if you've got well developed automation suites. However, it's still very worthwhile because it is a good feedback mechanism to explain why your other testing missed something, and it's the best way to notice the odd "that's funny..." sort of issues that are not functinoally incorrect but are still user annoyance type issues.
Where did you hear or get the impression that that was the MS "approach" to QA ?
I've written test suites for the following Microsoft Products
- Visual Basic Compiler, 7.0
- Microsoft Business Framework 1.0 (unreleased)
None of them involved just using the compiler or the business framework over and over in day to day work to find bugs.
We have a variety of test approaches, including a few that _might_ be construed as what you describe - There are a few ways that we get test coverage via product usage
- stress
- bug bashes
- app weeks
Stress is funnier than it sounds. Did you know we're not allowed to ship windows until the exact build of windows under ship consideration has been running on hundreds (thousands, usually) of machines continuously with no problems while enlisted in a distributed "stress" client... where they're pounded and pounded with automated tests that do things like starve memory whilst performing other work, etc? Same with ASP.NET and the CLR - they have to _survive_ for a pre-determined time period before the build can be considered shippable. We dont think there are any show-stopper bugs at this point - but we just want to be reasonably sure. Note that if we find a bug (even an unrelated one, like the documentation has a typo) and take a fix for it, the stress cycle resets because the bits have changed. Better safe than sorry. In the end game of a product release it can literally be the case that taking a bug fix means delaying ship for another week or more.
- bug bashes
this is probably most like what you're describing. Everyone on the team sits down for a couple of days and really just beats on a specific area of the product. Security Bug Bashes have become popular int he last couple years (wonder why
- app week
For developer tool products (like Microsoft Business Framework) we like to do an app week with each milestone, where everyone on the team builds some sort of end to end application, using as much of the toolchain as possible. This sort of testing really makes the employees better (we're usually pretty compartmentalized on our areas of functionality ownership). It also lets unreleated parties take a look at peices of the product they don't own (so don't have preconceived notinos about). Finally, it lets us simulate the end-to-end customer experience on our product stack. If we can build the sort of apps a customer might build with our tools, then the tools are probably alright. Where we run into problems, we know the tools need help.
bug bashes and app weeeks happen perhaps 1-2 weeks per milestone (which is on the order of 2 months). It is a small part of our testing, time, effort, and results wise. It's still important to do, but it is not the _focus_ of QA at microsoft.
My opinions are my own, and do not necessarily represent those of my employer.
Well, that would really be a problem, but despite Theo's personality (which I think might have its own charms) doesn't necessarily get in the way of development. Just think of the huge contributions OpenBSD made. Common Address Redundancy Protocol (CARP) for instance. Or their excellent firewall, pf (now present in all BSDs). Not to mention OpenSSH. And beside these standalone or highly portable applications, they released a secure and stable OS. Not 'just' a kernel. They write their own libc. They maintain a lot of software in their base system. Apache 1.3.x can be almost considered a fork, with their security/stability related patchset. Which comes down to my main point: The problem is not lack of resources, monetary or otherwise
Currently there are ~100 developers payed fulltime just to work on the kernel (at various organizations). There are none in FreeBSD. There are perhaps a dozen devs whose employers let them work on FreeBSD part-time, or there are various works that are sponsored by companies (pair network comes to mind) from time-to-time. But all in all, FreeBSD, that writes its own kernel, its own C library, and generally speaking maintains an OS (userland apps like their package management and ports system for instance, burncd - the native cd burning app of freebsd, etc.) does that with 1/50 of the resource Linux & co has just to develop the kernel.
This is not about linux vs. freebsd btw. I chose to use the latter, you chose the former, I really don't care, and I'm not willing to engage in yet another linux vs. bsd flamefest. You can argue endlessly about why linux is better, and I can do the same about FreeBSD, but I think we can agree on one point: either way, neither is that much better (lets cut down that figure to 10x - you can't possibly claim that linux is 10x better or something). In other words, my point is that it is not about (monetary) resources. It is a problem of organization imho. Less frequent releases, more API/ABI stability, a controlled release engeneering process might be a solution. Perhaps a branch split like it was done during 2.5.x (current 2.6) development. Pronounce the current 2.6.x branch STABLE, meaning introducing a POLA (policy of least astonishment in freebsd) and forbid API/ABI changes, then continue development in a new, 2.7 branch at the current pace.
I don't mean to imply that there is no release engeneering in linux kernel development whatsoever. But somehow FreeBSD's (and I assume the other BSD's as well) release engeneering seems to me a lot more transparent. Click the first few links at the top of this page to see what I mean by "controlled release engeneering process."
Actually, I'd say that giving proper credit and public recognition for bug reports is good enough for most of the end-users. Case in point (interestingly enough, from LKML and by Andrew Morton himself):
Getting an answer like that should lift anyone's spirits. Not only has the bug been fixed, it was also recorded for posterity that a certain user discovered it and helped to his ability in fixing it. And to top it all off, the reporter was given an honest praise and a thank you. The last part alone is usually enough for most users, to see that the developers actually care.
As for resumes? If you have a verifiable record of reporting back bugs and helping to test their fixes, you should be able to use that for your advantage in CV or at least in an interview. If nothing more, it shows that you can communicate with different kinds of people and have enough technical ability to follow through with their requests for further details. You might have even gotten a better product for yourself to use.
There is no such thing as good luck. There is only misfortune and its occasional absence.
I believe, there are real alternatives to linux. As linux is just a kernel. What we use and are comfortable - the numerous programs and utilities that run in the OS space are the GPLed GNU softwares which can run on linux alternatives too like FreeBSD, MacOS, OpenBSD, NetBSD and so on.
For the diehard fans of GPLed software , there is the GNU Hurd which can be embraced instead of Linux kernel. And the end user will never know the difference. This scenario of lack of testing will not occur for Hurd because it is a purely non-profit venture whereby even the developers do not rely on the project for their livelyhood. For them, it is a pure hobby and pleasure to work on the project and that is incentive enough for them to put in man hours in bettering the software.
Linux Help
for all things on Linux
The issue may have been already covered on Slashdot. At any rate, these were the findings of the study "Maintainability of the Linux Kernel" http://www.vuse.vanderbilt.edu/~srs/preprints/linu x.longitudinal.preprint.pdf at the Vanderbilt University, Nashville:
"We have examined 365 versions of Linux. For every version, we counted the number of
instances of common (global) coupling between each of the 17 kernel modules and all the other
modules in that version of Linux. We found that the number of instances of common coupling
grows exponentially with version number. This result is significant at the 99.99% level, and no
additional variables are needed to explain this increase. On the other hand, the number of lines
of code in each kernel modules grows only linearly with version number. We conclude that,
unless Linux is restructured with a bare minimum of common coupling, the dependencies
induced by common coupling will, at some future date, make Linux exceedingly hard to
maintain without inducing regression faults."
This article is about kernel development. While I appreciate the development being done to make the kernel faster/better/cheaper (well, it doesn't get any cheaper), it's already a Pretty Damn Good kernel. It sounds to me like the most crucial thing would be to solidify it and test the bejeezus out of it, then largely freeze it, because that's not where the problems are.
When people complain about MS Windows, they're not (usually) complaining about the kernel. They're talking about all of the stuff built on top of it: window manager, IE, networking, configuration. If the Linux kernel is receiving too little testing to be stable, what about the millions of lines of code that go into X windows, Gnome, CUPS (as mentioned the other day), etc.
If MS didn't have to make kernel changes to bettter support security, I suspect they wouldn't be touching it at all. BSODs are still more common than they should be, but most users find them extremely rare, and the kernel is Fast Enough relative to the work that needs to be done. The improvements in Longhorn are largely about changes above the kernel, especially in its spiffy interface.
While I'm grateful to Linus and all of the other developers for the kernel improvements, and while Open Source means never being told what to work on, kernel improvements other than stability are probably a terrible use of manpower. The kernel is a tiny fraction of the lines of code that go into a Linux distro. They are basic, and need to be rock-solid, but while performance improvements there benefit everybody, they don't benefit you at all if X, or KDE, or Konqueror, or any of the hundreds of other higher-level apps crash.
Read that, you might learn a thing or two.
Linus has always felt that his main role was to reject patches. If you take just anything, people won't refine patches to the point where they maintain or improve the overall quality of the code. Andrew and Linus essentially do a good cop/bad cop routine on patches.
Of course, he's essentially been on vacation from Linux work, developing git. I'd guess that writing his own thing has make him feel a lot better about the BitKeeper mess. He certainly seemed to be having fun coming up with brilliant solutions to problems, rather than the current kernel situation of endless refinement.
As for testing, the article is misinterpreting its own quotes. There's no lack of testing, and Andrew didn't say there was. There is a lack of reporting of test results and a lack of credit to people who would provide them. There is a lack of communication infrastructure for getting bug reports to people who might be able to fix them, to other people who may or may not be seeing the same problem (and can provide more details on when the bugs are triggered).
Of course, slowing down releases would make testing more difficult, because people don't test kernels that aren't released, and testing fewer releases just means that more people report the same bugs, because the fixes for those bugs are held up waiting for the next release.
The lack of properly stable releases should be fixed by the 2.6.x.y process; now an effective reporting process is needed to help the maintainers find out about bugs people hit, and determine whether correctness fixes actually deal with cases that happen in practice.
"Currently there are ~100 developers [paid] fulltime just to work on the kernel (at various organizations)."
I would be shocked if the number is that small.
"There are none in FreeBSD. There are perhaps a dozen devs whose employers let them work on FreeBSD part-time, or there are various works that are sponsored by companies (pair network comes to mind) from time-to-time."
That's a shame, but ok...
"[FreeBSD is released] with 1/50 of the resource[s of] Linux [...]"
Right, and so the fact that FreeBSD works well is quite impressive. The fact that it doesn't work at all on certain high-end platforms, obsolete platforms, lots of embeded platforms, etc., is also not shocking nor does it make it a poor platform.
FreeBSD does what it can with the resources it has, and that's a good thing. Let's not try to compare them to Linux. Linux is Linux and BSD is BSD. They are excellent tools for different jobs.