Lack of Testing Threatening the Stability of Linux

Loose women ??? by noisymime · 2005-04-22 01:46 · Score: 5, Informative

Coming from someone who was at that talk, he specifically said NOT to give money to testers. His words were actually 'give them credit, fame and loose women'.

This drew laughs from the audience.

Re:Contrapositive by jejones · 2005-04-22 01:47 · Score: 5, Informative

That's the converse. The contrapositive is, after a quick application of de Morgan's law:

"If it doesn't come to tears, then you didn't pick a good technology or your developers are sane."

Today's grammar lesson by Anonymous Coward · 2005-04-22 01:48 · Score: 1, Informative

For the slashdot editors :

"may ultimately threaten" != "threatening"

But, hey, the second one is more dramatic looking, so by all means run the scary one.

Truly, you are now ready to join the mainstream media.

Ngrhrrk. Use Aegis. by Anonymous Coward · 2005-04-22 01:48 · Score: 0, Informative

Decentralised SCM. Testing framework as part of the release cycle (now easy for most kernel dev with UML or Xen, contrary to stable-release Aegis manual...)

Do we need something like what Sun do? by Anonymous Coward · 2005-04-22 01:48 · Score: 3, Informative

Osnews had an article a while ago about some of the testing Sun do on Solaris - http://www.osnews.com/story.php?news_id=10178

Brief Answer: No. by ciroknight · 2005-04-22 01:54 · Score: 4, Informative

Long answer; kinda. You can use core dumps and system logs to interpret what's going on, but you can never really know for sure. Besides, the kind of errors that are in the kernel are the kinds of errors that really don't return error codes; they're the kind that crash the computer and make you reboot.

Microsoft's method is for some of the higher up software, and so is Apple's. If there's a bug in the kernel it's very unlikely that their code will catch it. Or at least that's been my experience.

If the problem is that Linux is so buggy, we just need to run it on a bunch more machines, and start randomly poking it as hard as we can until we break something. Once we've broken it, do it again to make sure it's not hardware, and then go to work fixing it. Good old brute force repairs.

--
"Victory means exit strategy, and it's important for the President to explain to us what the exit strategy is." G.W.Bush

Re:Vacation for Linus...? by mindstrm · 2005-04-22 01:56 · Score: 3, Informative

Why does it have to be an attitude? Linus has always maintained that it's his kernel tree, and that if you don't like the way he manages it, you are more than free to keep your own tree. The kernel is GPL, after all.
You won't hear linus complaining if someone forks his kernel and attention shifts away.. linus will continue to integrate things he wants to integrate.

Re:Contrapositive by Anonymous Coward · 2005-04-22 02:00 · Score: 1, Informative

Again, that's the converse. Converses are not guaranteed to be true, like contrapositives are. The original statement doesn't say when it won't come to tears, it only shows one situation where it will. If I said "If you jump off a cliff, you will die" would you say that if someone died, they jumped off a cliff?

Re:Moderators by DaHat · 2005-04-22 02:03 · Score: 1, Informative

Convicted Monopolist? There is no such thing. Microsoft was convicted of anti-trust violations, quite different. Remember, being/having a monopoly isn't a crime.

--
Help Brendan pay off his student loans

QA isn't sexy by ChaoticCoyote · 2005-04-22 02:03 · Score: 5, Informative

Morton is correct.

Even at commercial companies, QA isn't a "sexy" task. People would rather bang out code than write testing harnesses and run benchmarks.

Also, free software is driven by programmers, who tend to hate QA. Like any artist or craftsman, a programmer hates having their work critiqued. They spent hundreds (or thousands) of hours on a program, only to have someone nit-pick the details and point out the flaws. But for art, "quality" is a subjective quality -- and with software, quality and reliability are tangible quantities that can be measured.

My Acovea project demonstrated the problem. Users of GCC love Acovea; many developers of GCC, on the other hand, seem to treat it is an annoying distraction. Acovea identified more than a dozen errors (some quite serious) in the GCC 3.4/4.0 compilers -- and yes, I did report them to bugzilla. Only a couple of GCC's maintainers have said "thanks."

Not that the cool reception deters me. I have a new version of Acovea in the wings, and will be unleashing it on GCC 4.x Real Soon Now. ;)

As a consultant, I've been paid to perform QA work on commercial software packages -- but only one company, and a big one at that, has ever contracted me to QA a free software project.

Right now, free software is about many things, but quality is not job 1. And that needs to change.

--
All about me

Re:Contrapositive by Anonymous Coward · 2005-04-22 02:08 · Score: 1, Informative

p -> q
converse q -> p
contrapositive -q -> -p

Re:Vacation for Linus...? by bfields · 2005-04-22 02:14 · Score: 4, Informative

The issue is the way he is behaving in public. The flames, the "fuck off" attitude towards people working on the kernel, etc...I don't think that is the issue at hand. The issue is the way he is behaving in public. The flames, the "fuck off" attitude towards people working on the kernel, etc...
The kernel did not get where it is with his current attitude.

Oh, yes it did--go spend a few hours reading the lkml archives. He's always flamed people, and always been happy to drop patches that he thought weren't right for one reason or another. There's no sudden change here.... (But I wouldn't call it a "fuck off" attitude. Even when he flames someone he rarely seems to actually hold a grudge, or be unwilling to work with anyone.)

--Bruce Fields

Uh, no. by bmajik · 2005-04-22 02:17 · Score: 5, Informative

Software testing (usually) isn't monkeys pounding on keyboards until the box BSOD's.

It is difficult to test software without adequately understanding what it is supposed to do. Varying the underlying machine type is almost irrelevant for binary distributed software unless you're testing an operating system kernel or looking for race conditions in software (which is really just a stab in the dark)

How are you going to have 3rd party people debug software they know nothing about?

Where users help find bugs is by using the software. It honestly takes a certain mentality to be an effective software breaker, and it's not very common. It takes something else entirely to be a software tester; you've got to be a good developer (because software testing is about automation these days unless you're insane) but you've got to not get sucked into the developers way of thinking.

I assure you - letting normal users play with software doesn't clean it up. we can show that this is true in the following way:

- more users use Microsoft software for more hours a day than any other software in the world
- slashdotters say Microsoft software is the buggest software made

clearly if users using software was sufficient to find all the bugs, MS stuff would be bug free, based on its frequency of use alone. I know this isn't the case, because im a software tester at Microsoft.

(The appropriate response is "well then, stop posting and get back to work; you're clearly not done yet!" :)

W.r.t. linux kernel testing: this is something that's always amazed me - linux works surprisingly well for something with so little formal testing. On the other hand, when there are edge case problems my experience has been that nobody is much interested in fixing them. One example i had was at a consulting gig. the client was looking to move his web hosting business onto linux boxes if he could get more sites per box then he could on windows. He had a problem where his linux server would start dying after a few days. I started to look into it and the box would basically panic() in low memory situations. I asked Alan Cox about it (via irc) and the response was "buy more memory". Nice.

Another sore point with me growing up was xserver crashes. The Xserver was 99% reliable, but then you'd get some random crash and lose everything you were doing, and you knew there was no real way of getting it fixed or investigating it.. you just had to hope it magically got better somehow.. maybe when you switched hardware or something.

Then there's the just plain lack of testing of some F/OSS projects in general. When i was in college i had NeXT, Sun, and SGI boxes in my dorm room (but no linux :). I remember dling the Gaim tarball (this was loooong ago) and seeing about getting it built on my SGI machine. IIRC, there were some makefile / #include problems getting it to even build, and once it was built there were some other issues with its runtime. Ultimately i submitted a patch to the gaim folks that more or less "enabled" gaim on IRIX. There is no way anybody had ever used Gaim on an SGI without making these fixes, so it seems reasonable to suggest the authors had never tried it before. This lack of a platform test matrix is pretty common amongst smaller F/OSS apps, even when they say "works on *nix" they mean "works on the distribution of linux i run at home".

Another baby patch i submitted was for the openBSD kernel.. this time for the wdc driver. Back when UDMA 100 was newish, i bought 2 UDMA 100 disks a month or so apart.. so they were different sizes and different vendors, but on the same bus. The UDMA rollback code in openBSD would drop the DMA level from 5 (UDMA100) to 2 (something much slower, i dont remember what) after a certain number of DMA errors. This obviously sucked since you can run UDMA devices at different speeds on the same bus, and you can also fallback to UDMA66 and UDMA33, both of which are better than mode 2.

--
My opinions are my own, and do not necessarily represent those of my employer.

Somewhat alarmist headline by xixax · 2005-04-22 02:38 · Score: 3, Informative

From TFA:
"A lack of commitment to testing by the Linux community may ultimately threaten the stability..."

The content of the article is much better than the headlines and excerpts being quoted. I was there and felt that what he was geting at was that we need to start thinking about updating QA procedures. The ratio of bugs to features is decreasing, but the rate of features is (maybe?) growing that much faster. The point of his talk was to outline a number of options for improving QA, thre are issues, but the sky certinly isn't falling either. It was an excellent follow on from Tridge's keynote the previous day on how to do quality system programming (overshadowed by his very brief coverage of the BK thing).

Xix.

--
"Everything is adjustable, provided you have the right tools"

Re:*sigh* by Anonymous Coward · 2005-04-22 03:29 · Score: 5, Informative

I was a Microsoft developer for about 6 years, and this guy gets it exactly right.

Most of the really first-class groups at Microsoft (Windows, SQL Server, Developer Tools, lots more) have INCREDIBLY exacting test requirements, and extremely competent and thorough and demanding test teams. The open source community has done well, but it is nowhere near the professionalism and thoroughness of commercial software development. And it's precisely because the testers get *paid* to do the same damn test on every single build -- something open source people won't do, because there's no glory in it.

Slashdotters will no doubt respond, "Well, if it's so good, then what about all those security bugs!" Which is a fair criticism. Commercial software development (such as Microsoft's high testing standards, and similar at Sun, Apple, etc.) only works when the testing priorities you start with are the right ones. For a long time, Microsoft's priorities were 1) features, 2) usability, 3) more features, 4) stability, 5) security, 6) even more features.

This has changed. Microsoft mid-level managers (dev managers, product unit managers, etc.) have internalized the idea that they are literally under attack, and that security must be a high priority from here on. I wish they had STARTED with that as a priority, but at least they get the message now.

But, seriously, the parent poster is right on the money. Microsoft has AMAZING testers and test/developers. The hardware and software matrix that they run code under has to be seen to be appreciated.

And, again. This is not intended as a slight at all to open source development or testing. It's just *very* different.

True... by jd · 2005-04-22 03:33 · Score: 3, Informative

However, there are aids. The Linux Test Project doesn't do much real testing, from what I hear, other than some basic standards stuff, but it should be simple enough to bolt on some real heavy-duty code testing routines.

Then, there's the mysterious Stanford Code Validator, used to great effect for a while. I feel certain that a few sweeps of that would uncover many of the more troublesome problems.

For those without SCV (99.9999% of the planet), there are some Open Source code validators out there. It should be possible, at the very least, to use those to identify the more blatant problems.

If you're not sure about using code validators, then it's simple enough to write programs that hammer some section of the kernel. For example, if you have some large number of threads mallocing, filling and freeing random-sized blocks of memory, can you demonstrate memory leaks? How well does the VMM handle fragmented memory? What is the average performance like, as a function of the number of threads?

Likewise, you can write disk-hammering tools, ethernet tests, etc. For the network code, for example, what is the typical latency added by the various optional layers? Those interested in network QoS would undoubtably find it valuable to know the penalties added by things like CBQ, WFQ, RED, etc. Those developing those parts of the code would likely find the numbers valuable, too.

If you don't want to write code, but have a spare machine that isn't doing anything, then throw on a copy of Linux and run Linpack or the HPC Challenge software. (Both are listed on Freshmeat.) The tests will give kernel developers at least some useful information to work with.

If you'd rather not spend the time, but want to do something, map a kernel. There's software for turning any source tree into a circular map, showing the connections within the program. If we had a good set of maps, showing graphically the differences between kernel versions (eg: 2.6.1 through to 2.6.12-pre3) and between kernel variants (eg: standard tree, the -ac version and the -mm version), it would be possible to get a feel for where problems are likely. (Bugs are most likely in knotty code, overly-complex code, etc. Latency is most likely in over-simplified code.) You don't have to do anything, beyond fetch the program, run it over the kernels, and post the images produced somewhere.

None of this is difficult. Those bits that are time-consuming are often mostly time-consuming for the computer - the individual usually doesn't need to put in that much effort. None of this will fix everything, but all of it will be usable in some way to narrow down where the problems really lie.

--
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)

quoted out of context by mathgenius · 2005-04-22 09:48 · Score: 2, Informative

I was there, and the quote was taken _absolutely_ out of context: 'If you pick a good technology and the developers are insane, it's all going to come to tears.' He was not refering to BK in this instance; he was in fact talking more generally about SCM systems, and how he had noticed that these projects tended to attract "insane" developers (also the ide drivers do this too).
This was all part of a larger, very insightful remark, saying that had Linus chosen a free SCM tool three years ago, we would now have a fantastic SCM in the free software world. In this instance, it is not so much the _tool_ that would need to be good, but that the _team_ behind the tool needs to be solid, responsive etc.

Simon.

Slashdot Mirror

Lack of Testing Threatening the Stability of Linux

17 of 325 comments (clear)