Open Source Programmers Stink At Error Handling
Mark Cappel writes: "LinuxWorld columnist Nick Petreley has a few choice words for for the open source community in 'Open source programmers stink at error handling'. Do you think commercial software handles errors better?"
Things like checking pointers to see if they are NULL before using them. Simple basic things that could prevent errors.
Error handling doesn't just mean catching the error after its already happened. It also means being proactive about it before it happens.
A lot of programmers do not do that.
Visit the Arcade Restoration Workshop @ http://www.arcaderestoration.com
My textbook example:
It takes no argument, and only produces one line of output. Despite this apparent simplicity, I've been able to get each and every pwd that ships with a commercial Unix to dump core (almost always by executing in an exceedingly deep directory.)
The GNU shellutils version of pwd, on the other hand, has never dumped core on me.
I will admit, the fact that it took two decades for a non-crashable version of pwd to become available doesn't bode well for the many other vastly more complicated programs out there in any environment. But it does speak very highly of the GNU utilities in general, and I haven't even begun to praise the thousands of folks who have worked on making these tools quite portable!
The real problem, IMHO, is that nobody likes to do the intensive testing that is necessary to get a program to be truly robust. We do it here at IBM, and I promise you -- it's not something I would do if I weren't being paid to do it.
Taral
WARN_(accel)("msg null; should hang here to be win compatible\n");
-- WINE source code
which is why Open Source is required. So we can all see exactly where the code stinks and we can fix it. Too bad legacy development models don't provide such advantages. Wouldn't it be cool if you could just buy quality software?
------DO NOT WRITE BELOW THIS LINE------
And attitudes like that, ladies and gentlemen, are the reason why we're all going to be old and grey before Linux is accepted on the desktop.
And the men who hold high places must be the ones who start
To mold a new reality... closer to the heart
I was just re-re-reauthenticating my Cubase installation. The key CD is now scratched which hangs the authenticator forced a quite ungraceful reboot and corrupted my hard drive. (Perhaps a $150 upgrade will help. I'll never know.)
The last time I used Word a drive filled during a save operation and left me with just a mutilated copy of the original file. (I will not use it again.)
My HP PSC 750xi software informs me every morning that its controlling software was exploded and I should reboot the host computer. (I'll wait for the OS-X drivers. If they are still bad the PSC goes out the door.)
The most amazing part is that this state of affairs doesn't surprise me. If my refrigerator intermittently defrosted and melted icecream all over the kitchen I'd be ticked. If my car mysteriously dies at stop signs I get it fixed.
Programmers have managed to beat down everyone's expectations to the point where half-assed is pretty good.
The only way I see to fix it is for consumers to refuse to buy flawed products, or legislators to pass laws allowing redress for flawed products.
I don't think either is likely.
I now use OSS for my mission critical work and fix what needs it.
Nick Petreley is a moron. Intelligent people don't make blanket statements like "Open source programmers stink at error handling." Next thing you know, he'll be telling you "Closed source programmers use more descriptive variables." How the hell does he know?
Programming traits - just like preferences for pizza toppings, frequency in bathing and type of pr0n - vary from programmer to programmer. Some implement proper error handling, others could care less. It doesn't matter whether they're working on an open or closed source project. If the open-source programmers all traded places with the closed-source programmers, you'd have the same ratios of proper vs. improper error handling (although the traffic from open-source-programmers.com to goatse.cx would probably spike).
-Ryan, with the unoriginal sig
It seems a lot of open source programs do in fact have little error handling. Most open source programs seem to focus on functionality, rather than usability. It gets you from point A to point B, and doesn't give you much help in between. If the user is not a master programmer, any errors usually end up being cryptic and nonsensical. It seems that a lot of commercial software has decent error recovery, and prevents user error effectively. Now I know there are many exceptions, but I think it simply has to do with the fact that commercial programmers get paid to do their work, and competition forces companies to put out products that are competitively usable.
And I would be careful about holding up Tomcat as an (open source) triumph. It's had some major bugs all through the 3.x timeframe, and its team includes at least a few daytime profressional "closed source" programmers (there's no correlation between the two, by the way).
The only certainty is entropy.
Bullshit.
Sure, an application error in a Unix derived system is much less likely to bring down the whole system. But that's no excuse for not dealing with error conditions correctly.
Also "errors" don't just occur from bad code. Running Linux gives you no protection against drive failures, network flakiness, or plain old user error.
Sure, if the error messages is misleading you can look in the source to find out what actually caused the error. Heck, even if it SEGVs you can compile it with debugging symbols and let GDB tell you what line it's failing at.
However, the open source community will NEVER attract mainstream users with that kind of attitude. Furthermore, even hardcore geeks should have better things to do fix up crud in supposedly release quality code. Hey, it's one thing if I'm working on something clearly under development, but it's nice to be able to get stable stuff to.
That said, I don't find open source to be any worse than the commercial stuff I've worked with. With, say, Microsoft stuff, it is just much harder to distinguish bad error handling code from bad code even when no error conditions are encountered.
how is a programmer expected to deal with the CD being scratched? Does your car still work if the transmission is damaged or half the engine has been riddled with bullet holes?
Again, a very unexpected and unnatural scenario. How well do cars function when they run out of fuel?
But how well would your refrigerator react if you treated it shoddily such as by leaving it outdoors intermuitently or diconnecting and reconnecting the power several times a day?
Now, I'm not trying to excuse sloppy software development but the fact of the matter is that software is constantly expected to work perfectly under situations completely outside its specifications yet we don't expect this from other items or appliances that we use.
First let me say that I am a Linux user and an open source advovate.
Now let me compare this to a judge I once met, who said that men have more tickets in general, but women always follow too close.
This is interesting, but if we further evaluate, one could conclude that women are just as bad (equally so), but perhaps people were lighter on them along the way. A police officer might have let her off, and so forth (this isn't to sound mysogynist of course, but I know women who get let off all of the time).
Instead, following too close is an easy prelude to... an accident. After all, when your bumpers are crushed together, you're too close.
Now think of error handling. "Open Souce Software handles errors poorly," is another way of saying that it too crashes a lot. Perhaps other people get caught for other things, but we only rag on open source when it crashes.
This isn't to say ALL open source software though.... but lets be perfectly honest. Programming is a difficult profession that a lot of people think they can just pick up. How many people would volunteer to do surgery without med school because they read a book on the subject? How many people get offended when you flash some important programming credentials in front of them that they don't have?
The trick is sifting the wheat from the chaff. Sure, a 14 year old with a little ambition can whip up a pretty impressive looking windowed program in X... but he doesn't have the sophistication of a well educated programmer... generally. There are plenty of good programmers and bad programmers in open source. The key is to know whats good and whats bad. If you can't figure that out, then buy a distro made by people who do.
I agree with you that software in general is a lot more complex, and used in a lot more unexpected ways, than something like a car.
OTOH, there is such a thing called graceful degradation -- that is, if you push the limits of the software, it shouldn't just suddenly barf and die on you, but degrade gracefully. Too much code I've seen (both open and non-open source) assumes too much -- and dies badly when the assumptions fail.
It is possible and not overly difficult to design software such that it degrades gracefully. Sadly to say, sloppy programming (programmers), deadline pressure, or disinterest in handling error conditions, dominate the world of software. Not many would put in the extra work to make a program degrade gracefully, because it doesn't have very visible effect -- until things start to fail. And too many programmers have this "test only the cases that work" syndrome.
Poll Mastah
News flash: Technology pundit seemingly insults open source, Slashdot up in arms. None of them actually read the article. Story at 11.
The article does not say "open source doesn't handle errors as well as closed source". What the article does say is "like most commercial software developers, many open source programmers are just plain lazy about proper error handling. But we're supposed to be better than that...".
I don't see a problem with this statement. The fact is, most open-source software sucks donkey balls. Petreley is merely saying it's time to put your money where your mouth is -- if you want open source to be considered better than closed source software, it better stop being so danged flaky.
ZFS: because love is never having to say fsck
I'm not familiar with the rocket you describe, but yes, it is a superior error-handling philosophy. Imagine if there was an unchecked error, and the rocket, instead of detonating, landed in civilian housing?
Why would you assume the rocket was intentionally detonated by the computer? Its computers went down and it went completely out of control. It was only blown up after it broke apart because it happened to go into a spin. There is no upside to this computer failure.
You call blowing up a commercial satellite launch vehicle non-destructive? If this error was ignored the rocket would not have been affected by it, it was an utterly irrelevant mathematical value overflow error in a program that only did anything before launch.
This program became destructive because of the "error management." In particular, the error management philosophy that halting a suspicious system is always safer than allowing it to run.
The point you seem to have missed is that halting the program is often more destructive than ignoring the error. Data loss, control loss, vital services suspended, etc.
That's like saying there's no reason to assume knowing about a bug is better than just allowing a program to go on its merry way. Uncaught bugs are the cause of 99% of the security holes out there. It's always better to know when there is a problem.
I'm sure the European Space Agency found it worth every penny of the estimated half-billion dollars lost to find this otherwise irrelevant bug. After all, it's always better to know, whatever the cost of halting the system, right?
Want to improve as a programming in terms of how you handle errors? Be assigned to 7/24/365 support of your product. When it tips over, deal with the irate user. When its most minor function doesn't work, get paged out of bed at 3 am. Have it do something important with lives at stake and then just _try_ to walk away from a half-fixed problem or say SEP.
Once you've been on this hook, you look direly askance at poorly documented code (even code you wrote three years ago), code that doesn't do required error checking, needlessly convoluted code, and code which isn't easy to read at 3 am when the live system takes a hemorrage repeatedly and someone might not get a 911 call through...
Once you've had to put yourself in the pressure cooker some clients have to deal with, you develop either a) sympathy for their situation or b) a great desire to never ever ever hear from them again. That's a good motivation to quit or to do the damn thing right in the first place.
Too many coders are allowed to write crap, churn it out as a release, and then not worry about support, either doing none or passing it off to some other poor bugger. If core developers got stuck supporting what they wrote and their own laziness or poor planning came back to bite them on the ass, they'd learn rather rapidly to improve their code quality and handle the errors in a robust manner the first time!
Been there, learnt the lessons, DESIGNED the T-shirt....
Tomb.
-- Mal: "Well they tell you: never hit a man with a closed fist. But it is, on occasion, hilarious."
Anything beyond that -- for the programmer -- is simply fluff and busywork to impress someone.
Get off my lawn.
The user should not see errors unless they want to. I agree with sending errors to STDOUT. If the user wants to see the errors then they can either start the app in an XTERM so they can see the errors or switch virt terms over to VT1 and see the STDOUT output.
Also, I use error reporting to a logfile rather than alarming the user. Most applications should be able to survive the average error. Those applications should prompt the user for proper input - even to the point of placing the cursor in the proper field. Each field should be intelligent and be able to validate it's own input data.
Those error logs I spoke of should be used by the programmer to debug his/her application - don't alarm the user ok?
Codifex Maximus ~ In search of... a shorter sig.
The main difference between a great systems administrator and a technically competent sysadmin is paranoia.
A great sysadmin would cut out their own heart before operating without known good backups. A great sysadmin would chew their own arm off before putting something into production without testing it first in a development environment. A great sysadmin *always* has a backout plan.
And how does a lowly admin reach this amazing level of greatness, you ask?
Admins get paranoid after making hideous, terrible mistakes that immediately result in Bad Things Happening.
I have personally: killed the email server for 2 days...shut down distribution for the world's largest distributor of widgets (every Thursday for 3 weeks)...destroyed all connectivity (voice and data) to the world for 12 hours...hosed the upgrade on a 700GB Oracle database (and our backups were no good). And any semi-experienced administrator will have, at minimum, two stories that are at least this bad (like my friend who shut down trading at Fidelity for a day).
And for every one one of these instances, I immediately felt the wrath of: my manager, my manager's manager, other people's managers, other people who were affected, stray people wandering by my cube who weren't affected...I also became a part of the "mythical sysadmin storybook"--"I once worked with this guy, and (you won't believe this) he..."
I submit the hypothesis that: generally, most developers are not subject to this type of immediate and extremely negative form of feedback for their mistakes. Therefore it takes a developer a long time to develop an aversion reflex that conditions them to do "the right thing -- error handling, code documentation" instead of doing "the easy, interesting, enjoyable and sexy thing -- making spiffy algorithms, writing tight code".
Drifting into another analogy, error handling is like code docmentation. Why do most developers get good (and a little obsessive) about documenting code? Becuase they finally spent some years trying to maintain someone else's tight, sexy code that is virtually incomprehensible.
So, my point is, developers take a long time to viscerally learn the need for good error handling by repeatedly getting whacked on the head for lack of error handling. It's like evolution in action.
Without even having read the article (but I've read some of his previous stuff) I'm sure that Petrelly didn't base his statements on actually looking at code. No doubt he has some examples of errors that are no handled from the user perspective.
But that has nothing to do with the programmers. The difference between Open Source and commercial software here is simply that companies can afford dedicated testing staff... QA departments. Most of the errors that an idiot like Petrelly will be able to find will be caught by the QA department before release. Unfunded Open Source projects can't afford that kind of QA... and with time, widely used Open Source packages tend to become higher quality than much proprietary commercial software (the thousand eyeballs effect). But early releases do tend to have errors that a QA department at a company would have caught before release. That has nothing to do with the quality of the programming.
Two things I've learned are that (a) every "if" has an implied "else" clause that often represents an unconsidered error, and (b) those else cases, and other unexpected situations shouldn't be logged, they should be "asserted" in a way that makes the program stop dead, now. That forces you to fix them when they happen. The business the author cites of getting all these messages is truly evil, as it really helps no one, neither the programmer nor the end-user.
-dB
"It if was easy to do, we'd find someone cheaper than you to do it."