Open Source Programmers Stink At Error Handling
Mark Cappel writes: "LinuxWorld columnist Nick Petreley has a few choice words for for the open source community in 'Open source programmers stink at error handling'. Do you think commercial software handles errors better?"
We really need this open source BSOD library
that would make our life more convenient and
our applications more commercial-like.
If programs would be read like poetry, most programmers would be Vogons.
Who spend days at a time at work (read: Stallman) without showers, removing the last 3 words provides a better description :o)
Things like checking pointers to see if they are NULL before using them. Simple basic things that could prevent errors.
Error handling doesn't just mean catching the error after its already happened. It also means being proactive about it before it happens.
A lot of programmers do not do that.
Visit the Arcade Restoration Workshop @ http://www.arcaderestoration.com
What are these "errors" you speak of? Open source has no errors...
That's the error I'm getting. Could it possibly be slashdotted in only 3 minutes?
Too bad, I was hoping I could say something meaningful, or maybe even relevant...
Under capitalism man exploits man. Under communism it's the other way around.
it's a feature.
"I may not have morals, but I have standards."
Why does it seem like there are as many people in the "community" criticizing open source as there are supporting it?
Two Words: Apache and Tomcat
I'm a professional who works with the closed source equivalents all the time: Netscape iPlanet server, IIS and WebLogic.
Now: before you flame - I like working with WebLogic, but it is no better than Tomcat in my opinion (as far as error reporting goes). And IIS is a piece of crap! Not to mention Netscape's overly complecated UI that blasts every change you've ever made and is completely out of sync with the flat file configs.
Need I mention that Tomcat error logging is set-up in an XML file that is easy to read, modify, and translate into a simple report for management (IT that is).
When was the last time Windows gave you a nice error.log when it blue-screened, or how about IIS on a buffer overflow?
I'm sick of bashing on the free stuff out there. Sure, just because I can release one of my college projects into the open source may mean that statistically there are more projects without good error reporting, the real projects are pretty darn good.
My textbook example:
It takes no argument, and only produces one line of output. Despite this apparent simplicity, I've been able to get each and every pwd that ships with a commercial Unix to dump core (almost always by executing in an exceedingly deep directory.)
The GNU shellutils version of pwd, on the other hand, has never dumped core on me.
I will admit, the fact that it took two decades for a non-crashable version of pwd to become available doesn't bode well for the many other vastly more complicated programs out there in any environment. But it does speak very highly of the GNU utilities in general, and I haven't even begun to praise the thousands of folks who have worked on making these tools quite portable!
I've been coding for over 20 years and I've seen some beauties, and I'm sure others have as well. Like the guy who put about 500 lines of Java in one Try - Catch. I'd suggest they screen their contributors better. Use a carrot and very gentle stick approach and be certain to encourage coders to think "what could happen here and how should I handle it?" whenever writing.
A feeling of having made the same mistake before: Deja Foobar
The real problem, IMHO, is that nobody likes to do the intensive testing that is necessary to get a program to be truly robust. We do it here at IBM, and I promise you -- it's not something I would do if I weren't being paid to do it.
Taral
WARN_(accel)("msg null; should hang here to be win compatible\n");
-- WINE source code
As a professional programmer I adhere to a strict stylesheet which I think the Open Source community may appreciate a copy of:
main( arguments ){
try{
--code goes here--
}catch( exception ){
printout "I'm sorry to do that you need our $50k/year support plan. \n Thank you!"
}}
No need to thank me.
...is only as solid as the engineer behind it (and the design behind him/her). A poor design often results in a flaky system, difficult to implement and nearly impossible to predict. That, in turn, can result in very thin error handling. Whether or not a product is commercial has nothing to do with it. The only argument for that could possibly be that in many cases, more careful attention (in the form of testing and code reviews) is taken when a product is a revenue generator (or anything that will affect the perception of the quality of a company's engineering ability).
Ultimately, if the engineer (or team of engineers) is inexperienced, error-handling will be weak, error-recovery nearly non-existant. However, a more senior engineer will generally start from error handling on up, making sure the code is robust before diving too deeply into business logic. The time taken for unit testing plays an especially large role here. The more time spent trying to break the code (negative test cases) the more likely you will have a system that has been revised throughout development to have rock-solid error handling/reporting/recovery.
[McP]KAAOS
It goes from God, to Jerry, to me.
which is why Open Source is required. So we can all see exactly where the code stinks and we can fix it. Too bad legacy development models don't provide such advantages. Wouldn't it be cool if you could just buy quality software?
------DO NOT WRITE BELOW THIS LINE------
And attitudes like that, ladies and gentlemen, are the reason why we're all going to be old and grey before Linux is accepted on the desktop.
And the men who hold high places must be the ones who start
To mold a new reality... closer to the heart
i think you are projecting some of your real life expiriences on web browsers buddy ;)
Please help! I'm stuck inside my virtual reality headset!
I was just re-re-reauthenticating my Cubase installation. The key CD is now scratched which hangs the authenticator forced a quite ungraceful reboot and corrupted my hard drive. (Perhaps a $150 upgrade will help. I'll never know.)
The last time I used Word a drive filled during a save operation and left me with just a mutilated copy of the original file. (I will not use it again.)
My HP PSC 750xi software informs me every morning that its controlling software was exploded and I should reboot the host computer. (I'll wait for the OS-X drivers. If they are still bad the PSC goes out the door.)
The most amazing part is that this state of affairs doesn't surprise me. If my refrigerator intermittently defrosted and melted icecream all over the kitchen I'd be ticked. If my car mysteriously dies at stop signs I get it fixed.
Programmers have managed to beat down everyone's expectations to the point where half-assed is pretty good.
The only way I see to fix it is for consumers to refuse to buy flawed products, or legislators to pass laws allowing redress for flawed products.
I don't think either is likely.
I now use OSS for my mission critical work and fix what needs it.
of handling errors that should never happen? You just double the size of your code, cause schedules to be missed, make maintenance more difficult and increase the probability of a grotesque coding error. I expected more macho stuff from the slashdot audience, not namby pamby whimpering! Sheesh! Welcome the the real world, get a thicker skin man!
8)
On a serious note : I've written commercial and non-commercial code. Sometimes I'm obsessive about completeness, sometimes I'm pragmatic. No point in generalizing about OSS vs. commercial.
I guess if /. killed the site, it should mirror it :)
Here is a select-n-middlemousebuttonclick(with my formatting):
Title: Open source programmers stink at error handling.
Outline: Commercial programmers stink at it too, but that's not the point. We should be better.
Summary: Why are we subjected to so many errors? Shouldn't open source be better at this than commercial software? Where are the obsessive-compulsive programmers? Plus, more reader PHP tips. (1,400 words)
Author: By Nicholas Petreley
Body: (LinuxWorld) -- Thanks to my very talented readers I've been able to start almost every recent column with a reader's PHP tip.I'm tempted to make it a regular feature, but with my luck the tips would stop rolling in the moment I made it official.So I want you to be aware that this week's tip is not part of any regular practice. It is purely coincidental that PHP tips appear in column after column. Now that I've jinx-proofed the column, I'll share the tip.
Reader Michael Anderson wrote in with an alternative to using arrays to pass database information to PHP functions. As you may recall from the column Even more stupid PHP tricks, you can retrieve the results of a query into an array and pass that array to a function this way:
<?PHP
$result = mysql_query("select name, address from customer where cid=1");
$CUST = mysql_fetch_array($result);
do_something($CUST);
function do_something($CUST) {
echo $CUST["name"];
echo $CUST["address"];
}
?>
Michael pointed out that you can also retrieve the data as an object and reference the fields as the object's properties. Here's the above example rewritten to use objects:
<?PHP
$result = mysql_query("select name, address from customer where cid=1");
$CUST = mysql_fetch_object($result);
do_something($CUST);
function do_something($CUST) {
echo $CUST->name;
echo $CUST->address;
}
?>
I can't help but agree with Michael that this is a preferable way to handle the data, but only because it feels more natural to me to point to an object property than to reference an element of an array using the string name or address. It's purely a personal preference, probably stemming from habits I learned using C++.
Subtitle: OCD programmers unite
Nothing could be a better segue into the topic I had planned for this week. I'm thinking about starting a group called OLUG, the Obsessive Linux User Group. Although I know enough about psychology to know I don't meet the qualifications of a person with full-fledged OCD (Obsessive-Compulsive Disorder), I confess that I went back and rewrote my PHP code to use objects instead of arrays even there was no technical justification for doing so.
Certain things bring out the OCD in me. Warning messages, for example. It doesn't matter if my programs seem to work perfectly. If a compiler issues warnings when I compile my code, I feel compelled to fix the code to get rid of the warnings even if I know the code works fine. Likewise, if my program generates warnings or error messages at run time, I feel driven to look for the reasons and get rid of them.
Now I don't want you to get the wrong impression. My PHP and C++ code stand as testimony to the fact that my programming practices don't even come within light years of perfection. But just because I do not live up to the standards I am about to demand isn't going to stop me from demanding them. It's my right as a columnist. Those who can, do. Those who can't, write columns.
I'll be blunt. Open source programmers need to stop being so darned lazy about error handling. That obviously doesn't include all open source programmers. You know who you are.
If you want a demonstration of what I mean, start your favorite GUI-based open source applications from the command line of an X terminal instead of a menu or icon. In most cases this will cause the errors and warnings that the application generates to appear in the terminal window where you started it. (There are exceptions, depending on the application or the script that launches the application.)
Many of the applications I use on a daily basis generate anywhere from a few warnings or error messages to a few hundred. And I'm not just talking about the debug messages that programmers use to track what a program is doing. I mean warning messages about missing files, missing objects, null pointers, and worse.
These messages raise several questions. Doesn't anyone who works on these programs check for such things?Why do they go unfixed for so long? Are these problems something that should be of concern to users?Worse, what if these messages appear because of a problem with my installation or configuration, and not because the program hasn't been fully debugged?But even if it is my installation that is broken, shouldn't the application report the errors? Why do I have to start the application from a terminal window to see the messages?
Subtitle: Getting a handle on errors
At first I wondered if this was a problem that you would be more likely to find when developers use one graphical toolkit rather than another. But I see both good and bad error handling no matter which tools people use. For example, the GNOME/Gtk word processor AbiWord has been flawless lately. Not a single warning or error message appears in the console. It's possible that AbiWord simply isn't directing output to the console, but I'm guessing that it's simply a well-tested and well-behaved application.
On the other hand, GNOME itself has been a nightmare for me lately. At one point I got so frustrated that I deleted all the configuration files for all of GNOME and GTK applications in my home directory in disgust, determined never to use them again. When I regained my composure and restarted GNOME with the intent of finding the cause of the problems, the problems had already disappeared. Obviously one or more of my configuration files had been at fault. Which one, I may never know, because GNOME or some portion of it lacked the proper error handling that should have told me.
In this case I was lucky that the problems were so bad I lost my temper and deleted the configuration files. In most cases, the applications appear to function normally. Aside from being ignorant of any messages unless you start the application from a terminal, there's no way of knowing why the warnings exist, or if they are cause for concern. The warnings could be harmless, or they could mean the application will eventually crash, corrupt data, or worse.
Subtitle: Examples
Just so you know I'm not making this up, here are some samples of the console messages that appeared after just a couple of minutes of toying with various programs. By the way, did you know you can actually configure the Linux kernel from the KDE control panel? Bravo to whoever added this feature. Nevertheless, when I activate that portion of the control panel, I get the message:
QToolBar::QToolBar main window cannot be 0.
Is there supposed to be a toolbar that isn't displayed as a result? I may never know.
The e-mail client sylpheed generates this informative message after about a minute of use:
Sylpheed-CRITICAL **: file main.c: line 346 (get_queued_message_num): assertion `queue != NULL' failed.
The Ximian Evolution program generates tons of warnings, but most are repetitions. They begin with the following:
evolution-shell-WARNING **: Cannot activate Evolution component -- OAFIID:GNOME_Evolution_Calendar_ShellComponent
evolution-shell-WARNING **: e_folder_type_registry_get_icon_for_type() -- Unknown type `calendar'
evolution-shell-WARNING **: e_folder_type_registry_get_icon_for_type() -- Unknown type `tasks'
The KDE Aethera client generates even more warning messages than Evolution, but many of them are simply debug messages about what the program is doing. By the way, I finally figured out why I couldn't login to my IMAP server with Aethera. The Aethera client couldn't deal with the asterisks in my password. I could log in after I changed my password, but I still can't see my mail. The program simply leaves the folder empty and says there's nothing to sync. Here are just a few of the countless warnings I get from Aethera, including the sync message.
Warning: ClientVFS::_fact_ref could not create object vfolderattribute:/Magellan/Mail/default.fattr
Reason(s): -- object does not exist on server
Warning: VFolder *_new() was called on an already registered path
clientvfs: warning: could not create folder [spath:imap_00141, type:imap]
RemoteMailFolder::sync() : Nothing to sync!
The spreadsheet Kspread reports these errors all the time, even though what I'm doing has nothing to do with dates or times:
QTime::setHMS Invalid time -1:-1:-1.000
QDate::setYMD: Invalid date -001/-1/-1
The e-mail client Balsa popped up these messages just moments after using it:
changing server settings for '' ((nil))
** WARNING **: Cannot find expected file "gnome-multipart-mixed.png" (spliced with "pixmaps") with no extra prefixes
The Gnumeric spreadsheet only reported that it couldn't find the help file, as shown below:
Bonobo-WARNING **: Could not open help topics file NULL for app gnumeric
Many of these problems could easily have been handled more intelligently. For example, Gnumeric could have asked for the correct path to the help file, perhaps adding an option so a user can decide not to install the help files and disable the message. Unless GTK and Bonobo are a lot more complicated than they should be, it should be easy to create a generic component for handling things like this and then use the component to handle all optional help files as a rule.
The only conclusion I can draw is that, like most commercial software developers, many open source programmers are just plain lazy about proper error handling. But we're supposed to be better than that, and it's time we started to live up to the reputation. I realize that most of these programs are works in progress. But good error handling is not something that should be left for last. It should be part of the development process. Although I may not practice it myself, I'm not the least bit ashamed to preach it.
Leonid Mamtchenkov
Nick Petreley is a moron. Intelligent people don't make blanket statements like "Open source programmers stink at error handling." Next thing you know, he'll be telling you "Closed source programmers use more descriptive variables." How the hell does he know?
Programming traits - just like preferences for pizza toppings, frequency in bathing and type of pr0n - vary from programmer to programmer. Some implement proper error handling, others could care less. It doesn't matter whether they're working on an open or closed source project. If the open-source programmers all traded places with the closed-source programmers, you'd have the same ratios of proper vs. improper error handling (although the traffic from open-source-programmers.com to goatse.cx would probably spike).
-Ryan, with the unoriginal sig
Out of all open source software I use, my biggest complain is with Linux and how freaking hard it is to swap a hard drive to a new machine. I can only imagine the insults that will be thrown my way but 98, or even NT/2000 nine times out of 10 I will have no problem with this. Sure it takes about 30 minutes of clicking yes, install new hardware but it works (usually). Under Linux, can't load root fs, goto panic. Grr, that bugs me like nobodies business.
I am sure there is some semi-painful way to get around this but should I really have to? If you ask me, the kernel should not panic at this "error" and should recognize it, prompt you and try to solve it (probe the new hardware and load the correct module(s)). Maybe some distros are better than others (and I shouldn't be placing this "blame" on the kernel team).
I always have inaccessible boot device bluescreening problems under 2000. 98 does happily accept the other drive, but win2k seems to get far angrier. Perhaps I am missing something?
What we call folk wisdom is often no more than a kind of expedient stupidity.-Edward Abbey
Over the past few years I've used several OSS programs in pre-release versions, and the tendency I observed was for the programmers to provide "last gasp" file saves to keep you from using work when the program crashed. For instance, I never lost a keystroke when using early versions of LyX.
I don't recall ever seeing this in a commercial product, though I haven't used any commercial products to speak of lately, so perhaps the state of the art has changed. I sure used to lose a lot of work under commercial software, though.
Sheesh, evil *and* a jerk. -- Jade
See my .sig for linkage!
If you celebrate Xmas, befriend me (538
It seems a lot of open source programs do in fact have little error handling. Most open source programs seem to focus on functionality, rather than usability. It gets you from point A to point B, and doesn't give you much help in between. If the user is not a master programmer, any errors usually end up being cryptic and nonsensical. It seems that a lot of commercial software has decent error recovery, and prevents user error effectively. Now I know there are many exceptions, but I think it simply has to do with the fact that commercial programmers get paid to do their work, and competition forces companies to put out products that are competitively usable.
Exceptions are mandatory for good programming, period. If the language you are using doesn't support exceptions (C, Perl, etc), you are going to have problems. Exceptions make sure that if an error occurs, and you aren't aware of it, your program dies, and doesn't go on its merry way, causing a security hole/unstable software.
Perl's hack at exceptions using 'die' doesn't cut it; one important thing about implementing exceptions is that your base operations (e.g., opening files, and other system functions) need to raise exceptions when problems occur. If this doesn't happen, you're only going to struggle in vain to implement good, correct code.
Exceptions are a primary reason I've moved from Perl to Python. Python's exceptions model is standard and clean. Base operations throw exceptions when they occur problems. And my hashes no longer auto-vivify on access, thank goodness. Auto-vivification on hash access are probably one of the principle causes of bad Perl code.
When you start Outlook Express, it often displays the password entry box before it has finished drawing the screen. Enter your password before the redraw has finished and Lookout locks up. (Netscape 4.7x has a similar problem, too.)
Most languages make error checking very hard. In particular, C and Perl, two of the most used langs in OSS development, lack good mechanisms for sane error checking. I might explain more, but is better explained at this document.
btw, the document is part of a library that allows nicer error checking in C, called BetterC. (Yes, this is a plug, I've participated in the development).
It is modelled in Eiffel's "Design by contract", a set of techniques complemented with language support to make error checking a lot easier and semiautomatic. "Design by contract" has been described as "one of the most useful non-used engineering tool".
The open source community should take the same stance as closed source corporations when it comes to bugs. They are not really bugs but undocumented features!
Strange women lying in ponds distributing swords is no basis for a system of government.
And I would be careful about holding up Tomcat as an (open source) triumph. It's had some major bugs all through the 3.x timeframe, and its team includes at least a few daytime profressional "closed source" programmers (there's no correlation between the two, by the way).
The only certainty is entropy.
Yeah? My code might not handle errors well but your server doesn't handle a load well and at least my code will never get /.'ed.
Check your return values!! As simple as it sounds, so many people just don't do this. Everyone just assumes that everything will go ok. Check the return, then print out an error to stderr. To be more helpful use this define just before you print your error message to help find where the error was and debug it. :", __FILE__, __LINE__)
#define ERR_LOCATION fprintf(stderr, "ERROR in File: %c Line: %d
Then use it like so:
ERR_LOCATION;
fprintf(stderr, "foo returned %d.\n", ret);
I believe that's the correct code.
Outdoor digital photography, mostly in New Engl
Bullshit.
Sure, an application error in a Unix derived system is much less likely to bring down the whole system. But that's no excuse for not dealing with error conditions correctly.
Also "errors" don't just occur from bad code. Running Linux gives you no protection against drive failures, network flakiness, or plain old user error.
Sure, if the error messages is misleading you can look in the source to find out what actually caused the error. Heck, even if it SEGVs you can compile it with debugging symbols and let GDB tell you what line it's failing at.
However, the open source community will NEVER attract mainstream users with that kind of attitude. Furthermore, even hardcore geeks should have better things to do fix up crud in supposedly release quality code. Hey, it's one thing if I'm working on something clearly under development, but it's nice to be able to get stable stuff to.
That said, I don't find open source to be any worse than the commercial stuff I've worked with. With, say, Microsoft stuff, it is just much harder to distinguish bad error handling code from bad code even when no error conditions are encountered.
I don't think that a straight comparison of open source to commercial software, in the context of error handling, has any merit.
I'll try to illustrate with an example. I'm running IE 5.00.2920.00 on Windows 2000. I get a huge number of "Cannot find server or DNS error" pages from IE. You know, those are the stock HTML files that IE displays that say "The page cannot be displayed", and it has a whole boatload of gibberish on it about clicking the Refresh button, contacting your network administrator, checking URL spelling, etc etc etc.
Unless the host machine is truly unreachable, I can click "Refresh" and get the appropriate page almost instantly about 80% of the time. Does that make you smell a fish? It makes me smell a fish.
The fish that I smell is commercial software handling errors in such a way as to blame anything other than itself when it encourters an error. I'm sure this works on most Windows users, because they've never used anything else, and their desktops crash all the time. Why shouldn't web sites just arbitrarily refuse to give up a page now and then? But if I'm debugging a web server that I'm telnetted to from my SPARCStation, and IE on Win2K claims that the web server can't be found 12% of the time, yet finds it instantly on refresh, I begin to see a pattern.
If you write commercial software, the pattern is to including fairly complete error handling, but make the error handling blame something else. IE didn't choke, DNS or the remote server did, or you typed the URL wrong. Anything but admit that IE had the problem.
Open source programmers don't experience pressure from marketeers and PR people and "product managers" to appear blameless. Open source programs tell it like it is, up to the limits of the programmer's articulation. That's why it's useless trying to compare the two: commercial software handles errors in order to shift the blame. Open source software handles errors in order to provide debugging information.
Wait for it...wait for it...ahhhhh!
This is about detected bugs which haven't been fixed yet.
Basically, his complaints boil down to, "bugs exist, causing error messages, why aren't all the ones that cause error messages fixed yet?"
Then he goes off on a confused tangent, apparently suggesting that "error handling" be added to work around any bugs. After all, if it can log the errors caused by bugs, it can respond to them in any way, up to and including fixing the problem (i.e. doing what the code should have done, except for the bug)! For example, if a system file is missing (meaning either a bug in the install, a bug in the program requesting something that isn't really a required system file, or an externally damaged system that can't be expected to work at all), just pop up a dialog to let the user search for it! Because of course the user should attempt to patch things up with his intimate knowledge of system internals instead of just seeing that there's a bug to report.
Hooooo boy....
I didn't see a single example of a genuine external error that wasn't handled properly, just bugs which should be fixed.
He chose some pretty bad examples of bad error handling - they all gave the module and direct cause of the error, and provided ample clues for the programmer in each case to go find out what went wrong. If we're looking for anything that open source programmers do that stinks, it's making GUI apps that pretend nothing's wrong.
Way to go, I say. Would rather have hugely detailed warnings any day.
Dave
I write a blog now, you should be afraid.
Error messages need to have numbers associated with them. For instance when I have ORA-1241 in oracle, a quick search in groups.google.com will give me a lot of informations about this error, and why it occured and what I can do about. Alas, there is no such thing in most of Open Source software, you just have plain text, so the search is less effective, which search keywords are you going to choose. The situation is even worse for people who used localised versions of the software, as you don't have the English transltation so you can search the English archive in groups.google.com and which count for 80% of the posts.
What might be cool is a codified error numbering a la Oracle for instance. I would love to have KDE-2345 error, or GNOME-1234 error, or Koffice-567 etc. That would made searchs far more effectives
Seriously, I once booted a Macintosh and the only thing that came on the screen was a little "Sad Macintosh." Apparently means that your system folder is corrupt. How's that for error handling?
Actually the Sad Mac usually indicates hardware failure (failed POST). You'll notice a hex code underneath the icon; the hex code indicates the actual error. One of the reasons it doesn't give plain English errors is, the Sad Mac is in the ROM. Text strings would take more space (think back to 1984 when that was an issue). Also, the Mac hardware isn't supposed to be language-specific (notice that there are no text labels on the ports) - if English isn't your native language, you shouldn't have English error messages. On top of that, Apple originally intended the Mac not to be user-serviceable. If you got a Sad Mac, you were supposed to take it into an Apple-authorized repair center and have an Apple-certified technician (who would have a list of error codes) take a look at it.
Fortunately, Apple has changed their attitude, but the legacy Sad Mac remains. Personally I agree with you, it would be really helpful to have some idea of what the problem is without having to look up a number.
$x='S24;r)>63/* h@<5+oZ)32"5cz';$me='phroggy'x$];
$x=~y+ -xz+\0-Tx+;print$_^chop$me for split'',$x;
how is a programmer expected to deal with the CD being scratched? Does your car still work if the transmission is damaged or half the engine has been riddled with bullet holes?
Again, a very unexpected and unnatural scenario. How well do cars function when they run out of fuel?
But how well would your refrigerator react if you treated it shoddily such as by leaving it outdoors intermuitently or diconnecting and reconnecting the power several times a day?
Now, I'm not trying to excuse sloppy software development but the fact of the matter is that software is constantly expected to work perfectly under situations completely outside its specifications yet we don't expect this from other items or appliances that we use.
First let me say that I am a Linux user and an open source advovate.
Now let me compare this to a judge I once met, who said that men have more tickets in general, but women always follow too close.
This is interesting, but if we further evaluate, one could conclude that women are just as bad (equally so), but perhaps people were lighter on them along the way. A police officer might have let her off, and so forth (this isn't to sound mysogynist of course, but I know women who get let off all of the time).
Instead, following too close is an easy prelude to... an accident. After all, when your bumpers are crushed together, you're too close.
Now think of error handling. "Open Souce Software handles errors poorly," is another way of saying that it too crashes a lot. Perhaps other people get caught for other things, but we only rag on open source when it crashes.
This isn't to say ALL open source software though.... but lets be perfectly honest. Programming is a difficult profession that a lot of people think they can just pick up. How many people would volunteer to do surgery without med school because they read a book on the subject? How many people get offended when you flash some important programming credentials in front of them that they don't have?
The trick is sifting the wheat from the chaff. Sure, a 14 year old with a little ambition can whip up a pretty impressive looking windowed program in X... but he doesn't have the sophistication of a well educated programmer... generally. There are plenty of good programmers and bad programmers in open source. The key is to know whats good and whats bad. If you can't figure that out, then buy a distro made by people who do.
Heh, the link to the site gives me a Proxy Error. ;-)
sigs are a waste of space
Exceptions make sure that if an error occurs, and you aren't aware of it, your program dies, and doesn't go on its merry way, causing a security hole/unstable software.
You mean like that Ariane rocket that blew up when its double-redundant computer system was halted because of an utterly irrelevant uncaught exception? Yeah, that's definitely a superior error-handling philosophy.
Aside from the conceptual problems of what are essentially COMEFROM statements with scope management, there's no reason to assume that halting the program is better than just allowing it to run.
Open source programmers may suck at handling errors, but commercial programmers suck much more.
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
...and in that moment, he became enlightened.
When all you have is a hammer, everything looks like a skull.
I agree with you that software in general is a lot more complex, and used in a lot more unexpected ways, than something like a car.
OTOH, there is such a thing called graceful degradation -- that is, if you push the limits of the software, it shouldn't just suddenly barf and die on you, but degrade gracefully. Too much code I've seen (both open and non-open source) assumes too much -- and dies badly when the assumptions fail.
It is possible and not overly difficult to design software such that it degrades gracefully. Sadly to say, sloppy programming (programmers), deadline pressure, or disinterest in handling error conditions, dominate the world of software. Not many would put in the extra work to make a program degrade gracefully, because it doesn't have very visible effect -- until things start to fail. And too many programmers have this "test only the cases that work" syndrome.
Poll Mastah
The more likely that I think someone other than me is going to use a program, the more likely I'm going to put work into error support. If I'm expecting really techie types to use it, I may be more oriented to super-terse responses.
For stuff that goes to the general public, I'm more likely to do nice error work, but even that may be cut short if I'm really short on time. Of course, there's the problem of systems that go are originally intended for me but then get released to an unintended audience... These are the ones most likely to have what I consider inappropriate error handling for the audience. I think that this re-targeting of programs originally intended for internal consumption are also the source of sub-optimal error handling in some open source programs.
Some principles that I've learned for error reporting are (in roughly decresing order of importance);
- it shouldn't do the unexpected in the face of the unexpected/unwanted
- the response should be reasonable and predictable
- error messages should indicate, as much as possible) where the error originated
- recovery should be as easy as possible
Now I think that I'm going to get flamed at for putting easy recovery at the end, but that's really where I have it. I'd rather have consistent and reasonable (but cryptic) responses than reall nice messages and eratic behaviour. If it happens with any regularity, I believe that eratic behavior is going to be more off-puting to users than terse messages.Now ideally, I'm going to achieve all of the above, but that's going to depend on how much time I have to put into the project in question. I guess that the listing I gave is the order in which I plan for error reporting depending on the 'techiness' of the expected user.. (This may also include the techniness of my own expected mood when I'm going to be running the program).
To be honest, I believe that if other users are going to be playing with my program, it's usually worth my while to go the full gamut for error response/ recovery. I know that if I don't take the time to make error recovery as easy as possible, I'm ultimately going to end up spending more than that amount of time responding to users who don't understand/like the errors that they get out of my program.
Short term investment, long term gain
--------
Microsoft has had a history of going to extremes with their error response. The response is either laking all but (possibly) the most basic error handling that may not even achieve my first intent (e.g. BSOD) to things like the paper clip that are so damned helpful that they're annoying. Part of the problem, I think, is that the Microsoft culture encourages it.
When program failures are cryptic and unpredictable, it encourages support calls to Microsoft that they get paid for. Microsoft actually gets paid for. The other reasonable response is to go to microsoft for training -- once again paying them for an MSIE that spends lots of time on how to placate customer dissatisfaction with Microsoft's problems.. In other words, microsoft gets paid for bad programming practices.
This seems to change, however, when Marketing decrees that problems need to be handled better. As far as I can tell, marketing seems to drive Microsoft, so when they decree that things need to change, they will.
An early example of this was when windows 95 came out and decided to check the filesystem if the system was brought down badly. This was something that Unix and MAC boxes already did, and microsoft probably wanted to jump on the "stability bandwagon". The result was the annoying wait at the "we're about to clean the filesystem, you naughty boy" prompt.
I expect that the paperclip was similarly mandated by Marketing (though I sometimes think that it may have started out as some programmer's prank and picked up by marketing as a 'good idea'). That feature was so annoying that it was ultimately dropped -- I think because it broke the principle of 'reasonable response'.
Something that distracts the user isn't helpful. The constant (nervous) motion of the paper clip, and it's obtrusive location on top of the main screenare tricks learned by advvertisers to get peoples attention -- unfortunately, most of the time the users' attention is attempting to focus on the document being created. Had the paper clip been an unobtrusive text box in the toolbar, people probably would have welcomed it.
In any case (having rambled), I think that Microsoft error responses are more oriented towards making money for Microsoft than making life easier for the user (a common subtext on slashdot).
Sometimes boldness is in fashion. Sometimes only the brave will be bold.
you think checking return codes is the solution? Well, it is but at a cost.
Exercise for /. readers: add errorchecks to the following C function. 'return' and exception handling pseudocode allowed:
/* Here we do something with p1, p2, p3 */
int allocate_3(void){
int *p1, *p2, *p3 ;
p1 = malloc(SOME_NUMBER*sizeof(int)) ;
p2 = malloc(SOME_NUMBER*sizeof(int)) ;
p3 = malloc(SOME_NUMBER*sizeof(int)) ;
free ( p1 ) ;
free ( p2 ) ;
free ( p3 ) ;
return 0 ;
}
Let the game begin...
I must have been transported into a parallel universe, first a story thats negative towards Open Source on /. and then I cannot find any of the usual "Imagine a Beowolf cluster of errors", or "Is a Beowolf cluster of errors a cluster fsck?" and where oh where is "All your errors are belong to us", if anyone has directions back to my normal reality please help me.
Any sufficiently advanced man is indistinguishable from God
Open Source sucks at error handling? Look at the standards in the PC industry.
They've been declining in general for the past 10 years, and before that they sucked as well. I think the standard is really set by the hardware itself.
Typically drive errors can have symptoms of software running more slowly as the drive retries - or applications will simply appear to hang, or if it's an error reading code into memory, well, anything goes.
Network errors can go completely unknown until you haul out the crusty old hacker with a sniffer - oh gee, did you know that your card is dumping half it's packets?
Oh - especially network problems - where the software at the user level 90% time just sits there and goes "Duh!" for simple things like pulling the cable out.
Error checking and handling, in general, SUCKS and it's the main reason why computers suck - why the software industry spends billions of dollars chasing problems during the development phase that they never really get to pin down, so the problem ends up going into shipping products.
I blame the lax standards on the platform, and the dumbing down of programming in general (the over-reliance on high-level languages that remove the programmer progressively further and further from the hardware their programs run on).
If PC's had better standards for this sort of thing at the hardware level - and if the vendors adhered to those standards, then the software people could write software that handles errors better, and it would bubble up to the user level as more reliability, and much simpler troubleshooting, probably tens of billions of dollars saved in productivity alone, and probably the PC industry would be 10 times the size it is today, because people would actually trust them for important tasks, rather than the next nifty home killer-app like pirating music. (not meant to be a troll against MP3 trading - meant to be a troll against the apparent purpose and direction of the PC industry in general).
These are my friends, See how they glisten. See this one shine, how he smiles in the light.
If the language you are using doesn't support exceptions (C, Perl, etc), you are going to have problems. Exceptions make sure that if an error occurs, and you aren't aware of it, your program dies, and doesn't go on its merry way, causing a security hole/unstable software.
Unless it's an uninitialized-memory error or a buffer overrun that overwrites some other program variables, in which case a C++ program will still keep going on its merry way without throwing an exception, causing difficult-to-duplicate and hard-to-trace bugs.
If it's possible to check for the error at all, then anything that you can implement with exceptions you can implement without exceptions (though I agree that exceptions are a _neater_ way of doing it).
If your program can't check for the error (as is common for memory errors without extensive and slow wrapping on memory accesses), then exceptions won't be triggered and you're still screwed.
[Aside: You can propagate error codes up between levels either by making error codes bit vectors and masking subcall errors on to the parent call's failure code, or by implementing your own error stack (if you anticipate using deep recursion). Messy, so exceptions are still _preferable_, but it can still be _done_ without exceptions. Almost as cleanly, if you wrap error-handling helper functions nicely.]
News flash: Technology pundit seemingly insults open source, Slashdot up in arms. None of them actually read the article. Story at 11.
The article does not say "open source doesn't handle errors as well as closed source". What the article does say is "like most commercial software developers, many open source programmers are just plain lazy about proper error handling. But we're supposed to be better than that...".
I don't see a problem with this statement. The fact is, most open-source software sucks donkey balls. Petreley is merely saying it's time to put your money where your mouth is -- if you want open source to be considered better than closed source software, it better stop being so danged flaky.
ZFS: because love is never having to say fsck
For instance, I know many "average" users who eject floppy disks and CD-ROMs from the drive while they are being read. Any Linux user who tries a stunt like that deserves a seg fault (or worse)
Well, an Amiga would give you a "You MUST replace volume < disklabel > in drive < device >!!!" if you ejected a disk while it was in use. It was a good reminder to the user that he had just done a Bad Thing, but the program could (usually) continue once it got the disk back. I don't think the application program even knew that this had happened; the read() call just blocked until the disk was put back.
If the OS isn't able to handle this sort of situation, the application program should get an EIO error on its read(). However this shouldn't translate into a segfault. Nothing should translate into a segfault - at worst an abort() if the program doesn't feel like recovering from the error.
Of course a "real OS" will just lock the CD-ROM or floppy drive [hardware permitting], thus preventing the user from ejecting a disk that's in use (unless the user has a paperclip, in which case he does deserve whatever he gets).
Open source programmers are basically the same people as commercial programmers, maybe by night or maybe as different jobs come and go. The difference is that most open source projects arise from a person's need, and it is natural to ease up on the effort once that need is filled, i.e. once the program is good enough for your personal use.
Well, we all know how bug-free Internet Expl...<This program has caused an illegal operation in module kernel.dll and will now be terminated>
One thing that really bugs me about most programming languages is that they only allow 1 return value by their most natural idiom. So you get these stupid hacks where some settings of the returned value mean errors and some are useful results, of you have to define a new named data structure just for the return value of this one function, or you end up having to mix output variables with the inputs for a function.
This is one thing I like about Forth-style languages, where it's just as natural for a function to return multiple results as to receive multiple arguments, letting you do either:
A B / on_error{ log_error cleanup exit }else{ use_result } return
or
A B / on_error{ store_exception drop_result push_unhandled_exception_errcode }else{ use_result } return
or
A B / drop_error use_result return
Unlike with exceptions, the possibility of an error isn't hidden away somewhere; if you ignore it, or hand it down to reach exception handling code, you have to do so right there and then, explicitly at every step. Actually, that's a general plus: with a stack language, you have to explicitly dispose of everything, which makes it harder to ignore return values, and impossible to write programs without knowing whether a function returns anything ("What do you mean it can return an error code? I though it was void!").
Linuxworld is having issues, so I can't read the atricle, but I remember Petreley from when I used to get Inforworld Magazine.
He's the stereotypical technology pundit. He learns just enough about technology to have an uninformed opinion about it.
The worst thing is that we on the internet have truckloads of people like him. Every mailing list, newsgroup, web log, IRC channel, or any other group in which people or trying to get things done will have a crew of wankers spouting their opinions with no attempt to actually contribute anything useful.
What really burns me about pundits is that they're getting paid to do what a couple million monkeys on the internet do for free.
Take Petreley. One time, he wrote an article about how maverick programmers don't write good code. I guess I can believe that. Then he went on to say that all brilliant programmers are mavericks, and Microsoft etc all hire them so they'll write bad code and people will have to buy bug fixes. Um, right. He then finished off by claiming that he used to be an absolutely outstanding programmer and that he had to quit because he was so amazingly good that writing decent code wasn't fun for him.
He has, to the best of my knowledge, never actually contributed anything at all even remotely useful to Free Software, or computing in general. He's even worse than Fred Langa, the guy who helped invent ethernet in 1976, then spent the rest of his career punditing, developing more and more bizarre opinions as his practical knowledge became antiquated.
So here's a message to Petreley: Do something useful, anything. If all you have to contribute is your opinion, then go home. Free Software writers are mostly volunteers, we don't have to put up with your wanking. If you have a problem with a program, file a fucking bug report. Actually, if you're such an amazing programmer, SHOW US SOME CODE! I don't care how much Infoworld pays you, to us, your opinions are worthless. So do something useful or, I'll have to dig out my cluestick and use it bash you into a profession that benifits humanity in some conceivable way.
Jordan Bettis
``Wherever you go, there's another stupid sigfile quote.''That's right, open source software sucks at nearly everything it does![1]
Open Source as it stands today is great at bashing together a really "neat" program which gets the job done in a specific manner. Soon enough, lots of cool little features are added in, and before long you have a 'perpetual-beta application.'
Programming, however, requires some discipline which doesn't often get put towards OSS. Programs require good error handling (and error trapping, for that matter), usability (That means intuitive interfaces), and documentation. Oh yes, and freedom from bugs. However, these things are BORING to produce, compared to the original plan of bashing out a neat routine.
Ironically, the only way to achieve such things in a distributed and open development model, is to have a central administrative point. Without it, large projects are just impossible. Funny, eh?
[1]of course, so does commercial software, but in different ways)
"People who do stupid things with hazardous materials often die." -- Jim Davidson on alt.folklore.urban
Fascism starts when the efficiency of the government becomes more important than the rights of the people.
This is not about open-source vs closed-source programs, nor for-fun vs for-money programmers. It's about computational models such as von Neumann machines that, at their deepest roots, assume there will be no errors. That chain-of-falling-dominos style of thinking so permeates conventional programming on conventional machines that it's almost surprising that any code has any error handling at all.
Of course it's possible to hand-pack error-handling code all around the main functional code in an application.. and of course quality designers and programmers in and out of open-source will do just that.. but viewed honestly we must admit it's an huge drag having to do so, and typically fragile to boot, because the typical underlying computational and programming models provide no help with it. Error-handling code tends to be added on later to applications just as try/catch was added on later to C++.
Lest we think this sad state must be inevitable, let's recall that other computational models, like many neural network architectures for example, are inherently robust to low level noise and error. Then, that underlying assumption colors and shapes all the `programming' that gets built on top of it. We're to the point where trained neural networks, for all the limitations they currently have, can frequently do the right thing in the face of entirely novel and unanticipated combinations of inputs. Now that's error handling.
The saddest part is that von Neumann knew his namesake architecture was bogus in just this way, and expressed hope that future architectures would move toward more robust approaches. Fifty years later and pretty much the future's still waiting..
I'll be blunt, too. I got your fix RIGHT HERE! I have whipped up some open source magic that uses a powerful error-finding heuristic in combination with a correction algorithm. It should fix all of these problems you have described.
----CUT HERE----
#!/bin/bash
if [ "$#" -lt "1" ]; then
echo "Usage:" $0 "<program> {<args>}
exit 1
fi
$* 2>/dev/null
echo "All errors corrected!"
----CUT HERE----
You are not expected to understand how this works. Send me beer, we open source guys like that.
This sig is false.
no errors
Nothing to handle then stupid.
Yours Sincerely, Michael.
Error messages need to have numbers associated with them. For instance when I have ORA-1241 in oracle, a quick search in groups.google.com will give me a lot of informations about this error, and why it occured and what I can do about.
C's strerror() uses another approach: a short 6-character name for each error ("no such file or directory" is ENOENT, etc.) that stays constant across localizations.
The situation is even worse for people who used localised versions of the software, as you don't have the English translation
Whether you get "Non ci è tale archivio o indice (ENOENT)" or "Es gibt keine solche Datei oder Verzeichnis (ENOENT)", you can still search on the ENOENT. (Translations by Babel Fish.)
Now if only the popular apps did this...
Will I retire or break 10K?
A very smart guy from SGI once told me "A core
:-)
dump is the best possible error message because
it contains ALL the information you need to
diagnose why the program had to stop running."
Mmmm'K
This one shows another of my points:
the problem itself has a very linear structure, but the solution here has a lot of nesting. If i had more blocks, it would have even deeper nesting.
If the allocation was non linear (for example, a tree or a graph), and failed in the middle, deallocation would be really a mess. You would have to exit some mix of loops/recursion in the middle, and refree all before exiting.
If you want a better solution, see my comment about BetterC. Or use Eiffel
Ever use a commercial unix? I use HPUX at work and True64/Solaris at school. Of those, linux programs are the most understandable when things go awry.
Linux is a recreation of a system that has historically been 'terse'. Can we expect it to be very different?
...Amazingly, with all these errors and warnings, most of that software continues to run. Compare that with the way typical windows applications work (Crash on the first error and take out something important on your way down), that sounds like excellent error handling to me....
Just my $0.02
"Your superior intellect is no match for our puny weapons!"
I noticed that a lot of the errors he was talking about were missing files, blah blah blah. This I have had problems with in X using various managers. Here are some answers... a lot of the time missing files are do to lack of checking the required packages. Also it can have to do with the way different Window Managers handle different things. I would like to see the results of these little tests done on machines that were running every available WM. I bet most of the problems with the GUI based programs that he is reporting are due to the fact that they were written on a machine running a different WM. Just my thoughts.... if you don't like it don't read it :P.
Later
Error Handling and Appropriate Technology
This article is right on target. It's much more important for open source software to be of a higher standard than closed-source, simply because with open source, shoddiness can't be hidden and swept under the carpet (to state the bleedin obvious). If we make shoddy open-source code, then up-and-coming programmers will see it, learn from it, and treat this very ordinary code as the 'norm'. Worse, they will treat it as a target to be aimed for, and cut corners so even this low standard isn't met.
FWIW I've worked in safety-critical areas for some 20 years. I've managed to dodge being assigned to management, and am doing neat stuff like spaceflight avionics for interest rather than chasing dollars doing yet-another-b-2-b system. The biggest problem I've found with re-educating interns is getting them to be paranoid enough. It's a matter of culture.
Quick N Dirty is an appropriate culture for some systems.
If you're writing throwaway code for a specific purpose (such as a simple script) then quality isn't an important issue.
If a deadline is approaching fast, your budget is zero, your team burnt out, then damn the long-term costs, hack it so it kinda works and ship it on time. It's crap, but they only paid for less than crap, so don't worry, be happy.
There's a major financial incentive to write high-maintenance code, both for programmers and companies. You make pennies on the initial sale, megabucks on the maintenance. What do you call a programmer who writes superb code that's maintenance-free? Unemployed.
This is true for design and requirement analysis, as well as code.
But... it's important to realise this is not the only way of doing things. It has it's place. But not in open source.(And there's such a thing as professional pride too, but I digress)
If you're writing a re-useable module, you should treat all inputs as being guilty until proven innocent, always check any outputs from your area, and be honest regarding what side-effects your module has. In some languages, this is easy (e.g. Ada ), in others darn near impossible (e.g. C), but it has to be done. It's obvious that you have to do it when lives are at stake. It's less obvious when you're writing some device driver for Linux - but literally tens of billions of dollars may be riding on how well you do your job. Even if you're not getting paid for it.
I'm not asking for the degree of robustness typically shown by safety-critical systems What? Your code failed just because half the memory was corrupted and a CPU was on fire? Unacceptable! Failure is Not an Option! but enough so that BSODs or their equivalent lead to puzzlement I've never seen one of those before!.
Zoe Brain - Rocket Scientist
I expect that it is true that commercial software does a generally better job than open source software at error handling. This is probably even measurable: run comparable software products under similar usage load, and collect the errors generated (reported or experienced by users). I haven't tried such an experiment myself.
I don't why we (we = open source enthusiasts) should feel particularly worried about it at the moment. For most applications, the commercial alternative has a vastly larger user base than the open-source equivalent. And since these users are paying for their software, they are going to expect that there is a vendor who will respond to their concerns.
Customer service operations are very expensive. So it is very much in the interest of the commercial vendor to reduce the error rate to the level where the load on customer service is financially tolerable (this is not the same as zero).
Fixing obscure bugs is not the most exciting technical endeavor, and skilled engineers are more willing to do this work if you pay them...
As I said, I don't think we should feel bad about this state of affairs. A more appropriate line of discussion would be: What can the open source community do to create an environment where a higher degree of quality is met consistently for open source software products? Are there tools that we could build that will help? Non-intrusive processes that we can impose on one another? The Perl community in particular has an effort underway to establish a consistent level of testing for all the modules that are released on CPAN. Is that a worthwhile model?
I presume that's either a troll or a parody of the sort of nonsense that all too often gets put out in response to complaints. If it isn't, I'll simply note that you probably can't fix the code if you are not a programmer, which a lot of users aren't.
(On the other hand, users of open-source software should bear in mind that the developers are often not working full-time on the program - they may have day jobs, or may be going to school, etc. - and therefore that the feature that they really really really want may not have been implemented yet because nobody's had time to implement it, so they may Just Have To Wait.)
Requirements are specified in design documentation.
/. is ever likely to run.
Each instance of Error checking or Recovery must be specified, just as the format of each element of output or the formula for each calculation must be specified.
Without that specification, who cares if you think the code is wrong? You can't prove it's wrong because you don't have the spec and didn't pay for its development. You bought (or five-finger GPL'ed) a license to operate the software. On an as-is basis, for every piece of software anyone posting to
You want it improved? Write the Engineering Change Request specifying the improvement, and send it along with the money necessary to get it done.
Design and validation of "bug-free" code is the most expensive software process there is. Just the paperwork on the validation process will double or triple the cost of the software. The problem is provably impossible to solve, and the best efforts on nontrivial code (and sometimes on what appears to be trivial code) end up with unresolved errors that are signed-off as calculated risks the costs of which will be borne by insurance, government, lucky avoidance of catastrophe, and the bottom line.
--Blair
"And it pays my bills, in spades."
ardax makes the point above, but is only scored 1, so...
:-) and I expect it to continue running indefinately. About every 5 years I put a drop of oil on the fan shaft behind the freezer to keep it from squealing. Software has a ways to go.
First, about your analogies...
If I wear out the door key to my car, the car should not burst into flames when I try to open the door.
If my car runs out of fuel, I expect that after rectifying that little problem (and bleeding the injectors) it will be just like new. I do not expect that it will ruin my tires.
And yes, I have kept my refrigerator outdoors. I kept it on the front porch for two months while the house was being renovated 10 years ago. It worked just fine. It is 30 years old now (Thats 15 PC generations to you young whippersnappers. Moores law says the new fridges should be 1,000,000 times colder now.
About the cubase incident...
Yes, the CD is scratched. I expect that I won't be able to re-authorize my copy of the software, but don't ruin ALL the data on my hard drive! (Its actually worse. It was my wife's laptop. You do NOT want to have to tell my wife that you just wiped out her laptop.)
About Word...
Destroying the on disk copy of a document before successfully writing out the new copy is just plain stupid. Particularly on a Mac where there is a special file system function to swap two files. You write the new copy under a fake name, swap it atomically (even over file severs) with the original file, then delete the fake named file (which now contains the old data). No one gets hurt in error conditions, no one can ever have bad luck timing and read a partially written file off the file sever. Life is good.
The third case (The PSC) you don't mention, but it isn't really a case of graceful degradation. Its just an irritating bug. Honestly I'd dump the device because of the irritation, but it actually feeds card stock out of its paper tray! A rare quality in a printer.
I suppose the more explicit point I should have made is that bad things are going to happen to software and it requires effort from the programmer to deal with it. Sometimes just a tiny bit of effort. Cubase performed so badly with a bad CD that I suspect they never tested it. They write about it in their documentation, but they probably didn't test it. The Word example is just careless programming which could have been trivially avoided if the programmers understood the platform's file system calls.
How about the cost? I estimate that it probably doubles the engineering effort to handle the exception cases to a degree that would cover the incidents I note above. In the calculus of software development the benefits do not out way that cost.
Or else he wouldn't think that only open source software has lousy error messages...
The user should not see errors unless they want to. I agree with sending errors to STDOUT. If the user wants to see the errors then they can either start the app in an XTERM so they can see the errors or switch virt terms over to VT1 and see the STDOUT output.
Also, I use error reporting to a logfile rather than alarming the user. Most applications should be able to survive the average error. Those applications should prompt the user for proper input - even to the point of placing the cursor in the proper field. Each field should be intelligent and be able to validate it's own input data.
Those error logs I spoke of should be used by the programmer to debug his/her application - don't alarm the user ok?
Codifex Maximus ~ In search of... a shorter sig.
Something similar but worse once happened to me. I was editing something with Word while browsing the web; not doing anything out of the ordinary. I saved the file and logged off for the day. When I tried to open it again, Word refused, claiming the file format was incorrect.
I looked into the .doc with a hex editor and found that some HTML source had somehow found its way into the .doc! I was using win95, so I guess this can be chalked up to buggy filesystem code. The weirdest and most frustrating bug I've ever seen. I didn't manage to recover any of my work.
doh! heh make that STDERR heh :)) Must remember that Preview button.
Codifex Maximus ~ In search of... a shorter sig.
I don't think people are bashing the free stuff, but more along the lines of giving it the same type of scrutiny that everything else is given. Honestly if people can't take any criticism at all, you better crawl into a hole, because the *real* world is a scary place.
There is a famous mantra, all programs suck, some more than others (I'm replacing the original word OS's with programs, cause it still fits perfectly). That goes for closed, open, free, expensive everything; all programs suck, and being able to openly talk about deficiencies in them is the only way to make them suck less. It strikes me as rather two faced complaining about "people bashing free software", when just before that *you* bashed other software, for the same legitimate reasons others supposedly "bashed" free software. Again all programs suck, being critical of them makes them suck less.
Read the article. That's exactly what he said. Here's the title and the subtitle:
sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
I agree with Nick. Programmer Error handling sucks, but not just in Linux and Open Source. An example is at work, we had a programmer write a batch file to concatenate(sp?) two to five files together into one big file. Only thing is it depended on a network drive mapping (on a volume up on a Novell server...yech) and files to be there. If the batch file failed, there was no way to know it failed because of a network drive mapping error because bloody DOS has no frickin return codes. I WISH they'd let me and the systems programmer set these dang things up on a linux box so we could write a BASH or TCSH script with proper error checking so we could provide a return code back to the mainframe that triggers the script. That way the mainframe could holler at the operator that there's a problem. Right now, if the batch file fails it just drops thru. If it wasn't going to be replaced soon, I would rewrite the damn things, but since a new system will be entering implementation soon, we will be freezing all development except for fixing errors and fulfilling state/federal mandates. Hopefully the package we are going to (anyone ever heard of the education only package called Colleage by Datatel?.....it runs the business side as well as the scheduling, record keeping and all of the stuff a college computer system is doing....). Anyway, at least the picked the right OS and, in my opinion, the right DB for it (AIX for OS, and Oracle for DB......the other choice was....shudder....NT/2000 for OS and I believe SQL server, but it may have been something like DB/2 or something weird).
Gorkman
I want Linux to continue to run on new, interesting hardware. There is an ongoing battle to get vendors to release their specs. There are also several ominous clouds on the horizon, SSSCA and TCPA (trusted computing platform architecture) which threaten to rain on our parade of cheap, open commodity hardware.
.NET to buy airline tickets, make hotel reservations, or file your tax returns. A substantial base of Linux users can apply enough pressure to keep these protocols open.
If even a small percentage of "normal" users use Linux, it will be nearly impossible for anyone to marginalize Linux and lock it out of new hardware.
Also, consider protocols. Imagine if Microsoft pushes us to a point where you need
The author does have a point in his article. A lot of programs do spout nasty pointless error messages both at compile-time and at run-time. This is fine in development but stable versions should catch and properly handle such errors. That goes for any program regardless of the license it comes under. I think the main reason we notice it more on opensource apps is because they are public during development and a lot of times are already being included in your favorite distros. While the extra use does help the debugging process it can leave an impression of lack of polish.
At what price learning? At what cost wisdom? The price is a man's peace of mind, and the cost is his life.
You could set up a process monitor script that either runs as a daemon or via cron. If the same process is using 80% CPU or 80% memory two samples in a row, it would kill the process and pop an xmessage saying "$program was using too much memory, so I killed it."
Well, yeah, because real core dumps go to the line printer. Everybody knows that.
This next song is very sad. Please clap along. -- Robin Zander
The main difference between a great systems administrator and a technically competent sysadmin is paranoia.
A great sysadmin would cut out their own heart before operating without known good backups. A great sysadmin would chew their own arm off before putting something into production without testing it first in a development environment. A great sysadmin *always* has a backout plan.
And how does a lowly admin reach this amazing level of greatness, you ask?
Admins get paranoid after making hideous, terrible mistakes that immediately result in Bad Things Happening.
I have personally: killed the email server for 2 days...shut down distribution for the world's largest distributor of widgets (every Thursday for 3 weeks)...destroyed all connectivity (voice and data) to the world for 12 hours...hosed the upgrade on a 700GB Oracle database (and our backups were no good). And any semi-experienced administrator will have, at minimum, two stories that are at least this bad (like my friend who shut down trading at Fidelity for a day).
And for every one one of these instances, I immediately felt the wrath of: my manager, my manager's manager, other people's managers, other people who were affected, stray people wandering by my cube who weren't affected...I also became a part of the "mythical sysadmin storybook"--"I once worked with this guy, and (you won't believe this) he..."
I submit the hypothesis that: generally, most developers are not subject to this type of immediate and extremely negative form of feedback for their mistakes. Therefore it takes a developer a long time to develop an aversion reflex that conditions them to do "the right thing -- error handling, code documentation" instead of doing "the easy, interesting, enjoyable and sexy thing -- making spiffy algorithms, writing tight code".
Drifting into another analogy, error handling is like code docmentation. Why do most developers get good (and a little obsessive) about documenting code? Becuase they finally spent some years trying to maintain someone else's tight, sexy code that is virtually incomprehensible.
So, my point is, developers take a long time to viscerally learn the need for good error handling by repeatedly getting whacked on the head for lack of error handling. It's like evolution in action.
This will probably annoy programmers who started with "pure" C++, Java, or VB.
/* Here we do something with p1, p2, p3 */
int allocate_3(void){
int *buf, *p1, *p2, *p3 ;
buf = malloc(3*SOME_NUMBER*sizeof(int)) ;
if (!buf) { return -1; }
p1 = buf ;
p2 = buf + SOME_NUMBER;
p3 = buf + SOME_NUMBER*2 ;
free ( buf ) ;
return 0 ;
}
* And remember, it's spelled N-e-t-s-c-a-p-e, but it's pronounced "Mozilla."
to bugs. Bugless programs don't need error checking, just input bounds checking. Use Lisp and prove your program with mathmatical induction, or optionally, you can keep the same mindset in C. Just don't let the user screw up. If you have a finite set of inputs, you can very easily see that your program won't fail. I find it much easier to create functions to test that certain things work as expected than putting try blocks around things.
:)
:)
:) It's not laziness... just tired, and you aren't really getting paid for the work, so why try as hard as you do for paid work? I'm not even metioning games which I believe are essential. All programmers need to take their frustrations out on some helpless AI creature, or else they would buckle under the stress.
Now that I've given all the tips that I'm aware of, its time for the justification of my own faulty behavior that can't be justified
I think open source software does well for bug handling though. The bigest things I can think of that a lot of open source projects have faults with were never meant to be mission critical || are v1.0 || miss coordination caused some negative synergey. As for the first two, you should expect failures. The last is going to happen to even the best. I think it is a testament to OSS still. With such little time to invest, all the products I've seen get better every day.
And here come the excuses...
I really wouldn't call it laziness, but more a lack of motivation. The bulk of OSS is written in a geeks spare time, which in itself is small if the geek attends college and works. You have to account for all the reading a geek has to do on a daily basis. (Slashdot, Freshmeat, Changelogs, Anandtech et al, Pricewatch & EBay) Then account for all the time A geek spends perfecting his own system. (New kernel, apt-get, compiling his special favorite programs(MySQL, Apache, PostgreSQL, XBill)) By the time you get done with all the things you try to stay on top, you really don't have much time left. From there on out, your sleepy and are working purely on caffiene. You will enevitably make a few mistakes.
Before someone says it, I know the rewards of OSS programming. If there were no rewards, then no one would do it in the first place.
Karma Clown
I'm astonished at the poor error-handling in most software these days.
The biggest problem is not whether your language has exceptions (good error-handling has been done for years without them) or whether programmers are lazy. It's a matter of making it a priority. In fact, laziness caused a lot of us old-timers to take a major interest in error-handling.
Picture the days before internet access, running mainframe systems, probably with overnight batch cycles.
Good error handling might mean that you don't get a phone call at 3:00 am.
If that phone call comes, good error messages might mean that you can diagnose the problem over the phone and walk the operator through recovery.
In either case, you don't have to drive down to the data center.
Sleep. Now there's a motivator.
Dammit! This isn't a bloody pissing match! If we're going to set the bar so low that "it's okay as long as it's a little better than closed source", then we're destined for failure.
Instead, why don't we take this criticism at face value? "Open source programmers stink at error handling." Fine. Let's start disciplining ourselves and write our code with meticulous care. After all, we have no deadlines, we don't need to cut corners, we collectively have more time on our hands, so why coudln't we write excellent code if we trained ourselves to be careful. I think it's possible.
I'm not running XP, and I highly doubt I will in the near future. Win2K serves my needs perfectly well thank you ;) The error dialog doesn't come up for every app: the IE6 one is different to the one in Office XP, and nothing comes up after the frequent crashes of my own programmes that I'm testing. As for IE6 auto-restarting, I don't recall changing anything to make it do that. Office XP apps also restart automatically after one of these crashes.
Exceptions are mandatory for good programming, period. If the language you are using doesn't support exceptions (C, Perl, etc), you are going to have problems. Exceptions make sure that if an error occurs, and you aren't aware of it, your program dies, and doesn't go on its merry way, causing a security hole/unstable software.
C++ is implemented in C. Get out your copy of K&R and look up setjmp and longjmp. Do they sound scary? They should.
That is how C++ exceptions work too. Throwing an exception wihtout catching it is calling longjmp without setjmp.
It is your job as a programmer to check error return values, and write you code to clean up after itself if an error is returned. Throwing an exception is a cop out from cleaning up properly.
If your app aborts when memory or disk space is low, you could lose hours of work for your user. This is not going to make the user think your app is stable.
This is slashdot any post critizing linux or open source get modded up. Where have you been?
War is necrophilia.
The problem with result codes is that you can't propagate the problem up to the level of scope that should be dealing with it. For example, imagine you have a GUI program. At some point, it needs to open "foo.txt", but fails. Since you're a good software engineer, you've well-separated your GUI code from logic code. The GUI needs to display an error message, but if you only check error calls, the only part that knows about the eror that has happened is way down in the logic code, which has no idea how to tell the user. And propagating 'undef's all the way up through the code is uncool. Especially since return values should not be used to indidate errors; they should be used for return values.
That last sentence is stupid dogma. Take a look at the Mac OS APIs sometime. Almost all routines return an error value, of type OSErr. 0 means noErr, negative error values are well-defined by the OS. Postive errors above a certian range are left for applications to use.
With this convention, an error can be passed up the chain, and interpreted or transformed at each stage into something meaningful for the stage above.
At the GUI level, you can map error codes to strings based on these well-known values.
"When was the last time Windows gave you a nice error.log when it blue-screened, or how about IIS on a buffer overflow?"
First of all logging in windows pretty much sucks ass no matter what you are dealing with. I suspect this due to sever lack of any decent text tools like awk, grep, tail etc. Windows admins would get too confused with utilities like that.
That aside here are my favorite error messages I deal with pretty routinely.
From Access "there is no message for this error". Oh yea that's real helpful.
When importing data into SQL server "Overflow". No mention of line numbers or data types or field names. All you know is the one line of thousands had some data that SQL server did not like. Good luck finding it. What I do here is to create the same structure in postgres and import it into there. Postgres tells me what line and what data is bad. Postgres is a great debugging tool for SQL server and in many ways much better database.
And in ASP pages sometimes it pukes with a number (no message) a search on this number on the MS web site reveals that the error message means "exception occured". Wow that's real helful huh? A search on google shows many people with this problems with nobody giving solutions. My answer? Re-do the page in php.
War is necrophilia.
Its usually pretty easy to detect errors like this, for example, the program dies with a SEGV. The trick with errors is not in detecting the error, but rather in figuring out what to do when you detect it.
Is this error correctable, ignorable, or fatal.
If it is correctable, what is the correct action that corrects it. This can be more subtle than you think. And this correction code adds complexity and needs to be tested.
Which errors are minor and ignorable? IE, that are actually conditional status messages not actual errors?
What to do in a fatal error? What is the definition of a fatal error? A lot of code does not deal with resource starvation and treats running out of RAM as a fatal error. Should it? It doesn't have to, but htat would make the program orders of magnitude more complicated, it would turn every allocation into a potential exception-causing step.
By avoiding these problems and making more things into fatal errors, we make software cheaper and more plentiful. Would you rather have a netscape that crashes a couple times a month, or no netscape at all?
To respond to the article, IMHO, I'd treat the complaints that those applications print out as being debugging notifications. The computer warning about possible situations that might cause problems. By the same token, that code may not be robust, but making it robust introduces complexity and thus more risk for errors.
Without even having read the article (but I've read some of his previous stuff) I'm sure that Petrelly didn't base his statements on actually looking at code. No doubt he has some examples of errors that are no handled from the user perspective.
But that has nothing to do with the programmers. The difference between Open Source and commercial software here is simply that companies can afford dedicated testing staff... QA departments. Most of the errors that an idiot like Petrelly will be able to find will be caught by the QA department before release. Unfunded Open Source projects can't afford that kind of QA... and with time, widely used Open Source packages tend to become higher quality than much proprietary commercial software (the thousand eyeballs effect). But early releases do tend to have errors that a QA department at a company would have caught before release. That has nothing to do with the quality of the programming.
Not to absolve IIS of blame, but when you run a .dll in proc and it hangs, the whole process hangs and there isn't much that IIS can do about it.
In my experience thats where I get most of the errors with IIS hanging.
"You can now flame me, I am full of love,"
Endless checking of pointers is pointless, and wastes CPU. A much better approach is to use good design. Simple idioms such as Resource
Acquisition Is Initialisation (RAII) are much more reliable than manual pointer checks.
Thad
...but the same is equally true of the vast majority fo commercial and closed source programs too. The sad fact is that jobs like reducing the number of warnings from the compiler and testing can be incredibly boring jobs that noone wants to do, so NoOne does it except in the most perfunctory manner.
Its been said that a lot of open source development projects ought to have some form of Audit person or team whose job it is to look at the project code and then when they find problems to go and reeducate the person who wrote the faulty code, preferably teaching him not to do it again [with a large hammer if necessary!]
Donte Alistair Anderson Roberts - hi son!
Karma: Chameleon
Open source is hardly alone in this. Commercial software may detect errors with greater regularity, but it, too, rarely does the right thing when it actually finds an error (a dialog box is not usually the right thing). Languages also often do the wrong thing: C has no exception handling or automatic cleanup, Java encourages programmers to handle exceptions poorly, and only very few languages have restartable operations. I think to address this, we need a lot more training and education, but what else is new.
I once implemented a Neural Network for a school project (from the bottom up in C++).
...
The thing was trained to recognize numbers, but we never got a success ratio bigger than 85%.
Suposedly we should've manage to get more than 90%, but there was a programming error in the code for the NN implementation.
The interesting thing is that the Neural Network actually adjusted to compensate for a bug in itself and achieved an 85% success ratio
Now that's error handling
Try explaining the code to the receptionist at you Dentist's office, can you do it? if not maybe you don't realy understand it your self. Many problems in personal programming projects of mine were solved by explaining it to my wife. People like these tend to ask stupid questions, which generaly point out your stupid assumptions.
If the code is too hard to explain, its probably too complicated. If its too complicated its probably slow and buggy to. One thing I hate is a lot of OSS projects require certain libraries that are un available. After developement it helps to test them on a plain vanilla distro w/o a bunch of develoment libs just to see if they still work and if the required libs can be installed without breaking the rest of the system.
Apocalypse Cancelled, Sorry, No Ticket Refunds
you should at least be able to hit cancel, cleanout a couple 100Mb of GoatPorn and resave with out lossing anything except your patience.
If the program crashes, losses all of your work and corrupts the OS, at least others can use your program as an bad example.Actualy I remember when a 1.44 MB floppy was big and fast compared to a 1500 baud cassette tape for storage.
Apocalypse Cancelled, Sorry, No Ticket Refunds
I take that you're using Linux or something similar. Suppose you were running ./foo and it went belly up (core dumped). Simply "gdb ./foo core" and you can see what the program was doing when it died. Usually simple "bt" for backtrace is enough to find the reason. If you didn't get core file you might want to check your limits (man ulimit). Usually segmentation fault is caused by incautious pointer usage - programmer made a copy of pointer and used it after original resource was freed or something like that. If you see segmentation fault in malloc() you can be pretty sure that the problem is some extra free().
_________________________
Spelling and grammar mistakes left as an exercise for the reader.
1 I want to Work not run exotic diagnostic programs on other peoples software.
2 I concidered the disk writing dialog disapearing after the program finished writting to the Disk buffer but before the floppy was written to, to be the first indication that Windows 95® had a problem.
This is exactly the kind of thing that the topic is talking about error and exception handling, if you can't anticipate a common user error and make your software robust enough to handle it, then your reputation is going to suffer.
Apocalypse Cancelled, Sorry, No Ticket Refunds
I wonder how many of those "error" messages really indicate errors? When I am programming, I will put in lots of messages to make debugging easier later on. I will disable them on the final compile, but there have been times when this got forgotten in the rush to release on the deadline. I wonder how often that happens with OSS -- especially since OSS releases are usually not the end of the project.
Also, messages that were intended simply to show the progress of the program or confirm it went down the correct path often inadvertently sound threatening: "Cannot find file xxxx.xxx", when what you really meant was "No initialization file xxxx.xxx found, using defaults."
Of course, as the author said, the problem isn't that OSS is worse than commercial software, but that it should be better. Is there anything in OSS as bad as the error message I sometimes got from Win95, "Cannot find file", without the file name and path? Not to mention how Windows allows an application to silently malloc some memory, forget to free it, and repeat until it crashes a different application or the OS itself...
Error Handling needs to be part of OO analysis and design. The analysis needs to understand the scope of the class being designed, and error conditions need to be part of that. To the extent that the class analysis suggests that the class can deal with the errors, the design should specify that. All others should be part of the class interface. A clean reusable class has no business outputting text to stderr (unless the basis of the class is to interface with the user or administrator). All error conditions should be given back to the program using the class, with appropriate supporting information. The application then deals with it in some way more appropriate for the user. If an object cannot allocate memory, it should tell the application, not the user. The application can then tell the user.
There is one danger in this. If Microsoft follows this practice, when a class encounters an out of memory condition, the next day you'll end up with a Fedex arriving labeled "Here is the new RAM your computer ordered for you, courtesy of .NET and Passport. Your account has been dinged".
now we need to go OSS in diesel cars
in pseudo-Eiffel:
:= malloc(SOME_NUMBER*sizeof(int))
:= malloc(SOME_NUMBER*sizeof(int))
:= malloc(SOME_NUMBER*sizeof(int))
/* do something with p1,p2,p3*/
allocate_3 is
require
SOME_NUMBER>=0
local
p1,p2,p3: INT_POINTER
do
p1
p2
p3
free(p1)
free(p2)
free(p3)
rescue
free(p1)
free(p2)
free(p3)
end
Notice how little changed from the original program. You can have a similar C solution and a discussion of the problem (as an example on error-handling) at this document.
Note that this solution does all this things (and compare with other solutions posted):
* frees all memory, no matter if things succeed or fail, and even if things fail in the do_something part
* checks that SOME_NUMBER is valid (non negative) and does not overflow when multiplied by sizeof(int)
* Has not a deeply nested structure
* Has an obvious and visible flow control* Works as a non-error when SOME_NUMBER is 0
* Allows calling routines to get the same kind of clean error-handling
* works robustly when other error conditions I haven't thought of happen.
Yes, C allows all this, but it is a pain in the neck, the code gets big and messy, and hard to mantain. So error-checking in C comes at a great cost...
Two things I've learned are that (a) every "if" has an implied "else" clause that often represents an unconsidered error, and (b) those else cases, and other unexpected situations shouldn't be logged, they should be "asserted" in a way that makes the program stop dead, now. That forces you to fix them when they happen. The business the author cites of getting all these messages is truly evil, as it really helps no one, neither the programmer nor the end-user.
-dB
"It if was easy to do, we'd find someone cheaper than you to do it."
Would opensource programmers thrive, if they use a language, that requires them to provide a logical step-by-step proof of their code, side by side with the code?
Example of a function declaration, and the mathematical specification it MUST abide, with the logical proof it abides it (In plain English, as the syntax is not thought out yet):
- Define function sort.
- Function sort takes a sequence, and returns a sequence of the same type.
- The returned sequence is of the same size as the given sequence.
- For-any-element in the given sequence, there exists an identical element in the returned sequence.
- For-any-element in the returned sequence, but the first, the element before it is smaller-than or equal-to it (polymoprhic smaller-than or equal-to)
With this mathematical specification, and code that sits next to the logical steps required to prove it abides this specification, we can know for sure that sort() works correctly. Whether or not it leaks memory, is another issue, but disallowing allocation of "global" memory (side-effect allocation), and mathematically specifying memory requirements, you can ensure 0-bugs there too.
Bugs in mathematical specifications will remain the only source of problems, but those would be rare, because the mathematical code is much more trivial.
As for performance, there is nothing that the semantics of the actual code must abide to, as long as it is proven to provide the mathematical requirements. Therefore, the performance of the code should be at least as high as any other language, and depending on implementation, and the chosen semantics.
Test-First Programming (TFP) is a key part of the Extreme Programming methodology. The JUnit unit testing library has been ported from Java to pretty much every widely used language. So the tools are there to produce robust code.
Here's how it works... BEFORE you write the body of a method or function, you write a unit test(s) for that function, to make sure it provides correct results for whatever inputs you might encounter. All of those tests should fail. THEN you write the body of the method/function. All the tests then should pass. If the tests don't pass, fix until they do. If bugs are encountered later that aren't caught by the unit tests, use test-first for the repairs - that way, you know your fix actually works. Just keep adding tests as you learn more.
Now put calling those unit tests into a framework and call it from your makefile. Unit test every time you compile.
Here are some of the benefits...
1. If new code breaks old code, the unit tests catch the error, and you can fix it appropriately right away.
2. You code with far greater confidence.
3. You keep your APIs very clean, because you have to test them right away.
4. Your APIs are thoroughly documented by the unit tests themselves.
5. Maintenance, especially by other programmers, is far easier, because they have the unit tests for reference and can easily narrow down where any bugs occur.
6. Refactoring is much easier, as any errors caused by refactoring are caught by the tests.
TRY THIS. It will change your whole approach to programming!
Hand me that airplane glue and I'll tell you another story.
Comment removed based on user account deletion
I've been asked before in projects, 'What kind of error handling mechanism will you/we be using?'. My response is usually a cocky, "Pft, we don't put errors in our code, so why would we look for them?".
I disagree. Code reviews do an excellent job of catching errors. However you need a code review as soon as it compiles. Code reviews will catch those bugs that you spend the first couple weeks getting rid of. [if (x=y) instead of if (x==y), and some logic errors if(Error) { normal case} else { error case}]
I just commited a piece of code that cannot be checked any other way. Code that checks for hardware errors, but the hardware modifications to reproduce it are not worth the cost, after I was done codeing. Code reviews are my only chance of getting the code to work right.
The only real upshot that I see to exceptions is that it allows the error to traverse back up the calling stack (or down, however you look at it) until somebody catches the thing. All this adds overhead to the entire program though when it's compiled to be made aware of exceptions (in the case of C++... Java just keeps track of it all the time).
Excpetions certainly -can- be used in a proper manner but they can be abused too. One thing I'm not fond of is code like this:
Sure, in the above you should be trying to catch different exceiptions (one for file IO, one for the db, perhaps one for the recordset). Once you start really getting down to a line-by-line error handling mechanism things just get awkward. Larget code blocks leaving you wondering how control got to the catch{} block in the first place when you're not sure which line actually tossed the error out. To do things properly you almost need to be trying{} and catching{} every single line of code IMHO. Guess what? We're back to C style error-return value handling now. I think that was my point to begin with...
Your third paragraph," If the mySQL people release a free database and it loses the occasional record, and its later determined that mySQL was used to track nuclear arms and now some are missing, should the guy who wrote that code by put in jail or fined billions? I mean, he wrote the software he should provide a warranty!" does a great job of reinforcing my point. When software vendors are pressured to guarantee a product they have much more reason to ensure that it works.
Warranties generally include conditions under which they apply or are void. If the product isn't suitable for a particular use then consumers can make an informed decision about that use.
You never really know how close to the edge you can go until you fall off.
If you can't hold on to the user's data if and when you I/O fails then it's time to take a look at the design..
OK. Yank the hard drive from the computer while it's still on. Now lock the hard drive in a safe. Now try to recover your last hour's worth of changes. Are you implying that all programs should always transparently backup off-site? That would result in unacceptable latency for users on 56K or slower connections who try to edit large documents.
OK. Now do something to make the computer swap a lot. Now yank the hard drive. How is the OS supposed to continue in such a situation?
Will I retire or break 10K?
Thats why you ensure that something is watching the system all along, and ensuring that certain things are within limits.
Ever notice how Windows will occassionally say 'Your system is low on virtual memory'? Same thing. Presumably, you can get the app to shut itself down to prevent such a catastrophic failure.
-- I'm the root of all that's evil, but you can call me cookie..
The time for numbers have passed. Use a short mnemnonic keyword instead, computers handle them just as well as numbers these days, and humans handle them way better.
Open source, like everything else in life, strictly follows Sturgeon's Law: Ninety percent of everything is crud
I use KMail (under Gnome no less) for my email. It's a great client, and handles tens of thousands of emails without much fuss.
24 days ago yesterday, I transfered all of my account settings to a new username. Somehow I managed to forget to chown 'kmailrc', a few directories deep. I didn't notice this until I closed and re-opened KMail 24 days later...
So after 24 days of adding POPs, tweaking filters, etc, I find out these things never were written to the config file. I found this out NOT by an error message -- KMail pretended everything was fine. I only found the problem after losing the settings that had apparently been in memory for the last month...
Frustrating to say the least; I would have appreciated even "Can't open 'kmailrc': permission denied" or better yet a chance to chown and retry. Nonetheless, I haven't found anything better (and it was my screw-up), and I don't have time to try and get Evolution to compile... and anything beats going back to Windoze...
NGWave - Fast Sound Editor for Windows
You can if you are a programmer, and familiar with the language in which the program is written, and familiar with the theory behind the program (if any).
If "submit the bug" means "report the problem to the developers" (they may not have a formal bug-tracking system; perhaps they should, but that doesn't mean they necessarily will), then, yes, one should do that, regardless of whether the software is open-source. If the developers don't know about it, they aren't likely to it, except by accident.
However, all too often the "fix it yourself, or submit a bug" response to complaints gets over-simplified to "fix it yourself, you have the source", which is an error - there may be users who don't have the knowledge to fix the problem themselves. If that's what you really meant, that's what you should have said, clarifying "do something useful", which, in your previous message, was preceded by "Open source means you can fix the code", a statement, that, as noted, is not necessarily true unless you use a rather unrealistic definition of "can" (e.g., "can, if you learn a programming language and spend a lot of time studying the code and the theory behind it first").
And if your posting was responding to Petreley's article, accusing him of "whining and doing nothing", note that he wasn't just complaining about specific problems (which he should report to the GNOME, KDE, etc. developers), he was complaining about an apparent general attitude, and "filing bugs" on that may mean filing bugs on projects that don't even exist yet, so that when people write some new project they take more care handling errors, including those that "can't happen".
I don't think open source software is more error prone. I just think it's more likely to *tell* you when an error happens rather than just sweep it under the rug and pretend it never happened. OSS doesn't lie. If something went wrong, it TELLS you. I'd love to see that kind of behaviour out of my Win98 desktop, so I could actually figure out why it keeps launching goofy things at startup that I don't even have installed (resulting in blank windows I have to close by hand.)
Don't label something "offtopic" unless you know the topic well enough to tell what's on topic.
Titling an article "Open source programmers stink at error handling" is an inflamatory statement. Regardless of what is actually in the body of the article, you can't place it in a Linux/OSS oriented site without expecting that exact type of reaction from the /. crew. Hell, he probably called it what he did -hoping- to get a front page mention on slashdot.
my sig's at the bottom of the page.