Open Source Programmers Stink At Error Handling

Open source BSOD library. by aralin · 2001-10-25 10:02 · Score: 2, Funny

We really need this open source BSOD library
that would make our life more convenient and
our applications more commercial-like.

--
If programs would be read like poetry, most programmers would be Vogons.

Re:Open source BSOD library. by xmedar · 2001-10-25 10:40 · Score: 2

I always prefered the Amigas "Guru Meditation" error messages, I just wish they had combined it with a Zen Koan generator or quote of the day, it would have made figuring out what went wrong even more fun!

--
Any sufficiently advanced man is indistinguishable from God
Re:Open source BSOD library. by nomadic · 2001-10-25 17:37 · Score: 2

We really need this open source BSOD library
that would make our life more convenient and
our applications more commercial-like.

So you're attributing a Windows-specific problem to all commercial software everywhere?

In the case of some of us... by brunes69 · 2001-10-25 10:03 · Score: 4, Funny

Who spend days at a time at work (read: Stallman) without showers, removing the last 3 words provides a better description :o)

No, its not limited to OSS by ArcadeNut · 2001-10-25 10:03 · Score: 4, Insightful

Commercial Developers are just as bad. If you ever see an "Access Violation" or "Unhandled Exception in blah...", chances are, the programmer didn't do proper error handling or checking for error conditions.

Things like checking pointers to see if they are NULL before using them. Simple basic things that could prevent errors.

Error handling doesn't just mean catching the error after its already happened. It also means being proactive about it before it happens.

A lot of programmers do not do that.

--
Visit the Arcade Restoration Workshop @ http://www.arcaderestoration.com

Re:No, its not limited to OSS by deranged+unix+nut · 2001-10-25 10:23 · Score: 3, Insightful

No, but frequently error cases are only added because good testers force the programmer to take care of all of the edge cases.

I have forced the developers that I work with to add hundreds of error case checks in the last year. :)
Re:No, its not limited to OSS by xmedar · 2001-10-25 10:59 · Score: 2

Good for you, and no thats no sarcasm, we need better testing regimes in commercial software development companies, and developers, you should ask for some metrics on code before you join a team, testing is crutial if yo're going to ship a good product, code reviews, test harnesses etc should all be in abundance. Sadly what happens in a lot of cases is Marketing says deliver on day X, where day X is some random day that they thought up rather than a "real deadline" ("real deadline" being a competitor about to trample you etc) then you have to massage the code, work 18 hours a day, not do testing just so everything fits in with when Marketing want to launch their ad campaign (usually about a month after their Kenyan safari). I have personally expereinced this, and its not fun at all. If you want a professional level product you need to plan, design implement and TEST. The most recent high profile case I can think of is Oracles CRM offering which sucks, and doesnt have functionality that corporations were sold on. So now I know many people looking for alternatives to Oracle, not just for CRM but DBMS as well, give a dog a bad name and all that.

--
Any sufficiently advanced man is indistinguishable from God
Re:No, its not limited to OSS by susano_otter · 2001-10-25 11:43 · Score: 2

Funny E.piphany story:

My company contracted them to implement a site metrics solution. . . the first inkling I had that something was wrong came as a result of the following:

SENIOR E.PIPHANY CONSULTANT: Hey, this server you've given us? Does it have any utilities on it to run periodic processes? You know, like every day at the same time, run a process?

[The consultant goes on to name two or three 3rd-party process-scheduling apps that I've never heard of. This doesn't bother me too much though, because I'm thinking:]

ME (out loud): Well, there's cron, right?

CONSULTANT: "cron"? What's "cron"? Does it schedule processes?

. . .

Two days later their senior DBA tried to tell me that I couldn't get RAID-5 out of a three-drive array...

--
Any sufficiently well-organized community is indistinguishable from Government.
Re:No, its not limited to OSS by jweatherley · 2001-10-25 20:29 · Score: 2, Insightful

Things like checking pointers to see if they are NULL before using them. Error handling doesn't just mean catching the error after its already happened. It also means being proactive about it before it happens.

There's two ways of going about this kind of error and which one you choose depends on circumstance:

1
if(ptr != NULL)
{
//Do Stuff
}
else
{
// Attempt to recover
}

2
assert(ptr);
// Do Stuff

The advantage of one is that the program won't stop but you won't get so many bug reports. What if some code hundreds of lines away expected this to have worked? When the program does eventually fail you may be a long way from the real cause - debugging hell...
Two will let you know when that pointer is unexpectedly NULL straight away so the bug is more likely to be caught during development - a good thing!
It can be a good thing to have code that stops as soon as it hits a problem - especially development code. Though stuff like autopilots had better keep going - and have been through a good testing cycle in the first place.

--

--
Reverse outsourcing: it's the future
Re:No, its not limited to OSS by mr3038 · 2001-10-25 23:34 · Score: 2

[Speaking about the problems with error recovery...] What if some code hundreds of lines away expected this to have worked?
In that case this part of code is correct and the one that "expects" this part executed successfully is the buggy one. You're missing similar check(s) there!
The assert()-way is for programmers who don't want to handle exceptions but trust that they don't happen. Assert() clauses are used during developement to catch programming errors but they are usually removed from final release "because it runs faster that way". Even so, I think assert() is better than nothing. If you don't want to do real exception handling you should at least use assert()s.

--
_________________________
Spelling and grammar mistakes left as an exercise for the reader.
Re:No, its not limited to OSS by Grab · 2001-10-26 06:09 · Score: 2

Amen. Matlab is my current Private Enemy #1. In the 6 months from getting this PC loaded with Win98 (yes, I know, but this is work) to getting Matlab installed, it shut down properly 9 times out of 10. Since using Matlab, I have NEVER had the PC shut down correctly. Never. Every time, I have to do the hard poweroff and sit through ScanDisk the next morning. And this isn't just me - every engineer in my office experiences exactly the same thing. A 100% record of failure is not what I'd call acceptable for _any_ software, never mind commercial stuff that's selling for umptitum thousand dollars a seat!

Grab.

Errors? by zaius · 2001-10-25 10:04 · Score: 2

What are these "errors" you speak of? Open source has no errors...

Re:Errors? by Tachys · 2001-10-25 10:11 · Score: 2

What are these "errors" you speak of? Open source has no errors...

Think that is what he mean

Program has shut down, "error"? What error? Open source has no errors.
Re:Errors? by Xerithane · 2001-10-25 11:19 · Score: 2

There are no errors, only segfaults.

--
Dacels Jewelers can't be trusted.

New /. speed record? by MrResistor · 2001-10-25 10:04 · Score: 2

The page cannot be displayed

That's the error I'm getting. Could it possibly be slashdotted in only 3 minutes?

Too bad, I was hoping I could say something meaningful, or maybe even relevant...

--
Under capitalism man exploits man. Under communism it's the other way around.

It's not an error... by krmt · 2001-10-25 10:06 · Score: 2

it's a feature.

--

"I may not have morals, but I have standards."

That all depends on... your selection of course! by noahbagels · 2001-10-25 10:06 · Score: 3, Interesting

Why does it seem like there are as many people in the "community" criticizing open source as there are supporting it?

Two Words: Apache and Tomcat

I'm a professional who works with the closed source equivalents all the time: Netscape iPlanet server, IIS and WebLogic.

Now: before you flame - I like working with WebLogic, but it is no better than Tomcat in my opinion (as far as error reporting goes). And IIS is a piece of crap! Not to mention Netscape's overly complecated UI that blasts every change you've ever made and is completely out of sync with the flat file configs.

Need I mention that Tomcat error logging is set-up in an XML file that is easy to read, modify, and translate into a simple report for management (IT that is).

When was the last time Windows gave you a nice error.log when it blue-screened, or how about IIS on a buffer overflow?

I'm sick of bashing on the free stuff out there. Sure, just because I can release one of my college projects into the open source may mean that statistically there are more projects without good error reporting, the real projects are pretty darn good.

Commercial programmers are worse! by shoppa · 2001-10-25 10:08 · Score: 2, Insightful

Error handling in a C/Unix environment is, by its nature, very difficult. But at least some open source tools have become very refined over the years and are quite good at it.

My textbook example:

The pwd command.

It takes no argument, and only produces one line of output. Despite this apparent simplicity, I've been able to get each and every pwd that ships with a commercial Unix to dump core (almost always by executing in an exceedingly deep directory.)

The GNU shellutils version of pwd, on the other hand, has never dumped core on me.

I will admit, the fact that it took two decades for a non-crashable version of pwd to become available doesn't bode well for the many other vastly more complicated programs out there in any environment. But it does speak very highly of the GNU utilities in general, and I haven't even begun to praise the thousands of folks who have worked on making these tools quite portable!

Blame this on Open Source Programmers only? by ackthpt · 2001-10-25 10:09 · Score: 3, Interesting

I've been coding for over 20 years and I've seen some beauties, and I'm sure others have as well. Like the guy who put about 500 lines of Java in one Try - Catch. I'd suggest they screen their contributors better. Use a carrot and very gentle stick approach and be certain to encourage coders to think "what could happen here and how should I handle it?" whenever writing.

--

A feeling of having made the same mistake before: Deja Foobar

Testing by Taral · 2001-10-25 10:09 · Score: 3, Insightful

The real problem, IMHO, is that nobody likes to do the intensive testing that is necessary to get a program to be truly robust. We do it here at IBM, and I promise you -- it's not something I would do if I weren't being paid to do it.

--
Taral

WARN_(accel)("msg null; should hang here to be win compatible\n"); -- WINE source code

Re:Testing by sqlrob · 2001-10-25 12:50 · Score: 2

Not true. It is exceedingly difficult to see some of your own code flaws.

I think I've got it down by SanLouBlues · 2001-10-25 10:12 · Score: 5, Funny

As a professional programmer I adhere to a strict stylesheet which I think the Open Source community may appreciate a copy of:

main( arguments ){
try{
--code goes here--
}catch( exception ){
printout "I'm sorry to do that you need our $50k/year support plan. \n Thank you!"
}}

No need to thank me.

Re:I think I've got it down by selectspec · 2001-10-25 12:17 · Score: 2

// low budget version....

try {
--code--
} catch(exception e) {
// TODO....
throw e;
}

--
Someone you trust is one of us.
Re:I think I've got it down by elflord · 2001-10-25 13:20 · Score: 2

That's not legal C++ (though g++ will compile it unless you use -pedantic)

Error handling... by mcpkaaos · 2001-10-25 10:14 · Score: 5, Interesting

...is only as solid as the engineer behind it (and the design behind him/her). A poor design often results in a flaky system, difficult to implement and nearly impossible to predict. That, in turn, can result in very thin error handling. Whether or not a product is commercial has nothing to do with it. The only argument for that could possibly be that in many cases, more careful attention (in the form of testing and code reviews) is taken when a product is a revenue generator (or anything that will affect the perception of the quality of a company's engineering ability).

Ultimately, if the engineer (or team of engineers) is inexperienced, error-handling will be weak, error-recovery nearly non-existant. However, a more senior engineer will generally start from error handling on up, making sure the code is robust before diving too deeply into business logic. The time taken for unit testing plays an especially large role here. The more time spent trying to break the code (negative test cases) the more likely you will have a system that has been revised throughout development to have rock-solid error handling/reporting/recovery.

[McP]KAAOS

--
It goes from God, to Jerry, to me.

yes, and most programmers suck by daveking · 2001-10-25 10:17 · Score: 2, Insightful

which is why Open Source is required. So we can all see exactly where the code stinks and we can fix it. Too bad legacy development models don't provide such advantages. Wouldn't it be cool if you could just buy quality software?

--
------DO NOT WRITE BELOW THIS LINE------

Re:yes, and most programmers suck by MrBlack · 2001-10-25 12:13 · Score: 2

I love the term "Legacy Development Models" - I'm gonna try and use that one today at our meeting.

Re:That all depends on your point of view by LordNimon · 2001-10-25 10:19 · Score: 3, Insightful

Any Linux user who tries a stunt like that deserves a seg fault (or worse)

And attitudes like that, ladies and gentlemen, are the reason why we're all going to be old and grey before Linux is accepted on the desktop.

--
And the men who hold high places must be the ones who start
To mold a new reality... closer to the heart

Re:Of Course by TheRain · 2001-10-25 10:20 · Score: 2, Funny

i think you are projecting some of your real life expiriences on web browsers buddy ;)

--
Please help! I'm stuck inside my virtual reality headset!

Too specific: Programmers Stink at Error Handling by victim · 2001-10-25 10:21 · Score: 3, Insightful

I was just re-re-reauthenticating my Cubase installation. The key CD is now scratched which hangs the authenticator forced a quite ungraceful reboot and corrupted my hard drive. (Perhaps a $150 upgrade will help. I'll never know.)

The last time I used Word a drive filled during a save operation and left me with just a mutilated copy of the original file. (I will not use it again.)

My HP PSC 750xi software informs me every morning that its controlling software was exploded and I should reboot the host computer. (I'll wait for the OS-X drivers. If they are still bad the PSC goes out the door.)

The most amazing part is that this state of affairs doesn't surprise me. If my refrigerator intermittently defrosted and melted icecream all over the kitchen I'd be ticked. If my car mysteriously dies at stop signs I get it fixed.

Programmers have managed to beat down everyone's expectations to the point where half-assed is pretty good.

The only way I see to fix it is for consumers to refuse to buy flawed products, or legislators to pass laws allowing redress for flawed products.

I don't think either is likely.

I now use OSS for my mission critical work and fix what needs it.

What's the point... by Coniine · 2001-10-25 10:22 · Score: 2, Interesting

of handling errors that should never happen? You just double the size of your code, cause schedules to be missed, make maintenance more difficult and increase the probability of a grotesque coding error. I expected more macho stuff from the slashdot audience, not namby pamby whimpering! Sheesh! Welcome the the real world, get a thicker skin man!

8)

On a serious note : I've written commercial and non-commercial code. Sometimes I'm obsessive about completeness, sometimes I'm pragmatic. No point in generalizing about OSS vs. commercial.

slashdot effect - slashdot mirror by TV-SET · 2001-10-25 10:23 · Score: 5, Informative

I guess if /. killed the site, it should mirror it :)
Here is a select-n-middlemousebuttonclick(with my formatting):

Title: Open source programmers stink at error handling.

Outline: Commercial programmers stink at it too, but that's not the point. We should be better.

Summary: Why are we subjected to so many errors? Shouldn't open source be better at this than commercial software? Where are the obsessive-compulsive programmers? Plus, more reader PHP tips. (1,400 words)

Author: By Nicholas Petreley

Body: (LinuxWorld) -- Thanks to my very talented readers I've been able to start almost every recent column with a reader's PHP tip.I'm tempted to make it a regular feature, but with my luck the tips would stop rolling in the moment I made it official.So I want you to be aware that this week's tip is not part of any regular practice. It is purely coincidental that PHP tips appear in column after column. Now that I've jinx-proofed the column, I'll share the tip.

Reader Michael Anderson wrote in with an alternative to using arrays to pass database information to PHP functions. As you may recall from the column Even more stupid PHP tricks, you can retrieve the results of a query into an array and pass that array to a function this way:

<?PHP
$result = mysql_query("select name, address from customer where cid=1");
$CUST = mysql_fetch_array($result);
do_something($CUST);
function do_something($CUST) {
echo $CUST["name"];
echo $CUST["address"];
}
?>

Michael pointed out that you can also retrieve the data as an object and reference the fields as the object's properties. Here's the above example rewritten to use objects:

<?PHP
$result = mysql_query("select name, address from customer where cid=1");
$CUST = mysql_fetch_object($result);
do_something($CUST);
function do_something($CUST) {
echo $CUST->name;
echo $CUST->address;
}
?>
I can't help but agree with Michael that this is a preferable way to handle the data, but only because it feels more natural to me to point to an object property than to reference an element of an array using the string name or address. It's purely a personal preference, probably stemming from habits I learned using C++.

Subtitle: OCD programmers unite

Nothing could be a better segue into the topic I had planned for this week. I'm thinking about starting a group called OLUG, the Obsessive Linux User Group. Although I know enough about psychology to know I don't meet the qualifications of a person with full-fledged OCD (Obsessive-Compulsive Disorder), I confess that I went back and rewrote my PHP code to use objects instead of arrays even there was no technical justification for doing so.

Certain things bring out the OCD in me. Warning messages, for example. It doesn't matter if my programs seem to work perfectly. If a compiler issues warnings when I compile my code, I feel compelled to fix the code to get rid of the warnings even if I know the code works fine. Likewise, if my program generates warnings or error messages at run time, I feel driven to look for the reasons and get rid of them.

Now I don't want you to get the wrong impression. My PHP and C++ code stand as testimony to the fact that my programming practices don't even come within light years of perfection. But just because I do not live up to the standards I am about to demand isn't going to stop me from demanding them. It's my right as a columnist. Those who can, do. Those who can't, write columns.

I'll be blunt. Open source programmers need to stop being so darned lazy about error handling. That obviously doesn't include all open source programmers. You know who you are.

If you want a demonstration of what I mean, start your favorite GUI-based open source applications from the command line of an X terminal instead of a menu or icon. In most cases this will cause the errors and warnings that the application generates to appear in the terminal window where you started it. (There are exceptions, depending on the application or the script that launches the application.)

Many of the applications I use on a daily basis generate anywhere from a few warnings or error messages to a few hundred. And I'm not just talking about the debug messages that programmers use to track what a program is doing. I mean warning messages about missing files, missing objects, null pointers, and worse.

These messages raise several questions. Doesn't anyone who works on these programs check for such things?Why do they go unfixed for so long? Are these problems something that should be of concern to users?Worse, what if these messages appear because of a problem with my installation or configuration, and not because the program hasn't been fully debugged?But even if it is my installation that is broken, shouldn't the application report the errors? Why do I have to start the application from a terminal window to see the messages?

Subtitle: Getting a handle on errors

At first I wondered if this was a problem that you would be more likely to find when developers use one graphical toolkit rather than another. But I see both good and bad error handling no matter which tools people use. For example, the GNOME/Gtk word processor AbiWord has been flawless lately. Not a single warning or error message appears in the console. It's possible that AbiWord simply isn't directing output to the console, but I'm guessing that it's simply a well-tested and well-behaved application.

On the other hand, GNOME itself has been a nightmare for me lately. At one point I got so frustrated that I deleted all the configuration files for all of GNOME and GTK applications in my home directory in disgust, determined never to use them again. When I regained my composure and restarted GNOME with the intent of finding the cause of the problems, the problems had already disappeared. Obviously one or more of my configuration files had been at fault. Which one, I may never know, because GNOME or some portion of it lacked the proper error handling that should have told me.

In this case I was lucky that the problems were so bad I lost my temper and deleted the configuration files. In most cases, the applications appear to function normally. Aside from being ignorant of any messages unless you start the application from a terminal, there's no way of knowing why the warnings exist, or if they are cause for concern. The warnings could be harmless, or they could mean the application will eventually crash, corrupt data, or worse.

Subtitle: Examples

Just so you know I'm not making this up, here are some samples of the console messages that appeared after just a couple of minutes of toying with various programs. By the way, did you know you can actually configure the Linux kernel from the KDE control panel? Bravo to whoever added this feature. Nevertheless, when I activate that portion of the control panel, I get the message:

QToolBar::QToolBar main window cannot be 0.

Is there supposed to be a toolbar that isn't displayed as a result? I may never know.

The e-mail client sylpheed generates this informative message after about a minute of use:

Sylpheed-CRITICAL **: file main.c: line 346 (get_queued_message_num): assertion `queue != NULL' failed.

The Ximian Evolution program generates tons of warnings, but most are repetitions. They begin with the following:

evolution-shell-WARNING **: Cannot activate Evolution component -- OAFIID:GNOME_Evolution_Calendar_ShellComponent
evolution-shell-WARNING **: e_folder_type_registry_get_icon_for_type() -- Unknown type `calendar'
evolution-shell-WARNING **: e_folder_type_registry_get_icon_for_type() -- Unknown type `tasks'

The KDE Aethera client generates even more warning messages than Evolution, but many of them are simply debug messages about what the program is doing. By the way, I finally figured out why I couldn't login to my IMAP server with Aethera. The Aethera client couldn't deal with the asterisks in my password. I could log in after I changed my password, but I still can't see my mail. The program simply leaves the folder empty and says there's nothing to sync. Here are just a few of the countless warnings I get from Aethera, including the sync message.

Warning: ClientVFS::_fact_ref could not create object vfolderattribute:/Magellan/Mail/default.fattr
Reason(s): -- object does not exist on server
Warning: VFolder *_new() was called on an already registered path
clientvfs: warning: could not create folder [spath:imap_00141, type:imap]
RemoteMailFolder::sync() : Nothing to sync!

The spreadsheet Kspread reports these errors all the time, even though what I'm doing has nothing to do with dates or times:

QTime::setHMS Invalid time -1:-1:-1.000
QDate::setYMD: Invalid date -001/-1/-1
The e-mail client Balsa popped up these messages just moments after using it:
changing server settings for '' ((nil))
** WARNING **: Cannot find expected file "gnome-multipart-mixed.png" (spliced with "pixmaps") with no extra prefixes

The Gnumeric spreadsheet only reported that it couldn't find the help file, as shown below:

Bonobo-WARNING **: Could not open help topics file NULL for app gnumeric

Many of these problems could easily have been handled more intelligently. For example, Gnumeric could have asked for the correct path to the help file, perhaps adding an option so a user can decide not to install the help files and disable the message. Unless GTK and Bonobo are a lot more complicated than they should be, it should be easy to create a generic component for handling things like this and then use the component to handle all optional help files as a rule.

The only conclusion I can draw is that, like most commercial software developers, many open source programmers are just plain lazy about proper error handling. But we're supposed to be better than that, and it's time we started to live up to the reputation. I realize that most of these programs are works in progress. But good error handling is not something that should be left for last. It should be part of the development process. Although I may not practice it myself, I'm not the least bit ashamed to preach it.

--
Leonid Mamtchenkov ...i don't need your civil war...

Re:slashdot effect - slashdot mirror by BlowCat · 2001-10-25 14:56 · Score: 2

The funniest error message that I've ever seen was
expect unexpected
Actually, the message was telling me that the keyword "expect" was not expected to be in that position.

Nick Petreley is a moron... by ryanwright · 2001-10-25 10:23 · Score: 3, Insightful

Nick Petreley is a moron. Intelligent people don't make blanket statements like "Open source programmers stink at error handling." Next thing you know, he'll be telling you "Closed source programmers use more descriptive variables." How the hell does he know?

Programming traits - just like preferences for pizza toppings, frequency in bathing and type of pr0n - vary from programmer to programmer. Some implement proper error handling, others could care less. It doesn't matter whether they're working on an open or closed source project. If the open-source programmers all traded places with the closed-source programmers, you'd have the same ratios of proper vs. improper error handling (although the traffic from open-source-programmers.com to goatse.cx would probably spike).

--
-Ryan, with the unoriginal sig

Re:Nick Petreley is a moron... by xmedar · 2001-10-25 11:26 · Score: 2

Remember M$ putting in non-fatal error messages in Windows to kill off DRDOS? If you don't read this

--
Any sufficiently advanced man is indistinguishable from God
Re:Nick Petreley is a moron... by krokodil · 2001-10-25 14:17 · Score: 2

I have some news for you: most open source programmers are closed source programmers as well (during their day jobs). Except few lucky ones who are working for open-source companies.

So error handling level is about the same, thoug
vary for programmer to programmer.

stop panicking! by DeadPrez · 2001-10-25 10:24 · Score: 2, Interesting

Out of all open source software I use, my biggest complain is with Linux and how freaking hard it is to swap a hard drive to a new machine. I can only imagine the insults that will be thrown my way but 98, or even NT/2000 nine times out of 10 I will have no problem with this. Sure it takes about 30 minutes of clicking yes, install new hardware but it works (usually). Under Linux, can't load root fs, goto panic. Grr, that bugs me like nobodies business.

I am sure there is some semi-painful way to get around this but should I really have to? If you ask me, the kernel should not panic at this "error" and should recognize it, prompt you and try to solve it (probe the new hardware and load the correct module(s)). Maybe some distros are better than others (and I shouldn't be placing this "blame" on the kernel team).

That's interesting, but how? by Perianwyr+Stormcrow · 2001-10-25 10:25 · Score: 2

I always have inaccessible boot device bluescreening problems under 2000. 98 does happily accept the other drive, but win2k seems to get far angrier. Perhaps I am missing something?

--

What we call folk wisdom is often no more than a kind of expedient stupidity.-Edward Abbey

Last gasp error handling. by Black+Parrot · 2001-10-25 10:27 · Score: 2

Over the past few years I've used several OSS programs in pre-release versions, and the tendency I observed was for the programmers to provide "last gasp" file saves to keep you from using work when the program crashed. For instance, I never lost a keystroke when using early versions of LyX.

I don't recall ever seeing this in a commercial product, though I haven't used any commercial products to speak of lately, so perhaps the state of the art has changed. I sure used to lose a lot of work under commercial software, though.

--
Sheesh, evil *and* a jerk. -- Jade

Re:Last gasp error handling. by Ayende+Rahien · 2001-10-26 04:23 · Score: 2

No, it had it since 95 (ver 6 or so, I think) at least, and probably before it.
They just never advertised it so much.

--

--
Two witches watched two watches.
Which witch watched which watch?

Nothing beats X-Box's error handling! by ekrout · 2001-10-25 10:28 · Score: 2

See my .sig for linkage!

--

If you celebrate Xmas, befriend me (538

Commercial = Competition = Usability by nihilvt · 2001-10-25 10:30 · Score: 2, Insightful

It seems a lot of open source programs do in fact have little error handling. Most open source programs seem to focus on functionality, rather than usability. It gets you from point A to point B, and doesn't give you much help in between. If the user is not a master programmer, any errors usually end up being cryptic and nonsensical. It seems that a lot of commercial software has decent error recovery, and prevents user error effectively. Now I know there are many exceptions, but I think it simply has to do with the fact that commercial programmers get paid to do their work, and competition forces companies to put out products that are competitively usable.

Excpetions are a key by ftobin · 2001-10-25 10:37 · Score: 3, Informative

Exceptions are mandatory for good programming, period. If the language you are using doesn't support exceptions (C, Perl, etc), you are going to have problems. Exceptions make sure that if an error occurs, and you aren't aware of it, your program dies, and doesn't go on its merry way, causing a security hole/unstable software.

Perl's hack at exceptions using 'die' doesn't cut it; one important thing about implementing exceptions is that your base operations (e.g., opening files, and other system functions) need to raise exceptions when problems occur. If this doesn't happen, you're only going to struggle in vain to implement good, correct code.

Exceptions are a primary reason I've moved from Perl to Python. Python's exceptions model is standard and clean. Base operations throw exceptions when they occur problems. And my hashes no longer auto-vivify on access, thank goodness. Auto-vivification on hash access are probably one of the principle causes of bad Perl code.

Re:Excpetions are a key by NerveGas · 2001-10-25 10:56 · Score: 2

Interesting. I've never run across an example of using exceptions that can't be done more cleanly, quickly, and efficiently by simply using (and checking) result codes.

To show that there's been an error, I'd much rather do this:

return undef;

than this:

raise Exception.Create('This didn't work');

And to check for an error, I'd MUCH rather do this:

die unless defined(do_something);

than this:

try
do_something
except
on e.excpetion do
exit;
end;
end;

In fact, the programmers that I've seen use exceptions tend to be less careful than those that simply check result codes.

steve

--
Oh, you're not stuck, you're just unable to let go of the onion rings.
Re:Excpetions are a key by gizmo_mathboy · 2001-10-25 11:03 · Score: 2

Exceptions are mandatory for good programming, period. If the language you are using doesn't support exceptions (C, Perl, etc), you are going to have problems.

Well, you haven't seen Error.pm yet. It implements exceptions for Perl.

I'm not totally convinced that exceptions are necessary for good programming. A good programmer should know how to do error handling. It's nice to be able to call upon it when you need it but it should not be forced upon you, kind of like commenting your code.

Of course I love Perl and believe TMTOWTDI.
Re:Excpetions are a key by ftobin · 2001-10-25 11:05 · Score: 2

Interesting. I've never run across an example of using exceptions that can't be done more cleanly, quickly, and efficiently by simply using (and checking) result codes.

The problem with result codes is that you can't propagate the problem up to the level of scope that should be dealing with it. For example, imagine you have a GUI program. At some point, it needs to open "foo.txt", but fails. Since you're a good software engineer, you've well-separated your GUI code from logic code. The GUI needs to display an error message, but if you only check error calls, the only part that knows about the eror that has happened is way down in the logic code, which has no idea how to tell the user. And propagating 'undef's all the way up through the code is uncool. Especially since return values should not be used to indidate errors; they should be used for return values.

With an exceptions model, you can let the logic code just propagate the error up to the GUI, who can then display a message to the user. It's a very clean, elegant system.

try do_something except on e.excpetion do exit; end; end;

(Sorry for the lack of prettiness; Slashdot's input mechanism doesn't allow <pre> tags.)

This is not how you would handle it using exceptions; you would merely say "do_something". Period. If "do_something" threw an exception, and it wasn't caught, it propagates up and the program dies automatically.
Re:Excpetions are a key by ftobin · 2001-10-25 11:09 · Score: 2

Well, you haven't seen Error.pm yet. It implements exceptions for Perl.

As I stated in my post, having high-level mechanisms for exceptions doesn't cut it. Your base operations must throw them, or else you've lost out on 50% of the reasons for having exceptions. Opening a non-existant file with open() won't raise an exception; this is a problem.

I'm not totally convinced that exceptions are necessary for good programming

Exceptions are not necessary for good programming, but they are necessary for good software engineering.
Re:Excpetions are a key by KnightStalker · 2001-10-25 11:58 · Score: 2

While this is not exactly on topic, exception-like behavior in Perl can be handled using the eval()/die()/$@ syntax.

Certainly, exception handling in C++ or Python is much more efficient and elegant.

Example:

#!/usr/bin/perl
eval{test(3)};
if ($@) {
print "Whoops: $@\n";
}

sub test {
my $bob = shift;
if ($bob == 1) {
print "Happy\n";
} else {
die("Failure testing \$bob");
}
}

--
* And remember, it's spelled N-e-t-s-c-a-p-e, but it's pronounced "Mozilla."
Re:Excpetions are a key by ftobin · 2001-10-25 12:03 · Score: 2

Well then, Fatal.pm handles the other 50% of your quibble.

No, it doesn't, because Fatal can only sanely be applied to core operations like open(), which don't deal with objects (another must for good software engineering). For example, when I write, Perl, I use IO::File to open files; Fatal doesn't help there there.

Also, Fatal is very crude in that it just checks for false values. Perhaps a function fails if it returns undef, but succeeds if it returns a defined scalar, like 0, which tests as false! Fatal will flag this as an error, incorrectly.

But the "or die" idiom of Perl is nice in that it encourages the programmer to come up with a meaningful error message that is associated directly with the failing statement.

When you raise exceptions, you can associate a human-readable string with them, so you point is moot. You aren't just returning an exception; the exception is an object, which turned into a string is meant for human consumption. At least Java and Python are capable of doing this.

At the same time, the "or die" idiom pushes the exceptional code off to the right where it doesn't obscure the intent of the flow very much.

But with exceptions you don't even have to try to check if something failed; it automatically dies! What can be less obscuring than not being there!
Re:Excpetions are a key by ftobin · 2001-10-25 12:11 · Score: 2

But you can program badly with exceptions just as easily as without them.

I disagree with this. I think it is much easier to program badly without exceptions than with. Without exceptions, your code suddenly becomes a lot more ripe for corrupting data and causing security issues, without the user knowing there is a problem. With exceptions falling all the way back up the execution stack, it's immediately known there is a problem, and the program is halted there notifying the user, not causing hours more of run-time corrupting data.
Re:Excpetions are a key by ftobin · 2001-10-25 12:17 · Score: 2

You do realize, hopefully, that the die syntax makes it very hard to selectively catch exceptions. If I have a subroutine that does some array manipulations, and opens a file, I might want to only catch the IOError (file opening error) at the level I'm on, and pass the ArrayIndexError on up. eval() can't handle that well.

CPAN has some modules that hack some exceptions, but it's all very, very unclean. Unclean and unreadable code can lead to just as many errors.
Re:Excpetions are a key by iabervon · 2001-10-25 16:08 · Score: 2

Exceptions are faster, assuming that they never happen. They should never happen, therefore, unless something goes wrong that occurs only rarely and involves some real problem that has to be dealt with differently. For instance, running out of disk space or getting invalid input from the user; in these cases, the user doesn't really care about efficiency as much as correctness (if I mess up, the time spent waiting for the computer to tell me what is wrong is much less than the time it would take to redo things that have gotten messed up).

Using exceptions means that the code doesn't have to check for unusual failures in the normal case. If things are okay, it doesn't have to stop at each level and see if something went wrong; if things aren't okay, you don't care that it's slow.

Unless, of course, you're writing Oracle stored procedures, in which case exceptions are fine in the normal case.
Re:Excpetions are a key by KnightStalker · 2001-10-25 16:25 · Score: 2

I don't see why. Just test for the exceptions you're interested in and then calling die($@) will pass your exception to the previous eval{} frame. It's really not any more unreadable than most perl code. Perl's lack of a switch() statement makes this a minor hassle of if/elses but that's really no trouble. It's less elegant than C++ but I certainly wouldn't call it "very hard".

You can die with objects or hash/array references as well as scalars, which adds some flexibility. Furthermore, this fulfills your requirement in another comment that internal methods use the exception handling system. eval{} will catch fatal errors from perl internals.

--
* And remember, it's spelled N-e-t-s-c-a-p-e, but it's pronounced "Mozilla."
Re:Excpetions are a key by radish · 2001-10-25 21:45 · Score: 2

You only have one layer between the OS and your file handling?? Large applications will frequently have callstacks 15,20 or more levels deep. When the very bottom one goes "poof" you can either insert a load of GUI code at the very bottom (yeeeuch!), or write 15 loads of code propagating some error number up the stack (yuck!) or just use exception. Et voila, you error (together with useful error message) is automatically thrown up the stack until something says "hey I can handle that" and displays it.

Far, far , far, far better.

--
---- Den ene knappen er powerknapp, den andre er Bender voice knapp "Bite My Shiny Metal Ass"

No just Open Source by ozbird · 2001-10-25 10:39 · Score: 2

When you start Outlook Express, it often displays the password entry box before it has finished drawing the screen. Enter your password before the redraw has finished and Lookout locks up. (Netscape 4.7x has a similar problem, too.)

Well... error checking sucks in most languages by TrixX · 2001-10-25 10:39 · Score: 5, Informative

Most languages make error checking very hard. In particular, C and Perl, two of the most used langs in OSS development, lack good mechanisms for sane error checking. I might explain more, but is better explained at this document.

btw, the document is part of a library that allows nicer error checking in C, called BetterC. (Yes, this is a plug, I've participated in the development).

It is modelled in Eiffel's "Design by contract", a set of techniques complemented with language support to make error checking a lot easier and semiautomatic. "Design by contract" has been described as "one of the most useful non-used engineering tool".

Re:Well... error checking sucks in most languages by TrixX · 2001-10-25 13:09 · Score: 2

> Exceptions and Design By Contract are all very
> good, but they still are bandaids on the problem

Of course. they're just a tool, but in the end you need a good programmer. What I mean is: A good programmer gets a lot of help from DBC.

What would be really nice is an authomatic derivation tool, so you mostly write specifications (even that doesn't solve the problem of getting the specification right). But that is too far away from current programming technology, so we must work with best bandaids possible to get some decent quality in the software (OSS or not).

In the code example you present, I see three cases
1) You're quite sure that y is not 0. that doesn't mean an if, it could probably be a precondition of the routine with the division. This is by far the most common case.

2) The 0 case is possible, and must be handled. then you have 'if y=0 handle special case, else a=x/y'

3) 0 case is invalid, but you have no guarantee that it won't happen. This usually means bad design in DBC, but in the case you must do that, you can put an exception handler

The range/domain situation is helped by strong typing.

In short: there are a lot of tools to help error handling. They're not perfect, but by using it we can improve software quality. Then, why not?
Re:Well... error checking sucks in most languages by TrixX · 2001-10-25 16:22 · Score: 2

> I've never had the opportunity to write any real
> software in Eiffel, but I always use a "poor man's
> DBC" in C++

Try BetterC (see my post above). You'll probably like it if you like Eiffel exception handling and DBC
Re:Well... error checking sucks in most languages by Ed+Avis · 2001-10-25 22:14 · Score: 2

I think I should make a counter-plug for lclint (which I haven't worked on, only used), a kind of lint-on-steroids that lets you annotate your program with various kinds of assertions and have them statically checked - those most common being 'this pointer may not be null'. I haven't tried BetterC but I'll have a look. I wonder how well lclint and BetterC interoperate.

--
-- Ed Avis ed@membled.com
Re:Well... error checking sucks in most languages by TrixX · 2001-10-26 01:51 · Score: 2

I have just read it... It is nice, but it implements only half of the functionality: Definition of axioms over the abstract type (pre/post conditions and invariants), and checking of that assertions.

But it lacks the exception mechanism that makes that checking useful for writing robust program. It is a tool just for correctness, and DBC works toward correctness and robustness.
Re:Well... error checking sucks in most languages by TrixX · 2001-10-26 04:19 · Score: 2

It's not my page. I have contributed to the project, and done some documentation translation but I am not the author.

What you recommend works just for a particular case: resource allocation/deallocation. pre/post condition+invariants, combined with a proper exception handling mechanism, allow handling this and every error handling case I've met.

I have aversion to C++, not Eiffel ;) But people (boss/professor/etc.) sometimes forces you to work in C/C++ so you need something like BetterC (or that provides similar capabilities) to do proper error-checking without throwing up and making your code a mess. (btw, BetterC works nicely with C++ too).

Check this thread I started here, and see what people usually does. At the end I posted a DBC solution that is simple, clean, and check a lots more than most people did WITHOUT doing almost anything special.

It is true that pre/post conditions can be implemented in C++. But it's hard to get the precise semantics, like:
* Invariant checks should be made only on calls external to the object, but not internal ones. They must be made on entry or exit
* pre/post must not be checked on function calls inside other pre/post checks.
* a precondition violation must raise an exception in the caller routine, not in the called one.

And several other things. That's why I advocate the use of a library or language facility that allows doing that with proper semantic. That's is not exclusive with standard error handling techniques like auto_ptr, which help a lot too.

If we want good error checking, it's necessary to provide programmers with tools that allow them to do error checking without shooting themselves in the foot

There a no such things as errors! by toupsie · 2001-10-25 10:41 · Score: 3, Funny

The open source community should take the same stance as closed source corporations when it comes to bugs. They are not really bugs but undocumented features!

--
Strange women lying in ponds distributing swords is no basis for a system of government.

Re:That all depends on... (slightly OT) by FastT · 2001-10-25 10:41 · Score: 3, Insightful

Why does it seem like there are as many people in the "community" criticizing open source as there are supporting it?

Because the reality is that open source software is neither as good as, nor as bad as, the zealots on both sides claim. More than closed source development, open source development is subject to significant variability in the skills of its practitioners. There is some open source software out there that is complete crap, and some that is very good, and far more than either that is merely mediocre.

And I would be careful about holding up Tomcat as an (open source) triumph. It's had some major bugs all through the 3.x timeframe, and its team includes at least a few daytime profressional "closed source" programmers (there's no correlation between the two, by the way).

--

The only certainty is entropy.

Them thar's fightin' words by fobbman · 2001-10-25 10:43 · Score: 2

Yeah? My code might not handle errors well but your server doesn't handle a load well and at least my code will never get /.'ed.

This ones easy. by MongooseCN · 2001-10-25 10:44 · Score: 2

Check your return values!! As simple as it sounds, so many people just don't do this. Everyone just assumes that everything will go ok. Check the return, then print out an error to stderr. To be more helpful use this define just before you print your error message to help find where the error was and debug it.
#define ERR_LOCATION fprintf(stderr, "ERROR in File: %c Line: %d :", __FILE__, __LINE__)

Then use it like so:
ERR_LOCATION;
fprintf(stderr, "foo returned %d.\n", ret);

I believe that's the correct code.

--

Outdoor digital photography, mostly in New Engl

Re:This ones easy. by PigleT · 2001-10-25 10:51 · Score: 2

The difference, speaking from personal experience of one datapoint, is that the `commercial' world employs monkeys who say "oh yeah, On Linux gethostbyaddr_r returns -2 in this case" whereas in the free world, in the next release, libc6 (note, not "Linux") will return a different code, and in the real world, people will not be so stupid as to hard-code the numbers by hand when there are perfectly good symbolic Esomething constants to be used instead.

--
~Tim
--
.|` Clouds cross the black moonlight,
Rushing on down to the circle of the turn
Re:This ones easy. by mmontour · 2001-10-25 11:07 · Score: 2

It's even easier than that - just use the built-in "assert()" macro. This is a good thing to wrap around any function call that "can't ever fail", just so that when it does fail, your program will terminate cleanly and tell you the location of the error.

(Insert standard Douglas Adams quote about things that can't ever go wrong)
Re:This ones easy. by Malc · 2001-10-25 11:32 · Score: 2

Ahhh... assert: the basis of all good pre- and post- conditions, and any sanity checks you need in between ;)
Re:This ones easy. by mmontour · 2001-10-25 11:45 · Score: 2

Good point. I stand corrected.

I don't normally use the "NDEBUG" flag, and when I looked at the man page I guess I assumed that "assert(x)" would become "x" instead of "" when NDEBUG was on. However I just tested it, and it does indeed throw away the whole statement. Not the way I'd do it, but I guess the standards committee had their reasons (or were all in a hurry to take off for a long weekend...).

Re:What errors? by Cato+the+Elder · 2001-10-25 10:46 · Score: 3, Insightful

Bullshit.

Sure, an application error in a Unix derived system is much less likely to bring down the whole system. But that's no excuse for not dealing with error conditions correctly.

Also "errors" don't just occur from bad code. Running Linux gives you no protection against drive failures, network flakiness, or plain old user error.

Sure, if the error messages is misleading you can look in the source to find out what actually caused the error. Heck, even if it SEGVs you can compile it with debugging symbols and let GDB tell you what line it's failing at.

However, the open source community will NEVER attract mainstream users with that kind of attitude. Furthermore, even hardcore geeks should have better things to do fix up crud in supposedly release quality code. Hey, it's one thing if I'm working on something clearly under development, but it's nice to be able to get stable stuff to.

That said, I don't find open source to be any worse than the commercial stuff I've worked with. With, say, Microsoft stuff, it is just much harder to distinguish bad error handling code from bad code even when no error conditions are encountered.

Commercial software does it differently by Philbert+Desenex · 2001-10-25 10:47 · Score: 2

I don't think that a straight comparison of open source to commercial software, in the context of error handling, has any merit.

I'll try to illustrate with an example. I'm running IE 5.00.2920.00 on Windows 2000. I get a huge number of "Cannot find server or DNS error" pages from IE. You know, those are the stock HTML files that IE displays that say "The page cannot be displayed", and it has a whole boatload of gibberish on it about clicking the Refresh button, contacting your network administrator, checking URL spelling, etc etc etc.

Unless the host machine is truly unreachable, I can click "Refresh" and get the appropriate page almost instantly about 80% of the time. Does that make you smell a fish? It makes me smell a fish.

The fish that I smell is commercial software handling errors in such a way as to blame anything other than itself when it encourters an error. I'm sure this works on most Windows users, because they've never used anything else, and their desktops crash all the time. Why shouldn't web sites just arbitrarily refuse to give up a page now and then? But if I'm debugging a web server that I'm telnetted to from my SPARCStation, and IE on Win2K claims that the web server can't be found 12% of the time, yet finds it instantly on refresh, I begin to see a pattern.

If you write commercial software, the pattern is to including fairly complete error handling, but make the error handling blame something else. IE didn't choke, DNS or the remote server did, or you typed the URL wrong. Anything but admit that IE had the problem.

Open source programmers don't experience pressure from marketeers and PR people and "product managers" to appear blameless. Open source programs tell it like it is, up to the limits of the programmer's articulation. That's why it's useless trying to compare the two: commercial software handles errors in order to shift the blame. Open source software handles errors in order to provide debugging information.

Re:Commercial software does it differently by zulux · 2001-10-25 10:55 · Score: 2

There is a well documented case of how the Microsoft's BSOD changed from pointing out that "Windows Has Become Unstable" to "msdcex.dll has caused a fault" between Windows 95 and Windows 98. The change was done as to make it appear that Windows was not to blame.

--
Moneyed corporations, non-working 'poor' and criminal prisoners are turning productive citizens into tax-slaves.
Re:Commercial software does it differently by zulux · 2001-10-26 08:50 · Score: 2

And in this case, I agree with them. That DLL could just as well have been a flakey 3rd party device driver, in which case the fault isn't Windows. The new msg tells what part of the OS caused the error, and if it's a Windows DLL that will be apparent (or quickly pointed out on newsgroups).

It's still the fault of Windows - a decive driver should not be able to bing down the kernel of an operating system. Failure of a decive driver is ok and to be expected, but to have that failure cascade down to ring zero is unacceptable.

--
Moneyed corporations, non-working 'poor' and criminal prisoners are turning productive citizens into tax-slaves.

Blanket statements by Wonko42 · 2001-10-25 10:47 · Score: 3, Funny

"Intelligent people don't make blanket statements..."

Wait for it...wait for it...ahhhhh!

This isn't really about error handling. by Nindalf · 2001-10-25 10:50 · Score: 2

This is about detected bugs which haven't been fixed yet.

Basically, his complaints boil down to, "bugs exist, causing error messages, why aren't all the ones that cause error messages fixed yet?"

Then he goes off on a confused tangent, apparently suggesting that "error handling" be added to work around any bugs. After all, if it can log the errors caused by bugs, it can respond to them in any way, up to and including fixing the problem (i.e. doing what the code should have done, except for the bug)! For example, if a system file is missing (meaning either a bug in the install, a bug in the program requesting something that isn't really a required system file, or an externally damaged system that can't be expected to work at all), just pop up a dialog to let the user search for it! Because of course the user should attempt to patch things up with his intimate knowledge of system internals instead of just seeing that there's a bug to report.

Hooooo boy....

I didn't see a single example of a genuine external error that wasn't handled properly, just bugs which should be fixed.

Looks OK to me. by WasterDave · 2001-10-25 10:53 · Score: 2

He chose some pretty bad examples of bad error handling - they all gave the module and direct cause of the error, and provided ample clues for the programmer in each case to go find out what went wrong. If we're looking for anything that open source programmers do that stinks, it's making GUI apps that pretend nothing's wrong.

Way to go, I say. Would rather have hugely detailed warnings any day.

Dave

--
I write a blog now, you should be afraid.

Re:Looks OK to me. by Enigma2175 · 2001-10-25 11:36 · Score: 2

He chose some pretty bad examples of bad error handling
In addition, most of the examples he gave were not programs crashing. I think the problem is open-source software generally is more verbose in error checking. Proprietary software generally gives you NO ERROR MESSAGES, it just crashes indiscriminately. At least OSS programs give me an explanation of what might have happened, and often directions on how to fix the problem. Windows, for example, has given me error messages like: "An unknown error has occured in <unknown application&gt. The program will be terminated." I never get error messages like this in Linux. The error messages being verbose doesn't mean it is bad software, it means it is good software.

--
Enigma

Error messages need to be have error numbers by Khalid · 2001-10-25 10:55 · Score: 3, Interesting

Error messages need to have numbers associated with them. For instance when I have ORA-1241 in oracle, a quick search in groups.google.com will give me a lot of informations about this error, and why it occured and what I can do about. Alas, there is no such thing in most of Open Source software, you just have plain text, so the search is less effective, which search keywords are you going to choose. The situation is even worse for people who used localised versions of the software, as you don't have the English transltation so you can search the English archive in groups.google.com and which count for 80% of the posts.

What might be cool is a codified error numbering a la Oracle for instance. I would love to have KDE-2345 error, or GNOME-1234 error, or Koffice-567 etc. That would made searchs far more effectives

Re:How about Apple? by Phroggy · 2001-10-25 10:58 · Score: 2

Seriously, I once booted a Macintosh and the only thing that came on the screen was a little "Sad Macintosh." Apparently means that your system folder is corrupt. How's that for error handling?

Actually the Sad Mac usually indicates hardware failure (failed POST). You'll notice a hex code underneath the icon; the hex code indicates the actual error. One of the reasons it doesn't give plain English errors is, the Sad Mac is in the ROM. Text strings would take more space (think back to 1984 when that was an issue). Also, the Mac hardware isn't supposed to be language-specific (notice that there are no text labels on the ports) - if English isn't your native language, you shouldn't have English error messages. On top of that, Apple originally intended the Mac not to be user-serviceable. If you got a Sad Mac, you were supposed to take it into an Apple-authorized repair center and have an Apple-certified technician (who would have a list of error codes) take a look at it.

Fortunately, Apple has changed their attitude, but the legacy Sad Mac remains. Personally I agree with you, it would be really helpful to have some idea of what the problem is without having to look up a number.

--
$x='S24;r)>63/* h@<5+oZ)32"5cz';$me='phroggy'x$];
$x=~y+ -xz+\0-Tx+;print$_^chop$me for split'',$x;

Maybe people expect too much from software. by Carnage4Life · 2001-10-25 11:04 · Score: 5, Insightful

One of my friends counters arguments about software being too sloppy with the point that there is practically no other field where a product is designed to be used on such a varying degree of ways and expected to still be robust. For instance, let's use a car as an analogy to your complaints

The key CD is now scratched which hangs the authenticator forced a quite ungraceful reboot and corrupted my hard drive. (Perhaps a $150 upgrade will help. I'll never know.)

how is a programmer expected to deal with the CD being scratched? Does your car still work if the transmission is damaged or half the engine has been riddled with bullet holes?

The last time I used Word a drive filled during a save operation and left me with just a mutilated copy of the original file. (I will not use it again.)

Again, a very unexpected and unnatural scenario. How well do cars function when they run out of fuel?

The most amazing part is that this state of affairs doesn't surprise me. If my refrigerator intermittently defrosted and melted icecream all over the kitchen I'd be ticked. If my car mysteriously dies at stop signs I get it fixed.

But how well would your refrigerator react if you treated it shoddily such as by leaving it outdoors intermuitently or diconnecting and reconnecting the power several times a day?

Now, I'm not trying to excuse sloppy software development but the fact of the matter is that software is constantly expected to work perfectly under situations completely outside its specifications yet we don't expect this from other items or appliances that we use.

Re:Maybe people expect too much from software. by Xerithane · 2001-10-25 11:11 · Score: 2

You hit it straight on the head - I wish I had mod points because you would definitely go up there. I was about to post in response to that guy saying mostly the same things you said, except much less well-spoken.

Programmers are people, people make mistakes. Users are people, people make mistakes. Most often I've found problems arising from stable software coming from people using the software in a way it shouldn't be used.Using a full disk and complaining because it saved a mangled version of your original is just assinine. You are saving OVER a file. It is just extra code and bloat to ensure that your drive still has enough space every time you do a file save operation. I could understand being upset if he was doing a Save As operation.. but get real, it's his fault - not the software.

This mid-afternoon rant session has been brought to you by a slow day of engineering as my current project nears delivery.

--
Dacels Jewelers can't be trusted.
Re:Maybe people expect too much from software. by p3d0 · 2001-10-25 11:23 · Score: 2

One of my friends counters arguments about software being too sloppy with the point that there is practically no other field where a product is designed to be used on such a varying degree of ways and expected to still be robust.
But there is no other field where it is so straightforward to be robust: just check for error codes. I think that cancels the flexibility issue, and so software shouldn't be any less reliable than any other device.

The last time I used Word a drive filled during a save operation and left me with just a mutilated copy of the original file. (I will not use it again.)
How well do cars function when they run out of fuel?
I don't know about yours, but mine doesn't stop dead and leave the passengers mutilated.
Now, I'm not trying to excuse sloppy software development...
Yes you are.
but the fact of the matter is that software is constantly expected to work perfectly under situations completely outside its specifications yet we don't expect this from other items or appliances that we use.
No, but I think it's fair to expect that software can tell when it is beyond its specification and fail gracefully.
If you don't like checking the result of malloc, then write a function that does it for you. When you run out of memory, do your best to exit gracefully. It's not rocket science. It's just tedious, so people don't like it.
It's also hard to test, because some error conditions may be hard to reproduce. We may be able to take a hint from the hardware crowd, who have been using "design for testability" for some time now.

--
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
Re:Maybe people expect too much from software. by Ardax · 2001-10-25 12:28 · Score: 2

I think that you're trying to link together things that don't really go together in the real world.

A computer should deal with a scratched CD by going "Uh, garbage data. Don't like that. Shut down system gracefully, don't try to find out the sound of heads on platters."

If the transmission is screwed up in your car, you don't expect it to crack the drivetrain.

EVERY piece of software should be checking for sufficient disk space during a WRITE operation. Only fools and dieties do otherwise. You shouldn't blindly try to overwrite the old file with the new and hope it works unless you've got a damn good reason.

When your car runs out of fuel, you certainly don't expect it to ruin the engine. Give it fuel (or disk space, as it were), and it's happy again.

There's a difference between dying heroically and taking everyone with you. I don't mind if a program barfs on occasion because of slight variance in the pases of the moon. The software industry is still young and needs to learn more lessons from the engineering industry. I mind when it barfs and takes my data along with it.

--
Pax, Ardax
Re:Maybe people expect too much from software. by gilroy · 2001-10-25 18:39 · Score: 2

Blockquoth the poster:

as they provide a guarantee that the car is functional and safe for normal use.

Ah, only because your garage hasn't discovered the joys of the End User License Agreement.

Yet.

--
The Mongrel Dogs Who Teach
Re:Maybe people expect too much from software. by armb · 2001-10-25 23:39 · Score: 2

> > The key CD is now scratched which hangs the authenticator
> how is a programmer expected to deal with the CD being scratched?
By not requiring one specific non-copyable CD as a dongle? At the very least, by putting up an error message saying "sorry, I couldn't read that". _Not_ by rebooting and corrupting the hard disk.

> > a drive filled during a save operation
> Again, a very unexpected and unnatural scenario.
No it isn't. That's the sort of attitude that leads to crap code in the first place.

> How well do cars function when they run out of fuel?
With most cars, you fill them with fuel again and then they continue working (sometimes you might need your injection system cleaned). Word didn't just fail to save his file, it mangled what he already had.

--
rant
Re:Maybe people expect too much from software. by Herbmaster · 2001-10-26 01:35 · Score: 2

One of my friends counters arguments about software being too sloppy with the point that there is practically no other field where a product is designed to be used on such a varying degree of ways and expected to still be robust. For instance, let's use a car as an analogy to your complaints

That's okay, software more than makes up for this bit by being about the only field where you can be so certain about the environment it's used in. Cars are subject to various terrain conditions than the manufacturer never could have guessed (this could be half the user's fault), and environmental conditions which are rare and possibly very difficult to simulate during the development process (this is not). To prepare a car to theoretically resist any kind of failure you need a really good physics model and lots of really good test cases (lots = billions). Recompiling and testing 1000 times does not even begin to compare to building a car from a new design 1000 times, and testing. And I suspect 1000 tests on a piece of software would probably find/solve many more of the bugs in it.

Computer softare has all kinds of luxuries here. For one, all input to a piece of software is digital. I hope you understand how much easier this makes bounds/value checking. Software gets to come with hardware and software requirements which it gets to enforce: Oh, you aren't running at least Win98 on at least a PentiumII? I won't run then. Cars come with a manual that says "put 89 octane gas in me. Don't drive me on roads which are unpaved, save the pointy nails littered all over them. Don't drive around on antarctica cuz it's cold there." But if you defy any of these rules, you'll suceed at least part way, and then you can complain to the manufacturer, who will tell you to piss off. It would be really cool to see a car that actually tested the octane of the gas you put in it, and complains immediately if it's incorrect, but I don't see anyone writing an article that auto makers owe us this feature.

We're not talking about physical damage here. If you throw your hard drive at a lamp post, and then complain to your software vendor that the software doesn't work, you've got issues. But yes, software that corrupts your hard drive (???) because a CD is scratched is totally negligent.

--
I'm not a smorgasbord.
Re:Maybe people expect too much from software. by Shotgun · 2001-10-26 01:42 · Score: 2

Does your car still work if the transmission is damaged or half the engine has been riddled with bullet holes?

If the transmission is damaged, should the tire go flat for good measure?

Again, a very unexpected and unnatural scenario. How well do cars function when they run out of fuel?

How the hell is running out of feul or disk space unnatural or unexpected? By your analysis, if you run out of fuel, your car should explode and the manufacturer could just say, "Heh, he let it run out of gas. He should've gotten a bigger tank."

But how well would your refrigerator react if you treated it shoddily such as by leaving it outdoors intermuitently or diconnecting and reconnecting the power several times a day?

I take it outside, the pretty white paint turns yellow. I unplug it, frozen stuff melts...eventually. But I would be PISSED if I unplugged it and the door fell off.

We can expect more intelligence from computer software, because computer software is more intelligent. A program can take reasonable precautions to check if a step succeeded before trying to use the results of that step, and then give a reasonable error message or take an alternative course if the results were unexpected. Instead we get even less with software than we do with other things. If my car won't start, I can have my wife try to crank it while I listen for strange noises under the hood. If my Windows computer won't start I just get a message about an unexpected operation. (My Mandrake-Linux boxen tell all sorts of things about what's going on as they boot up, so I was easily able to find that IPForwarding wasn't working because I hadn't compiled the correct modules.)

That said, I've found that the commercial software I've used just dies without ever telling me anything, so I usually have no clue what happened or how to fix it. OSS tends to report a ton of cryptic garbage that I can do a grep on the source code with. I'd be no better off with OSS in most case if it wasn't for the fact that I'm a software engineer by trade.

--
Aah, change is good. -- Rafiki
Yeah, but it ain't easy. -- Simba
Re:Maybe people expect too much from software. by scrytch · 2001-10-26 04:38 · Score: 2

how is a programmer expected to deal with the CD being scratched? Does your car still work if the transmission is damaged or half the engine has been riddled with bullet holes?

The authenticator software can return "Error Reading CD" and not completely lock up and hose everything on the system, perhaps?

How well do cars function when they run out of fuel?

They stop. When you fill them up with gas, everything is fine again, no permanent damage is done. This is a simple and extremely common boundary condition that the automobile manufacturers account for. Running out of space is also a common condition that should simply result in a denied operation, not corruption of data.

But how well would your refrigerator react if you treated it shoddily such as by leaving it outdoors intermuitently or diconnecting and reconnecting the power several times a day?

Your analogies are getting very thin. The fact that computers are designed for more varied conditions of their runtime environment should not excuse them from being robust in the face of that fact.

--
I've finally had it: until slashdot gets article moderation, I am not coming back.

Hrmm by NitsujTPU · 2001-10-25 11:04 · Score: 2, Insightful

First let me say that I am a Linux user and an open source advovate.

Now let me compare this to a judge I once met, who said that men have more tickets in general, but women always follow too close.

This is interesting, but if we further evaluate, one could conclude that women are just as bad (equally so), but perhaps people were lighter on them along the way. A police officer might have let her off, and so forth (this isn't to sound mysogynist of course, but I know women who get let off all of the time).

Instead, following too close is an easy prelude to... an accident. After all, when your bumpers are crushed together, you're too close.

Now think of error handling. "Open Souce Software handles errors poorly," is another way of saying that it too crashes a lot. Perhaps other people get caught for other things, but we only rag on open source when it crashes.

This isn't to say ALL open source software though.... but lets be perfectly honest. Programming is a difficult profession that a lot of people think they can just pick up. How many people would volunteer to do surgery without med school because they read a book on the subject? How many people get offended when you flash some important programming credentials in front of them that they don't have?

The trick is sifting the wheat from the chaff. Sure, a 14 year old with a little ambition can whip up a pretty impressive looking windowed program in X... but he doesn't have the sophistication of a well educated programmer... generally. There are plenty of good programmers and bad programmers in open source. The key is to know whats good and whats bad. If you can't figure that out, then buy a distro made by people who do.

No kidding... by X · 2001-10-25 11:05 · Score: 2

Heh, the link to the site gives me a Proxy Error. ;-)

--
sigs are a waste of space

THAT is your answer? by Nindalf · 2001-10-25 11:06 · Score: 2

Exceptions make sure that if an error occurs, and you aren't aware of it, your program dies, and doesn't go on its merry way, causing a security hole/unstable software.

You mean like that Ariane rocket that blew up when its double-redundant computer system was halted because of an utterly irrelevant uncaught exception? Yeah, that's definitely a superior error-handling philosophy.

Aside from the conceptual problems of what are essentially COMEFROM statements with scope management, there's no reason to assume that halting the program is better than just allowing it to run.

Re:THAT is your answer? by ftobin · 2001-10-25 11:15 · Score: 2

You mean like that Ariane rocket that blew up when its double-redundant computer system was halted because of an utterly irrelevant uncaught exception? Yeah, that's definitely a superior error-handling philosophy.

I'm not familiar with the rocket you describe, but yes, it is a superior error-handling philosophy. Imagine if there was an unchecked error, and the rocket, instead of detonating, landed in civilian housing? That's precisely what not using exceptions allows for: programs that become destructive because of lack of error management.

Aside from the conceptual problems of what are essentially COMEFROM statements with scope management, there's no reason to assume that halting the program is better than just allowing it to run.

That's like saying there's no reason to assume knowing about a bug is better than just allowing a program to go on its merry way. Uncaught bugs are the cause of 99% of the security holes out there. It's always better to know when there is a problem.
Re:THAT is your answer? by Nindalf · 2001-10-25 12:17 · Score: 3, Insightful

I'm not familiar with the rocket you describe, but yes, it is a superior error-handling philosophy. Imagine if there was an unchecked error, and the rocket, instead of detonating, landed in civilian housing?

Why would you assume the rocket was intentionally detonated by the computer? Its computers went down and it went completely out of control. It was only blown up after it broke apart because it happened to go into a spin. There is no upside to this computer failure.

You call blowing up a commercial satellite launch vehicle non-destructive? If this error was ignored the rocket would not have been affected by it, it was an utterly irrelevant mathematical value overflow error in a program that only did anything before launch.

This program became destructive because of the "error management." In particular, the error management philosophy that halting a suspicious system is always safer than allowing it to run.

The point you seem to have missed is that halting the program is often more destructive than ignoring the error. Data loss, control loss, vital services suspended, etc.

That's like saying there's no reason to assume knowing about a bug is better than just allowing a program to go on its merry way. Uncaught bugs are the cause of 99% of the security holes out there. It's always better to know when there is a problem.

I'm sure the European Space Agency found it worth every penny of the estimated half-billion dollars lost to find this otherwise irrelevant bug. After all, it's always better to know, whatever the cost of halting the system, right?
Re:THAT is your answer? by dvdeug · 2001-10-25 12:43 · Score: 2

The code was designed for the Ariane 4; due to money and political reasons, none of the programmers knew it was going to be used for the Ariane 5, and no one ever checked to see if it would work right for the Ariane 5. Pretty nasty circumstances. If I understand the case correctly, it was a hardware exception, not a software exception, in any case.

what are essentially COMEFROM statements with scope management

And for loops are essentially if-goto statements. The added structure is what makes them useful.

there's no reason to assume that halting the program is better than just allowing it to run.

So you would prefer a hacker gain root because of a buffer overflow, rather than the attempt crash your webserver? A lot of the exception-causing problems, if just left to run, will spew garbage and crash, exchanging clear debugging info for a few microseconds of trash. If it's important that it continue to run, I can put a exception net at any level (esp. the highest level) and restart whenever an exception is caught, possibly taking different code routes.
Re:THAT is your answer? by Nindalf · 2001-10-25 12:52 · Score: 2

When the value exceded the max size of int, the value went negative

Yes, but...

and the computer thought the rocket had flipped and auto-destruct was triggered.

...no, this is simply wrong. Where did you get this idea? A simple search on google will let you confirm the inaccuracy of this claim with a dozen independent sources. The code in question served no purpose in the air, it was used to align the rocket on the ground. The flight software worked perfectly.

When it went negative, it failed an assertion and threw a math exception. The system read this as "this chip is fried" and shut it down. Then, because the 2 redundant backups were running the exact same programs with the exact same data, it did this twice more. With the computers down, the rocket went wild, and then blew up.

Like most disasters on this scale, many mistakes had to be made simultaneously. If they had stopped running this irrelevant code on launch, it would have been fine, if they had used a larger integer, it would have been fine, if they had used Ada's integer protection, it would have been fine, if they had caught the exception, it would have been fine.

But the things that gets me are 1) that the system was set up not to pass control from a seemingly defective chip to a seemingly good chip, but to yield control from any seemingly defective chip, whether there was another good chip or not and 2) any uncaught exception (not just a specific "We're really sure there's a hardware problem." exception) was taken as proof that the chip was defective. These were both well-known, meaning that there was a concious, considered decision that it was better to halt the system, and completely shut down the guidance system (the only possible result of an uncaught exception due to a software bug) than to face the unknown consequences of running despite an exception they didn't think to catch.

The blame lies with all of those flaws, not just one or two of them. It indeed also lies with the testing that didn't catch it and the decision to reuse legacy code of an older, slower rocket. But note that most of the direct causes were consequences of the exception-handling style of error-handling, and the single most direct cause was a mechanism inspired by the specific belief that halting the system was better than allowing it to run with any one unknown flaw.
Re:THAT is your answer? by dvdeug · 2001-10-25 16:48 · Score: 2

This program became destructive because of the "error management."

This program became destructive because of poor managers. Any test would have shown that the rocket would tilt more than the computer could handle. The reason the computer couldn't handle it was because it wasn't designed to run on that rocket, and no one bothered to check to see if it could run on that rocket. (That exception couldn't have been raised on the Ariane 4 so the check was removed because they didn't have the spare cycles.) On the Ariane 4 (the rocket it was designed for) it worked fine.

It's like ripping a heat sink off a CPU and blaming the CPU for melting down. Maybe the CPU should run cooler, but the big problem was that the CPU was never designed to run without a heat sink.

Data loss, control loss, vital services suspended, etc.

Like you can't get all of those by ignoring exceptions. When a hacker tries a buffer overflow, I'd prefer my webserver to crash rather than give him root. I'd rather a program crash with a pointer to the bug (where the exception was thrown) then just spit garbage and hang or crash. Either way it's dead, but one way is a clean, safe, informative death.
Re:THAT is your answer? by radish · 2001-10-25 21:40 · Score: 2

When you're dealing with something which could quite easily kill 500 people if it landed in the right place, you bet you need to err on the side of caution. I studied this exact case a few years back, and the rocket was auto-destructed by the safety systems directly because of the exception. This was 100% correct - if you are not totally sure that everything is perfect, you abort. The actual reason for the failure was bad management and testing practise, which allowed a system designed & tested for a different launch vehicle to be used to "save money". Blame accountants & PHMs!

--
---- Den ene knappen er powerknapp, den andre er Bender voice knapp "Bite My Shiny Metal Ass"
Re:THAT is your answer? by TrixX · 2001-10-26 01:40 · Score: 2

The Arianne exploded because the "horizontal bias", a 64 bit floating point number, was converted to a 16 bit signed integer producing an exception, because the conversion overflowed.

That exception was CAUGHT BY A HANDLER. The program didn't stop. But the handler above the faulty routine had no information about how to correct the problem, and given that the rocket was going off course, it self-destroyed (yes, that's a security measure).

So, the exception handling was used, and served its purpose of error-handling. The error, was using a 16 bit integer, that was not enough (they used that because they reused the code that they had used for the Arianne 4, whose trajectory had a bias that fitted into a 16bit signed int). If the rocket program would have continued, the arianne could've have crushed over some population. killing a lot of people.

So:
1) The program didn't stop abruptly (that's what exceptions are for, controlling program stop)
2) The exception mechanism possibly saved a lot of lives
3) The program did a reasonable work and ended in an expected way very well for a program with critical bugs in it.

My Ass by Greyfox · 2001-10-25 11:08 · Score: 2

I've seen (and done) a lot of commercial and custom programming for business. I've dug through a lot of open source software. Open source software wins hands down every time. Open source programmers like to code. You will find that a lot of people who are programming professionally do so because the salaries are good. They're mediocre programmers at best and their code reflects that.

Open source programmers may suck at handling errors, but commercial programmers suck much more.

--

I'm trying to teach myself to set people on fire with my mind... Is it hot in here?

Zen GET by isomeme · 2001-10-25 11:10 · Score: 2, Funny

What is the sound of one LinuxWorld story illustrating its own point, grasshopper?

Proxy Error
The proxy server received an invalid response from an upstream server.
The proxy server could not handle the request
GET /site-stories/2001/1025.errorhandling.html.
Reason: Could not connect to remote machine: Connection refused
Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request.

...and in that moment, he became enlightened.

--
When all you have is a hammer, everything looks like a skull.

Graceful degradation by PollMastah · 2001-10-25 11:21 · Score: 3, Insightful

I agree with you that software in general is a lot more complex, and used in a lot more unexpected ways, than something like a car.

OTOH, there is such a thing called graceful degradation -- that is, if you push the limits of the software, it shouldn't just suddenly barf and die on you, but degrade gracefully. Too much code I've seen (both open and non-open source) assumes too much -- and dies badly when the assumptions fail.

It is possible and not overly difficult to design software such that it degrades gracefully. Sadly to say, sloppy programming (programmers), deadline pressure, or disinterest in handling error conditions, dominate the world of software. Not many would put in the extra work to make a program degrade gracefully, because it doesn't have very visible effect -- until things start to fail. And too many programmers have this "test only the cases that work" syndrome.

--

Poll Mastah

Re:Graceful degradation by pmz · 2001-10-26 01:18 · Score: 2

A good example of graceful degradation is the behavior of a good UNIX kernel as virtual memory is exhausted. It still works, although slowly, since there are well-thought-out mechanisms to keep the system going.

Another example would be the Internet. It is designed to still work after much of it has been destroyed.

--
Healthcare article at Kuro5hin

Re:That all depends on your point of view by darkonc · 2001-10-25 11:21 · Score: 2

THe cluelessness of an error response will change from programmer to programme, and even from progrm to program. Programs that I write will vary from a reasonably complete attempt to catch and explain errors to, through terse "bad stuff" messages to ignoring the error and producing (obviously?) bad results.

The more likely that I think someone other than me is going to use a program, the more likely I'm going to put work into error support. If I'm expecting really techie types to use it, I may be more oriented to super-terse responses.

For stuff that goes to the general public, I'm more likely to do nice error work, but even that may be cut short if I'm really short on time. Of course, there's the problem of systems that go are originally intended for me but then get released to an unintended audience... These are the ones most likely to have what I consider inappropriate error handling for the audience. I think that this re-targeting of programs originally intended for internal consumption are also the source of sub-optimal error handling in some open source programs.

Some principles that I've learned for error reporting are (in roughly decresing order of importance);

it shouldn't do the unexpected in the face of the unexpected/unwanted
the response should be reasonable and predictable
error messages should indicate, as much as possible) where the error originated
recovery should be as easy as possible

Now I think that I'm going to get flamed at for putting easy recovery at the end, but that's really where I have it. I'd rather have consistent and reasonable (but cryptic) responses than reall nice messages and eratic behaviour. If it happens with any regularity, I believe that eratic behavior is going to be more off-puting to users than terse messages.

Now ideally, I'm going to achieve all of the above, but that's going to depend on how much time I have to put into the project in question. I guess that the listing I gave is the order in which I plan for error reporting depending on the 'techiness' of the expected user.. (This may also include the techniness of my own expected mood when I'm going to be running the program).

To be honest, I believe that if other users are going to be playing with my program, it's usually worth my while to go the full gamut for error response/ recovery. I know that if I don't take the time to make error recovery as easy as possible, I'm ultimately going to end up spending more than that amount of time responding to users who don't understand/like the errors that they get out of my program.

Short term investment, long term gain

--------

Microsoft has had a history of going to extremes with their error response. The response is either laking all but (possibly) the most basic error handling that may not even achieve my first intent (e.g. BSOD) to things like the paper clip that are so damned helpful that they're annoying. Part of the problem, I think, is that the Microsoft culture encourages it.

When program failures are cryptic and unpredictable, it encourages support calls to Microsoft that they get paid for. Microsoft actually gets paid for. The other reasonable response is to go to microsoft for training -- once again paying them for an MSIE that spends lots of time on how to placate customer dissatisfaction with Microsoft's problems.. In other words, microsoft gets paid for bad programming practices.

This seems to change, however, when Marketing decrees that problems need to be handled better. As far as I can tell, marketing seems to drive Microsoft, so when they decree that things need to change, they will.

An early example of this was when windows 95 came out and decided to check the filesystem if the system was brought down badly. This was something that Unix and MAC boxes already did, and microsoft probably wanted to jump on the "stability bandwagon". The result was the annoying wait at the "we're about to clean the filesystem, you naughty boy" prompt.

I expect that the paperclip was similarly mandated by Marketing (though I sometimes think that it may have started out as some programmer's prank and picked up by marketing as a 'good idea'). That feature was so annoying that it was ultimately dropped -- I think because it broke the principle of 'reasonable response'.

Something that distracts the user isn't helpful. The constant (nervous) motion of the paper clip, and it's obtrusive location on top of the main screenare tricks learned by advvertisers to get peoples attention -- unfortunately, most of the time the users' attention is attempting to focus on the document being created. Had the paper clip been an unobtrusive text box in the toolbar, people probably would have welcomed it.

In any case (having rambled), I think that Microsoft error responses are more oriented towards making money for Microsoft than making life easier for the user (a common subtext on slashdot).

--
Sometimes boldness is in fashion. Sometimes only the brave will be bold.

Depends upon the application by Adam+Wiggins · 2001-10-25 11:29 · Score: 2

The "proper" sort of error handling varies wildly depending on the application type. For example:

A game should pretty much always abort if anything goes wrong. Missing or corrupted datafiles, etc. Diagnoic information will only be useful to programmers, so no need to make it terribly user friendly - just tell them to resintall the game.
A server application should give detailed messages which go to a logfile, pinning down exactly what it was trying to do, and what line it was on if parsing a file.
A productivity application can go easy on the diagnostic information, like a game, but being able to recover gracefully is far more important. At the very least, allowing the user to save their data before they restart the program.
A viewer-type program should be able to recover gracefully and just fill the area it can't parse or that is causing a problem with a blank space or something. Like a web browser that can't load a picture. If it's something like a file viewer that can't load a certain font, it should be able to fall back to a default font so that the document is still readable.

The error handling challenge by TrixX · 2001-10-25 11:31 · Score: 2

you think checking return codes is the solution? Well, it is but at a cost.

Exercise for /. readers: add errorchecks to the following C function. 'return' and exception handling pseudocode allowed:

int allocate_3(void){ int *p1, *p2, *p3 ; p1 = malloc(SOME_NUMBER*sizeof(int)) ; p2 = malloc(SOME_NUMBER*sizeof(int)) ; p3 = malloc(SOME_NUMBER*sizeof(int)) ; /* Here we do something with p1, p2, p3 */ free ( p1 ) ; free ( p2 ) ; free ( p3 ) ; return 0 ; }
Let the game begin...

Re:The error handling challenge by Mr.+Barky · 2001-10-25 12:12 · Score: 2

If the alloc of p2 or p3 fails, then you have a memory leak.
Re:The error handling challenge by TrixX · 2001-10-25 12:14 · Score: 2

The big deal is: you have an intermediate non-stable state.

So, suppose the first malloc succeeds, but the second one fails.

In that case, you have allocated p1, buth then 'return -1'. That results in a memory leak, because you never freed p1.

There's one big point for garbage collection, btw. But, the same would happen with fopen()/fclose()
I posted this to show what the more common mistakes are. Yours is on the list (I've seen it zillion of times)
Re:The error handling challenge by treat · 2001-10-25 13:06 · Score: 2

I must say that this is so completely and utterly classic. It is similar to when people correct someone's "grammer".
Re:The error handling challenge by TrixX · 2001-10-25 13:14 · Score: 2

> would probably be better off using a higher level
> language with exceptions and a GC or something.

That's right. That's why I started this thread as a critique to all the people that said "well, checking return codes is enough".

When you run out of memory, you probably will quit, but it would be nice to do some some cleanup, like closing connections and files, saving the user unsaved work (or doing a backup), removing locks, etc. A "if (broken) exit(1)"approach is terrible at this.
Re:The error handling challenge by csbruce · 2001-10-25 13:22 · Score: 2

int allocate_3( void ) { int *p1=NULL, *p2=NULL, *p3=NULL; p1 = malloc( SOME_NUMBER * sizeof(int) ); p2 = malloc( SOME_NUMBER * sizeof(int) ); p3 = malloc( SOME_NUMBER * sizeof(int) ); if (p1==NULL || p2==NULL || p3=NULL) goto ERR_EXIT; /* Here we do something with p1, p2, p3 */ if (ran_into_error) goto ERR_EXIT; SaneFree( p1 ); SaneFree( p2 ); SaneFree( p3 ); return 0 ; ERR_EXIT: SaneFree( p1 ); SaneFree( p2 ); SaneFree( p3 ); return -1 } void SaneFree( void *ptr ) { if (ptr != NULL) free( ptr ); }

Here is some text to avoid stupid 'Your comment violated the "postercomment" compression filter. Try less whitespace and/or less repetition. Comment aborted.' lameness filter.
Re:The error handling challenge by TrixX · 2001-10-25 13:24 · Score: 2

I like this one a lot more... A lot of people will think the macros/goto are dirty, but they are actually providing a little exception handling capabilities. what you've done is simmilar to BetterC (see here).

But that is for my point that some exception handling is needed to do clean error checking in almost all non-trivial cases.

I like the initializing of p1,p2,p3. Initializing all vars helps to avoid heisenbugs, at a minimal efficiency cost.

I would've done free(p1), without the if(); free(NULL) is perfectly valid. But that's hard to know without a DBC approach (where you have info on pre/postcondition of your functions).
Re:The error handling challenge by TrixX · 2001-10-25 13:32 · Score: 2

One note: your "Sanefree" is 100% equivalent to ANSI free() (whose standard behavior on a NULL pointer is to do nothing).

Another "goto" approach. Dijkstra would throw up here, but what are you doing is like a crude exception handling. And people said "checking return codes is enough"...
Re:The error handling challenge by csbruce · 2001-10-25 14:00 · Score: 2

Another "goto" approach. Dijkstra would throw up here, but what are you doing is like a crude exception handling..

I prefer "simulated exception handling". ;-)

W.r.t. Dijkstra, the alternative would be bullshit condition variables, extra control structures, and multiple instances of clean-up code, which would be significantly less readable and maintainable.
Re:The error handling challenge by TrixX · 2001-10-25 14:32 · Score: 2

> The initializing of variables is required (except
> maybe p1). Otherwise if the p1 or p2 allocations
> fail, the program will pass an uninitialized
> variable to free().
They wouldn't be uninitialized. They would be NULL, because that's what malloc returns in case of failure, and you assign the result of a malloc to the 3 pointers (that means, you have valid pointers xor NULL).
The repetition is frequent, that's true. But it's a better solution than creating a language pseudostructure for using when I need the same code, and another for when I don't need different code.

One usual case of different code is object constructors. The constructor starts creating a structura, and if it fails in the middle, it has to free it all; but when it finishes correctly it has to free nothing. The approach above works. I've been using it for years and I don't use a debugger.
Re:The error handling challenge by epeus · 2001-10-25 19:39 · Score: 2

Close, but still messy.
#include"MacErrors.h" OSErr allocate_3( void ) { OSErr err=noErr; int *p1=NULL, *p2=NULL, *p3=NULL; p1 = malloc( SOME_NUMBER * sizeof(int) ); p2 = malloc( SOME_NUMBER * sizeof(int) ); p3 = malloc( SOME_NUMBER * sizeof(int) ); if (p1==NULL || p2==NULL || p3=NULL) {err=memFullErr; goto bail; } /* Here we do something with p1, p2, p3 and set err*/ if (err) goto bail; bail: free( p1 ); free( p2 ); free( p3 ); return err; }

In MacErrors.h OSErr is typdef'd to short, and the error constants are defined.
Re:The error handling challenge by TrixX · 2001-10-26 01:58 · Score: 2

> I just say "gee, I feel sorry for those C
> programmers." Of course, garbage collection
> (Java's garbage collection is not conservative...
> but do even conservative garbage collectors have
> problems with non-circular structures?)

There are complete garbage collectors that get anything that's not referenced directly or indirectly from the "roots" (global vars and stack).

Anyway, my do that again with an open/close on any resource... a GC works just for memory. Files usually are automatically closed at end. But open/close transactions with more complex objects, possibly externals would bring the same problem: nice error checking is very hard to do in C. It's possible, but the language make it lots harder than it should. So most people get lazy and don't do it at all. Given that C is one of the main languages in OSS, the title of this article didn't surprise me at all.
Re:The error handling challenge by TrixX · 2001-10-26 02:10 · Score: 2

wow, you got me! I've posted the problem and haven't thought about that.

I hope what has happened in this thread raises my point for exceptions. The only clean, not deeply nested solutions were using goto's in a form of exception handling. The "die if xxx"/"if (xxx) return -1" school is not enough for good error handling.

With exception handling and a good DBC mechanism, you would have:
* overflow raises exceptions, which would have caused an exception that would've been caught, and the memory freed. This is, I'm handling appropiately an error I hadn't thought of.
* not enough memory, or other possible errors also produce exceptions and are nicely cleaned up.
* using a different channel to report success/failure and to return results leads to less confusion:

consider the case SOME_NUMBER was 0. perhaps, if the "do something" part does a loop on the arrays form 0 to SOME_NUMBER -1, it is valid (it would do nothing). malloc(0) returns NULL (see ANSI standars), so we would have treated that as an error (by checking if (!malloc()){... ) when it is not.
Re:The error handling challenge by statusbar · 2001-10-28 17:20 · Score: 2

Very good question. Some people here actually got good solutions that appear to work without memory leaks.

HOWEVER

The next step is to take the platform into account!

Let us run the code on Linux.

GUESS WHAT? malloc() will almost never ever return 0 under Linux!!! Even when there is NOT enough memory (phys+swap) available! Try it! Allocate 100 megs 5 times on a system with 256 megs of ram+swap. It works.

It is a 'feature' called memory overcommit.

malloc() returns a pointer to a bunch of virtual memory space. The actual memory pages are not allocated until you TOUCH EACH MEMORY PAGE!

So, in linux you may as well just ignore the return values p1,p2 and p3 and just go ahead and use them. If there isn't enough memory available, the code you have at the point /* Here we do something */ will cause the application to be immediately terminated. No error codes, no way to handle it. In fact, other apps running at the same time may get terminated as well if they need to allocate memory.

Eiffel and C++ exceptions don't help either. The c++ 'new' operator WILL NOT THROW any exception. It will not return 0. Your program will just spontaneously terminate, LATER.

Do a search for 'Endless overcommit memory thread' in the linux kernel archives to read more about this behaviour.

Error handling is unfortunately more complex than just handling return codes properly.

--jeff

--
ipv6 is my vpn

/. error by xmedar · 2001-10-25 11:32 · Score: 2

I must have been transported into a parallel universe, first a story thats negative towards Open Source on /. and then I cannot find any of the usual "Imagine a Beowolf cluster of errors", or "Is a Beowolf cluster of errors a cluster fsck?" and where oh where is "All your errors are belong to us", if anyone has directions back to my normal reality please help me.

--
Any sufficiently advanced man is indistinguishable from God

take a look at the standards by jafac · 2001-10-25 11:32 · Score: 2

Open Source sucks at error handling? Look at the standards in the PC industry.

They've been declining in general for the past 10 years, and before that they sucked as well. I think the standard is really set by the hardware itself.

Typically drive errors can have symptoms of software running more slowly as the drive retries - or applications will simply appear to hang, or if it's an error reading code into memory, well, anything goes.
Network errors can go completely unknown until you haul out the crusty old hacker with a sniffer - oh gee, did you know that your card is dumping half it's packets?
Oh - especially network problems - where the software at the user level 90% time just sits there and goes "Duh!" for simple things like pulling the cable out.

Error checking and handling, in general, SUCKS and it's the main reason why computers suck - why the software industry spends billions of dollars chasing problems during the development phase that they never really get to pin down, so the problem ends up going into shipping products.

I blame the lax standards on the platform, and the dumbing down of programming in general (the over-reliance on high-level languages that remove the programmer progressively further and further from the hardware their programs run on).

If PC's had better standards for this sort of thing at the hardware level - and if the vendors adhered to those standards, then the software people could write software that handles errors better, and it would bubble up to the user level as more reliability, and much simpler troubleshooting, probably tens of billions of dollars saved in productivity alone, and probably the PC industry would be 10 times the size it is today, because people would actually trust them for important tasks, rather than the next nifty home killer-app like pirating music. (not meant to be a troll against MP3 trading - meant to be a troll against the apparent purpose and direction of the PC industry in general).

--

These are my friends, See how they glisten. See this one shine, how he smiles in the light.

Exceptions don't save you much of the time. by Christopher+Thomas · 2001-10-25 11:35 · Score: 2

If the language you are using doesn't support exceptions (C, Perl, etc), you are going to have problems. Exceptions make sure that if an error occurs, and you aren't aware of it, your program dies, and doesn't go on its merry way, causing a security hole/unstable software.

Unless it's an uninitialized-memory error or a buffer overrun that overwrites some other program variables, in which case a C++ program will still keep going on its merry way without throwing an exception, causing difficult-to-duplicate and hard-to-trace bugs.

If it's possible to check for the error at all, then anything that you can implement with exceptions you can implement without exceptions (though I agree that exceptions are a _neater_ way of doing it).

If your program can't check for the error (as is common for memory errors without extensive and slow wrapping on memory accesses), then exceptions won't be triggered and you're still screwed.

[Aside: You can propagate error codes up between levels either by making error codes bit vectors and masking subcall errors on to the parent call's failure code, or by implementing your own error stack (if you anticipate using deep recursion). Messy, so exceptions are still _preferable_, but it can still be _done_ without exceptions. Almost as cleanly, if you wrap error-handling helper functions nicely.]

Re:you didn't read the article by egomaniac · 2001-10-25 11:35 · Score: 5, Insightful

News flash: Technology pundit seemingly insults open source, Slashdot up in arms. None of them actually read the article. Story at 11.

The article does not say "open source doesn't handle errors as well as closed source". What the article does say is "like most commercial software developers, many open source programmers are just plain lazy about proper error handling. But we're supposed to be better than that...".

I don't see a problem with this statement. The fact is, most open-source software sucks donkey balls. Petreley is merely saying it's time to put your money where your mouth is -- if you want open source to be considered better than closed source software, it better stop being so danged flaky.

--
ZFS: because love is never having to say fsck

Re:That all depends on your point of view by mmontour · 2001-10-25 11:36 · Score: 2

For instance, I know many "average" users who eject floppy disks and CD-ROMs from the drive while they are being read. Any Linux user who tries a stunt like that deserves a seg fault (or worse)

Well, an Amiga would give you a "You MUST replace volume &lt disklabel > in drive &lt device >!!!" if you ejected a disk while it was in use. It was a good reminder to the user that he had just done a Bad Thing, but the program could (usually) continue once it got the disk back. I don't think the application program even knew that this had happened; the read() call just blocked until the disk was put back.

If the OS isn't able to handle this sort of situation, the application program should get an EIO error on its read(). However this shouldn't translate into a segfault. Nothing should translate into a segfault - at worst an abort() if the program doesn't feel like recovering from the error.

Of course a "real OS" will just lock the CD-ROM or floppy drive [hardware permitting], thus preventing the user from ejecting a disk that's in use (unless the user has a paperclip, in which case he does deserve whatever he gets).

It is not the programmers, it is the projects by dsfox · 2001-10-25 11:37 · Score: 2

Open source programmers are basically the same people as commercial programmers, maybe by night or maybe as different jobs come and go. The difference is that most open source projects arise from a person's need, and it is natural to ease up on the effort once that need is filled, i.e. once the program is good enough for your personal use.

Re:It is not the programmers, it is the projects by swordgeek · 2001-10-25 11:43 · Score: 2

Hmm. Implying that open source software will never reach the same quality or build standards as professional software.

--

"People who do stupid things with hazardous materials often die." -- Jim Davidson on alt.folklore.urban

Commercial error handling by Moonshadow · 2001-10-25 11:46 · Score: 2

Commercial apps are better?

Well, we all know how bug-free Internet Expl...<This program has caused an illegal operation in module kernel.dll and will now be terminated>

The problem with this: 1 return value. by Nindalf · 2001-10-25 11:47 · Score: 2

One thing that really bugs me about most programming languages is that they only allow 1 return value by their most natural idiom. So you get these stupid hacks where some settings of the returned value mean errors and some are useful results, of you have to define a new named data structure just for the return value of this one function, or you end up having to mix output variables with the inputs for a function.

This is one thing I like about Forth-style languages, where it's just as natural for a function to return multiple results as to receive multiple arguments, letting you do either:
A B / on_error{ log_error cleanup exit }else{ use_result } return
or
A B / on_error{ store_exception drop_result push_unhandled_exception_errcode }else{ use_result } return
or
A B / drop_error use_result return

Unlike with exceptions, the possibility of an error isn't hidden away somewhere; if you ignore it, or hand it down to reach exception handling code, you have to do so right there and then, explicitly at every step. Actually, that's a general plus: with a stack language, you have to explicitly dispose of everything, which makes it harder to ignore return values, and impossible to write programs without knowing whether a function returns anything ("What do you mean it can return an error code? I though it was void!").

Agreed, death to pundits by jorbettis · 2001-10-25 11:53 · Score: 2

Linuxworld is having issues, so I can't read the atricle, but I remember Petreley from when I used to get Inforworld Magazine.

He's the stereotypical technology pundit. He learns just enough about technology to have an uninformed opinion about it.
The worst thing is that we on the internet have truckloads of people like him. Every mailing list, newsgroup, web log, IRC channel, or any other group in which people or trying to get things done will have a crew of wankers spouting their opinions with no attempt to actually contribute anything useful.

What really burns me about pundits is that they're getting paid to do what a couple million monkeys on the internet do for free.

Take Petreley. One time, he wrote an article about how maverick programmers don't write good code. I guess I can believe that. Then he went on to say that all brilliant programmers are mavericks, and Microsoft etc all hire them so they'll write bad code and people will have to buy bug fixes. Um, right. He then finished off by claiming that he used to be an absolutely outstanding programmer and that he had to quit because he was so amazingly good that writing decent code wasn't fun for him.

He has, to the best of my knowledge, never actually contributed anything at all even remotely useful to Free Software, or computing in general. He's even worse than Fred Langa, the guy who helped invent ethernet in 1976, then spent the rest of his career punditing, developing more and more bizarre opinions as his practical knowledge became antiquated.

So here's a message to Petreley: Do something useful, anything. If all you have to contribute is your opinion, then go home. Free Software writers are mostly volunteers, we don't have to put up with your wanking. If you have a problem with a program, file a fucking bug report. Actually, if you're such an amazing programmer, SHOW US SOME CODE! I don't care how much Infoworld pays you, to us, your opinions are worthless. So do something useful or, I'll have to dig out my cluestick and use it bash you into a profession that benifits humanity in some conceivable way.

--

Jordan Bettis

``Wherever you go, there's another stupid sigfile quote.''

Not just error handling--EVERYTHING! by swordgeek · 2001-10-25 11:57 · Score: 2

That's right, open source software sucks at nearly everything it does![1]

Open Source as it stands today is great at bashing together a really "neat" program which gets the job done in a specific manner. Soon enough, lots of cool little features are added in, and before long you have a 'perpetual-beta application.'

Programming, however, requires some discipline which doesn't often get put towards OSS. Programs require good error handling (and error trapping, for that matter), usability (That means intuitive interfaces), and documentation. Oh yes, and freedom from bugs. However, these things are BORING to produce, compared to the original plan of bashing out a neat routine.

Ironically, the only way to achieve such things in a distributed and open development model, is to have a central administrative point. Without it, large projects are just impossible. Funny, eh?

[1]of course, so does commercial software, but in different ways)

--

"People who do stupid things with hazardous materials often die." -- Jim Davidson on alt.folklore.urban

Re:Not just error handling--EVERYTHING! by MikeBabcock · 2001-10-25 13:41 · Score: 2

Actually, if the people who wanted to bash out the routine would do so in a library with a proof-of-concept app to go with it, then others could pick that up and run with documenting and interfacing it.

PS, (harps again), I'm tired of poor NULL handling in glibc.

strcpy(NULL, "bah") segfaults instead of doing something handlable ...

--
- Michael T. Babcock (Yes, I blog)
Re:Not just error handling--EVERYTHING! by swordgeek · 2001-10-25 13:47 · Score: 2

But that's just it--who, without their paycheque riding on it, is going to "pick that up and run with documenting and interfacing it?" Because it's not fun, it gets left behind. That's where discipline and central organisation tend to help a project.

--

"People who do stupid things with hazardous materials often die." -- Jim Davidson on alt.folklore.urban
Re:Not just error handling--EVERYTHING! by MikeBabcock · 2001-10-25 15:01 · Score: 2

You make too many assumptions.

There are plenty of people who prefer to write English (or their own native tongue) than code. The people who document current interfaces in the Linux kernel, for example, are often not those who wrote them. This might not be ideal, but it happens.

Welcome to the bazaar.

PS, I often _only_ document projects I dream up and allow those who are able to hack them together in a week to do so.

--
- Michael T. Babcock (Yes, I blog)
Re:Not just error handling--EVERYTHING! by scrytch · 2001-10-26 05:20 · Score: 2

strcpy(NULL, "bah") segfaults instead of doing something handlable

You're trying to write a null pointer. That has the well-defined property of raising a SIGSEGV which can be caught and handled. What do you want it to do, fprintf(stderr, "no, I can't let you do that")?

--
I've finally had it: until slashdot gets article moderation, I am not coming back.
Re:Not just error handling--EVERYTHING! by MikeBabcock · 2001-10-26 05:53 · Score: 2

Actually, you're wrong. The reason glibc segfaults is because the C specs state that the results are 'undefined'. Personally, if they're undefined, we should return the standard error condition for these functions (return NULL) instead of segfaulting.

I know, "if (!c) return NULL;" is a lot of code to add to a string function ...

--
- Michael T. Babcock (Yes, I blog)

Re:Of Course by Amazing+Quantum+Man · 2001-10-25 12:01 · Score: 2, Funny

Well, then obviously, what OSS programmers need to do is have a dialog that pops up:

$APPLICATION has found something wrong. It is obviously not $APPLICATION's fault, it is most likely something else on your computer, perhaps $RANDOM_CLOSED_SOURCE_APPLICATION.

We highly recommend that you reinstall your entire system, but do not install $RANDOM_CLOSED_SOURCE_APPLICATION this time.

Thank you

--
Fascism starts when the efficiency of the government becomes more important than the rights of the people.

Blame it on von Neumann by dha · 2001-10-25 12:02 · Score: 3, Interesting

Inexperienced and lazy programmers are usually poor at error-handling, and it's easy to lay the blame there -- but at a deep level that misses the point.

This is not about open-source vs closed-source programs, nor for-fun vs for-money programmers. It's about computational models such as von Neumann machines that, at their deepest roots, assume there will be no errors. That chain-of-falling-dominos style of thinking so permeates conventional programming on conventional machines that it's almost surprising that any code has any error handling at all.

Of course it's possible to hand-pack error-handling code all around the main functional code in an application.. and of course quality designers and programmers in and out of open-source will do just that.. but viewed honestly we must admit it's an huge drag having to do so, and typically fragile to boot, because the typical underlying computational and programming models provide no help with it. Error-handling code tends to be added on later to applications just as try/catch was added on later to C++.

Lest we think this sad state must be inevitable, let's recall that other computational models, like many neural network architectures for example, are inherently robust to low level noise and error. Then, that underlying assumption colors and shapes all the `programming' that gets built on top of it. We're to the point where trained neural networks, for all the limitations they currently have, can frequently do the right thing in the face of entirely novel and unanticipated combinations of inputs. Now that's error handling.

The saddest part is that von Neumann knew his namesake architecture was bogus in just this way, and expressed hope that future architectures would move toward more robust approaches. Fifty years later and pretty much the future's still waiting..

Re:Blame it on von Neumann by scruffy · 2001-10-25 14:32 · Score: 2

While neural networks and the like might be robust to noise, they have the problem of doing the job imperfectly.
Do you want your data almost sorted or sorted perfectly?
Do you want your satellite to make the right calculations to orbit and point its attenna, or is it ok to be 1% off?
I'm also sure the bank won't mind if its accounting software is off by 1% or so. The IRS won't mind either.
Unfortunately, there are too many operations that we want to be exact. Even very small floating point errors can cause problems, so numerical algorithms have to written with this issue in mind.

I got your fix RIGHT HERE! by Dast · 2001-10-25 12:05 · Score: 2, Troll

I'll be blunt. Open source programmers need to stop being so darned lazy about e rror handling. That obviously doesn't include all open source programmers. You k now who you are.
If you want a demonstration of what I mean, start your favorite GUI-based open s ource applications from the command line of an X terminal instead of a menu or i con. In most cases this will cause the errors and warnings that the application generates to appear in the terminal window where you started it. (There are exce ptions, depending on the application or the script that launches the application .)
Many of the applications I use on a daily basis generate anywhere from a few war nings or error messages to a few hundred. And I'm not just talking about the deb ug messages that programmers use to track what a program is doing. I mean warnin g messages about missing files, missing objects, null pointers, and worse.

I'll be blunt, too. I got your fix RIGHT HERE! I have whipped up some open source magic that uses a powerful error-finding heuristic in combination with a correction algorithm. It should fix all of these problems you have described.

----CUT HERE----

#!/bin/bash if [ "$#" -lt "1" ]; then echo "Usage:" $0 "<program> {<args>} exit 1 fi $* 2>/dev/null echo "All errors corrected!"

----CUT HERE----

You are not expected to understand how this works. Send me beer, we open source guys like that.

--

This sig is false.

Re:Linus' changes to recent kernels by sydb · 2001-10-25 12:06 · Score: 2

no errors
Nothing to handle then stupid.

--
Yours Sincerely, Michael.

Follow the example of strerror() by yerricde · 2001-10-25 12:13 · Score: 2

Error messages need to have numbers associated with them. For instance when I have ORA-1241 in oracle, a quick search in groups.google.com will give me a lot of informations about this error, and why it occured and what I can do about.

C's strerror() uses another approach: a short 6-character name for each error ("no such file or directory" is ENOENT, etc.) that stays constant across localizations.

The situation is even worse for people who used localised versions of the software, as you don't have the English translation

Whether you get "Non ci è tale archivio o indice (ENOENT)" or "Es gibt keine solche Datei oder Verzeichnis (ENOENT)", you can still search on the ENOENT. (Translations by Babel Fish.)

Now if only the popular apps did this...

--
Will I retire or break 10K?

A core dump *is* an error message! by Anonymous Coward · 2001-10-25 12:14 · Score: 2, Funny

A very smart guy from SGI once told me "A core
dump is the best possible error message because
it contains ALL the information you need to
diagnose why the program had to stop running."

Mmmm'K

:-)

Re:The error handling response by TrixX · 2001-10-25 12:19 · Score: 2

This one shows another of my points:

the problem itself has a very linear structure, but the solution here has a lot of nesting. If i had more blocks, it would have even deeper nesting.

If the allocation was non linear (for example, a tree or a graph), and failed in the middle, deallocation would be really a mess. You would have to exit some mix of loops/recursion in the middle, and refree all before exiting.

If you want a better solution, see my comment about BetterC. Or use Eiffel

I don't see the problem... by ffatTony · 2001-10-25 12:32 · Score: 2

Ever use a commercial unix? I use HPUX at work and True64/Solaris at school. Of those, linux programs are the most understandable when things go awry.

Linux is a recreation of a system that has historically been 'terse'. Can we expect it to be very different?

Yet... by gnovos · 2001-10-25 12:40 · Score: 2

...Amazingly, with all these errors and warnings, most of that software continues to run. Compare that with the way typical windows applications work (Crash on the first error and take out something important on your way down), that sounds like excellent error handling to me....

Just my $0.02

--
"Your superior intellect is no match for our puny weapons!"

error handling and such.... by scoobywan · 2001-10-25 12:49 · Score: 2, Informative

I noticed that a lot of the errors he was talking about were missing files, blah blah blah. This I have had problems with in X using various managers. Here are some answers... a lot of the time missing files are do to lack of checking the required packages. Also it can have to do with the way different Window Managers handle different things. I would like to see the results of these little tests done on machines that were running every available WM. I bet most of the problems with the GUI based programs that he is reporting are due to the fact that they were written on a machine running a different WM. Just my thoughts.... if you don't like it don't read it :P.

Later

It's a serious problem. by aebrain · 2001-10-25 12:51 · Score: 2, Interesting

Error Handling and Appropriate Technology

This article is right on target. It's much more important for open source software to be of a higher standard than closed-source, simply because with open source, shoddiness can't be hidden and swept under the carpet (to state the bleedin obvious). If we make shoddy open-source code, then up-and-coming programmers will see it, learn from it, and treat this very ordinary code as the 'norm'. Worse, they will treat it as a target to be aimed for, and cut corners so even this low standard isn't met.

FWIW I've worked in safety-critical areas for some 20 years. I've managed to dodge being assigned to management, and am doing neat stuff like spaceflight avionics for interest rather than chasing dollars doing yet-another-b-2-b system. The biggest problem I've found with re-educating interns is getting them to be paranoid enough. It's a matter of culture.

Quick N Dirty is an appropriate culture for some systems.

If you're writing throwaway code for a specific purpose (such as a simple script) then quality isn't an important issue.

If a deadline is approaching fast, your budget is zero, your team burnt out, then damn the long-term costs, hack it so it kinda works and ship it on time. It's crap, but they only paid for less than crap, so don't worry, be happy.

There's a major financial incentive to write high-maintenance code, both for programmers and companies. You make pennies on the initial sale, megabucks on the maintenance. What do you call a programmer who writes superb code that's maintenance-free? Unemployed.

This is true for design and requirement analysis, as well as code.

But... it's important to realise this is not the only way of doing things. It has it's place. But not in open source.(And there's such a thing as professional pride too, but I digress)

If you're writing a re-useable module, you should treat all inputs as being guilty until proven innocent, always check any outputs from your area, and be honest regarding what side-effects your module has. In some languages, this is easy (e.g. Ada ), in others darn near impossible (e.g. C), but it has to be done. It's obvious that you have to do it when lives are at stake. It's less obvious when you're writing some device driver for Linux - but literally tens of billions of dollars may be riding on how well you do your job. Even if you're not getting paid for it.

I'm not asking for the degree of robustness typically shown by safety-critical systems What? Your code failed just because half the memory was corrupted and a CPU was on fire? Unacceptable! Failure is Not an Option! but enough so that BSODs or their equivalent lead to puzzlement I've never seen one of those before!.

--
Zoe Brain - Rocket Scientist

Yes, $$ software probably is better at this by jmay · 2001-10-25 12:56 · Score: 2, Interesting

I expect that it is true that commercial software does a generally better job than open source software at error handling. This is probably even measurable: run comparable software products under similar usage load, and collect the errors generated (reported or experienced by users). I haven't tried such an experiment myself.

I don't why we (we = open source enthusiasts) should feel particularly worried about it at the moment. For most applications, the commercial alternative has a vastly larger user base than the open-source equivalent. And since these users are paying for their software, they are going to expect that there is a vendor who will respond to their concerns.

Customer service operations are very expensive. So it is very much in the interest of the commercial vendor to reduce the error rate to the level where the load on customer service is financially tolerable (this is not the same as zero).

Fixing obscure bugs is not the most exciting technical endeavor, and skilled engineers are more willing to do this work if you pay them...

As I said, I don't think we should feel bad about this state of affairs. A more appropriate line of discussion would be: What can the open source community do to create an environment where a higher degree of quality is met consistently for open source software products? Are there tools that we could build that will help? Non-intrusive processes that we can impose on one another? The Perl community in particular has an effort underway to establish a consistent level of testing for all the modules that are released on CPAN. Is that a worthwhile model?

Re:Three true things by Guy+Harris · 2001-10-25 13:13 · Score: 2

3. Open source means you can fix the code. So stop complaining and do something useful.

I presume that's either a troll or a parody of the sort of nonsense that all too often gets put out in response to complaints. If it isn't, I'll simply note that you probably can't fix the code if you are not a programmer, which a lot of users aren't.

(On the other hand, users of open-source software should bear in mind that the developers are often not working full-time on the program - they may have day jobs, or may be going to school, etc. - and therefore that the feature that they really really really want may not have been implemented yet because nobody's had time to implement it, so they may Just Have To Wait.)

Error Checking and Recovery are Requirements. by blair1q · 2001-10-25 13:13 · Score: 2

Requirements are specified in design documentation.

Each instance of Error checking or Recovery must be specified, just as the format of each element of output or the formula for each calculation must be specified.

Without that specification, who cares if you think the code is wrong? You can't prove it's wrong because you don't have the spec and didn't pay for its development. You bought (or five-finger GPL'ed) a license to operate the software. On an as-is basis, for every piece of software anyone posting to /. is ever likely to run.

You want it improved? Write the Engineering Change Request specifying the improvement, and send it along with the money necessary to get it done.

Design and validation of "bug-free" code is the most expensive software process there is. Just the paperwork on the validation process will double or triple the cost of the software. The problem is provably impossible to solve, and the best efforts on nontrivial code (and sometimes on what appears to be trivial code) end up with unresolved errors that are signed-off as calculated risks the costs of which will be borne by insurance, government, lucky avoidance of catastrophe, and the bottom line.

--Blair
"And it pays my bills, in spades."

Not TOO much by victim · 2001-10-25 13:22 · Score: 3

ardax makes the point above, but is only scored 1, so...

First, about your analogies...
If I wear out the door key to my car, the car should not burst into flames when I try to open the door.

If my car runs out of fuel, I expect that after rectifying that little problem (and bleeding the injectors) it will be just like new. I do not expect that it will ruin my tires.

And yes, I have kept my refrigerator outdoors. I kept it on the front porch for two months while the house was being renovated 10 years ago. It worked just fine. It is 30 years old now (Thats 15 PC generations to you young whippersnappers. Moores law says the new fridges should be 1,000,000 times colder now. :-) and I expect it to continue running indefinately. About every 5 years I put a drop of oil on the fan shaft behind the freezer to keep it from squealing. Software has a ways to go.

About the cubase incident...
Yes, the CD is scratched. I expect that I won't be able to re-authorize my copy of the software, but don't ruin ALL the data on my hard drive! (Its actually worse. It was my wife's laptop. You do NOT want to have to tell my wife that you just wiped out her laptop.)

About Word...
Destroying the on disk copy of a document before successfully writing out the new copy is just plain stupid. Particularly on a Mac where there is a special file system function to swap two files. You write the new copy under a fake name, swap it atomically (even over file severs) with the original file, then delete the fake named file (which now contains the old data). No one gets hurt in error conditions, no one can ever have bad luck timing and read a partially written file off the file sever. Life is good.

The third case (The PSC) you don't mention, but it isn't really a case of graceful degradation. Its just an irritating bug. Honestly I'd dump the device because of the irritation, but it actually feeds card stock out of its paper tray! A rare quality in a printer.

I suppose the more explicit point I should have made is that bad things are going to happen to software and it requires effort from the programmer to deal with it. Sometimes just a tiny bit of effort. Cubase performed so badly with a bad CD that I suspect they never tested it. They write about it in their documentation, but they probably didn't test it. The Word example is just careless programming which could have been trivially avoided if the programmers understood the platform's file system calls.

How about the cost? I estimate that it probably doubles the engineering effort to handle the exception cases to a degree that would cover the incidents I note above. In the calculus of software development the benefits do not out way that cost.

Re:Not TOO much by Ayende+Rahien · 2001-10-26 01:51 · Score: 2

About Word on Macintosh, you are talking about a *near full* disk, right?
There isn't enouugh *space* for a new file, it *has* to destroy the old file before writing to the disk.
Frankly, yes, it *should* make some check that the size_of_free_space - size_of_old_file >= size_of_new_file, and inform you about it, but that isn't quite what you are talking about.

--

--
Two witches watched two watches.
Which witch watched which watch?

Re:That all depends on your point of view by elflord · 2001-10-25 13:23 · Score: 2

And attitudes like that, ladies and gentlemen, are

... largely irrelevant if the only people who subscribe to them are slashdot trolls. Clearly, this is not the kind of viewpoints held by coders in the KDE and GNOME projects.

This guy has never used Oracle ... by dhogaza · 2001-10-25 14:08 · Score: 2

Or else he wouldn't think that only open source software has lousy error messages...

The user should not see errors unless they want to by Codifex+Maximus · 2001-10-25 14:26 · Score: 3, Insightful

The user should not see errors unless they want to. I agree with sending errors to STDOUT. If the user wants to see the errors then they can either start the app in an XTERM so they can see the errors or switch virt terms over to VT1 and see the STDOUT output.

Also, I use error reporting to a logfile rather than alarming the user. Most applications should be able to survive the average error. Those applications should prompt the user for proper input - even to the point of placing the cursor in the proper field. Each field should be intelligent and be able to validate it's own input data.

Those error logs I spoke of should be used by the programmer to debug his/her application - don't alarm the user ok?

--
Codifex Maximus ~ In search of... a shorter sig.

Re:Too specific: Programmers Stink at Error Handli by Broccolist · 2001-10-25 14:28 · Score: 2

The last time I used Word a drive filled during a save operation and left me with just a mutilated copy of the original file. (I will not use it again.)

Something similar but worse once happened to me. I was editing something with Word while browsing the web; not doing anything out of the ordinary. I saved the file and logged off for the day. When I tried to open it again, Word refused, claiming the file format was incorrect.

I looked into the .doc with a hex editor and found that some HTML source had somehow found its way into the .doc! I was using win95, so I guess this can be chalked up to buggy filesystem code. The weirdest and most frustrating bug I've ever seen. I didn't manage to recover any of my work.

Re:The user should not see errors unless they want by Codifex+Maximus · 2001-10-25 14:29 · Score: 2

doh! heh make that STDERR heh :)) Must remember that Preview button.

--
Codifex Maximus ~ In search of... a shorter sig.

Re:That all depends on... your selection of course by InsaneGeek · 2001-10-25 14:38 · Score: 2

I don't think people are bashing the free stuff, but more along the lines of giving it the same type of scrutiny that everything else is given. Honestly if people can't take any criticism at all, you better crawl into a hole, because the *real* world is a scary place.

There is a famous mantra, all programs suck, some more than others (I'm replacing the original word OS's with programs, cause it still fits perfectly). That goes for closed, open, free, expensive everything; all programs suck, and being able to openly talk about deficiencies in them is the only way to make them suck less. It strikes me as rather two faced complaining about "people bashing free software", when just before that *you* bashed other software, for the same legitimate reasons others supposedly "bashed" free software. Again all programs suck, being critical of them makes them suck less.

That's exactly what Preterley said by Pseudonym · 2001-10-25 15:12 · Score: 2

Read the article. That's exactly what he said. Here's the title and the subtitle:

Open source programmers stink at error handling

Commercial programmers stink at it too, but that's not the point. We should be better.

--
sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});

Sad....but true.... by Chanc_Gorkon · 2001-10-25 15:37 · Score: 2

I agree with Nick. Programmer Error handling sucks, but not just in Linux and Open Source. An example is at work, we had a programmer write a batch file to concatenate(sp?) two to five files together into one big file. Only thing is it depended on a network drive mapping (on a volume up on a Novell server...yech) and files to be there. If the batch file failed, there was no way to know it failed because of a network drive mapping error because bloody DOS has no frickin return codes. I WISH they'd let me and the systems programmer set these dang things up on a linux box so we could write a BASH or TCSH script with proper error checking so we could provide a return code back to the mainframe that triggers the script. That way the mainframe could holler at the operator that there's a problem. Right now, if the batch file fails it just drops thru. If it wasn't going to be replaced soon, I would rewrite the damn things, but since a new system will be entering implementation soon, we will be freezing all development except for fixing errors and fulfilling state/federal mandates. Hopefully the package we are going to (anyone ever heard of the education only package called Colleage by Datatel?.....it runs the business side as well as the scheduling, record keeping and all of the stuff a college computer system is doing....). Anyway, at least the picked the right OS and, in my opinion, the right DB for it (AIX for OS, and Oracle for DB......the other choice was....shudder....NT/2000 for OS and I believe SQL server, but it may have been something like DB/2 or something weird).

--

Gorkman

Re:That all depends on your point of view by crucini · 2001-10-25 15:42 · Score: 2

I want Linux to continue to run on new, interesting hardware. There is an ongoing battle to get vendors to release their specs. There are also several ominous clouds on the horizon, SSSCA and TCPA (trusted computing platform architecture) which threaten to rain on our parade of cheap, open commodity hardware.

If even a small percentage of "normal" users use Linux, it will be nearly impossible for anyone to marginalize Linux and lock it out of new hardware.

Also, consider protocols. Imagine if Microsoft pushes us to a point where you need .NET to buy airline tickets, make hotel reservations, or file your tax returns. A substantial base of Linux users can apply enough pressure to keep these protocols open.

Re:That all depends on... your selection of course by MikeFM · 2001-10-25 15:59 · Score: 2

The author does have a point in his article. A lot of programs do spout nasty pointless error messages both at compile-time and at run-time. This is fine in development but stable versions should catch and properly handle such errors. That goes for any program regardless of the license it comes under. I think the main reason we notice it more on opensource apps is because they are public during development and a lot of times are already being included in your favorite distros. While the extra use does help the debugging process it can leave an impression of lack of polish.

--
At what price learning? At what cost wisdom? The price is a man's peace of mind, and the cost is his life.

Re:There is a significant difference by crucini · 2001-10-25 15:59 · Score: 2

(2) run-away processes that consume resources to no end, until the system crawls...

You could set up a process monitor script that either runs as a daemon or via cron. If the same process is using 80% CPU or 80% memory two samples in a row, it would kill the process and pop an xmessage saying "$program was using too much memory, so I killed it."

Re:That all depends on... your selection of course by david+duncan+scott · 2001-10-25 16:05 · Score: 2

Well, yeah, because real core dumps go to the line printer. Everybody knows that.

--

This next song is very sad. Please clap along. -- Robin Zander

Explanation via Analogy by jayed_99 · 2001-10-25 16:25 · Score: 3, Insightful

The main difference between a great systems administrator and a technically competent sysadmin is paranoia.

A great sysadmin would cut out their own heart before operating without known good backups. A great sysadmin would chew their own arm off before putting something into production without testing it first in a development environment. A great sysadmin *always* has a backout plan.

And how does a lowly admin reach this amazing level of greatness, you ask?

Admins get paranoid after making hideous, terrible mistakes that immediately result in Bad Things Happening.

I have personally: killed the email server for 2 days...shut down distribution for the world's largest distributor of widgets (every Thursday for 3 weeks)...destroyed all connectivity (voice and data) to the world for 12 hours...hosed the upgrade on a 700GB Oracle database (and our backups were no good). And any semi-experienced administrator will have, at minimum, two stories that are at least this bad (like my friend who shut down trading at Fidelity for a day).

And for every one one of these instances, I immediately felt the wrath of: my manager, my manager's manager, other people's managers, other people who were affected, stray people wandering by my cube who weren't affected...I also became a part of the "mythical sysadmin storybook"--"I once worked with this guy, and (you won't believe this) he..."

I submit the hypothesis that: generally, most developers are not subject to this type of immediate and extremely negative form of feedback for their mistakes. Therefore it takes a developer a long time to develop an aversion reflex that conditions them to do "the right thing -- error handling, code documentation" instead of doing "the easy, interesting, enjoyable and sexy thing -- making spiffy algorithms, writing tight code".

Drifting into another analogy, error handling is like code docmentation. Why do most developers get good (and a little obsessive) about documenting code? Becuase they finally spent some years trying to maintain someone else's tight, sexy code that is virtually incomprehensible.

So, my point is, developers take a long time to viscerally learn the need for good error handling by repeatedly getting whacked on the head for lack of error handling. It's like evolution in action.

Another solution by KnightStalker · 2001-10-25 16:43 · Score: 2

This will probably annoy programmers who started with "pure" C++, Java, or VB.

int allocate_3(void){
int *buf, *p1, *p2, *p3 ;

buf = malloc(3*SOME_NUMBER*sizeof(int)) ;
if (!buf) { return -1; }

p1 = buf ;
p2 = buf + SOME_NUMBER;
p3 = buf + SOME_NUMBER*2 ;

/* Here we do something with p1, p2, p3 */

free ( buf ) ;
return 0 ;
}

--
* And remember, it's spelled N-e-t-s-c-a-p-e, but it's pronounced "Mozilla."

Just say no... by j3110 · 2001-10-25 17:47 · Score: 2, Interesting

to bugs. Bugless programs don't need error checking, just input bounds checking. Use Lisp and prove your program with mathmatical induction, or optionally, you can keep the same mindset in C. Just don't let the user screw up. If you have a finite set of inputs, you can very easily see that your program won't fail. I find it much easier to create functions to test that certain things work as expected than putting try blocks around things.

Now that I've given all the tips that I'm aware of, its time for the justification of my own faulty behavior that can't be justified :)

I think open source software does well for bug handling though. The bigest things I can think of that a lot of open source projects have faults with were never meant to be mission critical || are v1.0 || miss coordination caused some negative synergey. As for the first two, you should expect failures. The last is going to happen to even the best. I think it is a testament to OSS still. With such little time to invest, all the products I've seen get better every day.

And here come the excuses... :)

I really wouldn't call it laziness, but more a lack of motivation. The bulk of OSS is written in a geeks spare time, which in itself is small if the geek attends college and works. You have to account for all the reading a geek has to do on a daily basis. (Slashdot, Freshmeat, Changelogs, Anandtech et al, Pricewatch & EBay) Then account for all the time A geek spends perfecting his own system. (New kernel, apt-get, compiling his special favorite programs(MySQL, Apache, PostgreSQL, XBill)) By the time you get done with all the things you try to stay on top, you really don't have much time left. From there on out, your sleepy and are working purely on caffiene. You will enevitably make a few mistakes. :) It's not laziness... just tired, and you aren't really getting paid for the work, so why try as hard as you do for paid work? I'm not even metioning games which I believe are essential. All programmers need to take their frustrations out on some helpless AI creature, or else they would buckle under the stress.

Before someone says it, I know the rewards of OSS programming. If there were no rewards, then no one would do it in the first place.

--
Karma Clown

Something to be said for baby-sitting mainframes by dinotrac · 2001-10-25 17:50 · Score: 3, Interesting

I'm astonished at the poor error-handling in most software these days.

The biggest problem is not whether your language has exceptions (good error-handling has been done for years without them) or whether programmers are lazy. It's a matter of making it a priority. In fact, laziness caused a lot of us old-timers to take a major interest in error-handling.

Picture the days before internet access, running mainframe systems, probably with overnight batch cycles.

Good error handling might mean that you don't get a phone call at 3:00 am.
If that phone call comes, good error messages might mean that you can diagnose the problem over the phone and walk the operator through recovery.
In either case, you don't have to drive down to the data center.

Sleep. Now there's a motivator.

Not a competition by Dwonis · 2001-10-25 18:09 · Score: 2

Do you think commercial software handles errors better?

Dammit! This isn't a bloody pissing match! If we're going to set the bar so low that "it's okay as long as it's a little better than closed source", then we're destined for failure.

Instead, why don't we take this criticism at face value? "Open source programmers stink at error handling." Fine. Let's start disciplining ourselves and write our code with meticulous care. After all, we have no deadlines, we don't need to cut corners, we collectively have more time on our hands, so why coudln't we write excellent code if we trained ourselves to be careful. I think it's possible.

Re:Of Course by Malc · 2001-10-25 18:49 · Score: 2

I'm not running XP, and I highly doubt I will in the near future. Win2K serves my needs perfectly well thank you ;) The error dialog doesn't come up for every app: the IE6 one is different to the one in Office XP, and nothing comes up after the frequent crashes of my own programmes that I'm testing. As for IE6 auto-restarting, I don't recall changing anything to make it do that. Office XP apps also restart automatically after one of these crashes.

C HAS exceptions. by epeus · 2001-10-25 19:12 · Score: 2

Exceptions are mandatory for good programming, period. If the language you are using doesn't support exceptions (C, Perl, etc), you are going to have problems. Exceptions make sure that if an error occurs, and you aren't aware of it, your program dies, and doesn't go on its merry way, causing a security hole/unstable software.

C++ is implemented in C. Get out your copy of K&R and look up setjmp and longjmp. Do they sound scary? They should.
That is how C++ exceptions work too. Throwing an exception wihtout catching it is calling longjmp without setjmp.

It is your job as a programmer to check error return values, and write you code to clean up after itself if an error is returned. Throwing an exception is a cop out from cleaning up properly.

If your app aborts when memory or disk space is low, you could lose hours of work for your user. This is not going to make the user think your app is stable.

Re:you didn't read the article by Malcontent · 2001-10-25 19:15 · Score: 2

This is slashdot any post critizing linux or open source get modded up. Where have you been?

--

War is necrophilia.

That IS what error codes are for. by epeus · 2001-10-25 19:27 · Score: 2

The problem with result codes is that you can't propagate the problem up to the level of scope that should be dealing with it. For example, imagine you have a GUI program. At some point, it needs to open "foo.txt", but fails. Since you're a good software engineer, you've well-separated your GUI code from logic code. The GUI needs to display an error message, but if you only check error calls, the only part that knows about the eror that has happened is way down in the logic code, which has no idea how to tell the user. And propagating 'undef's all the way up through the code is uncool. Especially since return values should not be used to indidate errors; they should be used for return values.

That last sentence is stupid dogma. Take a look at the Mac OS APIs sometime. Almost all routines return an error value, of type OSErr. 0 means noErr, negative error values are well-defined by the OS. Postive errors above a certian range are left for applications to use.

With this convention, an error can be passed up the chain, and interpreted or transformed at each stage into something meaningful for the stage above.

At the GUI level, you can map error codes to strings based on these well-known values.

My favorite MS error messages. by Malcontent · 2001-10-25 19:28 · Score: 2

"When was the last time Windows gave you a nice error.log when it blue-screened, or how about IIS on a buffer overflow?"

First of all logging in windows pretty much sucks ass no matter what you are dealing with. I suspect this due to sever lack of any decent text tools like awk, grep, tail etc. Windows admins would get too confused with utilities like that.

That aside here are my favorite error messages I deal with pretty routinely.

From Access "there is no message for this error". Oh yea that's real helpful.
When importing data into SQL server "Overflow". No mention of line numbers or data types or field names. All you know is the one line of thousands had some data that SQL server did not like. Good luck finding it. What I do here is to create the same structure in postgres and import it into there. Postgres tells me what line and what data is bad. Postgres is a great debugging tool for SQL server and in many ways much better database.

And in ASP pages sometimes it pukes with a number (no message) a search on this number on the MS web site reveals that the error message means "exception occured". Wow that's real helful huh? A search on google shows many people with this problems with nobody giving solutions. My answer? Re-do the page in php.

--

War is necrophilia.

Re:My favorite MS error messages. by Malcontent · 2001-10-26 18:14 · Score: 2

Not this particular error. I checked it's a mystery error whose error message is really "exception occured". Nobody seems to know why it happens or how to fix it. I wish I had the number handy but it's at work.

--
War is necrophilia.

Its not detecting the errors, its handling them. by Convergence · 2001-10-25 19:58 · Score: 2

Its usually pretty easy to detect errors like this, for example, the program dies with a SEGV. The trick with errors is not in detecting the error, but rather in figuring out what to do when you detect it.

Is this error correctable, ignorable, or fatal.

If it is correctable, what is the correct action that corrects it. This can be more subtle than you think. And this correction code adds complexity and needs to be tested.

Which errors are minor and ignorable? IE, that are actually conditional status messages not actual errors?

What to do in a fatal error? What is the definition of a fatal error? A lot of code does not deal with resource starvation and treats running out of RAM as a fatal error. Should it? It doesn't have to, but htat would make the program orders of magnitude more complicated, it would turn every allocation into a potential exception-causing step.

By avoiding these problems and making more things into fatal errors, we make software cheaper and more plentiful. Would you rather have a netscape that crashes a couple times a month, or no netscape at all?

To respond to the article, IMHO, I'd treat the complaints that those applications print out as being debugging notifications. The computer warning about possible situations that might cause problems. By the same token, that code may not be robust, but making it robust introduces complexity and thus more risk for errors.

Apples and Oranges by jurgen · 2001-10-25 20:00 · Score: 2, Insightful

Without even having read the article (but I've read some of his previous stuff) I'm sure that Petrelly didn't base his statements on actually looking at code. No doubt he has some examples of errors that are no handled from the user perspective.

But that has nothing to do with the programmers. The difference between Open Source and commercial software here is simply that companies can afford dedicated testing staff... QA departments. Most of the errors that an idiot like Petrelly will be able to find will be caught by the QA department before release. Unfunded Open Source projects can't afford that kind of QA... and with time, widely used Open Source packages tend to become higher quality than much proprietary commercial software (the thousand eyeballs effect). But early releases do tend to have errors that a QA department at a company would have caught before release. That has nothing to do with the quality of the programming.

Re:That all depends on... your selection of course by malfunct · 2001-10-25 20:12 · Score: 2

How many of those users are using an IISAPI module of some sort? How many of those are written well?

Not to absolve IIS of blame, but when you run a .dll in proc and it hangs, the whole process hangs and there isn't much that IIS can do about it.

In my experience thats where I get most of the errors with IIS hanging.

--

"You can now flame me, I am full of love,"

Checking pointers - or good design by codemonkey_uk · 2001-10-25 22:41 · Score: 2

Endless checking of pointers is pointless, and wastes CPU. A much better approach is to use good design. Simple idioms such as Resource
Acquisition Is Initialisation (RAII) are much more reliable than manual pointer checks.

--

Thad

Open Source Error Handling *DOES* stink... by maroberts · 2001-10-25 23:02 · Score: 2, Interesting

...but the same is equally true of the vast majority fo commercial and closed source programs too. The sad fact is that jobs like reducing the number of warnings from the compiler and testing can be incredibly boring jobs that noone wants to do, so NoOne does it except in the most perfunctory manner.

Its been said that a lot of open source development projects ought to have some form of Audit person or team whose job it is to look at the project code and then when they find problems to go and reeducate the person who wrote the faulty code, preferably teaching him not to do it again [with a large hammer if necessary!]

--

Donte Alistair Anderson Roberts - hi son!
Karma: Chameleon

most developers stink at error handling by mj6798 · 2001-10-25 23:18 · Score: 2

Open source is hardly alone in this. Commercial software may detect errors with greater regularity, but it, too, rarely does the right thing when it actually finds an error (a dialog box is not usually the right thing). Languages also often do the wrong thing: C has no exception handling or automatic cleanup, Java encourages programmers to handle exceptions poorly, and only very few languages have restartable operations. I think to address this, we need a lot more training and education, but what else is new.

Speaking of Neural Networks ... by Aceticon · 2001-10-25 23:21 · Score: 2

I once implemented a Neural Network for a school project (from the bottom up in C++).

The thing was trained to recognize numbers, but we never got a success ratio bigger than 85%.

Suposedly we should've manage to get more than 90%, but there was a programming error in the code for the NN implementation.

The interesting thing is that the Neural Network actually adjusted to compensate for a bug in itself and achieved an 85% success ratio ...

Now that's error handling

Re:Killed accting server during merger & yeare by budgenator · 2001-10-26 00:17 · Score: 2

Try explaining the code to the receptionist at you Dentist's office, can you do it? if not maybe you don't realy understand it your self. Many problems in personal programming projects of mine were solved by explaining it to my wife. People like these tend to ask stupid questions, which generaly point out your stupid assumptions.

If the code is too hard to explain, its probably too complicated. If its too complicated its probably slow and buggy to. One thing I hate is a lot of OSS projects require certain libraries that are un available. After developement it helps to test them on a plain vanilla distro w/o a bunch of develoment libs just to see if they still work and if the required libs can be installed without breaking the rest of the system.

--
Apocalypse Cancelled, Sorry, No Ticket Refunds

if the program doesn't crash and burn, by budgenator · 2001-10-26 00:30 · Score: 2

you should at least be able to hit cancel, cleanout a couple 100Mb of GoatPorn and resave with out lossing anything except your patience.
If the program crashes, losses all of your work and corrupts the OS, at least others can use your program as an bad example.Actualy I remember when a 1.44 MB floppy was big and fast compared to a 1500 baud cassette tape for storage.

--
Apocalypse Cancelled, Sorry, No Ticket Refunds

Re:That all depends on your point of view by mr3038 · 2001-10-26 00:41 · Score: 2

This is something that annoys me - programs that just go 'Segmentation Fault' in the night... What do you do with a seg fault? How do you find the cause of a seg fault? What causes them? Unmatched libraries and application code? Null / Uninitialised pointer exceptions?

I take that you're using Linux or something similar. Suppose you were running ./foo and it went belly up (core dumped). Simply "gdb ./foo core" and you can see what the program was doing when it died. Usually simple "bt" for backtrace is enough to find the reason. If you didn't get core file you might want to check your limits (man ulimit). Usually segmentation fault is caused by incautious pointer usage - programmer made a copy of pointer and used it after original resource was freed or something like that. If you see segmentation fault in malloc() you can be pretty sure that the problem is some extra free().

--
_________________________
Spelling and grammar mistakes left as an exercise for the reader.

Get a life by budgenator · 2001-10-26 00:46 · Score: 2

1 I want to Work not run exotic diagnostic programs on other peoples software.

2 I concidered the disk writing dialog disapearing after the program finished writting to the Disk buffer but before the floppy was written to, to be the first indication that Windows 95® had a problem.

This is exactly the kind of thing that the topic is talking about error and exception handling, if you can't anticipate a common user error and make your software robust enough to handle it, then your reputation is going to suffer.

--
Apocalypse Cancelled, Sorry, No Ticket Refunds

Is OSS ever finished? by markmoss · 2001-10-26 01:42 · Score: 2

I wonder how many of those "error" messages really indicate errors? When I am programming, I will put in lots of messages to make debugging easier later on. I will disable them on the final compile, but there have been times when this got forgotten in the rush to release on the deadline. I wonder how often that happens with OSS -- especially since OSS releases are usually not the end of the project.

Also, messages that were intended simply to show the progress of the program or confirm it went down the correct path often inadvertently sound threatening: "Cannot find file xxxx.xxx", when what you really meant was "No initialization file xxxx.xxx found, using defaults."

Of course, as the author said, the problem isn't that OSS is worse than commercial software, but that it should be better. Is there anything in OSS as bad as the error message I sometimes got from Win95, "Cannot find file", without the file name and path? Not to mention how Windows allows an application to silently malloc some memory, forget to free it, and repeat until it crashes a different application or the OS itself...

Error Handling needs to be part of OO analysis by Skapare · 2001-10-26 02:11 · Score: 2

Error Handling needs to be part of OO analysis and design. The analysis needs to understand the scope of the class being designed, and error conditions need to be part of that. To the extent that the class analysis suggests that the class can deal with the errors, the design should specify that. All others should be part of the class interface. A clean reusable class has no business outputting text to stderr (unless the basis of the class is to interface with the user or administrator). All error conditions should be given back to the program using the class, with appropriate supporting information. The application then deals with it in some way more appropriate for the user. If an object cannot allocate memory, it should tell the application, not the user. The application can then tell the user.

There is one danger in this. If Microsoft follows this practice, when a class encounters an out of memory condition, the next day you'll end up with a Fedex arriving labeled "Here is the new RAM your computer ordered for you, courtesy of .NET and Passport. Your account has been dinged".

--
now we need to go OSS in diesel cars

The error handling challenge: official solution by TrixX · 2001-10-26 02:33 · Score: 2

in pseudo-Eiffel:

allocate_3 is require SOME_NUMBER>=0 local p1,p2,p3: INT_POINTER do p1 := malloc(SOME_NUMBER*sizeof(int)) p2 := malloc(SOME_NUMBER*sizeof(int)) p3 := malloc(SOME_NUMBER*sizeof(int)) /* do something with p1,p2,p3*/ free(p1) free(p2) free(p3) rescue free(p1) free(p2) free(p3) end
Notice how little changed from the original program. You can have a similar C solution and a discussion of the problem (as an example on error-handling) at this document.

Note that this solution does all this things (and compare with other solutions posted):
* frees all memory, no matter if things succeed or fail, and even if things fail in the do_something part
* checks that SOME_NUMBER is valid (non negative) and does not overflow when multiplied by sizeof(int)
* Has not a deeply nested structure
* Has an obvious and visible flow control* Works as a non-error when SOME_NUMBER is 0
* Allows calling routines to get the same kind of clean error-handling
* works robustly when other error conditions I haven't thought of happen.

Yes, C allows all this, but it is a pain in the neck, the code gets big and messy, and hard to mantain. So error-checking in C comes at a great cost...

why is error handling rotten? by dbrower · 2001-10-26 02:42 · Score: 2, Insightful

often because it visually bloats code, and obscures the "beauty" of the underlying algorithm. How often have you seen a published fragment that says, "error handling omitted for clarity?". The big difference I seen in good commercial application code is that the error handling code is often 1/4 to 1/2 the code bulk. Yes, exceptions can help reduce that. But the key point is being willing to go in and "uglify" your code to do all the error handling. It is unglamourous, and appears to be non-functional, so it is easy to blow off.

Two things I've learned are that (a) every "if" has an implied "else" clause that often represents an unconsidered error, and (b) those else cases, and other unexpected situations shouldn't be logged, they should be "asserted" in a way that makes the program stop dead, now. That forces you to fix them when they happen. The business the author cites of getting all these messages is truly evil, as it really helps no one, neither the programmer nor the end-user.

-dB

--
"It if was easy to do, we'd find someone cheaper than you to do it."

Bugless code. by Peaker · 2001-10-26 02:45 · Score: 2

Would opensource programmers thrive, if they use a language, that requires them to provide a logical step-by-step proof of their code, side by side with the code?

Example of a function declaration, and the mathematical specification it MUST abide, with the logical proof it abides it (In plain English, as the syntax is not thought out yet):

- Define function sort.
- Function sort takes a sequence, and returns a sequence of the same type.
- The returned sequence is of the same size as the given sequence.
- For-any-element in the given sequence, there exists an identical element in the returned sequence.
- For-any-element in the returned sequence, but the first, the element before it is smaller-than or equal-to it (polymoprhic smaller-than or equal-to)

With this mathematical specification, and code that sits next to the logical steps required to prove it abides this specification, we can know for sure that sort() works correctly. Whether or not it leaks memory, is another issue, but disallowing allocation of "global" memory (side-effect allocation), and mathematically specifying memory requirements, you can ensure 0-bugs there too.

Bugs in mathematical specifications will remain the only source of problems, but those would be rare, because the mathematical code is much more trivial.

As for performance, there is nothing that the semantics of the actual code must abide to, as long as it is proven to provide the mathematical requirements. Therefore, the performance of the code should be at least as high as any other language, and depending on implementation, and the chosen semantics.

Test-First Programming by Frank+Sullivan · 2001-10-26 02:46 · Score: 2

Test-First Programming (TFP) is a key part of the Extreme Programming methodology. The JUnit unit testing library has been ported from Java to pretty much every widely used language. So the tools are there to produce robust code.

Here's how it works... BEFORE you write the body of a method or function, you write a unit test(s) for that function, to make sure it provides correct results for whatever inputs you might encounter. All of those tests should fail. THEN you write the body of the method/function. All the tests then should pass. If the tests don't pass, fix until they do. If bugs are encountered later that aren't caught by the unit tests, use test-first for the repairs - that way, you know your fix actually works. Just keep adding tests as you learn more.

Now put calling those unit tests into a framework and call it from your makefile. Unit test every time you compile.

Here are some of the benefits...
1. If new code breaks old code, the unit tests catch the error, and you can fix it appropriately right away.
2. You code with far greater confidence.
3. You keep your APIs very clean, because you have to test them right away.
4. Your APIs are thoroughly documented by the unit tests themselves.
5. Maintenance, especially by other programmers, is far easier, because they have the unit tests for reference and can easily narrow down where any bugs occur.
6. Refactoring is much easier, as any errors caused by refactoring are caught by the tests.

TRY THIS. It will change your whole approach to programming!

--
Hand me that airplane glue and I'll tell you another story.

Comment removed by account_deleted · 2001-10-26 02:50 · Score: 2

Comment removed based on user account deletion

Pft, Error handling is for wusses. by pi_rules · 2001-10-26 03:13 · Score: 2

I've been asked before in projects, 'What kind of error handling mechanism will you/we be using?'. My response is usually a cocky, "Pft, we don't put errors in our code, so why would we look for them?".

Re:Code Review by bluGill · 2001-10-26 03:35 · Score: 2

I disagree. Code reviews do an excellent job of catching errors. However you need a code review as soon as it compiles. Code reviews will catch those bugs that you spend the first couple weeks getting rid of. [if (x=y) instead of if (x==y), and some logic errors if(Error) { normal case} else { error case}]

I just commited a piece of code that cannot be checked any other way. Code that checks for hardware errors, but the hardware modifications to reproduce it are not worth the cost, after I was done codeing. Code reviews are my only chance of getting the code to work right.

Exceptions are overated. by pi_rules · 2001-10-26 04:09 · Score: 2

Exceptions are little more than syntatically nicer glorified GOTO statements. Take a peak into the Linux Kernel and you'll see goto's are actually used fairly often for exception handling. This isn't saying that you -should- use goto's, because they are harder to trace through in code but programmers should at least recognize just how pithy they really are. They're a flow-control mechanism more than a fix-all-error handling mechanism like I seem them being used all too often by beginning programmers. ie:

try {
function_A();
} catch (...) {
// ignore it
}

The only real upshot that I see to exceptions is that it allows the error to traverse back up the calling stack (or down, however you look at it) until somebody catches the thing. All this adds overhead to the entire program though when it's compiled to be made aware of exceptions (in the case of C++... Java just keeps track of it all the time).

Excpetions certainly -can- be used in a proper manner but they can be abused too. One thing I'm not fond of is code like this:

try {
// Open file for reading
// read password
// open db connection using passwowrd
// pull back recordset from db
} catch (...) {
// Now what?
}

Sure, in the above you should be trying to catch different exceiptions (one for file IO, one for the db, perhaps one for the recordset). Once you start really getting down to a line-by-line error handling mechanism things just get awkward. Larget code blocks leaving you wondering how control got to the catch{} block in the first place when you're not sure which line actually tossed the error out. To do things properly you almost need to be trying{} and catching{} every single line of code IMHO. Guess what? We're back to C style error-return value handling now. I think that was my point to begin with...

Re:Exceptions are overated. by scrytch · 2001-10-26 05:09 · Score: 2

} catch (...) {
// Now what?
}

I see that form misused too, but it hardly indicates the opposite extreme of catching in every single line of code. It's simple: if you don't know what to do with an error, let it fall through and catch it at the level of its most basic subsystem. If you don't know how to handle a particular DB error, let it propogate to the top of the DB subsystem, and it'll probably involve recycling the current connection. And if you get some totally insane error, let it fall all the way through to the top, generate a user-visible error, consider it a failure of the entire subsystem that generated it, and see if you can gracefully restart it without losing too much state.

Everything is just syntactic sugar on UTM's or S+K combinator application. Just because it doesn't compute anything differently doesn't mean it doesn't express the concept differently.

--
I've finally had it: until slashdot gets article moderation, I am not coming back.

Re:Once Again... by trongey · 2001-10-26 04:22 · Score: 2

Your third paragraph," If the mySQL people release a free database and it loses the occasional record, and its later determined that mySQL was used to track nuclear arms and now some are missing, should the guy who wrote that code by put in jail or fined billions? I mean, he wrote the software he should provide a warranty!" does a great job of reinforcing my point. When software vendors are pressured to guarantee a product they have much more reason to ensure that it works.

Warranties generally include conditions under which they apply or are void. If the product isn't suitable for a particular use then consumers can make an informed decision about that use.

--
You never really know how close to the edge you can go until you fall off.

Recover from this by yerricde · 2001-10-26 04:25 · Score: 2

If you can't hold on to the user's data if and when you I/O fails then it's time to take a look at the design..

OK. Yank the hard drive from the computer while it's still on. Now lock the hard drive in a safe. Now try to recover your last hour's worth of changes. Are you implying that all programs should always transparently backup off-site? That would result in unacceptable latency for users on 56K or slower connections who try to edit large documents.

OK. Now do something to make the computer swap a lot. Now yank the hard drive. How is the OS supposed to continue in such a situation?

--
Will I retire or break 10K?

Re:An alert box requires allocating memory by Thomas+Charron · 2001-10-26 04:49 · Score: 2

Thats why you ensure that something is watching the system all along, and ensuring that certain things are within limits.

Ever notice how Windows will occassionally say 'Your system is low on virtual memory'? Same thing. Presumably, you can get the app to shut itself down to prevent such a catastrophic failure.

--
-- I'm the root of all that's evil, but you can call me cookie..

Error numbers are pure evil by Per+Abrahamsen · 2001-10-26 05:33 · Score: 2

The time for numbers have passed. Use a short mnemnonic keyword instead, computers handle them just as well as numbers these days, and humans handle them way better.

Re:That all depends on... your selection of course by Krelnik · 2001-10-26 05:53 · Score: 2

Open source, like everything else in life, strictly follows Sturgeon's Law: Ninety percent of everything is crud

No indication of error by sfe_software · 2001-10-26 06:57 · Score: 2

I use KMail (under Gnome no less) for my email. It's a great client, and handles tens of thousands of emails without much fuss.

24 days ago yesterday, I transfered all of my account settings to a new username. Somehow I managed to forget to chown 'kmailrc', a few directories deep. I didn't notice this until I closed and re-opened KMail 24 days later...

So after 24 days of adding POPs, tweaking filters, etc, I find out these things never were written to the config file. I found this out NOT by an error message -- KMail pretended everything was fine. I only found the problem after losing the settings that had apparently been in memory for the last month...

Frustrating to say the least; I would have appreciated even "Can't open 'kmailrc': permission denied" or better yet a chance to chown and retry. Nonetheless, I haven't found anything better (and it was my screw-up), and I don't have time to try and get Evolution to compile... and anything beats going back to Windoze...

--
NGWave - Fast Sound Editor for Windows

Re:Three true things by Guy+Harris · 2001-10-26 07:13 · Score: 2

The point is that 1 and 2 may be true, but the whole point of open source is that one can fix the code. This does not mean that you have to, but it means that you can.

You can if you are a programmer, and familiar with the language in which the program is written, and familiar with the theory behind the program (if any).

If you can fix it, do so. If you can't, submit the bug. But whining and doing nothing are not acceptable.

If "submit the bug" means "report the problem to the developers" (they may not have a formal bug-tracking system; perhaps they should, but that doesn't mean they necessarily will), then, yes, one should do that, regardless of whether the software is open-source. If the developers don't know about it, they aren't likely to it, except by accident.

However, all too often the "fix it yourself, or submit a bug" response to complaints gets over-simplified to "fix it yourself, you have the source", which is an error - there may be users who don't have the knowledge to fix the problem themselves. If that's what you really meant, that's what you should have said, clarifying "do something useful", which, in your previous message, was preceded by "Open source means you can fix the code", a statement, that, as noted, is not necessarily true unless you use a rather unrealistic definition of "can" (e.g., "can, if you learn a programming language and spend a lot of time studying the code and the theory behind it first").

And if your posting was responding to Petreley's article, accusing him of "whining and doing nothing", note that he wasn't just complaining about specific problems (which he should report to the GNOME, KDE, etc. developers), he was complaining about an apparent general attitude, and "filing bugs" on that may mean filing bugs on projects that don't even exist yet, so that when people write some new project they take more care handling errors, including those that "can't happen".

The difference is in error reporting. by DunbarTheInept · 2001-10-26 08:40 · Score: 2

I don't think open source software is more error prone. I just think it's more likely to *tell* you when an error happens rather than just sweep it under the rug and pretend it never happened. OSS doesn't lie. If something went wrong, it TELLS you. I'd love to see that kind of behaviour out of my Win98 desktop, so I could actually figure out why it keeps launching goofy things at startup that I don't even have installed (resulting in blank windows I have to close by hand.)

--

Don't label something "offtopic" unless you know the topic well enough to tell what's on topic.

Re:you didn't read the article by ameoba · 2001-10-26 10:29 · Score: 2

Titling an article "Open source programmers stink at error handling" is an inflamatory statement. Regardless of what is actually in the body of the article, you can't place it in a Linux/OSS oriented site without expecting that exact type of reaction from the /. crew. Hell, he probably called it what he did -hoping- to get a front page mention on slashdot.

--
my sig's at the bottom of the page.

Slashdot Mirror

Open Source Programmers Stink At Error Handling

205 of 610 comments (clear)