Calling Software Reliability Into Question
phillymjs writes "CNN is running a story on software reliability, and how the lack of it may cost more and more lives as technology creeps further into everyday products. It appears a debate is finally starting amongst everyday (read: non-geek) people about vendor liability for buggy software. Some opponents of the liability push are unsurprising: Says the story, 'Microsoft contends that setting [reliability] standards could stifle innovation, and the cost of litigation and damages could mean more expensive software.' The article also says, however, that consumers' favortism of flashy products over reliable ones is partly to blame for the current state of software."
I like what Microsoft has been doing qith security these days, quite frankly. The new security features in Windows Server 2003 look innovative and very modern, and quite easy to use.
Linux may be secure when configured correctly, but Windows Server 2003 looks to be the most secure OS out of the box at the moment.
Abortion is advocated only by persons who have themselves been born.
--Ronald Reagan
Remember, one thing M$ does well is pay lawyers.
"
:P
"OTOH, if computers were reliable enough to crash only once every few years, then users might report every crash that happens, the vendor can diagnose it, and fix the bug or family-of-bugs so that it never happens again. This is roughly what happens when a mainframe crashes, I believe - it's a big event."
I think that has alot more to do with the critical, often costly, tasks that mainframes are used for than because its an infrequent occurance.
In my experience, infrequent crashes are much easier to ignore than one that occurs constantly
Why can't there be a "cutting edge" in reliability?
Because software needs to be thoroughly tested before it can be called reliable. "Cutting edge" software tends to be poorly (relativly speaking) tested, since it hasn't had that much time in the real world.
Therefore, for instance, Debian stable still uses kernel 2.2 by default (alltough there's a 2.4 installation flavour), because it's well tested and reliable. As a result, I've never experienced inconsistency or crashes with a Debian stable release.
(Now, if you want cutting edge Debian, there's always Debian Sid (also known as unstable)).
Circular thinking. Find the golden tool required to reduce (dramatically) the time and expense to "debug it, redesign it, retest it, certify it, and release it". If you think it can be done, then you may not be the person to do it. I'm not either but problems should be seen as opportunities. How do I open all these cans quickly...
There is no golden tool, or to quote a more famous source (Frederick P Brooks, Jr) there's no silver bullet.
The article discusses the magic tool idea saying the Sustainable Computing Consortium "wants to create automated tools that analyze software and rate its reliability." But the problem with reliability is it's BIG. Not only must your program not crash, not leak resources, not have race conditions, and not degrade in performance over time, but it also needs to be doing the right thing during all of this (you know, what the software was written for in the first place). Finally the software has to be able to do all of this in the face of all the possible failures the system could throw at it. That's no small feat. Now what type of tool will work across the board? I won't stand in the way of someone trying to make one, but honestly I don't see it happening.
There are a lot of different testing techniques that solve different parts of these problems. And each application is going to need it's own unique testing methods in addition to tried and true techniques. But all of this testing is going to take time to develop and/or adapt to the product at hand and finally run. And as you discover reliability bugs you may be blocked before you can find (and fix) more. So it will take a long time to work through the issues.
But your original question is "Why can't there be cutting edge reliability?". And the answer is that there can be, but it has to be what the consumer demands. If the consumer will choose features over reliability that's what the market will deliver. This is simple - if you spend time doing all of this testing and improvement to your product someone else will ship the unreliable product first and everyone will use that. They may curse the developers name while they're using it, but they'll use it. Meanwhile they're improving their product AND getting revenues, and you're just improving your product. They end up having more money to improve their product, and they win.
In cases like NASA the customer demands it, and they get it. A magic tool is a long way off - our code needs a lot more metadata before a magic tool can even begin to tell what's going on.
Because software needs to be thoroughly tested before it can be called reliable.
/. contributors have consistently ignored the role played by trusted components such as VMs and safe compilers. Bottom line is that we all need to get away from the mindset engendered by years of Unix and C hacking and recognise that not all problems are going to be solved by employing programming whizzes or spending a fortune on testing.
This is not strictly true. I know that my Java program will never have a buffer overrun because it is impossible for me to produce JVM instructions that corrupt buffers or alter pointers. Therefore, I can download and run any Java program to my Java smartphone without invalidating the phone's network certification.
Throughout this discussion, I've noticed that
But what's the difference between a heap and a stack who should I care?
Basically, you need to know the difference if you ever want to write really good, efficient code, particularly in C/C++. Its basically about the fact that in order to do so, you really need to know what is going on "behind the scenes" / "under the hood" etc with the compiler. You can't write "good", highly optimized C++ code without at least a solid understanding of how the compiler turns your code into assembly code, and how the CPU executes that assembly code (e.g. stack, registers, cache etc). If you don't truly understand what is happening 'behind the scenes', you not only end up making design mistakes that impact performance, but you also will never really be able to do as good a job at optimizing the C++ code. (This could be seen as a problem with C++, i.e. that you basically need to know assembly to use it properly, but thats the way it is).
Stack is temporary storage space where function parameters and variables with function scope are created. When the function returns, the variables are popped from the stack (not physically, the stack pointer just gets incremented). Anyway, the point is, stack memory is rather limited (usually about 1 or 2MB in an application today). So, a fairly common programming error that you see from not-so-clued-up programmers is to use the stack (variables local to a function, function parameters) for thing that really belong on the heap (new/delete/malloc/free). Heap memory is effectively "unlimited", and persists across functions until manually freed by the programmer. Now, the clueless programmer might know new, delete, malloc, free etc, but doesn't understand when he is using the stack or the heap. So they make mistakes like the following:
void foo() { char toobig[10000000]; ...
}
And next thing they are knocking on your door asking why their program crashes on that otherwise innocent-looking (or so they think) code. The array is allocated on the stack, and the stack basically overflows immediately. They should have used new/delete or malloc/free.
So why is this so bad? Not because it is bad per se, but because it reveals a fairly deep lack of understanding of 'how to program'. Seeing this mistake is, in a way, a very useful benchmark of a programmers level of knowledge. Basically, if you are seeing a graduate comp sci. student on your team make such a beginner mistake, you know that they are very far behind, and will need a LOT more training and experience to become really good programmers. But more importantly, if a comp sci graduate doesn't know the difference between a stack and a heap, it also usually means that they lazy: they have never wanted to have a 'deep' level of knowledge of programming. And this laziness is an indication that you probably don't want that person working for you.
This isn't all necessarily that important for many types of applications. For "typical" VB or Java apps, it isn't. But for any serious C++ project, you generally do not want such a programmer on your team; you rather want the sort of person who is actually willing to learn these basics either in university/college or on their own time.
An analogy: its like a car repair place hiring a mechanic who has a general idea of how an internal combustion engine works, but doesn't understand details (e.g. doesn't understand what distribute timings are about). Its not as if you won't be able to get useful work out of that person -- but you are always going to have to do a fair amount of "hand-holding" in order to get it ("bring that here, put that there, unscrew this" etc).
Yes, Ada was designed from the ground up for reliability, and experience has shown that it substantially reduces bugs, particularly post-deployment bugs, the most expensive kind. I'm amazed that nobody else mentioned this. Oh well, nobody will read this comment anyway.
I watch Brit Hume on Fox News