Ultra-Stable Software Design in C++?
null_functor asks: "I need to create an ultra-stable, crash-free application in C++. Sadly, the programming language cannot be changed due to reasons of efficiency and availability of core libraries. The application can be naturally divided into several modules, such as GUI, core data structures, a persistent object storage mechanism, a distributed communication module and several core algorithms. Basically, it allows users to crunch a god-awful amount of data over several computing nodes. The application is meant to primarily run on Linux, but should be portable to Windows without much difficulty." While there's more to this, what strategies should a developer take to insure that the resulting program is as crash-free as possible?
"I'm thinking of decoupling the modules physically so that, even if one crashes/becomes unstable (say, the distributed communication module encounters a segmentation fault, has a memory leak or a deadlock), the others remain alive, detect the error, and silently re-start the offending 'module'. Sure, there is no guarantee that the bug won't resurface in the module's new incarnation, but (I'm guessing!) it at least reduces the number of absolute system failures.
How can I actually implement such a decoupling? What tools (System V IPC/custom socket-based message-queue system/DCE/CORBA? my knowledge of options is embarrassingly trivial :-( ) would you suggest should be used? Ideally, I'd want the function call abstraction to be available just like in, say, Java RMI.
And while we are at it, are there any software _design patterns_ that specifically tackle the stability issue?"
How can I actually implement such a decoupling? What tools (System V IPC/custom socket-based message-queue system/DCE/CORBA? my knowledge of options is embarrassingly trivial :-( ) would you suggest should be used? Ideally, I'd want the function call abstraction to be available just like in, say, Java RMI.
And while we are at it, are there any software _design patterns_ that specifically tackle the stability issue?"
I'd hate to say it, but you might want to SERIOUSLY consider managed code. You could build some of the parts in C++ if need to be, but doing it purely in C++ seems like a bad idea to me. You're asking for a silver bullet that just doesn't exist...but managed code is getting faster and can be pretty stable.
Write with them.
> Sadly, the programming language cannot be changed due to reasons of efficiency and availability of core libraries.
You can easily embed C/C++ in other languages. Take a look at Inline::CPP, for example. With code like:
use Inline CPP;
print "9 + 16 = ", add(9, 16), "\n";
print "9 - 16 = ", subtract(9, 16), "\n";
__END__
__CPP__
int add(int x, int y) {
return x + y;
}
int subtract(int x, int y) {
return x - y;
}
you can put the parts that need to be fast in C++, and the parts that need to be easy in Perl. (If you do the GUI in perl, you won't have to worry about portability or memory allocation. And the app will be fast, because the computation logic is written in C++.)
> The application can be naturally divided into several modules, such as GUI, core data structures, a persistent object storage mechanism, a distributed communication module and several core algorithms.
Yup. There's no need for the GUI to know how to do computations, remember. The more separate components you have, the more reliable your application (can) be. Make sure you have good specs for communication between components. Ideally, someone will be able to write one component without having the other one to "test" with. For testing, write unit tests that emulate the specs... and make sure your tests are correct!
My other car is first.
try not to de-reference any NULL pointers and you should be ok..
If you're willing to compromise performance to the point that you can use CORBA for IPC, then you should be more than willing to write it in the language of your choice, within reason. C, C#, C++, Java, all are far faster than your CORBA transport.
If you can provide more details about the specific requirements, you might get more informed responses. As it is, though, your stated goals really don't seem to add up.
Even as stated, I would write the core in a highly tuned fashion (although C++ might not be my best choice for this), then write the GUI in the language of your choice, quite frankly. Optimise the bottlenecks (ie: your core processing) for speed, optimise everything else for maintainability and ease of development.
You're special forces then? That's great! I just love your olympics!
THere is no silver bullet for what you describe other than sound development practices. The best results in this area are acheived by teams who are constantly refining their processes based on lessons learned in previous software iterations.
Bulletproof code isn't cheap, but it can be done.
1. Write the whole thing in Python.
2. Once it's bullet-proof, replace each function and object with C++ code.
3. Profit.
Follow NASA's advice... http://www.fastcompany.com/online/06/writestuff.ht ml
You can use the Boehm garbage collector to eliminate a huge class of typical memory errors:
http://www.hpl.hp.com/personal/Hans_Boehm/gc/
This isn't necessarily something you'd have to design around, either. You can add it later.
If moderation could change anything, it would be illegal.
Good people
lots of time
lots of money
then you have a chance of pumping out the good product
First, consider how complex you want to make the system. The decoupling is a good idea, I think. However, I don't think that having modules automatically restart one another is a good idea; it introduces a whole slew of other problems. At most I'd say use a watchdog process (principle of single responsibility).
Furthermore, you're crunching large amounts of data, so I'm guessing batch processing. If you can have the application not be a server, then you simplify things a lot. Make it a utility that takes data on standard input and runs whatever analysis you need, and duct tape it together with cron or a simple program that watches for new input files.
Also, I'd like to suggest that you consider whether other languages could be efficient for the task. For example, Java is pretty good numerically, and as far as your libraries go, see if you can use SWIG to generate JNI wrappers. Also, then you get Java RMI.
Next, get them down to one platform. It's *way* easier to develop software with tight constraints on a single platform (versus multiple platforms). Investigate QNX: a reliable operating system (though admittedly quirky) with a beautiful IPC API. In any case, make sure you get a well-tested library with message queues, etc. You don't want to be using raw sockets; you could but that's just another pain in the ass on top of everything else.
Last, figure out what the cost of a failure is. Getting that last few percent of reliability is very very expensive. Unless you're a pacemaker or respirator, the cost of failure is probably not as high as the cost of five nines of uptime.
When coding something that needs to be stable, you need to keep your ego aside and concentrate on the task at hand. Stick with tried and true methods don't go with any algorithm that you are not 100% comfortable with even if it makes the code less ugly. Be sure to follow good practices make many function/methods, and make each one as simple as possible, makes it easier to check each function for bugs when they are simple. Secondly document it like you never want to touch the code again (in code and out of code), you want to know what is going on at all time and the bigger it gets the larger chance you could get lost in your own code. When working in a team and you are in someone else's code document that you did the change.
Next take into account what causes most Crashes.
Bad/Overflow memory allocation.
Memory leaks.
Endless loops.
Bad calls to the hardware.
Bad calls to the OS.
Deadlock
If you are going to decouple modules keep in mind that you will need to do as much processing as possible with minimum message passing and allow for mirrors so if one system is down and other can take its place, without killing the network.
For IPC I tend to like TCP/IP Client server. But that is because it tends to offer a common platform independence and allows for expansion across the network. Or try other Server Methods such as a good SQL server Where you can put all the shared data in one spot and get it back. But not knowing the actual requirements it may just be a stupid idea.
I would suggest that you also ask in other places other then Slashdot. While there are many experts on this topic there are also equal if not greater amount of kids on there who think they know what they are talking about, or they have there ego in this technology/or method.
If something is so important that you feel the need to post it on the internet... It probably isn't that important.
before you even write it. I mean get your idea clear and then write the code.
Check your input _always_ and get clear on error signaling. Any module can cause an error, but this thing should be efficently said to other modules so they can handle the error.
Create an universal error trap that will catch any error you don't expect, process it and allow for further program run.
That should do it.
Your program's only as stable as the "core libraries" your company wants you to use.
REM Old programmers don't die. They just GOSUB without RETURN.
First, there is not a silver bullet design that makes a program 100% crashproof. Even if there was there would need to be the corresponding crash proof Operating System, which there really isn't. Linux and some Unixes have very high uptime (99.997%), as do Mainframe OSes, but Windows certainly is not normally in that category.
To make your program as crash proof as YOU can control you should validate your requirements using Use Cases, minimize Design Complexity, use good C++ programming practices, and do extensive testing at every level using white box and black box testing techniques. Testing is key, and regression testing after changes is even more key. Don't assume fixing this didn't break that. Test with REAL data if you can. Test with invalid data so you will test your error handling, test at maximum usage levels to validate no memory leaks, resource contentions, deadlocks, etc.
However, at some point things get out of your control, as you don't write the C++ system calls, or the compiler code, or any OS features the code uses. So bugs in those can cause your program to crash. It wasn't your code that crashed but you'll get the blame. So to be crashproof it takes a "system" that is crashproof, you program is just one part of that.
Given the advice you are giving the original poster, would the D programming language be a good alternative choice for him? Programming by contract, binary compatibility with C libraries, and actually compiled as CPU instructions.
State machines help make sure you cover (almost) all possibles cases your app may encounter.
Here's a great framework to start with:
http://www.quantum-leaps.com/products/qf.htm
And the book:
http://www.quantum-leaps.com/writings/book.htm
Extreme programming is your friend on this one. Doesn't matter what language you use, test and retest at every change. Testing is the only, only, only way to get extremely stable software outside of formal verification methods.
I think the people you have working on this particular project will have the most influence on whether you have a stable design in the end; especially when working with C++. Put together a team of top-notch engineers, read the Mythical Man Month, then start to think about the design. If you gave three different teams the same task, most likely in the end these three teams would produce three different, yet functional designs; with one of these designs being the most stable. The success of many large projects hinge on the skill set of the engineers, communication, project management, and process (CMM, etc.).
You talk about making an "ultra stable, crash free" ystem and then talk about crash recovery. I'm guessing from this you don't want to protect the application from harm (eg. full of exception catching and internal recovery from those evil buggy 3rd party librariesor whatnot), but how important is your data? Is that what you mean when you mean ultra stable, that you end up getting the right results at the end? Maybe you should think about redundancy, tracability of results etc.
I think perhaps what you REALLY mean here by stability is Fault Tolerance. It's impossible to write code that has zero defects, outside of any trivial examples. Real Code Has Real Defects. Now, as you talk about modular design and being able to restart modules, you're talking about, not stability, but fault tolerance; the ability of the application to recognize and recover from faults. For instance, you can't necessarily guarantee that the module on machine A running task B won't die, hell the computer could accidently fry, but if your application was Fault Tolerant then the application would kick off another process somewhere else on computer C to rerun job B. Stable systems aren't built necessarily by trying to write defect-free code, but by recognizing that defects will occur and architecting the system in such a way that it can recover from them. Here you need to be concerned about things like transactions, data roll-back, consistency, techniques (active vs. passive, warm vs. cold). The key thing is before you even write a LINE of this C++ code, make sure that you have a complete, comprehensive ARCHITECTURE for your application that will gracefully handle faults.
be assertive
...that you are about to board.
I've spent over a decade refining how best to create stable, great software. And guess what? I still learn things every day. If you are really new to enterprise-grade software, the best thing you can do is search amazon and choose 3 to 5 great books about writing stable, bug-free enterprise code and just start reading and scheming. Give yourself lots of time. Be neurotic, type-A, attention to every detail, stay up at night wondering how your system could fail and what you can do to prevent it. Some immediate thoughts, however:
1. Good hardware. Obviously. Redundancy everything, self-diagnosing, etc. How can things go wrong? What will go wrong? How can I know when something is going wrong? How I can fix it quickly without impacting the system? Etc.
2. Enterprise grade (n-tier) architecture: You'll definitely want to do something where you have a database running on one or two (or more) machines, at least two business servers and at least two web servers. Redundancy is good. As you suggested, a setup like this lets you isolate problems (and provides for better security in general).
3. Test, test, test. From the very start, every day to the very end. Start coding by writing test suites for your code. Learn about unit testing, black box testing, user testing, regression testing, etc. And hire developers and QA whose sole job is test, test, test using great automated testing software.
4. Profile. Stress-load-test. Know how your system responds to all scenarios. Feel comfortable knowing the limits of your system. There should be no surprises.
5. Assert. Learn the magic of assert(). If your code isn't at least 25% asserts, you are not trying hard enough.
I told myself I was only going to write the first five thoughts that came to my mind, otherwise I could spend weeks trying to answer your question!
If you consider CORBA check out TAO (http://www.cs.wustl.edu/~schmidt/TAO.html), it is a reliable open-source implementation that is widely used. You can get commercial support for it from OCI (http://www.theaceorb.com./
If CORBA is too heavyweight for you take a look at ACE (http://www.cs.wustl.edu/~schmidt/ACE.html). ACE is an open-source portable framework that is used within the TAO real-time CORBA. It allows you to write portable networked applications in C++ but is a lot smaller than CORBA as it does not implement the ORB etc. Several companies use ACE as a lightweight middleware as it has a very permissive license. You can get commercial support for it from Riverace (http://www.riverace.com/).
If you're aiming for performance you might check out ICE (http://www.zeroc.com./ ICE is available under the GPL and commercially and is a really fast middleware that is not CORBA but is portable between several languages.
All options provide you with a good framework to develop reliable and maintanable code.
Use TPS reports. You'll thank me later.
Take the cheese to sickbay, the doctor should see it as soon as possible - B'Elanna Torres, "Learning Curve"
I don't know how complex your system has to be, but I'd strip out anything that isn't 100% necessary. No convenience code. No pretty, easy-to-use, fully featured gui. Just the basic required to get the job done. At that short a length you should be able to reliably verify a VERY low coding error/KLOC. Also, I would recommend 2-person coding if you have the say in it. Have 2 people working at the same time. 1 codes, the other checks. It will save yourself a lot when you go to testing and check out.
I do security
It doesn't have anything to do with using C++, because you are ultimately at the mercy of how the libraries you're accessing are going to interact with whatever systems you're doing. Because you have that dependency you can make nothing rock solid without putting a strong layer of security between those libraries and you.
If I were to make this as secure and stable as possible that's where I'd start - by wrapping those libraries in some strong error handling systems. Probably even go one step further and use some managed code wrappers (JNI, COM, CORBA, whatever) so that you can interact with the libraries using managed code. That will save you any number of headaches in the long run and will be immediately more testable, as you can separate easily the libraries you're using from the code you're writing and use stubs to test both fairly trivially.
In short, have each component loosely coupled with the whole system, and make each component crash and restart (to a recent good state) on failure. When shit happens the whole system can go on working, and the component that crashed resumes work quickly.
c.f. http://crash.stanford.edu/
Try Corewar @ www.koth.org - rec.games.corewar
Use formal methods. If you really need all of the things you mentioned, than I'm sure you client will be happy to pay the premium. Unless, like every other frakking client I've ever seen, they want it yesterday and for peanuts :).
I wish I could mod the article +5 Funny.
70e808a22cb027cde4a6abddf6435d55
At some level, the definition of stability is that it does not change often. Thorough testing is of absolute importance. Make sure every block of code is tested with a large enough variety of data. Once you have it in production, make sure you go after any problems that you have - never let any bug bite you twice. Don't add unnecessary features; keep it as simple as possible.
I've been out of C++ programming for several years, but I do remember a couple basic rules I followed that saved me from a lot of memory problems and invalid state problems. This may not be the kind of thing you're looking for, but here it goes... 1) Never allocate memory to a raw pointer. Never. That is, if you allocate memory to something, it better be allocated to a smart pointer like auto_ptr or a reference counting pointer (boost.org at the time had a family of these). The only exception to this rule is in the implementation of the smart pointers themselves. You should be able to find a good number of articles on this. 2) Always follow the "strong exception safety guarantee". Classes that provide this guarantee promise that they will not change their state if they throw an exception. Again, there are many articles. Here's an example of an assignment operator providing the guarantee (please forgive me if my C++ is not quite right - I'm rusty): Whole::operator=(const Whole& that) { auto_ptr tempPart1 = new Part(that.part1); auto_ptr tempPart2 = new Part(that.part2); this.part1 = tempPart1; this.part2 = tempPart2; } The example is a class Whole with two dynamically allocated Parts. The assignment operator instead of having two lines of code - i.e. cloning the two parts and assigning them directly to the member variables - has four. It first clones the parts to temporary variables and then assigns them to the member variables. Why? Without the temporary variables, if the second "new" operation throws an exception (such as bad_alloc), the state of the class would be different and inconsistent from before the call. It would have one original part, and one part from the cloned class. There are lots of other simple rules like this that can make code more solid, easier to read, and easier to maintain. If I remember right, the C++ FAQ from the C++ newsgroup contains a lot of them.
If your develop safety critical code, or anything that requires hi-rel you need to break down the application into functional testable units, with test fixtures to test each module. Then a integration test framework. You can't create a "verified" correct system with ad-hoc testing. Unless you're very good and you own the whole thing and then it's just you that knows it's right, Ya right.
JimD.
> what strategies should a developer take to insure that the resulting program is as crash-free as possible? First, avoid using C++.
*Get a coverage testing tool
*avoid pointer arithmetic
*declare your copy constructors private (with no body) if you don't plan to use them. With this you'll catch unintentional use of the copy constructor through parameter passing.
*Use unit testing and make sure you can regression test your system
*Get a tool such as purify to find memory leaks and use of uninitialized memory
*turn on compiler warnings to its most anal setting
*Create a system to give you a call stack in case of errors (to quickly squash bugs because you will have bugs).
*Only write multi-threaded if you have to. If you have to program multithreadedly, try to have a good and well thought out strategy to avoid race conditions.
The Internet is full. Go Away!!!
You sound like you are new at this, probably just graduated from college a few months ago. But I would suggest that you know your clients and what their expectations are. Every client will say that they need the application bullet proof and fast, and it is the most important program on earth.
But what they really need is a simple solution that is better then what they currently have. This is not an excuse to write sloppy code. But to keep in mind what is needed. If you can get the job done in something simple do it. If you make it more complex then you need trouble will just occur in the future. If the program takes 100 minutes to run vs. 90 minutes. They will learn to deal with it. Remember It is often cheaper to buy a computer that is twice as fast then it is for the Programmer to write there code to run 10% faster.
This sounds like an issue my friend from college was talking to me about when he first started working. The client wanted a High performance method of sending messages to the company and managing the data. So he spend months of working with low level programming calls to have almost working solution that did what the management said they wanted. Then what they really wanted was a mailto: link on the page, and have outlook filter the data.
If something is so important that you feel the need to post it on the internet... It probably isn't that important.
You say "for reasons of efficiency". How do you know that some other safer language (like Java) wouldn't be efficient enough? Have you done smoe tests? Have you analyzed the business case? The business case would look like this: "We could write it in C++, which would be efficient enough to run on Hardware X, or we could write it in Java, which would require more expensive Hardware Y. The cost difference between the two is Z, and the programming time to get the same level of safety between C++ vs Java is Q, so clearly it's a lot cheaper to do it in C++ and save money on the hardware."
Most of the projects that worry about "efficiency" haven't done an analysis like this, and 99% of the time, if they did such an analysis, the would find out that they are blowing $50k in programmer-hours to save $5k in hardware.
Then if you do another step of the analysis and put in a term like this: "A one-day outage or security breach could cost us $500k in lost business. Java has no buffer overflows, direct memory access or other common causes of security problems. Which is the cheaper option?"
Optimize last, usually.
I know, I hate it when I ask a question, "how do I do this with this certain tool" and someone says, "you shouldn't be using that tool", but, unless what you are doing is a ray-tracing cluster or similar, it sounds like you are on the wrong track.
If you want an ultra-stable crash-free system, you will need to avoid both Linux and Windows. The choice of programming language and methodology is way down in the noise compared to that.
Use FreeBSD or stay home.
Honestly, everyone seem to believe that all C/C++ code is unstable (probably because of all those people working for companies like Sun/Microsoft, who are promoting The Next Big Thing in Software Development), but it's far more to do with how you go about using the language and its features.
However, it is honestly an improvement on C. I think Bjarne Stroustrup (I'm almost certain I spelled that wrong) said it best: "C makes it easy to shoot yourself in the foot; C++ makes it harder, but when you do it blows your whole leg off." (http://public.research.att.com/~bs/bs_faq.html; apart from the quote, it has a whole lot of useful tips).
So, here's my advice, from my experience working in all sorts of languages (professionally I've used everything from TCL through PHP to Java/C#, but at home I use C++ exclusively). This is just my experience, though; take this with the requisite grain of salt:
STL is your friend. I cannot stress this enough. STL allows you to leverage complex containers (automatically resizing lists, hash tables, ropes - which are mega-long strings, etc) with complete type safety: you can create a container for a particular type, and the compiler will balk if you ever try to put something incompatible in. Also, the generated code will be optimized for your particular storage type. In this respect, C++ is actually better than most other languages (only with Java 1.5 and .NET 2.0 do Java and C# implement Generics, and in the case of Java, it's only implemented in the compiler).
Pointers are not always your friend. When you allocate data structures on the stack (e.g. "string blah;"), they will automatically be taken care of by the language. Even if an exception is thrown, these objects will still have their destructor called (which in turn will call all other necessary destructors) and the memory will be deallocated freely.
Of course, this will not work everywhere (large numbers of polymorphic, dynamically allocated objects). But in these cases, you can use helper classes (such as auto_ptr in the STL, or something like shared_ptr from Boost or nsCOMPtr from Mozilla). Look around; lots of other people have already solved this problem. In fact, there are even Garbage-Collection libraries available for C++!
Use exceptions instead of value checking, dammit! Every time you call a function that "returns 0/false/-1 on error" you are exposing yourself to possible bugs. Try to avoid this wherever possible, and try to keep all your OS-specific calls in one spot.
Check out Boost: (www.boost.org) There is a LOT of useful stuff in here, and it will certainly speed up the development process.
Finally, make sure you design it properly! A couple of well-defined interfaces to separate things out will go a LONG way towards simplifying code, as it will ensure there is less coupling between different code modules (i.e. they don't depend on eachother as much, so you can rewrite one without affecting others). This goes for ANY language. C++ doesn't directly support interfaces, but abstract classes will do the same thing.
As for separating modules out, you could try CORBA (as you mentioned): ORBit (orbitcpp.sourceforge.net) is a free implementation (that happens to be used by GNOME). But to be honest you probably won't need to go to this effort. If written properly, your code should be stable enough that you don't even need to separate code out into separate processes. The furthest I'd go in your situation would be to write a command-line application that does all the work, then have a graphical client (although you could write this in a different language; don't get me started on graphics library support under C++) that uses a socket or TCP/IP to communicate with the worker thread. In this case, you could just use a very simple protocol to
-- Dramatisation - May Not Have Happened
If you really want stable programs using C++, be sure of the following basics -
1. Hire good programmers.
2. Make sure that EVERY function is defined with a specification, describing everything within the function. This allows you debug much easier.
3. Make sure that you've got all requirements written.
4. Try not to use fancy stuffs such as function pointers, de-referencing pointers, etc. Not all programmers are genius.
5. 1 good practice, if you allocate memory in an object, make sure that the same object is responsible for de-allocating memory. This is commonly practical.
6. For IPC, try not to use shared memory. Using message queue makes your work easier because of its guarenteed nature. Try to use MQ Series or something similar. They provides a robust mechanism for transferring and retrying data. It is the money worth spending. It is also compatible with Windows and Linux as well.
7. Stick to ANSI C++ functions to ensure compatibility.
8. Use a portable UI language such as qt.
9. Test, test and test. Peer review codes.
10. Establish a naming convention for variables and classes.
should be portable to Windows without much difficulty."
...
insure that the resulting program is as crash-free as possible?
errrror.... eeeeeeeeeror... (computer explodes)
I work for the Department of Redundancy Department.
But there are ways to approach this problem. Here are a couple:
....
1. Read absolutely everything you can find on Software Reliability.
2. Experienced Software Engineers opinions should count for more than Joe Q. Random here on Slashdot.
3. The fact that you are doing this in C++ is not the problem. More to the point, the quality of your development staff is paramount. One bad programmer set loose on a good project can wreck it.
4. Learn and use the priciples Extreme Programming. This should take care of many of your reliability problems.
5. Define *exactly* what you mean by "never crash". What are the risks involved? If it crashes, do people die or go to jail? If thats the case then you need to seriously consider correspondingly large funding levels, and
6. Testing. Addressed somewhat by Extreme Programming. You absolutely positively need to define exhaustive tests for each module, before you even bother coding it. This means you need to encode exactly what each module should be producing for a given set of inputs and use good sense in your OO programming to isolate each object/class from side effects and cascade failure. Furthermore, test not just for inputs and outputs, but for timing and also consider security implications. What if the system doesn't normally fail unless someone is trying to push it into a failure mode (known as a morbidity state). Make sure you are do tests for input validity. Buffer overflows in your code can open the entire system up to hacking. If this is a web app, scan it with WebInspect. Hire a penetration team to do application vulnerablity testing on it. Have the platform built from a known good system image. Lock the machine down. Test the platform for vulnerablities.
7. Test, Test, Test. Run the test suite with each and every build. Make each programmer responsible for writing the test cases against his partners code. If the test cases fail, make the responsible programmer fix the code (or the tests if they are broken) before the new module/class can be checked in.
8. You are using a decent version control system right? Anything is better than nothing. RCS, CVS, SVN whatever.
9. Use a proper software engineering life cycle. Make sure you never push code directly into production before Q/A testing. This can suicide a project (and a company) faster than you might imagine. If the tests are properly written, if positive tests results are required before code checkin, then Q/A should be a very fast process and they will thank you.
go!
Executive summary of this post: Keep it simple. As simple as it can be while getting the job done. The more buzzwords you think about implementing, the more you need to reconsider whether you really need that whiz-bang feature.
You need to abstract your design into really independent layers, such that the backend processing can be done across linux, windows and even beos slaves simultaneously, and the frontend is viewable via a web interface, fed into excel or whatever. You can't look at this as one big project, but many independent (and more easily verifiable!!) applications cooperating with each other.
My impression from the description is that you want a system like folding@home for corporate customers - they have a whole heap of data they want analyzed (parallel workload across many clients) and a small subset of results they're interested in. Don't make things any more complicated than they have to be - the data sets could simply be files that are partitioned by a master, sent out when requested to client workhorse computers, getting there by http, nfs or whatever, processed, and the results returned into an incoming directory for a simple frontend to tabulate.
The biggest mistake you could make is having one gargantuan application in charge of everything. The race conditions will drive you mad, be they in data access, allocation, retrieval, dispatch or anything else you're trying to manage that the OS could do for you.
Just look at Froogle. Their millions upon millions of store/price listings are fed by people ftp'ing a feed of tab-separated text values.
Don't Hate, Gestate
How well does a Boehm garbage collector work with the Resource Acquisition Is Initialization pattern? In Boehm's library, do destructors or finalizers get called in any sort of predictable order?
I suggest you plan on knowing what is happening. Choose a logging system to use for reporting informational, debugging, and error conditions. Then use it generously. As reliability is important, you'll probably be testing input for sanity, and you should have messages available so people can figure out why data is rejected. Also have available informational messages about decisions being made, so it can be found that, umm... no widgets are being emitted because a gadget needs to be supplied.
valgrind -v ./myapp [args]
It gives you massive amounts of great information about the memory usage of your program.
The other day I spent nearly 3 hours trying to decode what was happening from walking the backtrace in gdb. Couldn't for the life of me figure out what was happening. Valgrind figured out the problem on the first run and after that, I had a solution in a few minutes.
Highly recommended software, and installed by default on several distributions, AFAIK.
Enjoy!
std::disclaimer<std::legalese> sig=new std::disclaimer; sig->dump(); delete sig;
I know you asked about C++, but for fault-tolerant network applications, nothing beats Erlang.
I know: -1 Flamebait. But really, this is Slashdot. A story with such a minor reference to Windows going without a Windows-bashing comment for this long is just inexcusable.
-William Brendel
A variable in Python is a variable as in anything else, but a variable is a reference to an instance of a type that could be anything -- the referenced instance has a type as opposed to being some universal type like a string, but it can be assigned on the fly, and it can be a number, a string, an object instance, a class, or a function. I suppose assignment is just copying the reference, but as soon as you do anything with it, you have to somehow look up the dynamic type and decide to do something legal. And fail gracefully if called to do something illegal.
And as to memory allocation, everytime you touch a reference you are doing something with a reference count, and I believe there is some kind of primitive mark-scan garbage collection layered on top of the reference count to break circular chains. Are you going to hand translate that into C++ as well?
Greenspun's 10th law is this inside joke among Smug Lisp Weenies (TM) that any sufficiently complicated Fortran program (back in the day, today substitute C/C++ program) implements a good chunk of Common Lisp, only slower and with a lot of bugs. I may offend people to compare meek Python to mighty Common Lisp, but Python has a sufficient dynamic behavior to make the connection.
My advice on the reliable C++ program is 1) design to the level of having a clear idea of the architecture of your app before coding -- the classes, their purpose, their containing other objects, 2) for each object where a reference is contained in another object, do some kind of code reading/verification/check of conformance to a standard you have established as to how that object gets deleted in a safe way. It could be reference counts, auto-objects, caller deletes/callee delets -- just decide on what you are going to do and read code to see that you are consistent about it.
Sounds like the poor soul is in over his/her eyeballs.
The simple truth is that interstellar distances will not fit into the human imagination
- Douglas Adams
However the answer to your question is: DESIGN (Requirements Doc), DESIGN (High Level Design doc and Test Plan), DESIGN (Detail Design doc). CODE. Then TEST, TEST, TEST, TEST and RETEST. After that's done. then TEST some more.
Get customer signoff at every stage of design so as to have a stable target. Nothing screws with stability more than a customer/client who is allowed to change the requirements on the fly.
Following this pattern I've designed and built communications servers for credit card authorizations and N-tiered communication servers for claims submissions that ran error free for five or more years. But in C and UNIX or DOS - never Winblows. Or C++. But the design,code,test 'till you puke paradigm will work all the same.
Good Luck.
Too lazy to create a sig...
...although not strictly true, it's just too good of an Ask Slashdot motto to pass up :)
Ask Slashdot: impossible questions with impossible answers!
Ref:
"I need to create an ultra-stable, crash-free application in C++"
"...due to reasons of efficiency and availability of core libraries."
"...but should be portable to Windows without much difficulty."
Lots of posts with interesting advice though so best of luck! Would make for an interesting Slashback entry when/if you make it succeed (and possibly even if you don't).
--
this additional sig includes a portrait of Mohammed in support of freedom of expression, feel free to reproduce it
this comment is provided "as is" and without any express or implied legibility or congruity [...]
First of, not sure why a big deal is made that it's in c++. c++ was developed to make more stable code. Sloppy programming in any language will cause a crash. I can write python code that will come to halt, it's not that hard.
It sounds like part of the problem is you don't *know* the c++ language. I suggest your first move is to get a book. Bjarne Stroustrup's book is pretty decent, and he goes into design issues.
Here are a couple features that can help your code be stable:
* Object-oriented design allows you to protect your variables by providing a protective layer. Provide access functions to change these variables. This makes debugging a cinch too, because there will be very few places you need to look that directly change variable 'foo'. Also, the constructors allow making sure that your variables are properly initialized. Humans err, so it's not perfect, but what is? You can easily come up with a system to make sure every new variable introduced will be initialized properly.
* Templates allow you to write code once, and use many times. The less code you have, in theory the less errors you will introduce.
* Smart pointers are you friend.
* STL - standard template library, provides almost all the standard container classes one would ever wish for. Less coding on your part, less errors.
* You can actually find garbage collectors for c++ (I am assuming this is why you might think other languages might be better than c++). The advantage, you don't have to worry about memory allocation. The disadvantage? You will lose some of your precious speed.
* Expections. The biggest reason for a program crashing, besides just plainly bad code (i.e. overwriting memory locations etc.) is not handling error conditions correctly. Exceptions are a huge leap from standard C in that they allow you to manage errors in a much more sane way. It will make your code a little more ugly, but if crashing is a major concern, use them.
* RTTI - run time type identification, yes that's right, c++ can do run time inspection if you want. You can use this to make sure that functions are receiving the correct types.
* '#ifdef __DEBUG__ #endif's make code a little ugly but is a great way to have production and a testing code. Put code in to check the sanity of things.
Other posters have suggested that you avoid pointer arithmetic. Generally, it's not a bad practice, but just like fire, pointers aren't bad. Sometimes you *need* fire to do certain things. No getting around it. Just remember, you are playing with fire.
As for design, you should read books. There's a couple good ones out there. You should also read the Linux guidelines. One of the best ones:
* Keep all your functions below 20 lines of code. I adhere strongly to this, and it has kept my code relatively bug free (although, code practices cannot save your from your own stupidity).
In general, keep your code small, and modular. This allows you to test out portions and check the sanity of things.
It is possible to use c++ with other languages. For example, there is a library called boost::python, which allows you to very easily create python modules.
On a final note, I've written tons of simulation code in c++. The only time I really encountered crashing code, was when I had code that sent out data to be crunched over TCP/IP, and then receive back the results. The crashing was simply because I didn't have enough time to write all the error checking code. The more distributed or complex the design, the more errors that can arise, and you have think of what they are and be able to catch them all.
Heavy use of code generators is always a good place to start--the less code you write, the fewer bugs you will create.
:-)
Distributed applications are very, very hard. It has all the joy of multi-threaded code with latency and communications issues added in. Stability of the overall system can only be achieved by a layered design: I've never seen the design patter described, but there is a "Manager Pattern" in which one process takes responsibility for controlling another process or set of processes. Autonomous restart is not a good idea because the single node that has experienced a crash does not have all the information required to make a good judgement about what to do. An external manager process that has an overview of the whole system status will do better.
Also, restarting a process and hoping the crash does not happen again is not in general the right thing to do, as students of the Ariane V disaster will realize. In that case there were multiple redundant processors that all had the same bug (relative to the inputs they were getting from the new vehicle). In most cases restarting after a crash will just result in another crash. Realistically, you need to be able to inform the user that something bad has happened and ideally give the user the opportunity to intervene (change parameters, for example) before restarting the process. This may require that the whole data analysis run be restarted, again indicating the need for an external manager process to co-ordinate everything.
For IPC, if you are using a common language on all platforms I strongly favour XML serialization and sockets. Any good code generator will generate serialization code to dump your classes to an XML string, and you can then send the string through a socket. It is relatively easy to do this, and avoids the huge overheads that CORBA involves (the only large project I've used CORBA on has since stripped it all out as being too heavy-weight, a decision I think is quite reasonable.)
Using a solid framework like Qt or wxWidgets (which I've honestly found to be superior to Qt in many respects) will help reduce the amount of code you write. For crash-free code you must use open-source frameworks as much as possible, because every set of libs has bugs, and the only way you can track them down and fix them is if you have the source.
Finally, you should think about hiring someone who's done it before
Blasphemy is a human right. Blasphemophobia kills.
Here's a better request: "I want an ultra-stable, crash-free application in C++ and a pony."
Anyone that would think it'd be a good idea to Ask Slashdot(tm) for advice on how to write the program you described isn't smart enough to write said program. Seriously. Call your boss/manager/lab supervisor/cult leader and tell them to find somebody else for the job, because you will fuck it up just as sure as the sun will rise.
And for all of you folks suggesting this guy/gal writes it in Python/Perl/.Net/Whatever instead of C++, give it a rest. Please. Does the questioner sound like the kind of person that would bother to write exception handlers? That would even bother to buy a frickin' book already to find out what an exception was? No, they do not.
Christ. I'm sick of this sea of idiots.
If you really want the reliabilty you say you want, you probably need something like QNX with the High Availability Toolkit. That's what drives the newer Cisco routers. Or a Tandem system from HP. Or some kind of fault-tolerant cluster architecture.
But you probably don't, or you would have mentioned MTBF requirements and allowed restart times.
If you have to write multithreaded code, plan out the design before implenting it, and always put the locks in as part of the design, not trying to shoehorn them in.
Those are low-level programming-jock languages disguised as high-level languages. As long as the punks who program them will have pissing contests in code obfuscation, you can count on having buffer overflows and memory leaks.
The reasons? A unit test suite that implements several million test cases (mostly pseudo-random probes -- the actual test code is about 1/3 the size of the functional code). In fact, the "defects" that hit production were more "oversights"; stuff that didn't get accounted for and hence didn't get implemented.
Just as importantly; every dynamically allocated object just got assigned to a "smart pointer" (see Boost's boost::shared_ptr implementation).
Quite frankly, compared to any Java implementation I've seen, I can't say that "Garbage Collection" would give me anything I didn't get from smart pointers -- and I had sub-millisecond determinism, and objects that destructed precisely when the last reference to them was discarded. The only drawback: loops of self-referencing objects, which are very simple to avoid, and dead trivial if you use Boost's Weak Pointer implementation.
We didn't have access to Boost (which I Highly Recommend using, instead of our reference counted pointer) when we first started the project, so we implemented our own Smart Pointers and Unit Testing frameworks.
I've since worked on "Traditional" C++ applications, and it is literally "night and day" different; trying to do raw dynamic memory allocation without reference counting smart pointers is just insane (for anything beyond the most trivial algorithm). And developing with Unit Testing feels like being beaten with a bat, with a sack tied around your head...
-- -pjk Perry Kundert perry@kundert.ca http://kundert.2y.net
Linux never crashes unless you try to upgrade something. :o)
Don't use Windows. :)
First, you want to get some very experienced engineers who have done this type of thing before. Try ones with a background in either Avionics or medical devices, since both are life-critical / mission critical arenas. Second, you may want to look at companies which make fail-safe systems as these usually require special purpose hardware. HP has a computer line called NonStop which may be worth looking into (no, I don't own any HP stock :)).
In terms of techniques:
1. NEVER, NEVER, NEVER, NEVER -- NEVER execute a loop waiting for some event
to happen, that does not have a bailout mechanism, even if its just
counting a variable up to (or down from) a few million or so (however
long you've determined would be the maximum wait interval. If a piece
of hardware breaks or a sibling thread crashes you'll be out to lunch.
2. Try to use a real-time system that is used on fail-safe systems
commercially.
3. Don't use Windows. No matter how defect-free / error-free you make
your system, it won't matter, because Windows will have more than
enough defects and flaws to make your system fail in weird and
mysterious ways.
4. Use a journalling file system like ext3 or reiserfs.
5. keep a recent copy of your operational state / data somewhere safe,
like in non-volatile memory. If your system has to restart itself,
this data will help you become operational again much faster.
6. Use a watchdog timer. Basically, this is a piece of hardware that
your code has to "feed" on a periodic, repeated basis. If your
code gets hung up in an infinite loop somewhere, the watchdog timer
will assert the reset line and start things up again. That's where
your "warm" data comes into play.
7. As many here have mentioned, try to partition your system in such a way
that you can stay away from C++ as much as possible.
8. As some here have mentioned, real-time java or a commercial garbage
collector library service could help alot in avoiding pesky memory
leaks.
9. Assume you will mess up the first time. Its a much more realistic
assumption than assuming you'll get it right the first time. Hey,
most of us didn't even get our first KISS right the first time, and
what you are looking at is alot more complicated than that :)). So,
schedule enough time to do so (call the first one an R&D program),
collect enough information about your design decisions and rationale
that they will help you to understand where you went wrong, and help
you to do better the second time around. Good Luck.
10. You've gotten alot of good comments from a lot of very intelligent
and experienced people on this list. Read them over carefully.
Good Luck
dennis
"I need to create an ultra-stable, crash-free application in C++. Sadly, the programming language cannot be changed...
From zero to flame war in under 20 words. Well done!
But at the end of the day, you have to assume your program is going to crash, either because of intrinsic uncaught bugs, or more likely, unexpected system problems (power outages, network wires accidentally unplugged, etc.).
What to do? Concentrate as much on recovery mechanisms as you do on code correctness, in case your program (or an entire node) does crash.
This has less to do with C++ (or any language) and more to do with thinking through how to journal your program's state (perhaps you are running on top of a file system or database that has transactional semantics and can help you here) and how to have the nodes coordinate after a failure.
There are no karma whores, only moderation johns
I could tell you, but then I'd have to bill you.
I don't know how to make a program crash free. What I do know is - if you have enough logging in your application, when your program does crash, you can quickly look at the logs and find the exact class/method that cause the fault and fix it.
Break down the big project into small components. Have your programmers write unit test for each components and also add instrumentation/logging code, with an option to turn on/off the logging.
Many people are suggesting that you move to the managed language, even when the OP has stated that it is not really an option. I think moving to managed language can help with stability but it can't eliminate all the crahses or memory leakes. You can still have things like null pointer exceptions or an ever growing array hogging lots of memory.
>> As it is, though, your stated goals really don't seem to add up.
I agree entirely with your reply here. The poster's statement (below) is frankly ludicrous:
>> Sadly, the programming language cannot be changed due to reasons of efficiency and availability of core libraries.
Well in that case, sadly, the inherent unreliability of the programming language and core libraries cannot be changed either. Efficiency is the *primary* inverse determinant of reliability.
This is ENGINEERING we're talking about here, ie. a practical discipline that's all about making tradeoffs in one area in order to reap benefits in another. He's not willing to make any key tradeoffs, so he's not going to gain what benefits he seeks either.
"The question of whether machines can think is no more interesting than [] whether submarines can swim" - Dijkstra
Assert every input parameter. Assert every returned value. Shred your garbage (set pointers=null after deleting them). Take a stick to programmers who use #if's. Take a bat to programmers who use templates. Have unit tests for every public method. Have stress tests that check for memory leaks.
1. Write a coding specification that specifies what portions of C++ that are usable on the project.
2. Read Code Complete 2
3. Only use Autopointers. No real pointers. Not for anything.
4. Spec the program out down to the routine level. Have this working before you write any code.
5. Code reviews, code reviews, code reviews.
6. Read the section in Effective C++ on interface/model separation. It will help you make each section of your code losely coupled enough that you have a chance to make your self-restarting idea work.
7. While you think you're going to be able to write this on windows, I think if you're going to try your "making sure nothing's dead for too long" scheme, you should look at Hard Real time linux to put under a linux installation, THEN port the program to windows if they demand it. Either that or make a hard real time linux monitor for the entire system to make sure the system is going.
8. Make this system LEAN LEAN LEAN. Cut out every feature imaginable. Develop the baseline system and a fully featured one. Have metrics on how every features changes the baseline so you can pull back in and out things
Want to see every step I took to start my company? http://www.rowdylabs.com/blogs/pitchtothegods
Electronic components are far too fickle for anything that must be ultra reliable. For such applications, you must build a Babbage analytical engine out of titanium, like we do here at .. oh, unexpected visitors .. I have to g............
I started writing an application in C++ a few months ago. I have similar goals, specifically stability and data integrity. Obviously the language doesn't really help with those two goals, but a strong flexible design combined with a large test-case framework is a good place to start.
As for the middleware, I've been looking at the Ice middleware library from ZeroC. It can be commercially licensed for closed-source development, or used freely in an open-source project.
From the documentation, their protocol looks like it's very well designed, and very full-featured. I also like their grid, proxy, and batch capabilities. All in all, Ice looks much better than CORBA and far easier than trying to roll my own middleware.
Does anyone reading Slashdot have any hands-on experience with Ice? I'd appreciate any comments/advice/email you care to share on this topic.
LOAD "SIG",8,1
LOADING...
READY.
RUN
Maybe you can find some jewels of wisdom about stable code in general from this story
No sig for you!!
Read it and follow it. Beyond that, expect errors while testing and don't be discouraged by them. Don't be afraid to start completely over. The best way to get an error free system is to make all the errors in versions 1,2, and 3, then put version 4 into production.
v1: Code prototype (after designing of course). When you find an error in design, make a cludgey work-around. Finish entire prototype. TEST TEST TEST
v2: Re-design while keeping in mind all the errors you ran into, design them out. Code prototype and make a work-around for any new errors you find. TEST TEST TEST
v3: Re-design while keeping in mind all errors you ran into in v2, design them out. Code program and make a work-around for any new errors you find. TEST TEST TEST
v4: Re-design while keeping in mind all errors you ran into in v3, design them out. Code program and you shouldn't have any errors BUT TEST TEST TEST. If you do find errors, fix them elegantly. If cludge is required for any fix, redo v4.
Use your brain. C++ is a very commonly used language. If you can't write an application that is stable, consider another career. The only people I know who criticize a widely used language like c++ are posers and hacks. Think about it, the linux kernel is written in an even lower level language (C), and it's fairly stable. Sorry for the harshness, but I have the displeasure of working with a guy (like you), that complains about having to do serious design, and prefers to write in PERL, becuasue it's easy to use. Designing fast, scalable, stable programs is part of our profession. Either get with the program, or get out. Sorry if I didn't answer your question, but I don't consider it honest.
I do think there is potential in the overall design you are choosing, with a focus on expecting stuff to break and then simply making the system robust enough that if one module fails it is either restarted or another duplicate takes over and processing still continues.
If you go with restarting, and performance is critical, one of the important aspects will be to help you restart a downed module efficiently. From the GoF, the Memento pattern for storing an object's state may help you. The Command pattern may also be of benefit. For example, if processing something fails, you may only want to retry a certain number of times, in case there is a systematic reason for the failure. This type of functionality can be easily implemented using the Command pattern.
Also, although I have not read it myself, you may get some good ideas out of "Pattern-Oriented Software Architecture, Volume 2, Patterns for Concurrent and Networked Objects". I read the first volume of this series and found it to be a great reference.
My comments are focussed on the "Design Patterns" part of your question. What references have others found useful specifically for building robust software?
I have all too often seen these supposedly robust custom written systems degenerate into a big pile of sh*t which then requires much work to get fixed up. I would strongly recommend keeping close track on the status and quality of your project, and, at the first sign of slippage, particularly if you hear people saying "stuff just doesn't work", get some serious professional help if you want the project to succeed.
Good luck and I hope your project turns out well.
FREE - Java, J2EE and Ajax Audiobooks for Software Developers - www.DeveloperAdvantage.com
Create comprehensive unit tests, and create an integration test harness as you create the product. Not only will this enforce decoupled design, it will ensure continuous confidence in the quality and robustness of your product. Doesn't matter what language you use, as long as you test, test, and test again.
Want stability? Test it and keep testing it. Create a LARGE test suite. Per class testing, then per module, then whole app. Test 'till you're blue, then test it some more.
-=[ place
Good God didn't you just make the program take twice as long to create? You write it in Python, debug, test, debug, test and get it working. Then replace the Python with C++ code which you then have to test and debug (at least once). If you can model it in Python in the first place, why not just start with C++?
If you don't have enough knowledge to know how to approach this now (it's obvious you don't) then you are bound to not get it right for several iterations. Writing reliable software is going to take a lot of experience so you really get your head around how thorough you need to be.
t ml
Can your task afford to have a very long period of non-productivity while you play around with ideas and learn the rather hard lessons of writing high-reliability software?
Ideas that spring to mind:
- NASA and their shuttle software. They have enormous resources and get it right by being very careful and formal. Go read about it. Starting point:
http://www.fastcompany.com/online/06/writestuff.h
- Formal methods. Proving your design. Or at least formal coding/testing of the interfaces.
- Aerospace software. Like the fly-by-wire systems. They have less resources than the shuttle and make occasional mistakes.
Also, I would reexamine your requirements. Is it medical? Do people's lives depend on it? If not, it seems like someone is overstating the need for reliability.
Finally, I have had some small experience with people attempting to "add" reliability to a system - like automatic restarting of modules. The usual result is to make the system MORE complex, and LESS reliable, or prone to fail in more severe ways, because simple ideas don't have the effect you'd think.
For every expert, there is an equal and opposite expert. - Arthur C. Clarke
don't split things up and string them on an untested distributed "restart" architecture. this will fuck you over and be a constant source of pain.
get the core code right through testing, purify, using safe libraries where possible, ensuring that data meets constraints before commits, etc. i don't think C++ is noticably more error-prone than any other language (although memory alloc is harder without GC, it's still a solved problem - smart pointer abstractions to obviate problems, purify to catch things you miss... etc).
Lots of people will try to work around your requirement for C++, but I'll assume your reasons are solid and let it stand.
c omo/drivers.html
1 00997-2271304
What your are looking for are not design patters but software engineering practices. Specifically, you're interested in what would be called critical systems (think things like air-traffic control where failures can cost lives). These sorts of systems exist, and are written in all sorts of languages, but writing them is not a small undertaking. To get an idea of what you're undertaking, have a look at how reliability (RELY) affects things in the COCOMO II model.
http://sunset.usc.edu/research/COCOMOII/expert_co
The next step is to look at some of the literature. I'd suggest starting with Somerville's 'Software Engineering', where you'll find part 4 dedicated to critical systems and part 5 dedicated to verification and validation. The chapter on critical systems validation is probably the meat of what you need, but the rest is likely needed for a solid background. Suffice to say that those saying 'test driven development' are on the way to enlightenment, but are missing a large part of the story.
http://www.amazon.com/gp/product/0321210263/102-9
Best of luck though. It sounds like you'll be in for an interesting project.
"becuasue"
Look, development of a large (esp. disturbed) app can pose many unexpected problems,
I suggest if you cannot comprehend the task at hand (even as vague as it was posted)
then maybe you should refrain from posting or better yet, "consider another career".
Look, writing in managed code may be great for YOUR code's stability -- but only because you're writing less of it and trusting more and more to other people. In all candor, we're talking about either the JVM or the CLR (which are nearly the same thing in many respects).
Sure, you write less lines so you write less bugs. Big deal. If you take your time and treat your work as a CRAFT and yourself as a CRAFTSMAN, you'll write good code in nearly any language.
Personally, If I were after "Ultra Stability" (a made up phrase, by the way, with no real definition) I'd be looking first at the hardware and operating system and pulling out absolutely everything it does not need, then moving to the custom software and pairing everything down to its absolute purest, most minimal set of features and functions.
Think in terms of simplifying the environment, the platform, and then your code. Its the external factors that are going to kill you. Get rid of as many as you can.
The problem with quotes on the internet, is that nobody bothers to check their veracity. -- Abraham Lincoln
You may want to check out OCaml. You can use C++ libraries from it using something like SWIG (and the libraries of other languages too, using something like Pycaml for Python, and there's an interface for Perl too).
:)
Here's a nice comparisong of a ray tracer written in C++ and one written in Ocaml.
And here are many more comparisons and information on why OCaml is so great.
Oh, by the way, OCaml is much safer than C++, infinitely more elegant, faster to develop in and more readable.
"disturbed"
If you're going to point out spelling errors, it's generally best to avoid making any of your own...
First, if you need multiplatform, then code for that from the start. That kind of stuff can't be done in the end of the project. Your best shot is to use the wxWidgets C++ library.
Then, some things in C++ are boring and prone to errors. So learn lisp and make C++ generators in lisp. I did it to develop some C++ core data structures and their corresponding persistent object storage mechanism.
And the payback is great!
Most errors come from refactoring, where you change one thing in one place, but forgot to change the same thing in the miriad places that should be updated. With lisp, you just change a line in your code, tell lisp to recreate your C++ files and you can be sure everything works right.
You can test and change stuff faster with this approach. Hey, I can give you my lisp files if you want to try.
HTH, YMMV.
We are Turing O-Machines. The Oracle is out there.
1) Pass references into functions/methods as much as possible (since they can't be null... in theory) There is very little need to pass pointers into methods _if_ you take the time to think about things.
2) Properly using const will help a lot.
3) At the cost of speed, bounds check all operator [] overloads. Use pure arrays as little as possible.
4) Don't use loops that don't have a determined stop/abort condition.
5) Check and handle all return codes from standard library calls or other function/method calls for that matter.
6) You could put a global try/catch block around everthing for good measure, but there isn't much difference between this and just bombing unless you can actually handle the global error condition and somehow recover.
7) Don't throw from a destructor.
8) Actually use destructors.
9) The list goes on and on...
Tackle this with a bottom up approach.
Try not to use the heap. Try not to use pointers.
If you have to use the heap use a rock solid smart pointer implementation.
Do you have to use C++? Could you just use C?
td
hard core geek-ware
Yes, some of those do conflict. How to keep things simple AND have fault-tolerence, for example. That's where a good design comes in handy, because you can get a better feel for where you should make the trade-off between certainty of working, certainty of working later on and getting some sleep this side of 2008. It's all a matter of weighing the options and investing time in the place most likely to benefit.
(Because everything is a trade-off, anything listed above may not apply. But then, it may not need to. If you've tested a component thoroughly along all boundaries, a good sample of valid conditions and a good sample of erronious conditions, AND everything has been kept as simple as possible so that really wierd cases are unlikely to crop up, then you may decide you can simplify or eliminate fault-tolerent components. There is no point in catching errors that won't occur. In fact, that adds complexity and violates the Keep It Simple rule.)
Oh, and as this is a networked system, testing should include testing network I/O. Use packet generators if necessary, to see how the system handles erronious packets or massive packet floods. You don't want "perfect" responses (unless you can define what "perfect" means), you want reliable responses. If X occur
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
Stable programming can be achieved, and achieved quite well. No single method is the "silver bullet", so to say, but using a VM'd languaged or garbage collected language won't do you any good either. Here's some basic principle for any language that can help make you programs stable.
1. Never trust anything coming in. Always check every parameter to a function and make sure that each is within the expected value ranges. For pointers, you can't tell whether they point to the right place, but you can make sure they are valid. If something doesn't look right, then return an error code - don't continue. Also check everything returned by any function you call - check for the error codes, and actually handle them; if a caller needs to handle it, then return a consistent error code and let it handle it. But, most importantly, HANDLE the errors.
2. HANDLE all the errors from any function or procedure - whether hardware or software. Throwing an "exception" is a bad method to handling the error and really does nothing for the actual problem - it's too generic, and too easily turned into giving the error to a higher level, which just prohibits the higher level from being able to actually overcome the error (fix it, handle it, etc.). Handle everything, and do so meaningfully, and when an error occurs, clean up anything that needs to be cleaned up before returning to the caller or exiting/terminating.
3. NEVER ASSERT!!!! Asserts will only lead to faulty, crashing programs. HANDLE THE ERRORS! (See #1 & #2)
4. Don't trust that system calls succeed - always check that memory is allocated, clear it, and then use it; and check that the correct amount of data got written out to disk or read in.
5. Be explicit about everything in your code. Don't assume that an "int" is signed - declare it as a "signed int"; this will help with reading and understanding the code as well.
6. Be explicit in your logic - use an explicit structure with single-entry-single-exit principles to do the above.
7. Since you're using C++, use Object's to control your resources efficiently. For example, don't just allocate buffers using "new" - create an entity in the program that is to do so, and then use it to do it. Return the buffer to it when you're done, and let it also destroy them. Same goes for threads, and any other resource. Don't rely on the OS, a VM, or a GC to clean up after you. DO IT YOURSELF!
8. Reuse proven code as much as possible, and use efficient algorithms. Sometimes an STL Vector class may be good for you; othertimes it won't be. Be aware of what you're using - its benefits AND costs, and consider them all in the process.
9. Be able to create and destroy, load and unload any object in the system - e.g. libraries. Make sure you can unload a library to re-initialize it if you need to. It would be great to be able to do this even for standard objects, but this must apply especially to libraries - first AND third party - so that you can clean out their errors or reset them if needed.
10. Design and Architect the software to achieve its goals and requirements. After each design and architecture is proposed, revisit the requirements and (a) check that the design meets the requirements, and (b) see what other requirements are then inferred by the design and architecture. Repeat as needed until a version that can be extended is viable. Do not try to do it all in one version.
11. Document. Document. Document. Document your code. Document your design. Document your algorithms. Document your architecture. Write it all down and make sure that anyone can pick up the documents and be able to understand exactly what it is you are doing - write each document to the necessary type of person, e.g. an SDK document should assume software programming background; overviews should be aimed at your manager or your manager's manager and Joe Smoe off the street (not that Joe Smoe will get it - just anyone should be able to understand it given th
Truth is like the sun. You can shut it out for a time, but it ain't goin' away. - Elvis Presley (source: imdb.com)
One of the most powerful techniques i've seen for c++ is to implement a thin memory management layer (gc, reference counting, caching and cache warming). This will make it stable to the level of JVM's and CLR's without a performance loss, assuming that all code uses this api (or simply preprocess on malloc and free). This can even speed up code significatly with hardware cache aware code.
Python is the second best thing in the world, right after love.
The restart logic to bring parts of the system up that fail also seems like it could have a net-effect of reducing reliability. Such restart logic will almost certainly be complex and thus problematic. I recommend instead a different error handling paradigm: whenever anything pathological is detected your program should "panic" (quit in a graceful way) and tell you exactly why it panicked (perhaps by printing out a stack trace, perhaps by printing out the exception's .what() string and making sure all possible exceptions produce uniquely identifiable strings.) In the course of development every time a panic happens you will know why and so will be able to resolve them quickly. Given this sort of discipline, some good engineers and the proper development schedule by release time you will have worked all the common panics out of the system.
The advice to use some sort of agile software development process and automated unit-testing is spot-on. Check out cppunit (http://cppunit.sourceforge.net/cppunit-wiki). I use cppunit extensively and it helps me greatly. Making the application unit-testable will increase its complexity somewhat, but this complexity increase is usually not too great and will prove to be insanely worthwhile given the advantages of having an automated unit-tests testing that your software still works at every build.
Think carefully about your Makefiles/directory structure. Again you want something simple. But you also want the build system to do some things for you like auto-generate the dependencies of your .cpp files and invoke your unit-tests as part of the build. Make sure you build system refuses to output an application binary if any unit-test fails (this is quite easy to do with make and cppunit.)
The advice to utilize STL and exceptions is also very good. There is no need to reinvent the wheel when you can leverage STL. Make sure you do some reading about exceptions to avoid making a big mess, but they are absolutely worth learning and using.
Really, really try to stay away from threads. Threads imply complexity and it is hard to get them right without great discipline and some experience.
Reading:
The C++ Programming Language by Bjarne Stroustrup
This is worth having around as a reference. It is a bit heavy as a tutorial but it will be able to answer many the strange questions that come up about C++.
Effective C++ Third Edition by Scott Meyers
(Getting the THIRD (newest) edition in particular is important.) C++ has a learning curve that's a bit steeper than many of the other modern OO languages. This book helps you up that curve as quickly as possible (assuming you already know the basics of OO software design and languages.) Very good advice about why you should use the "auto_ptr" (that others mention) when you allocate memory on the heap. Advice about exceptions. Lots and lots of other lessons can be learned from a quick read that will save you so much time and pain the long run.
Agile Software Development by Robert C. Martin
Others may disagree with this recommendation. I think there are several similar books. This is the one that I happened to encounter and I do not know if it's the best. I do know it is sufficient at giving a broad overview of many important concepts like: agile development processes, unit testing, OO design principles, design patterns.
Good luck with the system!
Consider: 30 years ago, software was written in assembler. Nowdays, it's written in a variety of higher level languages. In that time, has software reliability increased, or decreased?
Note to the truly clueless: I'm not suggesting that assembler is better.
I wonder if that was included in the question not as a bash on C++ being unstable, but to try (unsuccessfully, as it turned out. This IS slashdot, after all) to ward off some of the inane "Use Python/.NET/Ruby/Visual Basic/etc..." posts.
Unit Testing anyone?!?!
I've dealt with software that automatically restarts a dead process, and in my experience, it doesn't work so good. If you want ultra-stable software, you want to know what caused the crash and why.
For your situation, where I guess you're doing lots of time consuming computing, I'd think you should also set checkpoints, save intermediate results, or something, so if it does crash, you can restart in the middle instead of going back to 0. (A standard practice when I was analyzing large databases for corruption, a task that could take days)
Do you even lift?
These aren't the 'roids you're looking for.
So your question is, "How can I write code that doesn't crash?" First of all, it doesn't matter which language you use. There is no language that will guarantee uncrashability. There is no "design pattern" that will guarantee uncrashability. If you are asking such questions, I think your best course of action is to find an experienced programmer to do it for you. They are not cheap, but it will be worth it in the long run.
I've seen the most stable code written in C, and I've seen horrible mess in PHP and Python. It's like asking, "Which brand of circular saw should I use to make the best kitchen table?" A true hacker can make a work of art with an axe, and not because he is forced to, but because that happened to be the best tool for the job.
Read all of Scott Meyer's and Herb Sutter's books. Practice what they preach. Investigate tools in finding memory leaks and use them from day one. Invest in infrastructure that helps you test code. Write unit tests. Insist on 100% block coverage on every piece of code that gets checked in, even if it means that you have to write a lot of stub functions that force failures (e.g. a version of operator new which is designed to throw an out-of-memory exception). Have at least one tester for each coder, if not more, and hire people who would easily be as good as your developers. Have the testers do code reviews as well as developers. Spend time refactoring code every time to go to write new code - make sure the existing code that you're attempting to modify actually makes sense in light of the changes you're making to the system. Use prepackaged code rather than inventing your own - too many teams rewrite pieces of the standard library or other libraries because they think they can do it better, or think that the existing code isn't efficient (although they haven't actually profiled it and don't know if it matters for their application). Basically - care about what you're doing and make sure every person on the project does, and you'll have much better success than the average team. Having people who care what they are doing and stay around long enough to see the job get done is worth more than many people give it credit for.
1. Use Web 2.0 for your app!
2. ???
3. Profit!
A few simple steps.
1. Favor creating objects on the stack, rather than on the heap. Stack objects are automatically destroyed when they go out of scope. If you can't do it on the stack, try to destroy it in the same function in which it is created, barring that, destroy it in the calling object's destructor.
2. If a library function exists that can do what you want done, use it.
3. Keep functions as compact as possible. A short simple function that does just one thing is easier to get right/debug.
4. Set your compiler to maximum nagging (-Wall in gcc), and fix your code so that it doesn't complain.
5. Use structed exception handling. It will greatly improve the readability of your code.
6. If you have to use allocated (malloced) memory buffers, create a class to "own" the memory buffer and access the memory via that classes member functions. That way you can take advantage of rule 1 for managing buffers as well.
7. Code reviews. Have a group of peers review each and every line of code you write, preferably with you out of the room.
I'm sure there are other things, but that is a good start.
Jesus christ people he is asking you how he should go about building an ultra-stable application in C++. He told you he *has* to build it in C++ because there are critical libraries and other components that aren't availible in C++. Telling him he shouldn't build it in C++ anyway just isn't helpfull.
I hate to break it to people but there *are* libraries, especially for types of scientific computing, that are only (reasonably) availible in C++ or sometimes FORTRAN. Not only would abandoning these libraries mean he would completely have to reinvent the wheel but also might cause serious compatibility problems not to mention a much greater ongoing maintenence responsibility (he can't just check his program to make sure things still work when someone fixes a library bug).
Moreover, the idea that because he is considering using CORBA, IPC or whatever else speed can't matter enough to require C/C++ is dead wrong. It is true that whatever *parts* of the process are done using these components may not require huge amounts of speed but this doesn't mean one of these components isn't doing something very processor heavy.
In particular what he says sounds like the situation in some areas of scientific computing. If one is writing a program to do some sort of simulation or similar math intensive operations speed can be *very* important in the critical parts of the code but (in some cases) transfering information to the GUI or other components need not be particularly speedy (increasing by an order of magnitude may make a small difference in overall runtime). Imagine a program that does some kind of weather, or nuclear detonation simulation. The cross-processor communication and the core simulation kernel need to be very fast but the GUI and data input components need not be particularly fast. Also it is my understanding that often the critical libraries in this area are often only availible (at least freely) with C/C++ or fortran bindings.
Anyway I think it is important to distingush several different goals, ultra-stability, minimal downtime, and minimal data/computation loss. For instance a climate simulation that may run on a supercomputer for months it is very important to have minimal data/computation loss (i.e. if something goes bad you don't lose months of very valuable supercomputer time) but you need not have ulta-stability or minimal downtime. As long as when any node crashes the simulation can easily be restarted without loss of data there is no problem. On the other hand if you are running a website like slashdot it is minimal downtime that is important it doesn't really matter if some of the web server processes are rebooted once in awhile. If, on the other hand, you are writing code to monitor a nuclear power plant it is ultra-stability that is important (though I can't at the moment think of something that requires distributed processing and ultra-stability but I'm probably just missing something).
So I think the answer depends on what sort of stability you want. If it is important that no individual *node* crashes (though the GUI/other non-core components can crash) then you should pursue the seperation you described above. I have to admit I'm not an expert here but the client-server model (like mysql, X etc.) seems to work well in this context. However, this depends alot on what sort of data you need to transfer. If you just need to send the core setup commands and get back mostly unstructured info (say a grid of tempratures or other simple datasets) then I would suggest sticking with one of the simpler abstractions and don't get lost in CORBA. On the other hand if you need to send back and forth real objects with significant structure then creating your own serialization system/bindings is just asking for bugs.
On the other hand if what you want is minimal data/computation loss, downtime, or any other property where it is the overall system you care about not a crash at any particular node then I suggest concentrating less on dividing any one node into comp
If you liked this thought maybe you would find my blog nice too:
Avoid the latest "big thing" for the core of your project. It's usually specialized, non-portable, etc. The standard template library for C++ (for example) is here to stay, with tested algorithms that are safer and faster than you can usually write (because they are optimized for the platform you compile on). For the GUI, on the other hand, you may be better off with a GUI-based language/tool. That's less likely to be portable, but that's the way GUIs work.
Next, spend some time upfront on your design, with things like use cases, sequence diagrams, and other visualization tools to help you understand just what you want to happen in best case situations as well as failures. The level of detail/formality required is a moving target, so update as needed. You should have a solid error detection/correction plan so that you can design each component to follow it. Also design for test and with logging - it will help you while debugging, while testing, and while fixing the bug the customer is seeing.
Make sure management will allow sufficient time for testing. A lot more lip service goes into support for testing than actual schedule and money. Your test plan should be as bulletproof as your design.
That's my 2 cents. And a random book recommendation: books like Scott Meyers' "Effective " provide info on effective/error reducing ways to use the language/libraries, but won't help you get started with the architecture.
1) write an api that you use to do ALL of your memory allocating/freeing. Have it track allocations and frees and warn you when they don't match up at app shutdown. Have it fill memory blocks with fefe or some other pattern on release so that late references are crashes.
USE IT FOR ALL ALLOCATIONS. Fix problems as soon as they occur.
For the most stable possible code. DONT validate pointers - crashing hard during development is much better than having your parameter validation mask an error or bad assumption. Unless the code on the other end isn't your own code - in which case trust nothing. always fully validate at the interface anything that you didn't create. Better yet, don't use it at all.
2) Dont use 3rd party componants. Insist on source code for anything that you call (with the obvious exception of OS functions).
Managed code doesn't gurantee that you won't leak memory, it just makes the causes of leaks a bit rarer with at the same time making them MUCH harder to track.
No kidding. It will catch things most humans will miss. Check it out here. If you want to get an idea of the kinds of things it'll catch, check out their "bug of the month" page - it's a facinating read. You'd be surprised how many nearly-impossible to detect errors you can make in C++.
Disclaimer: I don't work for Gimpel, YMMV, etc.
Weaselmancer
rediculous.
It's hard to gauge exactly how stable this application is meant to be. If you write an application for almost any company they will say "yes it needs to be very very stable." But there is actually a large variance in that.
For some companies, an application has to be very stable because the company might lose, say, $100/minute when it goes down. But this is actually toward the bottom end of needing to be stable, and is pretty typical for most business software. Real stability is needed when a crash could mean a loss of millions of dollars, or loss of life.
Because you state portability to Windows as a requirement, I tend to doubt your application really needs to be that stable. Applications that need very high stability usually have one specified target platform.
Whatever your requirements are, the number 1 way to achieve stable software is by using the KISS principle as much as possible. However, if your requirements are in the latter category, you will absolutely need to use KISS to the maximum. Don't allow new features. Make the application as stupidly simple as possible.
Finally, you need to carefully consider what kind of 'stability' is needed. For software that needs to be super 'stable', producing the incorrect results can actually be worse than a crash. Writing code that is completely bug-free is much harder than writing code that simply doesn't crash.
In C++, the best approach is to take advantage of the type safety features of the language. Don't use things like void pointers, and use STL containers and so forth rather than allocating memory yourself. Keep the programming team as small as possible, and carefully vet the code.
If the program is truly in the "needs very very high reliability" category, then yes, write unit tests. If it's actually in the "complex program that we're going to add lots of features too, but we'd kinda like it to be stable too" category then unit tests are probably going to be a waste of time.
-Sirp.
* Minimize logic redundancy.
* Minimize use of unsafe data structures and operations. Write or use existing checked wrappers around arrays, pointers, dynamic memory allocation, casting, etc. So that it just isn't possible to run into the problems that these things can cause.
* Try to make it impossible to write things incorrectly by taking advantage of the compiler. Use classes with limited interfaces so you can't call the wrong methods.
* For things you can't check at compile time, check them at run time. Use lots of assertions. Constantly check that pre-conditions, post-conditions, and invariants are maintained.
* Do unit testing, system testing, and every other kind of testing for anything you can't check at compile time.
* Make the assertions be exceptions for production releases and use good exception handling techniques.
This statement is just ignorant and wrong. I don't want to flame the original poster, he probably just didn't think about it too much and made a mistake, but it is even worse to mod this up to 5.
It is a well known fact that a huge percentage of processor time is used by a small percentage of the code. It is very important for that code to be fast because it is inside a loop being executed millions of times and perhaps this requires a language like C/C++ however other parts of the code which execute once for every million times the performance critical code runs can often be orders of magnitude slower without affecting overal performance.
Just as an example consider something like the GIMPS project or one of the RC whatever challenges. It is vitally important that the tight inner loops be very fast but you could write the graphic output and network components in a shell script that calls command line packet/output programs written in java and it wouldn't slow the overall app much.
If you liked this thought maybe you would find my blog nice too:
Way back in 1993, thanks to a three month schedule delay in shipping the original Apple Power PC hardware, Graphing Calculator 1.0 had the luxury of four months of QA, during which a colleague and I added no features and did an exhaustive code review. Combine that with being the only substantial PowerPC native application, so everyone with prototype hardware played with it a lot, resulted in that product having a more thorough QA than anything I had ever worked on before or since. It also helped that we started with a mature ten year old code base which had been heavily tested while shipping for years. Combine that with a complete lack of any management or marketing pressure on features, allowed us to focus solely on stability for months.
As a result, for ten years Apple technical support would tell customers experiencing unexplained system problems to run the Graphing Calculator Demo mode overnight, and if it crashed, they classified that as a *hardware* failure. I like to think of that as the theoretical limit of software robustness.
Sadly, it was a unique and irreproducible combination of circumstance which allowed so much effort to be focused on quality. Releases after 1.0 were not nearly so robust.
With the security fixes put into OpenBSD, it would be helpful to make your programme run on it natively - the fixes they have make programmes crash rather then allowing them to be exploitable, you'll be finding bugs much easier on it.
A few areas I haven't seen covered:
* If you are going to be consuming code you don't own, for the love of god run it in an external process. Create a defined plug-in like interface that your customers will use, and have a shim process designed to load the plug-in and perform communication back and forth with your real app. Your real app should assume that every attempt to communicate with the external process will fail and respond accordingly.
* Don't assume that hardware is fault free; assume that any and every operation than can fail will. I have personally tracked down application crashes that were the result of faulty hardware; in one instance, a machine had failure rates 20% higher than the rest of the machines -- turned out the mobo was faulty. In another instance, a harddrive was failing. I had one case where an object was allocated, verified not null, and passed in as a reference to a function -- which failed because the object was null...
He didn't complain about anything... you added that part. He asked for advice. Were more people to do so, instead of relying on their overconfidence (like you) to get jobs done, maybe we wouldn't have computer instability be such an issue.
You want extreme reliability? Put the app on an HP Nonstop, use some system calls to checkpoint the data to the app's backup process at key points in the processing, and voila! If the app crashes or the processor goes down, the backup process will takeover in less than 15 milliseconds, and continue processing. And yes, you can do it in C++.
A potential drawback: last time I knew, the cheapest two-processor Nonstop you could buy would set you back $250k.
The Nonstop is designed from the ground up to be extremely robust and reliable. That's why 80+ percent of all the ATM transactions in the world are handled by a Nonstop computer back at the bank's data center. 90+ percent of all the world's stock exchanges run on Nonstop machines.
(Full disclosure: I have never worked for HP. I spent 12 years working on Tandem computers, 6 at Tandem HQ. Compaq bought Tandem, then (as we all know) HP bought Compaq, and so through them acquiring a company founded in 1978 by 3 ex-HP engineers. The formerly Tandem platform is now called the HP Nonstop.)
Cheers, Tim -- Tim Janke Part mad scientist, part lion tamer: sr. software engineer, global team leader, project mana
Crash only
You are not entitled to your opinion. You are entitled to your informed opinion. -- Harlan Ellison
A lot of good stuff has been posted in this thread, but let me add a little more.
Success in a project such as this one requires an approach that pays attention to three factors. Ignoring even one of them will guarantee failure.
1. People: Can you use a design & development team of one person? If not, communication will be crucial. Get the best people you can who are willing to work together. Your people need experience working together.
2. Environment: Your team will need to be intimately familiar with both the development and live environments where this application will run. No one can be permitted to alter these environments once development starts. Don't even let them change the Uninteruptable Power Supply. I've seen a defective UPS introduce seemingly random errors into a rock solid production system! No other applications. No OS upgrades. Nothing!
3. Subject: You can neither design nor write good code if you don't fully understand the subject matter. What does the application do? Why is it important that this be done? Where does the input data come from? How and why is it generated? Who will use the outputs from your system? What will they do with those outputs? Why?
If your requirements are as stringent as you claim, select your team up front. Do not add anyone to the team (designers, coders, testers, documenters, project managers, etc. ) once you start the process outlined below. Everyone needs to be in it "for the duration".
Don't plan to do just one project. Plan for the team to do five projects using the environment. Plan for enough time to do it right. Plan for enough time for your people to get enough sleep. Plan for your team to not do any other applications until this one is finished. Do not reuse any code from project to project. The first four projects are learning projects.
Project 1: Team Building Project - Pick something relatively easy. The primary purpose of this project is for the team to become familiar with each other. The secondary purpose is for them to become familiar with the environments (OS, development tools, test tools, etc.) But this project (and all the others) must be taken seriously and produced to the same standard as your final product.
Project 2: Environment Test - Pick something that will push the environment to it's limits. Use every feature of the languages. Process numbers that are extremely small and extremely large. Process strings that are empty. Process strings that are huge and that contain all possible characters. Include all the "corner cases" you can imagine. Run it under high load, light load, no load and variable load. The purpose is not just to prove the environment, it's also for the team to gain an understanding of how the environment pushes back.
Project 3: Subject Matter Test - Pick something relatively simple to give your team experience with the subject matter and data that will be used in your ultimate project.
Project 4: Trial Run - Do a simplified version of your ultimate project. This is the place where your team gains experience with the core of the problem they will be addressing in the ultimate project.
Project 5: The Real Thing - This is where your team finally delivers what is wanted. Use a minimum of language features and system resources. Avoid doing anything "fancy". Keep it simple and organized. And remember, don't reuse any of the code from the earlier projects. Just use that you learned.
See, it's really easy. I've been producing high quality code for more than 35 years. And I learn something new on every project. This methodology won't guarantee success, but it sure will help you get there. Good luck.
Morris
I completely agree with the parent, but I want to add a few things.
Before you do your design, you should try understand all of the system's requirements. Just how much reliability is really needed? Are lives at stake? Is a lot of money at stake? Is a system failure just inconvenient? Remember that each extra "9" in reliability multiplies the cost of the project by 10, so make sure you understand just how reliable your system must be before you start.
Once you understand the system's requirements, make sure that your fault-tolerant design is testable. Then design your tests (before you write a single line of code). Don't make the mistake of leaving testing until after implementation. Make sure all of your interfaces are specified and that the subsystems are decoupled enough such that you can test each unit individually and thoroughly. What happens if a subsystem receives bad input? What happens if a subsystem takes longer than expected to respond? How long is too long? What happens if a subsystem returns bad output? You should design your system such that you can replace any subsystem with a misbehaving (test) subsystem and that your overall system responds appropriately. In the process of designing your test, you should expect to find many design defects (or design glosses).
You should also get someone to peer-review your design (and your tests) with the mindset of making your system fail. People with different backgrounds will have different experiences of what can go wrong; don't expect that you have thought of everything yourself.
By now you may be thinking "that's a lot of extra work". You're right, it is. But it's all necessary. You can scale back some of this depending on how much reliablility you actually need, which is why it's essential that you understand your requirements. You also don't have to do all the work yourself. In fact, you probably shouldn't. You should get someone else to work on the test side of things.
By the way, one essential subsystem to modularaize is the allocation of resources. You will find a lot of defects just by inserting a memory allocator that occassionally simulates out-of-resource conditions. There are tools to do this, but they don't seem to be portable to different operating systems.
If you focus on writing the code right the first time, then it will be more reliable. Try programming it as though the user is a homicidle maniac that knows where you live.
But seriously, use lint, Electic Fence, and all the tools you can to test the code.
Fight Spammers!
disturbed
please, that is clearly not a typo, but an acte manqué
You don't want it to be crash proof, you know in your heart of hearts it will crash.
You want to design it to keep working regardless of the failures you know will happen.
Assume there will be failures and design it to work anyway. That is the correct starting point. Then worry about the technical side of things and for that I leave it to others to contribute.
First advice: Use modularisation at its extreme. Second advice: After finishing your UML model, create a mathematical model too using automata such as finite state machines and formal methods or other software engineering techniques. If the model works well mathematically, there is good chance that the software conforming to the model will work ok too... in theory. In practice, I advise you to hire a good lawyer to write you a thorough disclaimer.
write a perl interpreter in C++, then have the perl program stored in a string within the source file. that way it is still in c++.
Snowden and Manning are heroes.
Both your reasons for staying with C++ are insane. There are plenty of other execution-efficient languages, and interfacing with libs is something every language pretty much has to be able to do.
Your other ideas also cut directly against simplicity and execution efficiency in a lot of cases, so dogmatically insisting on C++ is stupid.
That said, language is not your largest problem. Investigate provably correct software, less than perfect proofs, tools for evaluating code for correctness in automated manner, test first development, and lots of other things. And hire someone more experienced than yourself, you clearly don't have the background to pull it off.
-josh
For the distributedness, CORBA has always seemed like a good way to end up in a mess. Is it possible for you to split the computations on process borders, and maybe use ssh to distribute the work? At least for prototyping?
Gee! All that, and there's no modeling? Is everyone around here a cowboy programmer?
I have written applications that run on low memory frootprint devices, offer complete CORBA Interfaces for CSAF-Configuration, Status, Alarms and Faults integeration few a dozen EMS.
I use ORBit2 for both the client and server side... it's almost natural once you know how.
I have a few pointers for stability:
1. A function should be of return type void if and only if the failure inside that function is non-critical.
2. Every possible function should have a return -- and failures should be trapped gracefully and all the way to the top.
3. try-catch-exceptions --- wrap a lot of your code to catch exceptions... and use these consistently.
4. And finally... do not trust any entity. Do not make any assumptions about things like a function succeeding, memory getting allocated etc. I call this is a "failure assumption mode" where I trap failures first, then handle successes later.
Also, if possible go through some general linux kernel code. Though it is coded in c, and is a monolithic kernel, you can see a lot of graceful handling of failures.
Erlang is used in Ericsson's phone systems, is now FOSS, runs under Windows & Linux and is rock-solid, unlike C++. You can even buy support for Erlang if you so desire.
Actually, IIRC the Ariane software didn't attempt a restart, and it wasn't really a crash per se, although if it had attempted a restart, the result would have been the same. Anyway, as I recall, each control system (two redundant systems) was designed to assume a hardware failure and shut down upon receiving inputs that were outside a certain range, and thus leaving the second (working) system in charge. Except that they slapped an old system in from the Ariane IV and didn't bother to test it against the flight profile of the new rocket, so upon getting out of range inputs, both systems duly shut down as designed, thereby leaving nobody to drive the bus ;)
ABSURDITY, n.: A statement or belief manifestly inconsistent with one's own opinion.
DESIGN DESIGN DESIGN DESIGN
TEST TEST TEST TEST
Every function, every class, every file, EVERYTHING, needs to have automated tests associated with it. Keep in mind that it's really tough to write tests after the fact. You need to do it as soon as possible, and you need to enforce their use, preferably by automated means.
Use revision control, even if it's just you doing the coding, so you can add a post-commit hook on the server end to compile and run your tests, and raise some sort of red flag if anything fails.
One of the most common causes of crashes for C/C++ applications is quite obviously memory management (bad allocs/deallocs, overflows, etc.)
You can get very much more secure in this area if you use STL tools as much as possible, and let strings and containers handle the memory allocation issues.
You gain a lot of time, more readable code, and also faster one in most cases because the STL algorithms are so highly optimized (I've more or less renounced trying to outsmart the STL with hand-crafted code...)
Reducing the number of explicit 'new' calls in the program is, in my experience a significant factor in stability.
MySQL, for example, is STL-based, and I think this is one of the reasons why it has been extremely stable since the beginning (at a time when it was not obvious to make this choice because of compiler issues).
Another positive aspect is that, because containers are so easy to use, you don't hesitate doing complicated data management tasks that might seem undauntable when hand-coded. On this aspect, this makes C++ closer to interpreted languages with, for example, ready-made associative arrays and automatic memory management.
JF
First, this is a discipline issue. Your language or tools won't make your code rock solid, only you can. Based on the wording of the question I doubt you are experienced enough to achieve your goal with out a lot of self education.
For example you say hiding of the abstractions like Java RMI is desirable. But what Java RMI hides it what makes it unsuitable. The fact is a remote method call could fail for a number of reasons and pretending a remote method call is just like a local method call won't help you write rock solid code. Use a messaging system that doesn't pretend things are simpler than they are.
Don't use cutting edge, buzzword-worthy, technology. A mature technology will be more robust and have fewer bugs.
Humble yourself some. Unless you are a programming god, which you aren't because you're asking slashdot for help, your code will have many more bugs than the mature libraries you use. I love how your decoupling statement is based on errors in "say, the distributed communication module" as opposed to your own code where it's much more likely.
Two tips:
One, use references not pointers. The compiler is very good at detecting wonky things when you're actually dealing with objects and not just pointers to objects.
Two, use STL wherever possible.
first and foremost, use good coding techniques. This means use exception handling where appropriate, use standard containers over hand rolled data structures (prefer std::string over char arrays, this will help prevent almost all common string based buffer overflows alone), and follow good style guidelines.
As for a GUI programming, if you are strictly tied to c++, i would recommend QT (www.trolltech.com) they have a fabulous API (takes getting used to, but it makes sense once you do). Nice part about QT is that it is source portable to just about every major platform (X11, Win32, Mac).
It is possible to write reliable, fault tolerate code in c++ (realize please that perfect code is impossible in any language), it just has to be well thought out and done right.
proxy
I had to find a bug that caused a crash approximately every 2 weeks. It was terrible to debug because we couldn't replicate it initially. I eventually wrote a system test that shunted about 2 weeks data through the system in an hour but using random rates and delays and I simulated comms failures by killing the test process and restarting randomly. I found the bug in a comms error handling routine using purify (you could use valgrind). It was a double-free that corrupted the stack and caused a crash somewhere else later on, sometimes. The customer account was saved.
The lesson, to me, is to write a test program which will simulate heavy but randomised use of your system and to then to use this against the system whilst running it in a memory debugger.
In addition, I would suggest that you don't really need to split the program into several processes but would probably benefit from restarting the single process every day at some low usage period or by starting a new process and handing off to it from the old one. This idea always invites derision from my colleagues but they do make just as many mistakes as me and most of these mistakes wouldn't result in costly support time and patch creation if the software was restarted often.
Regards,
TIm
This is all just my personal opinion.
I take it by 'stable' you mean 'reliable' rather than 'someone's life actually depends on it'??
If someones life depends on it for christ sake don't use any of the high level languages, that's what languages like ADA were written for - but it'll mean a long development cycle and serious cash.
If you just want cast iron reliability (banking sector reliability) the best advice is to get back to basics with the development - don't try to do it on the cheap, because it'll fail - don't let anyone you don't trust do it. It will go over budget and it will take longer than planned but use a proper development model (Booch, for example) and propper integrity testing (Z for example) and be prepared for a long haul.
I really think you had better qualify this. IMO, assertion failures do not *cause* problems; they are messengers, and the message is always this: "Your program is broken."
I don't think you want to *recover* from a broken state. I think you want to debug it -- find out what went wrong, fix the code, recompile, test, and re-deploy.
Because, if you get to the point where an assertion fails, it means the state of the program is corrupted, and therefore you can't trust any part of it; e.g., you can't trust error-recovery code to be well-behaved. The best you can do is bring everything to a halt and fix the bug.
There are rare exceptions (no pun intended) to this rule, but for the most part, if you write out a condition and say, "if this is false, then the program has a bug", then you have some explaining to do if you *don't* want to use an assertion.
I don't see one post rated above 2 that isn't a language flame.
How about the same it's always been.
1. Design it well.
2. Hire good people (yes this rules out outsourcing of any kind, they dont give a damn about your code, only your money).
3. Test it.
4. Test it again.
Language doesn't matter, testing does. But testing costs money, and that's bad, so just use Java, noone EVER makes an error in Java - or Python/.Net/Ruby or whatever the hype of the day is when you read this.
Good grief.
- Adam L. Beberg - The Cosm Project - http://www.mithral.com/
By the way: to get good formatting for code, use the "Code" format (see the drop-down menu next to the "preview" and "submit" buttons).
The suggestions I have seen here so far seem to boil down to "Don't do it that way". Sometimes that's not possible. If it truly has to be C++, and it truly has to be as fast as possible and as bug free as possible, there are a few guidelines that can help:
1. Unless the GUI will be I/O bound, and that's unlikely, try to write it in a safer language that has better GUI support.
2. Make all your classes small and simple, and create test harnesses that are as complete as possible. Try to make the classes simple enough that they can be individually tested in such a way that all code paths are exercised.
3. Check your arguments. This includes checking for invalid combinations, and arguments that are invalid given the state of the object.
4. Don't use new or pointers directly. If there may be multiple references to an object, then reference count it and create handle classes that hold the references so all instantiation is controlled, and all destruction is implicit. Make these handles STL compatible, and never pass around pointers to them.
5. Try to design the application to fail fast and recover from failure. For example, maintain the state of work being done in discrete transactions that can be aborted if a failure is detected. This can be on disk or in memory depending on your performance needs. This could be combined with the ability to restart the app in a new process and have it pick up where the last one left off.
6. Have the app keep track of its memory usage, and be prepared to recover from memory leaks, possibly by restarting as in item 5.
7. If the compiler you're using supports structured exceptions, then use them. They can degrade performance a bit, but they can also enable you to recover from NULL pointer exceptions.
8. If you have multiple threads, then to avoid both the performance hit from context switches and the chance of deadlocks, don't let them access the same data directly. Instead, have them communicate through lock free queue structures. That way, all your main threads can pretty much spin freely. Spawn worker threads for any I/O or other operations that can block. A context switch can take as much time as thousands of instructions. You want to use as much of every time slice as possible.
9. Keep the number of main threads down to the number of CPU's or less. That way, except for the times when the CPU is being used by the OS or other processes, (should be relatively rare) each non blocked thread gets its own CPU.
10. Have an experienced QA team, that understands their job goes beyond unit testing.
Now here's a few that are always important, but for what you want to do, they become critical.
11. Have the design laid out at least roughly before you start.
12. If at all possible, don't let requirements change in midstream.
13. Overestimate the time it will take very generously. You will probably still be crunched.
pornking
I'd say to design a crash-free program you'd want to go back to the starting line. Deciding what language to implement a feature, module, etc. in should really be a final step in designing a program. You can design virtually every aspect of the program without implementation language ever coming up.
It's much easier to understand if you step back a bit, think of programming languages as tools. Would you rush up to an architect and say "Alright, I've got this idea for a building but before we get started let me just say it must be done using only a ball-peen hammer and a flat-head screw driver."
Seperate design and implementation, decide how your program works, what it does, what it looks like... The architect does not worry about what tool the construction workers will use to accomplish each task. When you have an idea for a program, why would you worry about which language each module or component is written in? If it accomplishes your task and meets your performance and stability requirements, isn't language irrelevant?
If an architect arrives at a building site, and the structure meets his specifications, why does he care what kind of hammer the construction crew used? For all the architect should care, the construction workers could have driven the nails in with the palm of their hand.
Decoupling, design patterns, data structures, GUI, persistent object storage, Linux, portable.... you've tossed in just about every trendy technical jargon a person could say in one breath. Without a deeper understanding of the construction of software, you are paying only lip service to any of these concepts.
Write some wrappers for haskell and then use haskell's fully deterministic type-safe behaviour in the right way.
Then logically *prove* that your deterministic code does what it should.
Now you only have to worry about the non-deterministic part.
So keep it as small as possible under any circumstances!!!
Then be sure to modularize your code in a way that allows runtime replacement (of parts where you forgot some exceptional possibility a by fully proven and unit-tested new module) for that case of some problem, that *WILL* happen (but will not be a problem if you work that way)!
If you can't use haskell in *any* way, try to create fully deterministic and type-safe (think *micromanagement*) code WITHOUT state-dependend objects for everything where you would use haskell.
I've some pretty advanced design patterns here, but i can't draw them right in here....
Any sufficiently advanced intelligence is indistinguishable from stupidity.
Provides very good decoupling for anything that can be abstracted to a request-response model.
XML is used extensively (if not exclusively) for data formats, and HTTP is used as a network protocol, keeping these to known standards keeps it simple.
I recently used gSOAP (http://gsoap.sourceforge.net/ to implement some web services into a C++ API and it worked flawlessly.
I've done 2 summers worth of internships at a software development company who specializes in various safety-critical applications. The kind of question you're asking must obviously come up with you're talking about software that keeps a plane in the sky, or life-saving medical equipment functional. Even though nobodys death seems imminent if your program crashes, the same kind of approach could be used I suspect. I think the key is really a well defined process. I'm sure there are plenty of good books written about the subject, but the long and the short of it is this in my opinion: Detailed requirements that all trace directly to code (and there is no code that doesn't trace back to a requirement), and testing that validates each and every requirement. As for the modular approach you described, the projects I worked on were generally embedded and used operating systems designed with the sort of no-nonsense approach you talked about. Maybe considering the concepts used in these OS's would yield some ideas for your own application?
I'm suprised this didn't come up in any of the other discusisons. If the
application requires stability, then you will need time. lots of it. it
doesn't matter how many developers you have its going to take time to
cook. if the powers that be cant give you 2-3x over what would the nominal
development cycle would be, then just forget it outright (or maybe
stability isn't all that important?). enough time any you could possibly
employ the write-it-twice strategy (which i've never known to succeed
in practice)
aside from using a better environment (sounds like erlang might be perfect
for you), the only other thing i could suggest that i haven't seen here
would be reviews. long boring group sessions with a projector. 1 on 1
peer reviews. print the thing out and take it to the bar with you and
make sure that its obviously correct before you even bother having someone
else go over it.
your suggestion about restarting failed components is really looking at
the problem from the wrong angle i think. what about correctness? build
it properly in the first place
No matter how you code, there will be bugs. Aside from all the good advise others have given here, what you also need is a human QA. And I don't mean two half-assed testers. I mean hiring a bunch of people with evil glances in their eyes who could code the stuff themselves and pay them on a number-of-bugs-found basis. Give them every incentive to be as mean to your code as they can possibly be. You and the other coders will hate them, but they'll find anything that can be found in testing if you can make "I found a bug" something they can be proud of.
Assorted stuff I do sometimes: Lemuria.org
when you need him.. so many experts, so few answers.
The less code there is, the fewer bugs you should have. This sounds like at least two or three programs, a GUI/controller and the nodes. Try to avoid threading, or any situation where there might be unexpected timing related interactions. And you make it clear that C++ is unavoidable, but does it all have to be C++? Or just the parts that need to be fast? The GUI could be managed. The node could be split into a managed program that starts an unmanaged program with a set of parameters and reads the results. Even if both parts were unmanaged, you'll still have a system where a single crash wouldn't crash the node entirely, but likely just the worker half of the node, which can be restarted by the monitoring half. And you mitigate a lot of memory leaks if your worker processes get restarted from time to time.
As far as how to handle crash recovery, whether the program should fight tooth and nail to overcome the error, ignore (but log) the error, crash and restart upon error, or crash until it's restarted manually, it really depends on the severity of possible errors, and the consequences of failure vs incorrect behaviour. Is this a program that you'll be unable to patch in a timely fashion if it breaks? Obviously you'll want to run a lot of stress tests before the thing goes live, and consider how it'll respond to bad data.
I have a credit card processing proxy (about 3 pages of javascript) in place to protect us from the outright dangerous credit card handling in our ERP system. The proxy just sits in the middle and blocks the unwanted transactions, returning fake replies to make the ERP system happy. Apparently our use of the ERP system is a little atypical, like sometimes shipping orders and charging cards on the same day the order was taken. Because its job is so important, if there is an error, I'd rather have it crash than take a chance that it might do something really bad. Most errors crash the proxy outright. The proxy is run by another script that mostly just sends me a message when it crashes. I like to live just a little dangerously, so the proxy gets restarted once after its first crash. But if it crashes a second time, I must restart it manually.
On the other hand, I sometimes write code that ignores most unexpected errors, so long as it's worse to completely fail than to partly fail, and it's not doing anything expecially important or dangerous. For important things, it's often better to fail than to not fail when it should.
Perhaps I misunderstood, but it seems that you're advocating the use of error codes instead of exception throwing/catching.
Quoth Herb Sutter and Andrei Alexandrescu (C++ Coding Standards, item 74): "Report errors at the point they are detected and identified as errors. Handle or translate each error at the nearest level that can do it correctly."
And then there's item 72, which is pretty lengthy and contains lots of references:
"Prefer to use exceptions to report errors.
"Summary
"Prefer using exceptions over error codes to report errors. Use status codes (e.g., return codes, errno) for errors when exceptions cannot be used (see Item 63), and for conditions that are not errors. Use other methods, such as graceful or ungraceful termination, when recovery is not required or is not possible."
(Item 63 basically says, "don't let exceptions propagate beyond the boundaries between your code and the code you don't control.")
Given that these experts have written at length about error handling, and that they suggest a preference for exceptions when dealing with [non-bug] errors from which you can recover, I think you need to talk more about the kinds of cases where you think exceptions are inappropriate (and give some complete code samples).
C++ has been accused of containing some useless features, but I don't think anyone of note has said that about exceptions.
Please reconsider the C++ requirement. Managed environments like Java and .Net aren't slow, and can call existing C++ libraries. You can even compile Java / .Net so that it runs as fast as C++. Also, don't forget that it's cheaper to buy a faster computer then to write a faster program.
No, I will not work for your startup
Don't make the mistake of putting all the testing and verifying on the developers. Nobody's perfect. Even experienced developers who practice Test Driven Development (TDD) will release software with the most obvious bugs. The fact is, the person writing the code is the worst person who should test it. The developer may have made assumptions that certain algorithm work without actually testing it due to time constraints or worse ego. If you had a good software quality department, they can come up with a thorough test plan and focus on testing the corner cases that would cause instability.
Ok, but note that some C++ Standard Library functions will throw if they fail. So when you call these functions, you have to think about exception safety, but you can also assume that they have succeeded on the lines that immediately follow the call.
(Implementations vary though, so you'll need to check up on this for your compiler. E.g., with some older compilers, operator new returns NULL upon failure, whereas the Standard says it must throw.)
Rather than investing too much time and effort in creating a complicated crash-free program, just make sure your application can recover from a crash, and then use a process management application that restarts the program on it's node when it is detected to not be running properly.
It's simple to write a 100% correct program that checks the health of your main application, and restart it when it isn't responding.
http://pcblues.com - Digits and Wood
You mentioned possibly running this in parallel. If you're going to have bugs, it will be there. Especially if you want to have this be cross-platform, abstract out your inter-node communication so that you have a choice for what method (ie MPI, raw sockets, etc) you're using to communicate between computers, and put the network error detection and revival code in there. Design the program as you go with parallel execution in mind to save pain later. Sadly, this all is not a simple problem.
I partially agree.
If your code is unstable in a way that memory leaks and segmentation faults are not only a "remote possibility" but a - even if only rarely - reoccuring event, then any safeguards you implement won't be overly sucessfull, unless you fix the code that causes the errors first. (Disclaimer: There is no perfect code. Even if there were no bugs in the code, the program has still the "remote possibility" to crash due to errors in the hardware / OS)
That said, garbage collection or not is a different discussion. Some say it is bad and breed lazy programmers, while others argue (I amongst them) that it is a terrific tool for designers, since it almost eliminates the occurance of memory leaks (unless you do some really bad programming) and it might even speed up your program
+++ MELON MELON MELON +++ Out of Cheese Error +++ redo from start +++
Of course it's more of a question of most C/C++ compilers lacking an option "force-checking-of-all-return-values" than anything else. Exceptions have their own set of problems, though. Try googling for "strong exception guarantee".
It's good to be explicit about some things, but don't be inane.
For example, "int" is *always* "signed int" -- this is in the language definitions. So it's a waste of time to spell it out. Note the signed-ness of plain 'char' is implementation defined, so you could legitimately be explicit about that. But note again that a lot of Standard Library functions use plain 'char', so you have to be careful that your explicitness doesn't break things.
It does help to be explicit about, say, constructors (which is why the creators of C++ added a keyword just for that purpose). E.g.:
struct A
{
explicit A();
explicit A( const A& );
explicit A( int );
};
In short: be as explicit as you need to be (such that any human reader could draw the same conclusions), but no more. E.g, the parentheses in "a + (b * c)" are superfluous because everyone knows (or can quickly find out) that binary '*' has higher precedence. But then again, with lots of sub-expressions, it can help to add parentheses, line breaks, and indentation at certain points to ease readability. There's no hard rule for it; you just have to feel things out.
- Instanciate coding guidelines - code must be easy to read. Minor variation in coding flavour shouldnt be here, merely the large scope.
- Avoid if...else if...else if...else if... - constructs. They are especially hard to follow, and can often be replaced by switch/case-statements.
- Require large modularization of code. - No function/method should be more than about 40 to 100 lines, the fewer the better, but don't be too rigid here - some functions/methods are better of being straight-on than modularized.
- Code for each case-statement in a switch should be a call to a method/function that encapsulates actions and declarations. (not always possible, but if the code exceeds 4-5 lines a function/method should be considered).
- Don't nest a switch/case inside a switch/case - do the nested switch in a called function/method.
- All code must be reviewed.
- Test cases for each module, which requires writing a bunch f test code that can be used for regression testing changes.
- Don't allow compiler warnings. (-Wall shall be used if using gcc, possibly also other options)
- Declare your own types to manage code neatly.
- It's better to write code cleanly than to write it in the most compact manner unless it's a real performance issue.
- Document each module to describe why it is and what it is doing.
- Place each class in it's own file - like Java.
- Be sure to keep as much as possibly 'private' and only relax to 'protected' or 'public' when needed.
- Variables in classes should have get/set methods and shouldn't be accessed directly unless there is a performance issue. Set methods can then be able to validate indata and reject or throw an exception on bad data.
- If something can't be resloved without a compiler warning - think again and if still not possible - document that reason before the code review.
- Run the code under analysis of leak and memory access testing software like Valgrind and/or PurifyPlus. Preferrably both during unit testing and system testing.
- Be paranoid. Check for 'null' results and do detailed try/catch blocks instead of a try/catch over a large block. Using detailed checks allows you to take appropriate action on detail errors.
- Instaciate an extensive beta-testing program.
- Inline-declare function/methods that are broken out if they should be inline with the code for performance reasons.
Notice that PurifyPlus is a package that allows you not only to detect memory leaks and invalid memory accesses, but is also able to do performance analysis of the code with Quantify as well as checking that you actually have tested all code by the component PureCoverage. This allows you to be able to focus your work on improving application performance on the places where it really matters.For code written in C you can use Splint
If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
Using the good-old Ada programming language (for which a new standard should be issued this year), you can go closer to what you're looking for (though I'm not sure your goal is realistic).
Here's a pointer to the new standard : http://adaic.org/standards/05rm/html/RM-TTL.html
With Ada:
It probably won't solve every of your problems, but it might help.
For a free, quite strong Ada compiler, have a look at https://libre2.adacore.com/ (it's based on GCC).
Oh yes, Ada is a statically, strongly, strictly typed language (e.g. the compiler won't let you assign an integer to a float variable). My opinion is that it's a Good Thing for critical programs. Useless to restart a "type war" on this subject ;-)
Good luck.
Note that because C++ allows automatic destruction, having a single exit point explicitly represented in the program source code is not terribly important, and sticking hard-and-fast to this rule may lead to some unnecessary performance hits on account of extra calls to class assignment operators, extra class object temporaries, etc.
An explicit "single exit" rule makes more sense in a language like C, where there is no language feature that enables simple automatic clean-up.
Tools such as purify,valgrind (dynamic analysis) and other tools such as Coverity (static analysis) should be used along with all the other methods.
I agree with item 10, but it should be stated earlier and more often. Bugs in the analysis of requirements and in the design can literally cost years of lost time -- and I'm not counting man-years. If the design is correct, then so many other decisions just naturally fall into place. If it's flawed, you'll spend a lot more time than expected on development. If it's deeply incorrect, you'll be in a world of hurt.
If you're a reasonably competent programmer, you won't have crashing issues with pointers in C++. They're not that hard to deal with, unless you're doing some weird kung-fu with templates or function pointers. Even then, crash-avoidance is not the difficult part. In your case that hard part is going to be synchronization between different things that are running concurrently, and those problems have nothing to do with the language you use. In fact, since C++ is a system programming language you probably have more options available to help with synchronization than you would in other languages.
I already (successfully) built a project that sounds just like this. I currently have a 500Mb database with 2500 data sets in which the clients are adding another 0.5Mb/day to. In my system, it's not so much about the size of the databse but rather the time needed to calculate the data.
:) OSX.
Here's what I did:
1) GUI
Server side web interface (python, apache, mysql, php, svg)
2) Core data structures, a persistent object storage mechanism
MySQL + phpMyAdmin
3) a distributed communication module and several core algorithms.
Python + urllib (communicates with the server through cgi). I also use the pysco python jit compiler for speed.
If your algos are complex enough you could also look into Numerical Python. Or, If python is too slow, you can write just your algos in c/c++ as a python extension module.
The remote clients connect through server side python cgi scripts on a SSL connection. I "pickle" and compress the datasets in both directions for transport across the network.
If you need a client gui then look into WxPython.
My System works like a champ! And the simple reason is that I intentionaly sought out mauture proven tools that when combined left me with only the task of glueing them together and adding the application specific code. The result -- both the client and the server are naturaly portable across Linux, Windows and (nativly
...do your own homework!
/ Per
Note that the run-time cost of copying a class object may be such that it makes more sense to pass a copy instead of a const ref (and incur the cost of constantly "dereferencing" that reference).
It sounds like all your modules need to be separate processes, and then you need a central module which handles restarting crashed modules and handles and logs all communications between processes. When a crashed module restarts, it needs to query the central module to retrieve a log of the prior messages to reconstruct its state. This obviously requires careful design of the protocols between the modules - they protocols need to be either non-state-dependent or you need to send a full state dump periodically to allow the central module to dump old state info.
Oh, come on, stop making yourself miserable !
About 10 years ago i had a project on university, where i have implemented a "safe pointer" C++ template. The quite complex program (no GUI, portable) was for optimisations, it generated and deleted a lot of C++ objects.
Every class in the project was derived from one base class, which enabled the ability to be a "safe object". All Pointers were in fact objects, generated by the template and the class the pointer pointed to. New objects were normally generated with "new", but there were no "delete" commands. The objects were deleted automaticaly when the safe pointers were deleted or overwritten. When more than one safe pointer was pointing to an object, the object was deleted when the last safe pointer was deleted or overwritten.
Based on this safe pointers i wrote some list and tree classes. I never had any problems with memory leaks or uninitialized pointers. Yes, i think it is possible to write big and safe C++ programs, as long as you control the pointer problem.
It's simply impossible to forget to delete with a shared_ptr... no matter how simple you make it to remember, you WILL forget to delete one of your pointers at some point (or several points). Any time you need to manually insert code to do something you're just increasing the number of things that can go wrong, better just to be safe and always use smart pointers.
HAND.
Take a look how Postfix is programmed http://www.postfix.org/OVERVIEW.html
While there's more to this, what strategies should a developer take to insure that the resulting program is as crash-free as possible?
...
Let me code it
Unfortunately using refs can obscure the meaning of your code, specifically when used for "out" parameters -- ignoring the obvious const-ref usage. I've found that always using pointers for "out" parameters helps make it more obvious at the call site that something is being altered by a function/method call.
HAND.
Defensive programming might be necessary for the kind of stuff you want to make.
:/)
However, there are also a lot of good practices when it comes to coding in C++ that will help you have better, more trustworthy code. Sorry in advance if these seem obvious.
It is of course merely my opinion, and some of these advice may be wrong or horribly wrong.
1. If you need to do something that the standard C++ library can do, use the standard library. Don't implement your own containers unless you really, really need to.
2. Never use the stuff from the standard library that were inherited from C. Examples:
Use stringstream instead of sprintf.
Use file streams instead of fopen and the like.
Never use any XXXcpy function (memcpy, strcpy, etc.). If you need to do it, think twice, and then think again. It shouldn't be necessary in most cases.
3. Never do pSomething = new Someclass if you need the instance only in the scope where it is defined.
Use Someclass Something, and you won't need to put a delete pSomething in every place where you can leave the function.
4. Pointer arithmetic is bad. You should almost never need to add an integer to a pointer, or to take the difference between two pointers.
5. Casts are dangerous. Always use C++ style casts, like static_cast. C style cast should be avoided. reinterpret_cast or const_cast should be a very, very rare occurence, so think twice if you think you need one.
6. Respect constness everywhere it is needed. And respect it properly, not by using mutable or const_cast liberally (unlike what some people did in the codebase I'm maintaining at work
7. If you have a boolean, use bool. Not int or whatever else.
8. Use smart pointers. With reference counting for objects with shared ownership, or just to automatically delete stuff that goes out of scope.
Then you can write SmartPointer = new blabla and don't need to explicitely delete. This simplifies the code in functions because you don't have to delete in every place you leave the function, and in classes because you cannot forget to add the delete in the destructor.
And it's the only proper way to work with exceptions.
If you want to be strict, block the ampersand operator in classes that you want people to access only through smart pointers, and also forbid constructing or setting a smart pointer from a reference (to prevent errors like making a smart pointer point to an instance on the stack)
9. Beware of large classes and large functions. If you want to be able to trust that the code works, you need classes and functions that perform a single, well-defined task. You can then easily verify that they do what they're supposed to do and not more or less.
When you have a class or function with a vague name, a vague purpose and that spans pages of code, you cannot easily audit it to check that it does what it's supposed to do.
To make s/w reliable you have to assume that it isn't. Although NASA lost the plot decades ago, it's worth looking at how IBM engineers built systems that could survive the failures of these chimps. IBMers assumed that everything could go wrong. There was great pride in the team when they found a stupidly obscure and massively unlikely bug. There was no notion of "having to decide whether you were an engineer or a manager". IBM stuff managed to survive things like motherboards being melted in flight with microgee solder balls shorting out the rest of the system. Even in Challenger the IT was working as it hit the water on the way down. That culture is far more important than any single technique. Implicit is the notion that no one can really check their own work. You need someone who takes a positive joy in pointing out your errors. That needs to be backed up by managment, who must not only avoid trating bug finders as trouble makers, but positively reward them, and yes that means cash. I think we can already see why so much commercial s/w has stupid levels of bugs... If you're serios then I'd suggesting voting as an architecture Multiple modules to take decisions. To do this properly, each needs to have different code. The gold standard is separate code, each in a different language written by a different programmer. The mulitple language precept is because two programmers in (say) C++ may actually make the same error from the same specification. A classic example is "between one and ten items". Different people may interprest that as including 1 and 10, or excluding them, or in the case of some C++ developers 0..9 VB programmers "know" that Dim q(10) actually has 11 elements in it. Voting puts a bound on the effect of a failure.
Dominic Connor,Quant Headhunter
It's too bad that none of the other posts I read here contained any information. That said, for my $0.02, exception-handling is one of the best mechanisms you can use to make your system reliable.
Good luck....
I'm probably not on the same level as some of the people here, but here are a few strategies I've learned along the way:
1. Exception handlers - bracket every functional unit with exception handlers to catch any situation that hasn't been provided for explicitly. It's worthwhile thinking hard about what to do when an exception is caught though. For the sake of code maintainability you need to figure these in as a fundamental part of the design and think in terms of a hierarchy of handlers where at each point you make a decision whether to handle locally or pass the problem back up the chain.
2. Safe memory allocation - avoid allocating "big enough" static structures - take the GNU coding guidelines' advice and avoid arbitrary constants. And use a decent third party safe malloc() implementation.
3. Bounds checking. You should do your own bounds checking to preserve the logical integrity of your execution path, but ideally you should link in with a well proven third party bounds checking library as well.
4. Waypoints - there might be places in the code where you can save valid partial results to disk in order to minimize reprocessing when a thread has to be restarted after an exception handler has passed control back to a caller some way up the line.
All I can think of right now...
Never took that serious, but /. is really a "do my homework for me" place now, isn't it?
(*) == I call BS.
Do you all really think putting a (possibly buggy [MS]) virtual machine in the game will make his app more stable? OMFG...
It's better to be the foot on the boot than the face on the pavement. ~~ tkx Kadin2048
In my experience decoupling and automatic restarting is a recipe for failure. You set yourself up for all sorts of race conditions. For instance, if a module is unresponsive for a while but not crashing, do you restart it? And if you do, what if the original module finishes its grand execution plan and comes back up after a minute?
No, I'd go for:
* A "monolithic" application with module separation provided by OO design. At least you know that either your whole application is there, or it isn't. No inconsistencies between modules because of individual module re-starts, and if the app breaks, restart the whole thing. Starting the app is the code path you've tested, restarting separate modules usually isn't (and even if it were, there's usually 2^27324 different situations to test, i.e., all possible combinations of modules failing in any sort of way).
* Use smart pointers exclusively, preferably Boost's shared_ptr. Use weak pointers (Boost provides an implementation for that as well) to automatically break reference cycles.
* For error handling, use exception handling exclusively. Incredibly many bugs are caused by ignored return codes.
* Use "auto" objects for all resources that you acquire and that need to be released at the end of a code section. Cleanup that doesn't happen when a code path encounters an exception can cause resource leakage, instability and hangups (locks, anyone?). In my programming practice, when I allocate a resource (memory, refcount, open/close a recordset, etc.), I always wrap it in an auto object immediately, so that I can forget about managing it through all the code paths that follow.
* Use the correctness features that the language provides: write const-correct code from the start.
* Use automated testing right from the start, both unit testing and integration testing. If you don't do this, you will be forever tied to whatever bad design decisions you make in the first months of the project. Automated testing allows you to always make large implementation changes, giving you confidence that it will not break existing behaviour.
what strategies should a developer take to insure that the resulting program is as crash-free as possible?
Simple, easy answer: unit test it as much as possible, then tinker with the design to make more unit tests posible.
Beyond that, do regular code reviews and get into the habit of programming defensively.
this applies to any language, not just C++. There's even a port of JUnit to C++ over here
My Karma: ran over your Dogma
StrawberryFrog
What this guy really needs is the time-tested, tried-and-true Waterfall development process !
Thomas-
Use formal methods, it's what NASA and companies that develop medical software use.
I'd recommend B or Z.
There are even methods/software that will help you convert your specification from B/Z to C/C++ code (or other languages).
(Just a word of warning: This is not going to be easy or fun, but it will make your software robust, and you will catch a lot of errors before you even start coding)
http://www.zuser.org/z/
http://czt.sourceforge.net/
Beyond language ,beyond buzzwords, read up on the requirements for SEI CMMI certification. Basically the point is to documanent everything, all procedures, all requirements, all test results, everything. Only once have I had the pelasure of working on such a project, the resulting application is 100% rock solid, and delivered ahead of schedule. (a bit over bifdget however, 2of three is not bad). Going in I would have thought that it would be terrible, but once I realized how nice things were without the usual bickering, I would hesitate to work without the SEI cert. You really don't need level 5, level 3 will suffice, but read all of the requierements and get a good coach to help set things up. It is really worth it. Oh, yes, the app is written in C++ and runs on PPC under PSOS.
You can link against your C++ libraries using the forgien function interface. Functional programming is extremely efficient for distributed computing (see google). Finally it's a lot easier to write stable code.
The main difference between C++ and other popular languages these days (other than gross syntax) is C++'s lack of garbage collector. Thus in C++ you can corrupt memory while in other languages you can't.
Yet we haven't experienced _any_ memory corruption in the medium-scale project we are working on? The trick? We avoid all C-style pointer-based constructs and instead use only higher-level, type-safe abstractions like smart pointers, vectors, strings, etc.
In the places those aren't available, we build safe constructs on top of the unsafe ones.
For example, instead of calling pthread_create which passes the parameters to the thread function through a void*, we have 20 createThread template functions that can be used to start a thread function with 0-19 arguments.
This leaves us with a single avenue for memory corruption: modifying a STL container while it's being iterated through. While it's impossible to prevent those at compile-time, a library like STLPort will catch them at run-time.
Dejan
> I need to create an ultra-stable, crash-free application in C++.
First order of the day: think about what this means. You want an ultra-stable application, so you conclude it shouldn't crash. Right. But it also shouldn't get stuck. Or return the wrong data. Or screw up something.
Once you have a good idea of what you want to avoid, just limit your use of the language features. Afraid of NULL pointers/dangling pointers? Ok, use smart pointers. But check that your smart pointer library is rock solid.
Afraid of deadlock? No more while loops, for loops only with a maximum loop counter, no more recursion etc. See, it is getting rather tricky.
In the end, C++ can probably do this if you use the necessary restrictions, but it is not going to be efficient. Using an interpreted language might be the easier option.
Too slow? Fast, safe and useful, pick two out of three.
Classically, you cannot optimize two parameters simultaneously. Thus, if reliability and robustness are the most important things, coding in C++ must logically be less important. Even so, of course, there might be no practical alternative. But I don't believe that is so. A lot of the trouble we have with modern software is that the people who produced it did not assign a high enough priority to reliability or security.
.NET? The libraries you refer to are presumably fairly robust, so if you write your own code in a safer language you should be OK. Clearly, other parameters such as performance and cost may be affected, but as already stated, you ca't optimize everything.
I realise this probably isn't a practical suggestion, but it is at least an existence proof. In the VMS Common Language Environment you can mix languages more or less ad lib. Thus, you could call C++ libraries from a simpler and safer language such as Pascal, Ada, or even Java. (Yes, I know Ada is big and complex, but there is no law that says you have to use all of it. Besides, reliable/secure subsets have been defined for mission-critical or safety-critical applications).
Can't you do similar mixed-language programming in
I am sure that there are many other solipsists out there.
I needed to reformat what you said before I could read it easily:
I've been out of C++ programming for several years, but I do remember a couple basic rules I followed that saved me from a lot of memory problems and invalid state problems. This may not be the kind of thing you're looking for, but here it goes...
1) Never allocate memory to a raw pointer. Never. That is, if you allocate memory to something, it better be allocated to a smart pointer like auto_ptr or a reference counting pointer (boost.org at the time had a family of these). The only exception to this rule is in the implementation of the smart pointers themselves. You should be able to find a good number of articles on this.
2) Always follow the "strong exception safety guarantee". Classes that provide this guarantee promise that they will not change their state if they throw an exception. Again, there are many articles. Here's an example of an assignment operator providing the guarantee (please forgive me if my C++ is not quite right - I'm rusty):The example is a class Whole with two dynamically allocated Parts. The assignment operator instead of having two lines of code - i.e. cloning the two parts and assigning them directly to the member variables - has four. It first clones the parts to temporary variables and then assigns them to the member variables. Why? Without the temporary variables, if the second "new" operation throws an exception (such as bad_alloc), the state of the class would be different and inconsistent from before the call. It would have one original part, and one part from the cloned class.
There are lots of other simple rules like this that can make code more solid, easier to read, and easier to maintain. If I remember right, the C++ FAQ from the C++ newsgroup contains a lot of them.
Question 1: what strategies should a developer take to insure that the resulting program is as crash-free as possible?
Answer:
a. Use OO techniques and maintain all objects in your system extremely simple; furthermore, maintain all methods in your system extremely short, well-contained, well-defined.
b. Don't use C++ arrays, ever. Especially not for strings. Use and abuse the STL. is just plain beautiful IMH?O.
c. Check extensively the behaviour of your constructors and destructors.
d. Make a object-lifecycle diagram of each class you program. In the diagram, relate it to the neighboring classes (parents, children, siblings, classes involved in design patterns with, classes aggregated, classes value-aggregated, classes where this is aggregated or value-aggregated)
e. Use, carefully, and always when possible, smart pointers. Remember std::auto_ptr is your best friend -- its limitations are a defining part of its strength. Remember boost::shared_ptr is also a good friend, but its cousin boost::intrusive_ptr is even more friendly -- but use one of those (and their other cousins scoped_{ptr,array}, shared_array, weak_ptr) only in the (rare) cases where auto_ptr does not apply.
f. As a corollary to (e) above, use boost. This is really an extension of (b), too.
Question 2: How can I actually implement such a decoupling?
Answer:
I would use a simple, socket-base, take-my-data, gimme-my-results scheme. It would be network-distributable, easy to detect if some service is or isn't alive via timeouts... If you want something more sofisticated/RMI-like, SOAP (with binary XML or compressed) may be an option. The simpler the better IMHO.
Question 3: are there any software _design patterns_ that specifically tackle the stability issue?
Answer:
All of them? IMHO, DPs can represent huge tool to increase the stability of a system. Take a look athere [WARNING: PDF] (and in the bibliography) for some ideas.
I know many of my posts were self-marketing lately, but if you need someone to work with you, I'll be happy to send you my resume... write me at hmassa (at) gmail.
It's better to be the foot on the boot than the face on the pavement. ~~ tkx Kadin2048
I would suggest to create the software using a preferably a stateless domain and connect any relying system using a service/adapter. this way your application would still functoning even when loosing a certain service. You can loosely couple your dependencies... This is often called a Domain centered architecture or as Alistair Cockburn calls it: url:http://alistair.cockburn.us/crystal/articles/h paaa/hexagonalportsandadaptersarchitecture.htm We use it for almost all the software we create, instead of the more traditional tiered architecture.
F/OSS & IT Consultant
And the code is linkable with C code and libraries if you must.
I've ported several programs from C and C++ to freepascal. In every case the compiler found a handful of previously undetected range and overflow errors.
Oh, and stay away from the C-like "extensions"!
Hi,
:)
Maybe you shouldn't aim to have the program crash free. If you assume that crashes will occur, and put in place mechanisms to deal with that, you will have a far more reliable system than if you use a more optimistic approach.
Smart use of signal handlers and threads should go a long way to ensure the system keeps running. As an example, I'll mention a server I worked on for a couple years, that used the following approach:
- All subsystems would run inside a try/except block, with exceptions being caught on the upper level
- A backgroud thread catching signals and ensuring that the subsystems are running
The server would run the main loop and ensure that the state was recorded at the end of each loop. It could handle even segfaults by resuming operation at the last known state (previous loop). Deterministic errors in an iteration would be dealt with by notifying the admin (only happened in the test versions of course, which had experimental stuff).
I guess it's more a matter of method than language itself or libraries.
"I don't mind God, it's his fan club I can't stand!" E8
And more specifically testing approaches. Use a standard procedure. Choose the appropriate software engineering model(waterfall,rad, spiral and so on) according to what you have right now and have release cycles along with testing periods. Black and white box testing are the ones I can recall right now. Of course beta testing helps a lot also. This is a crash-safe approach to a crash-free application, in my opinion, although there is no such thing really.
You can't use testing to _prove_ the absense of the sort of runtime errors you need to avoid but you can use tools such as PolySpace (www.polyspace.com) to do this. It won't prove that your program does what it is required to but it can /prove/ it to be free of those horrible little bugs that can take days/weeks to find and fix.
Alternatively, import the C++ libraries you have to keep into a program written in a language design for producing high integrity systems, e.g. Ada or SML.
I assume you _really_ have to make everything in C++, otherwise you should seriously consider developing most of your code in another programming language. The way to make a crash free application is to use formal methods to prove it can't crash. That means specifying preconditions and postconditions for all methods and invariants for classes. There are no facilities in C++ for enforcing this kind of programming, but you can build your own based around the assert.h functionality. The other thing you want to do is to restrict yourself severely as to what parts of C++ you use. Use the STL library for everything, use smart_ptr and auto_ptr in place of raw pointers, use pass by reference as much as possible. Don't do anything fancy like multiple inheritance, operator overloading, advanced template programming or writing your own container classes. These things are better left for library writers. Build some sort of memory tracking facility so you will know about memory leaks (there shouldn't be any if you never use raw pointers, but you never know). Above all, make automated tests for all features you add.
The interactive way to Go -- http://www.playgo.to/iwtg/en/
Sadly, the programming language cannot be changed
Did you not read this part?
"I reject your reality, and substitute my own!"
OK .... here goes. I'm an old salt - been a programmer for 25+ years. What I'm going to give you here, now, are my observations and general philosophy ..... your milage may vary.
Get the best programmers you can lay your hands on. I don't care what the platform is - bad programmers write bad code. Good programmers write good code. GREAT programmers will make you cream :-) (I know - I have to get out more)
Ignore any comments of the type "Use XXX or you're nuts". "XXX is the only way to go, etc." They're trends - I've been seeing comments like that since I started, and code quality is no better now than it was when I was doing hand assembly.
Simplify, Simplify, Simplify. The smaller and more discrete your modules are, the less likely you are to have errors. Break the problem down into the simplist, most discrete chunks you can, and do each of them as an individual task. The smaller code will be simpler and easier to test & verify.
Use code reviews. Even a guru's eyes will glaze over once he's eyeballed the same source for the nth time .... you need the additional eyeballs.
Decouple & seperate. The less each module is dependant on other modules - explicitly OR implicitly - the less likely you are to have unforseen interactions. This also, btw, will help you develop an API that can be made available.
Use these approaches throughout your project, and you still wont' have perfect code - but it will stand a much better chance of being correct, and any errors you DO have will be easier to find.
This is not an exhuastive list, by any means, but good code is more about approach than specific languages or libraries - it's a state of mind.
If your program is accessing an object that has been replaced or deleted, is it still correct? There is an area of lock-free programming that uses GC but it uses COW (copy on write) to ensure the objects are in a consistent state and the programmers do have to be aware of the slightly altered semantics. For composite data strutures such as collections, you have to be extremely careful that operations linearize correctly (i.e. preserve collection semantics). Naively accessing a data structure without syncronization and assuming that since you didn't crash that your program is working correctly is not very wise. If anything, you should be more afraid of programs managed with GC. Be very afraid.
Yeahbut the reasons he gave were not convincing. Freepascal code can be linked with most anything that doesnt have some really bizarre interface. As an example, I've linked it to al lkinds of hellish Windows C DLLs and system APIs. No sweat.
The key to robust programming occurs long before coding.
You need a solid design that breaks the system down into objects. Those objects need well defined states, actions and messages. The messages can be implemented as messages or as calls.
If you have that type of rigorous base the scope of error that you can make in any programming language is very small and very easily diagnosed.
> what strategies should a developer take to insure that the resulting program is as
:)
> crash-free as possible?
Well, if you ask you're not sure, and if you're not sure, why don't you hire a very
experienced developer and pay him damn well?
Let me see, where could you find one? Oh, you're lucky! Here I am
Seriously, an architecture that behaves as you describe is possible. Also, each abstract 'computation unit' should be implemented with easy to read, single purpose functions, whose logic can be formally checked.
Some things people didn't consider yet:
First: Your code is as stable as the less stable piece of it. It doesn't matter if you work alone, but if you have a team, make sure that ALL the components of the team are great programmers. Only one bad programmer is enough to destroy all the code.
Second: Your code can't be more stable than the system it is running on. Some people have pointed things on this direction, but didn't go to the point. You'll need stable (and, probaly redundant) hardware, and a very stable OS (I'd adivise to forget about Windows, Linux-2.6 and weard drivers). All processes running on the machine should also be stable (OS tries to blind you from that, but there is all sort of timing problems that leak), so the less processes the better, and all of them stable. The network should be reliable: no random errors, only packages drop (write another error detection if needed, but you shouldn't need). And, most important, you need a reliable ENERGY SUPPLY. All effort is lost if you suddenly have no energy.
Rethinking email
It's the process what makes a code rebust and
manageable, not the language. Here's some tid-bits.
Design...Code independent. (XML, UML)
Design...Modular . ( Natural/Logic decomposition of elements)
Design...By Contract. (Sign the dotted line, read the fine print)
Code...Follow the semantics design.
Code on tiers. Upper tier strictly follow design, same
naming convention same logic..lower tier
uses generic tools, api's for portability (ADT's, STL, POSIX).
Test...Traceable...Preconditions..Postcondtions.. al modules are accompanied
by a test suite and or third party test suite.
Test...black box, third party. Don't preach to the choir.
Test...Metrics metrics metrics. Test early, test often, catch the problems
before the catch you.
Happy Coding!
- these are not the droids you are looking for -
if "Basically, it allows users to crunch a god-awful amount of data over several computing nodes" is one of your requirements I would look at http://www.open-mpi.org/ and http://openmosix.sourceforge.net/. These two projects will provide you the library to create multi-node applications( MPI ) and a load balancer for your cluster( MOSIX ).
First, don't listen to the anti-C++ naysayers. C++ can be used to be complex, highly efficient, and bug-free software. It can also be used to make a horrible, inefficient, and bug-ridden mess, if you don't do it right.
Some basic ideas:
First: study some of the C++ "best practices" type of books (Exceptional C++, Effective C++, etc). Don't do any production C++ programming until you have mastered these. There are many invaluable tips.
Second: stay away from fancy object hierarchies. Try to use template mechanisms instead of object mechanisms. Template code is fast, and has superior safety compared to object-style code.
Third: stay away from explicit allocation. Use STL containers (and other containers from trusted libraries). Going along with this concept: avoid using pointers. I find it is only rarely necessary to use pointers or to allocate things (e.g. Object *x = new Object();). Almost always, I can use a container (e.g. vector x).
Fourth: If you're doing any multithreaded programming (including across different machines) -- and you are, from the description of your problem -- then you should spend time understanding threading issues. Threads (whether in Java or any other programming environment) carry their own very nasty bug risks. Sure, you can avoid many types of memory allocation problems with a "managed code" environment. But what about deadlock? There's nothing for it but to really understand threading. A book I found to be excellent in this regard was Concurrent Programming in ML. Yes, it uses ML. But it's teaching you about threading issues, not C++ issues. Concurrent ML happens to be a very succinct language for its purpose. There may be excellent C++ books which address threading issues, but I haven't seen any.
Fifth: a minor-ish tip: when writing multi-threaded C++ code, don't ever use global objects -- only use pointers to global objects. This point may be covered in one of the above C++ books I mentioned, but I don't recall it. The problem is destruction -- when a program ends, you don't know in which order the threads will die, and when the global objects will get destroyed. So, use pointers to global objects, and destroy them explicitly if you need to (I rarely bother).
Sixth: Look at the Boost libraries (boost.org).
Well, those are some ideas. I've found that C++ does not suffer from bugginess, despite its reputation. There's a perception that C/C++ code makes it easy to make memory-related bugs (accessing invalid memory, failing to free garbage, etc). While the reputation is well deserved for C, and is doubtless deserved for badly-written C++, it is not deserved for well-written C++. Only rarely do I encounter memory-related bugs (or any serious bugs at all). It's all about coding practices, and using STL-type containers instead of doing memory allocation yourself.
Hope that helps.
Use CPPUnit, and follow test-driven development practices:
1) Write a test.
2) Run the test. It should fail.
3) Write just enough production code to make it pass.
4) Run the test. It should pass.
5) Repeat 1-4 until you have a complete system.
Honestly, this really works, and it'll save you a lot of debugging time in the end. Sure, you can't test every single input to the system, but you can test the corner cases and expected inputs, and that counts for a lot. I'll put my defect counts against any other programmer who thinks that unit tests are for wimps any day.
If it's really that critical, you might want to take a look at something like Correctness by construction. They actually do not recommend using C++, but with rigorous developer discipline you can achieve similar results to what they describe.
Also, separating critical and non-critical modules is a good idea. That means you can do the not-so-critical GUI quickly in some convenient environment (Qt, Kylix, whatever) and do the critical stuff the hard way.
(Apologies for the lame sig below)
He who laughs last, thinks slowest.
Ultra-stable, crash-free, and runs on Windows.
Requirements like that make me glad I gave up programming to fish for salmon in Alaska.
The most important thing is: make a model of what you want to do before even thinking which programming language you use. Reduce the semantics needed from external sources to the essential set of features. But do not re-implement semantics w/o need.
a lloc(20000);b[i++]=malloc(10);free(a); ......
Whereever possible make finite automata models; try to use automated techniques for detecting deadlock in the interplay of these. Wherever possible use known protocols.
Try to avoid atomic regions in your code for which you have no semantic model (i.e. try to make atomic regions only inside member methods of object specifically designed for "sharing data").
Avoid unecessary system calls; for the sake of performance and because you never know what will happen.
If you are sure that you have no circular references, prefer reference counting over garbage collection. (because it is a finite automata!).
Try to avoid dynamic data allocation - or better: think about the chinks in which memory will be needed.
Depending on your application write your own memory allocation (Only do this if you know what you are doing), if you have a special idea about the sequence in which memory will be needed (imagine the following memory requests (i write in C, but you should get the point).
i=0;a=malloc(10000);b[i++]=malloc(10);free(a);a=m
a=malloc(30000);b[i++]=malloc(10);
In this simple example.
If finite automata do not suffice at a point use a stack machine.
*Maybe* use yacc to design your parser;
Probably necessary to say, but: Run your network of your computing cores should be connected to the GUI only via IPC; Please no shared memory space.
For the network communication consider existing methods. e.g. Corba - if applicable.
Ahem; as many things as possible should be as stateless as possible; makes testing easier.
Your software is only going to be as stable as the components that make it up. That means you will have to choose your operating system, APIs and hardware carefully, as well as your choice of programming language. Ensure you have clear requirements from the start (ensure all constraints are defined from the get go). Spend atleast 50% of your time on this pahse. This will lead to a clear design strategy that is half the battle to stable software.
During implementation you will find that peer review, and static code analysis will help reduce the defect rate (being a good and experienced programmer will also help). Try to use a safe language subset that will reduce program complexity. One idea is to prevent the use of dynamic memory allocation at all (although that will be hard if you have already decided on C++). Use code metric limitations. Peer review all the way long.
Finally a very thorough full system test (to 100% logical coverage) will help you find the stability you are looking for. Ensure that your test setup provides realistic scenarios and do everything you can to break it, design your software to fail safely. Assume that your software will crash and design is so that if it does crash it recovers in a way so that the crash never mattered in the first place. Use concepts such as hot and cold redundancy to help here. Run your software out of specifications and put it into situations you know it will crash and see how it behaves.
Proving that any non trivial software will never crash is impossible, you soon start getting into the realms of timing and scheduling analysis at the OS/assembler level and Linux/Windows/C++ are not hard or soft realtime.
Then, when everyone chops him up, he can point to this discussion as evidence of what experts in the field think of the situation. Which comments are going to be true enough regardless, even if the boss doesn't believe him.
"It is a greater offense to steal men's labor, than their clothes"
I've always found this article to most insightful with regard to the sort of effort you are describing: http://www.fastcompany.com/online/06/writestuff.ht ml
At the least, it will open you eyes to the amount of effort required, if you really are serious about this.
Good luck to you.
By the perception of illusion, we experience reality
In a desperate rush for some reading material for the toilet, I grabbed what must be a 5 year old C/C++ User's Journal from a storage room. The theme of that month's issue was MULTITHREADING.
I thumbed through it and came across an interesting article ``ALWAYS HANDLED ERROR CODES''. The idea being that a lot of errors can go undetected because programmers are lazy about checking return values. And why not, who bothers checking printf()'s return value, for instance?
Simple enough design. The object constructor sets the result, the destructor will abort() the application if the Checked variable is false. The overridden == and != operators evaluate the result, and also set the Checked variable.
In your functions, instead of return SUCCESS; you write return ErrorCode(SUCCESS);
Wondering if anybody does this. If I needed something ULTRA STABLE I guess I might...
Easy, write good code, make sure you write a lot of functions, lots of objects, don't write large chunks of code at once. If there is a library or function already made for something use it, that will avoid bugs and unstability. And test test test test and test again. You could go as far as dividing the program into several smaller application and use IPC(fork + dup2 + pipes) to have them all communicate but I don't think that will help much to be honest. Unless you're using shared memory and message queues, semaphores etc... however keep in mind these things are sometimes very difficult to debug. As far as efficiency just remember one thing the Big O!!!!! for example an algorithm that has a Big O of (1) is better then a Big O of (Log2N) which is better then (N), which is better then (Nlog2N), which is better then (N^2) that will give you the biggest difference in performance, SIMD and -O3 and other optimizations are really just candy and will only give you marginal increase in performance. Spend time learning the debugger, and I've seen a few utilities to find memory leaks, for linux anyway. Oh! and make a memory map sometimes it helps figure out where your leaks are.
Divide, and conquer - DON'T do the whole app in C++. Split the back-end, the part that does the heavy lifting and uses all the libraries you have in C++. Define (AND DOCUMENT) the interface between the back-end and the front-end GUI - use something like a socket or a pipe or some other form of interprocess communications to tie them together.
Do the front-end (the GUI) separately. If you want to stay with C++, great, otherwise, use whatever language makes the front-end easier.
GUIs are hard - they have to be asynchronous, and that makes design and debugging a bitch. From what little you have described, the back-end should be fairly straightforward from a code flow perspective - get data, process data, return results - no event handlers, no callbacks.
Given the split, you can unit test the back-end a lot more simply, and test for all the corner cases. Unit testing a GUI is hard - again, testing all the async events is a bitch.
Design the back-end so that if the GUI "goes away", the back-end can continue to operate - preferably allowing the front-end to reconnect to the back-end and pick up where you left off.
Other than that, here are some general rules I follow in C++:
Never instantiate a variable until you can initialize it - if that means you define a variable in the middle of a function then do so.
Use lots of local scopes - you can insert a "{ }" anywhere, and create a scope, with its own variables. This both helps the optimizer know where a variable is used, and it helps the programmer know, too.
Keep functions as small as possible.
Treat pointers like guns - keep them pointed in a safe direction (i.e. at an object or at NULL), never make any assumptions about whether they are loaded or not (check for NULL), keep them away from children (junior programmers) by keeping them locked up safely inside objects.
Where-ever possible, use the "allocation is construction" idiom: If you need to allocate a resource, such as a socket, or mutex, or whatever, create a class object that allocates the resource on construction and releases it on destruction. That way, you can instantiate that object as an auto variable, and when that variable goes out of scope the resource is released, no ifs, ands or buts. It is also released if any exceptions are thrown. One of the reasons I don't like Java is that this idiom won't work due to Java's lazy destructor policy.
Last but not least - use ASSERT and other logical checks LIBERALLY in the code. If you are making an assumption TEST IT in the code. Track where you are - objects that are passed the __FILE__ and __LINE__ parameters to track who did what where will help you immensely in testing your code.
www.eFax.com are spammers
Since you plan to decouple your various processes, why not use a single "watchdog" process to oversee the others and restart if necessary? This is a commonly used technique in high availability environments. You can guarantee the watchdog is running by putting it in a cron job, whereby it continues running if it can't see itself in the process list.
Hope this helps!
Zen tips: Pay attention. Don't take it personally. Believe nothing.
If lives are at stake, or this is a seriously mission critical system, language selection must be part of the architecture and design - not an arbitrary choice "because my boss said so".
C++ is not an inherently fault tolerant language. There are several that are: Erlang for instance.
you had me at #!
It may be that the author simply has not conveyed the scope and rationale for all the elements of the project that he mentions, but this reads like someone intent on developing a lot of trendy, infrastructure that is interesting for the developers to work on, but isn't really necessary for achieving the goals set by the clients.
Over-engineering results in unnecessary complexity, both in the code and the project organization. And those are among the main causes of instability in software. Further, over-engineering also increases delays and cost, which can lead to a "crashed" project, which is more likely to produce crashing software.
Ironically, aiming to make a program more "stable" than it needs to be will probably make it less stable, whatever tools are chosen.
You forgot to ask for a pony.
If I were you I'd start by purchasing all of Scott Myers books on the STL (Standard Template Library) and read them twice. You'll have a much better idea where to go from there. You even want to hire a C++ guru to help you design the application.
10: PRINT "Everything old is new again."
20: GOTO 10
You should read Effective C++ by Scott Meyers. Twice. And then you should read More Effective C++ by Scott Meyers. And then you should read both of those books again. Every day before breakfast.
I can't write multiplatform GUI code with Lisp.
Not unless I want to have my clients pay a runtime fee for every running copy of the software.
I have great hopes in the wxCL project.
We are Turing O-Machines. The Oracle is out there.
"Sadly, the programming language cannot be changed..."
Oh, maybe he did complain. Fuck off.
Physical separation of modules is a very good idea. It helps to contain damage when one part fails, makes the app easier to upgrade piecewise, and forces you to think hard about interfaces. Your first attempts at ultra-reliability will fail. But if you encapsulate well, with clean interfaces, you can make the individual modules ever more reliable over time. Peers of a failing module should detect the failure without collapsing, of course. But consider centralizing the start/stop/restart of all modules in a process manager. Peers detecting a failure report the failure to the PM, but do not take action themselves. I think you have an implicit assumption in "Sadly, the programming language cannot be changed due to reasons of efficiency and availability of core libraries". It's the word "the" - why only one language? Ask yourself if only certain parts of your app are subject to the constraints you cite. Maybe some parts are better suited to a scripting language. I don't mean to preach language, but I like Python so I'll use that as an example: It can interact with C++ by network protocols between separate processes, or within the same process through available APIs; it's good at cross-OS, unless you intentionally use OS-specific libraries; and you could code some parts much faster, leaving you more time to think hard about your interfaces. Use message queues only if you need the asynchronous behavior. If synchronous request/reply in enough, skip the added subsystem. For any inter-process interfaces where efficiency is not a dominant concern, consider text protocols. Your human intelligence is good at detecting errors in text, so this makes the interactions between modules more transparent. It's also handy to test an interface by typing at it. If you go with remote calls between modules, consider whether they need to be object-oriented. Old-fashioned Sun RPCs still work fine, and they're simpler. Object-oriented design is great within a process; but stateless protocols are often best between processes. Treat shooting a module as a primary use case. It's important for isolating failures of course, and also for partial upgrades to a running system. Finally, have a single point of truth for everything the system must know. It's OK to distribute copies of data when you must, but be clear on what module is authoritative for every piece of data.
Get a junior-level software development job, and in about 5 years, you'll know the answer.
Seriously, people post questions like this about once a month, and I can't believe that they actually think they're going to get an answer in a couple of paragraphs to the quesion "how to I write good code?". If it were that easy, someone would write a book, everyone would read it, and there would be no buggy software.
You can avoid some of the pitfalls of C++'s need for manual memory management and other problems by simply avoiding them. For instance, never do memory management yourself. How? By using STL containers to do it all for you. Next, avoid fixed arrays. Again, let STL do it for you. And, above all else, never do anything where you don't restrict the length. Since you're using STL for arrays, you're good to go there, and you won't end up running off the end of a character array (because you don't use them!). So what you're left with is doing I/O properly. Always limit the the amount of data you read to the buffer size you have allocated.
I'm sure that there's tons I've left out, but this has worked reasonably well for me. The only problem is that STL can be slow. Sure, map may be O(log(n)), but the constants are huge. Unfortulately, for practical reasons, performance and security are often inversely proportional.
The buggiest code that you are dependent on is likely to be the GUI libraries from the systems you are targetting. X/Windows and Win32-GDI are full of sh*t code. Historically, these libraries have memory management errors and certainly have leaks.
1) You need to write a command line version of the code.
2) You need to employ developers with at least 5 years of professional experience. No noobs allowed.
3) You need to use as many compilers, native, 3rd party, and GNU with all warnings enabled. Correct all warnings (when possible). This will teach you and your team that not all compilers are created equal. Last time i coded, HP's C++ compiler was too strict when compared to most other compilers.
4) Follow bullet proof coding standards, hold real reviews and be especially judicious when employing memory management and threads in your code.
So you know, my background is in Space Shuttle Flight Software development, JSC Mission Control Center software development, and Telecom systems design, development and deployment.
AND YES, I'm an Anonymous Coward!
... comments and complaints to the contrary.
.NET all started out in, or continue to be, or continue to rely on code written (usually) in C (some in C++). This stuff, in tern, relies on a C runtime library that gives you all malloc, free, sprintf, etc. It may be ugly, but it is there. Then, of course you (probably) are running on some version of Linux, BSD, Mac OS X, Windows, etc. All written in C (and YES the under-lying OSes all have different levels of intrinsic stability, I know, and so do you, please no comments about this they're not productive).
Firstly people, please THINK before dropping your favorite high-level language or uber-run-time in as the way to go: perl, python, ruby, java,
My point is this: if your favorite high-level language and/or run-time is "ultra stable" then how was this accomplished in poor old C/C++? At the very least C/C++ was involved somewhere in the lifetime of these products.
I'm now going to regurgitate some already existing comments and ad a few of my own:
- Yes, as others have said, do consider using a well tested higher-level language/run-time to glue stuff together and optimize where you can. Since you said that you can't do that, I guess C++ it is.
- Yes, use external components that have a long track record of being "functionally" stable.
- Testing is great, and using test driven development is a good idea and I certainly would make this this the central "arch" of your development efforts.
- FORMAL development has been demonstrated to give the most consistently good results in terms of "quality" of produced code. What I mean by this is to the extent possible: PROVE the correctness of your code, formally. There are plenty of books to help you get started. A place to start would be David Gries' book "The Science of Programming", and excellent text that will make you a better programmer no matter what.
- Since you like I are human an imperfect it is in your best interest to assume that failures will happen. Then I would suggest KISS (keep it simple stupid). Build individual parts that are VERY simple, easy to understand, and highly unlikely to fail individually and assemble them into more complicated systems. Again, the idea is that along with each assembly step keep the individual assembly steps as simple as possible.
- Code defensively in two domains: Exceptions/failures - to the extent possible write your code to be able to identify and recover when failure has occurred. ALSO, and this is as equally important: Sanity checks - wherever possible DON'T trust the results from each of your components. Wherever you can implement sanity checks on incoming values from each one of your modules. This will go a long way to helping identifying where a 36 hour computation went wrong.
- Log, log, log. You won't get it right the first time but the more information you have to fix your mistakes the better.
In order to help you assemble a lot of simple parts you may want to consider Linda (C-Linda and C++-Linda also exist). It might make building your system easier.
Hope this helps
Where do people get this idea? I have ported quite a few applications, and usually the porting done by locating the libraries you need on the new platform, and fix a few oddities in the current platform (like closing sockets in z/OS or switching to unsafe multitasking (p-threads) on windows. Porting to linux is so trivial that I often do it just to get access to the superior tools available there, especially valgrind. GUI is the exception, of course, unless you use a x-platform kit from the beginning.
Which leads me to my recommendations, in no particular order
The above approach works for me. You mileage may vary.
Religion is regarded by the common people as true, by the wise as false, and by rulers as useful.
I am by no means a specialist in this field and of course I do not know whether your project actually allows this approach.
But if I were asked to do this, I would take a database (a stable release of MySql is what I would choose) and use it both as the persistent object storage and communication module. The GUI and the number-crunching module(s) would be set up to primarily communicate through the database, rather than directly with each other. A task state/queue table in the database would inform the modules what tasks have not been assigned yet, are running, are complete, have not returned in the expected time, or have failed. This would make it asynchronous and highly traceable; databases are (supposed to be) good at managing the interactions between multiple user processes and still maintaining data integrity. Admittedly this is not the best approach if you want your results real-time.
The central managament of the processes could be kept minimalistic and simple, and "therefore" robust: Some very simple communication with number-crushing processes to test whether they are alive (a TCP/IP socket read-write might do), re-opening a task that has not returned in an expected time period (if its process still returns later, the newly started process will have to detect that its work was already done, and discard its results instead of writing them back), and perhaps signalling critical task completion to users (by GUI message, e-mail, text message, ...). The central management would not have the startup responsibility for distributed number-crunching modules, that would remain with the local servers they are running on. Such a process can then "knock on the door" of the database, register its presence, and take the next available task, or wait until one is available.
The persistent form of the core data structures would be in database tables, but the modules would of course have their share of the data in memory as class representations of the data structures, defined to be initialized from the database tables and written back to them. These class representations of the data structures then could be in a common library shared by the different modules, but alternatively you might opt for different class representations for e.g. the GUI and the number-crunching modules if that is more efficient (it often is) and even write them in different languages if that is more convenient. I admit that that adds to the amount of code and therefore to the amount of bugs. On the other hand, you could write two "completely" independent implementations of the same task.
If your number-crunching is complex and long, then evaluate whether you can write back intermediate states to the database as a recovery point, or even split the calculations in completely independent modules, each one starting and ending with a given database state. The desirability of this depends, of course, on the balance between I/O and processing costs. If you have modules that are relatively simple and safe but need to work quickly through a large amount of data, you could consider database stored methods for these; not very distributed but it reduces the amount of I/O and they can easily be called by client processes.
The database does not care in what language the different modules are written, so you can then write every one in the language that is most appropriate. For example, there may be no reason at all to write (parts of) the GUI in C++ -- and that is something I would try to avoid. If performance allows it, I would use Java for the GUI, both for portability and simply to avoid the mess of writing user interfaces in C++; in my experience that does not tend to be the most stable solution.
For the C++ part I would start by structuring pretty strongly; write a large number of simple classes instead of a smaller number of complex ones, and test every class before you move on to the next level. The "salami approach" works well if you plan it well. It is perfectly possible to write very ro
If you want "ultra-stability", then choose a language designed for that purpose. Standard ML is the obvious choice - why waste time testing when you can prove the correctness of the program from the get-go?
"Availability of core libraries" - what relevance does this have? It's pretty standard practice these days for languages to provide access to external libraries, regardless of implementation language.
Once you have a provably correct and therefore "ultra-stable" implementation, attack the performance issues. Profile to find the bottlenecks. Use better algorithms. Recode performance hogging functions in assembly. Most obviously, upgrade the hardware. It's much cheaper and practically risk-free.
C++ takes a lot of platform-specific work to become portable
Bullshit. C++ written well is portable by default (between windows and linux). There are a few minor issues between linux and sgi.
"Ultra-Stable Software Design in C++"
- Malmesbury & Duke, 'Tech Stuff for Morons' series, 1998
Other titles in the renowned M&D catalogue include: -
- "Make Your Own Canoe from Chicken Wire", 1982
- "The Importance of Pig Iron in Modern Aviation design", 1990
- "1001 Cajun dishes featuring raw sewage", 2005
A great series. I have a bookcase full of 'em.
What you just posted did NOT make any sense. One Cycle? Of What? Please clarify what you just said.
I am not merely a "consumer" or a "taxpayer". I am a Citizen of the State of Texas
Have you have flown on a commercial airline in thelast 30 years? If so, you trusted your life to software.
Thare is a standard called DO-178B Level A that applies to aircraft software upon which lives depend. There is a saying in the commercial avionics business: "Nobody has ever died from software failure on an airplane, yet." There have been some accidents where software played a role, but I won't quibble with that now.
The point is that safety critical software is developed routinely. It has been developed in asembly language. It has certainly been developed in Ada, C, and sub-sets of C++. It is expensive. Validation of avionics software and certification in an aircraft can easilly cost an order of magnitude more that just writing the software, and writing the software using required processes and producing required artifacts is not cheap either.
You say "IMHO" too often.
I googled a little, and came to this page: which in turn led me to this old PDF DOCUMENT from 2002: But I can't for the life of me tell what this operating system is supposed to be.
Is it Digital VMS? Is it Digital OSF/1-Tru64? [OSF/1 is mentioned on page 9 of the PDF document.] Is it some flavor of HPUX?
Or is it something else entirely?
And, parenthetically, I'd ask: Why do the droids in Sales-N-Marketing insist on publishing this crap that doesn't even begin to answer the most fundamental questions their customers might have?
After reading some of the replies, most people seem to take it for granted that C++ is inherently unstable. The main reason cited is memory leaks. Well, guess what, C++, if written properly, can have garbage collection. The answer is smart pointers. Specifically, check out the boost smart_ptr module. I find that if I wrap all calls to new in a shared_ptr or shared_array, my memory leaks go away. If I need bounds checking on an array, I'll use a std::vector class. And whenever type casting a class, I'll use a dynamic_cast, followed by an assert() of the result. Of course, there's always the possibility of introducing a memory leak when sharing pointers with another library, but at some point, you have to trust some code to be crash proof (like the .NET runtime or the JVM). In short, the key to safe C++ is smart programming techniques.
Also, this is intended to be run primarily under Linux, with the possibility of a Windows port later. Can anyone tell me how good the Mono class libraries are? I would think that, like GCJ, Mono is somewhat incomplete and/or buggy.
If you can read this sig, you're too close.
The parent refers to Erlang, and is modded "Troll."
copy( istream_iterator( cin ), istream_iterator(), back_inserter( v ) );
is just plain beautiful IMH?O.
Well if this is beautiful, then why is it that I have no idea what it is doing? Beautiful code is unreadable? Congrats! You are a genious.
How about Syntax like:
a[3;12]=b[11;2];
That would be beautiful. Iterators are a misunderstood concept.
Your post convinces me even more that there is no substitute for maturity in writing bullet proof code. It isn't the language - C and C++ are fine, mature languages for mature programmers. This has nothing to due with age.
If even a single bug is allowed into a daily build, then you will not have a bullet-proof program. Most commercial teams cannot take that high standard due to costs and schedule pressures.
Here is a real answer, unlike the dozen or so I just saw perusing existing posts.
The goal of 'crash free' is going to be impossible... Even if you could write perfect code, you already stated you need to rely on third party libraries. You can only be as stable as they are. You want to change this goal to be 'high availability', which is a system architecture term that you can now google on and find some design principles, including redundancy. You want to be 'resiliant to crashes'. Keep in mind that this will affect everything down to the design of the computer it will run on (do you need 2 hot-swappable power supplies so that one can fail without the app going down?)
While this term is so overused as to almost be meaningless, you will want to keep in mind a 'service-oriented architecture'. What I mean is that you will want to build the modules you discuss in your post as 'services' that other components will rely on. The components that rely on the services need to be resiliant to failure, with strategies such as queueing a request that isn't fulfilled (due to a crashed service), redirecting the request to another available service, etc. You want to be able to decompose your problems into 'problem chunks', and think of your various modulces as 'producers' and 'consumers' of these chunks. the producers and consumers shouldn't talk to each other directly; there should be a blocking queue between each so that producers and consumers can be decoupled - the producer can keep producing while consumers can come and go, and consumers can have work to do if a producer crashes and has to restart.
That should give you a good start for googling on some architecture.
Lastly, if you really want your code to have any hope of being nearly bulletproof, you have to have some discipline, in the following forms:
1) Use Coding conventions, and stick to them. Automate the ability to check compliance. With coding conventions, maintenance becomes easier, defects stand out, etc.
2) Write unit tests, and have good code coverage. In C++, you are going to have to worry about bounds checking, etc. Unit tests are the best way to prove your code works.
3) Document things well. Don't document what the code does, document WHY it does it.
4) Use some Version Control/CM processes and tools, like CVS or Subversion. You ALWAYS want to be able to recreate a particular state, for instance, if you do have a defect and need to be able to recreate it in a predictable environment. Besides this, CM gives you a project-lifetime 'undo' capability.
5) Use static source code analysers like lint. (There are dozens of these in Java, like PMD and FindBugs). Even if these tools cannot find the really hairy problems for you, they do find defects and do so cheaply and easily, compared to finding the same defect yourself, or worse, after your software is in production.
Hope this helps.
It may be necessary to use C++ for parts of the system, but can they be isolated as a seperate adjunct process? That way, your damage is limited and your main, secure process can keep a watchful eye on the C++ process.
Design the C++ side to be stateless, so a halt-and-restart can occur losslessly between RPC transactions.
I see the .NET astroturfing has begun.
How much does each score 5 post pay?
Use Electric Fence, Valgrind or Purify and automatic test your program to death using deterministic and then randomized (in key dimensions) inputs - let the test run itself for days. This technique can uncover a lot of stability defects.
Only the paranoid, diligent programmer stands a chance of writing "crash proof" code. The reason code crashes is because some error checking is missing. If the "worst case" scenario is always handled then the program my quit functioning as expected but it won't crash.
The other major reason programs quit working, and possibly cause a crash if there isn't sufficient error checking, is because of memory leaks, which are due to poor architecture. There is no substitute for using a "formal" architecture, like state machines, to avoid unexpected code paths to be executed. The formal architecture will help a programmer determine to best point to allocate and free up memory that will avoid memory leaks. Garbage collection is a great idea to help harried programmers get something out the door but it can't take the place of good architecture.
"Meaningless!, Meaningless!" says the Teacher. "Utterly meaningless!"
Ah, but test driven development flies in the face of the new government backed, SEI approved software development silver bullet called TSP (Team Software Process). And by following TSP you too can consider just how much better it is than test driven development while waiting for your co-workers to inspect your code for a few months.
All of the above +... Use C xor C++. When must mix (OS or external library calls), wrap. C++, if not used with STL, boost, exceptions, SafeGuard, RAII, the whole sharade, sucks beyond repair. P.S. If you use STL and have exceptions disabled, leave the project right now or I'll shoot :-)).
1) Learn to use STL.
Do *all* memory management via STL vector/string.
2) Don't ever type "new[]/delete[]".
Just don't do it. Not. Ever. Use std::vector instead.
"Arrays are evil" - the C++ FAQ.
PS: You can still use malloc()/free() but only as a last resort in low-level classes which are designed for data storage.
3) Get a reference-counted pointer and use it.
Automatic memory management...'nuff said.
4) Attach an alarm bell to your "~" key.
If you're writing destructors for classes which don't control system resources (eg. files) then you're probably doing something wrong - see notes 1, 2 and 3.
No sig today...
I'd suggest reading Software Safety by the Numbers at http://www.embedded.com/showArticle.jhtml?articleI D=19201765 on IEC 61508 for safety critical industrial software
or getting a copy of RTCA DO-178B from http://www.rtca.org/ for avionics.
Whatever else you take from the multitude of responses, your system will be better off for having formal design and code reviews among small subsets of your team. And collect all test cases into a framework such that any potential release will have to pass ALL previous tests. These two steps alone will go a long way to making your application stable and correct.
Efficient, bug free code is the Holy Grail of software, and no one has achieved it yet. All I can do is give the standard answers:
1) Nail down the specifications and don't let them change. Once you have a design worked out, any changes to the goal can cause unforseen complications and bugs.
2) Have the system designed by someone with a lot of experience in system design and a proven track record of low-defect designs. Bugs caught and prevented in the design stage are always the cheapest to cure.
3) Take the estimated time for testing and debugging and double or triple it. Cutting time corners is one of the main ways bugs get added or lost. Remember that the really low defect software houses document and justify EVER SINGLE CHANGE TO EVERY LINE OF CODE. This may be overkill for you if you aren't remotely programming a $10 million satellite or writing life-critical code but its how you make sure your defect level is as low as possible.
4) Hire veteran programmers. Its no guarantee of bug-free code, but among better programmers, the longer they've been programming, the cleaner their code is. Besides, they'll know all of these points already. The obvious corollory is not to hire bad programmers, no matter how much experience they have.
5) Turn on all warnings and errors, and use an additional external verifier. I prefer Gimpel Lint, but there are other lints out there as well that can help you. Generally, any tool that can help in the verification process should be used.
6) Code verification walkthroughs. Its amazing how much cleaner folks write code when they know that they will have to show and explain it to a random co-worker.
7) Have unit tests of each module and/or code section as its written. Keep the tests and use them for regression later. Have someone OTHER than the writer of the code test it.
8) Pray to the Deity of your choice that the inevitable bugs won't be serious.
especially Boost::Python.
Ok, my ideas may be a little juvinile, but specific programming instructions that will work in the testing phase:
1. Find the error message possiblities with every function you call and handle them with try/catch.
2. Implement a timer that will send a message every x seconds after you start it. Start it before every loop, and stop it after the loop completes. If the loop turns into an infinite loop, your timer will throw an error message (with should be handled) in the function body of that loop.
3. To keep all of your modules seperate, either use managed code, or just write an if statement checking for its existance. If it's not there, let the user know so they can fix the issue.
If you can program in these three things, it will be up to the user to respond to problems, rather than the programmer. Once you've tested this application, you'll probably have a good understanding of what responses the user still has to make on a regular basis, and program those in as well.
It's all about making sure you know all of your specifications, making designs that will work within the design, as well as work without. If you can test all design modules to a point where the programmer and user will agree on its functionality, you should be in a good place to sandwich test with another couple modules.
Checkout IEEE software engineering docuements for help with design and testing ideas.
Easy answer: just learn to code, dammit! :P
No, seriously something like a virtual machine might be the way to go. Yes it's horribly contorted, but see each indirection as a choke-point for leaks and glitches. Build in as many idiot-proof checks as you can, and do extensive bounds checking and whatnot. If you can make the outer layer of your app crash-proof, it will form a sort of "condom" for the innards.
-Billco, Fnarg.com
Frankly, if you just follow the guidelines established in Stephen McConnell's Code Complete, most if not all of your concerns will be addressed. The guidelines pretty much apply independent of language, operating system, etc.
Don't underestimate the power of The Source
>
> is just plain beautiful IMH?O.
I'm sorry, but I just can't agree. It might appeal to a mathematician who wants to see everything use functional notation and hates every language except lisp, but to a non-abstract-elite-ivory-tower-mathematician this is absurd. cin is not an array of integers and the use of the adapter obfuscates the fact that you are using a conversion from a char array to an int. The back_inserter also makes it harder to see where the data is going by losing "v" in it. Many would also frown at it for taking a non-const reference, although since it is a standard adaptor it is probably ok.
C++ programmers are often unnaturally attached to efficiency and have to be watchful for template bloat. Your copy generates 88 instructions, whereas an equivalent iterative solution is only 33 instructions long, most of them belonging to the inlined push_back. Not only is the generated machine code smaller, but the source code is smaller as well, and is far more readable, making the algorithm obvious at a glance to any procedural programmer, who make up the majority outside the hopelessly out-of-touch with reality academia.
Academics love integer and float arrays because that's what they usually work with. Scientific simulations produce data in that form and require processing programs that take something from a data file, crunch some numbers, and output something to cout. In the real world people work on user interfaces, databases, and other complicated things, where one normally works with arrays of objects rather than numbers. If you ever tried to apply a functional algorithm to a vector of objects, trying to manipulate some member variables or call a member function, you would know that the result is so hideous that it isn't even worth considering. There is a reason people prefer iterative solutions; they are how the real world works. Reality is algorithmic, not functional, and so are user specifications for the things they want done. Trying to cram them into an abstract mathematical functional model is insanity.
> Use, carefully, and always when possible, smart pointers.
> Remember std::auto_ptr is your best friend
Most of the time, no. While I would not deny the utility of auto_ptr in localized situations manipulating the object state during reallocation, its constant use indicates lack of understanding of object lifecycle in the program. It is fashionable in Java to create objects left and right, without consideration of who is supposed to own them. Hey, just let the garbage collector take care of it! Who cares how long the object lives? Obviously, such immature mentality produces plenty of memory leaks for which Java is so infamous. In a good design object ownership is strictly defined. Objects belong to collections that manage their lifecycle. There ought to be no "dangling" objects that just "hang there". If you don't know to which collection the object belongs, you have no business creating it. If you think your objects are "special", you haven't thought beyond their internal functionality or considered where it fits in your overall design.
> Question 2: How can I actually implement such a decoupling?
> I would use a simple, socket-base, take-my-data, gimme-my-results scheme
And thereby slowing your program to a crawl? There is a reason people use CORBA and the like: those frameworks optimize distributed object calls to avoid network hits, often being able to reduce the overhead to be equivalent to a virtual function call. Furthermore, networked applications have their own set of complexities and security considerations. You get to keep an open port somewhere, handle authentication (becase wherever there's an open port, there will be malicious connections), and extensive data validation (for the same reason). While these problems are applicable to dis
See also: wikipedia article on reference counting.
There is a book, called Taming C++ from Jiri Soukoup.
;D )
A lot of stuff in the book is likely not interesting or even strange, but he has excelent ideas about data structures.
E.G. if you have an aggregation (in terms of UML) in Taming C++ the suggestion is to implement it as double linked list which is inherited via a template (kind of mixin).
Try to find it in a library, look over the chapters and look at the code examples regarding that (and skip everything regarding design patterns
Writing code like it is suggested in the bug is makes it very resistant against memory errors and the code is ultrafast!!
angel'o'sphere
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
Get yourself a good C++ memory checker like Insure++. Code defensively and test extensibly with the memory checker prior to deployment.
By code defensively I mean things like the following:
- If possible use string routines that verify the length is not overwritten. Insure++, etc. won't find problems unless they occur during your runtime testing.
- Code using data structures and malloc as needed. Don't assume you will have a maximum number that could cause memory to go out-of-bounds.
- Code in alerts to inform you if a structure is getting too big (e.g. memory is not being freed).
- Don't assume data being read from a file or passed in from another program is of the proper type. Always check to verify and alert if there are issues.
For the memory checker, enable warnings & errors and do as much testing as possible to remove any problems. The software can find memory leaks, accesses to bad memory, arrays / strings / etc. that go out of bounds, etc.
Once you get the hang of the software, you can write rock solid code...but remember to code defensively so you learn to write good code and catch things the checker software doesn't. Don't rely just on the checker software.
The idea is simply to define the "space" of legal inputs for each module and the correctness criterion for each input, and then generate random inputs based on the spec. This is far more effective than traditional hand-coded test data at both unit and system test levels, and as an added bonus the test spec doubles as a formal specification of the correct behavour that coders can actually work from. This is similar to the XP practice of "test-driven development".
Paul.
You are lost in a twisty maze of little standards, all different.
Make a compiler (or translator, converter, or whatever you prefer to call it) in {Haskell,Lisp} translating {Haskell,Lisp} into C++. Then, make your application in Haskell and let your compiler spit out C++ code.
One word, Bubba!
Ada. It's the ONLY way to create truly safe software.
Trust no others.
If found the biggest problems that cause crashes are memory management, lack of testing and object state issues.
Doing things like ensuring that an object allocates its own memory and deallocates that memory in a well managed fashion in every possible instance is a good step. Also people developing object oriented code in C++ often dont think about the state of their objects and instead assume proper usage, this is a *big* mistake. You should ensure that your objects are always in a valid state no matter what happens or throw an exception, every constructor and method; and expose as little member data to unmonitored changes as possible.
Finally, ive found unit testing to really be useful. Every class, or if needed group of classes, that i develop gets a unit test module that fully exercises its functionality, focusing on boundary conditions. No program is crash proof but if you write your software well and test each piece and then the pieces together, you can have some reasonable assurance that things will work as expected. A well defined test suite for after you put it together doesnt hurt either.
I was crazy back when being crazy really meant something. (Charles Manson)
First, there is no magic solution to this. IT IS A LOT OF WORK. Also, my company predominately develops software for non-Linux platforms, so I'm not going to recommend any Linux specific tools.
I recommend the following (by no means a complete list):
1. Fuzz testing
Fuzz testing is throwing directed yet random inputs at a program to see how it fails. Extremely long strings, null terminated strings, invalid files, files that are "almost" valid, etc. It's good for security but it also helps reliability. Even if all your input comes from trusted sources, by protecting against invalid input you also protect against bugs in these sources.
2. Dynamic Analysis Tools
There's a wealth of tools that'll simulate disk-read errors, out of memory errors, and other failures like this. Even if you expect to always have enough memory, OOM conditions may happen even temporarily. Tools like AppVerifier help detect heap buffer overruns, underruns, and bad API usage. Run your test passes under tools like these.
3. Static Analysis
There's a host of tools which can analyze source code and look for problems. Run these as often as possible and fix all the issues which come up. If they are quick to run, make a clean run checkin requirement.
4. Establish a feedback loop
Even with the strictest coding standards, strict testing, and excellent tools, crashes will happen. Eventually, your code will run in an unexpected environment, some external influence on the program will corrupt its environment, or some maintenance coder two years down the road will checkin a "fix" that introduces a crashing regression for some customers. Have someway for your customers to send you dump files whenever a crash does happen. If you happen to support Windows, this is really easy. Microsoft has a site for getting access to all the crash data that the customer would send for your product. Establishing an account is free (as in beer), but does require you to provide a VeriSign ID to establish identity, so noone else will try to get at your data. My company uses this, and it allows us to focus on the top N crashes that occur in our products so we get the most bang for our bug.
Even if you do all of the above, there will still be some crashes in the product.
What not to do:
1. Swallow all exceptions
This'll make your code appear more stable on the surface, but by blindly swallowing exceptions you are forcing your code to operate in a state you never designed for. All you really do is turn an easy to diagnose crash into an impossible to diagnose crash, or worse, a bug that just results in silent data corruption.
2. Believe that using library x/Java/.Net/STL/etc. will fix your problems
All of the above are just tools, but it is still possible to have crashes even if you use these tools 100%. An OOM exception in any of the above is more graceful and more recoverable than an access violation, but you're still going to have to do a lot of work to make sure you eradicate the sources of exceptions in your code as well as make sure the exceptions you do expect and can recover from you can actually rollback/retry/etc. to leave your data in a valid state.
I can all but guarantee you that said libraries will have undocumented memory usage "features" that will conflict with whatever careful programming model you put together in your code. Very, very few C++ libraries are designed with long-term memory usage sanity in mind. If I were you, I would not trust any code that I hadn't read in detail.
To a Lisp hacker, XML is S-expressions in drag.
Not understanding something is one thing, but not understanding something so let's reject it as being "elite-ivory-tower" and "academic" is another. I've seen a lot of buggy C++ code being rewritten employing this style in obvious places - many defects were automatically addressed.
I disagree. Reality is reality. Algorithmic or Functional are just ways people look at it. Aren't "algorithmic" also abstract? Isn't "object-oriented" abstract as well?
By the way, using your vocabulary, I view the world as a mixture of "algorithmic" and "functional". No pure anything can describe the world, in my opinion.
Being functional or algorithmic has *NOTHING* to do with one being "more mathematical" and the other "less mathematical". I advise you, that your use of the common peoples' fear for mathematics in your arguments is not going to help.
Templates, being code generators, differ by nature to hand-tuned codes. So your code generates only 33 instructions vs the template's 88. Great - now tell me - which architecture? What compiler? What version of that compiler, and whose STL are you using, and which version of THAT?
And before you count the instructions, did you realize that this code is waiting for keyboard inputs, therefore what you're doing is unnecessary (and obviously premature) optimization?
How does the constant use of auto_ptr relates to the understanding (or the lack thereof) of object lifecycle? Sorry, but understanding object lifecycle the liberal use smart pointers are not mutually exclusive.
It is fashionable *among incompetent* Java developers to create objects left and right which make their programs memory hogs. It is also fastionable for *incompetent* C++ programs to forget deallocations leaking memories. What's your point? This mentality, immature or not, is not unique to managed languages.
You may want to take a look at Erlang before getting started on the project. Erlang is a functional programming language that already addresses fault-tolerance, modularity, distributed computing, and other problems you may need to solve. Even if you don't end up using Erlang, you could probably pick up some good design ideas from the (excellent) documentation.
Erlang does provide several mechanisms for interfacing with C code. If you wanted you could use Erlang at a system level, and C++ at a lower level where performance and interoperability with existing code is important.
Large clusters are in a constant state of failure. Plan on having at least one node down (due to failed hardware) at any given time.
Some techniques to live with the state of failure are as follows:
-Error detection (know when a node is giving goofy results)
-Failure detection (know when a node has died)
-Checkpointing (periodically saving each node's state in a 'recoverable' manner)
-Hot spare nodes! (So your N-node computation can keep going after losing node(s))
If a node is either failed, or providing inconsistant results, then remove it from the cluster, and assign a hot spare. Restore to the last checkpoint & resume your computation.
Failure happens. Plan for it.
Hi there,
:-P
... and, well, you have Google. ;> But thos are all good ones.
First of all, I want to ask you to have a more open mind about all of this. You sound like you have already decided that using a particular language will result in poor software. I can reassure you, though: If you are using C++ properly and have written all your code perfectly, your program will not crash. It really is that simple.
Leverage as many tools as you can to help you deliver quality. Valgrind, memcheck, gdb (I like the ddd shell), whatever it takes. Look into commercial tools as well.
Establish good unit testing practices, code standards, peer review, and QA procedures. Find a development process that will fit your organization and use it. Make all of those things work for you.
Finally, learn to use your tools as well as you can. Stop blaming the knife when it cuts you. Instead, keep it razor-sharp and learn how not to cut yourself with it. It does not matter what kind of knife you are using. Make the commitment to craftsmanship that's required to deliver quality software.
Okay, off the soapbox and on to some practical ideas for you to investigate.
First of all, use a high-quality, modern C++ compiler and the best libraries you can find. I recommend you explore Boost, to start with. If you want garbage collection, use a garbage collector. Have a look at the commercial garbage collectors or the Hans-Boehm garbage collector. Remember that memory is only one resource that you will have to manage, so be sure to look at the smart pointers in Boost and the standard library, and learn to use RIAA in the right way.
You didn't mention the kinds of algorithms you needed, but if they're mathematics-related, check out Blitz++, POOMA, MTL (Matrix Template Library), GNU mp, fftw,
For distributed computing, components, object persistence, and many other very cool things, I suggest that you take a good look at Ice from ZeroC Software ("http://www.zeroc.com"). In my opinion, it's got all of the great things about CORBA only much, much nicer to work with. It supports the platforms that you mentioned and several different mainstream languages, and is very stable and efficient.
GUI: Qt. Or, find another library you like. there are lots of good ones that should meet your criteria. You will get the most versatilty from whatever you decide to use by
using the GUI library well and building to a good, modular design.
Finally, unless there is some reason not to do so, don't forget that you can easily bind C++ into some other language. You could write parts of the application in Ocaml, Scheme, Ruby, or whatever else you want.
Best of luck to you.
It's very simple - don't use assert as assert will just crash the program. Rather, do something better. Detect the error, give the user a good and useful error message, and then terminate the program. You can't do anything else, but at least you can give the user some idea of what happened - of what went wrong - and you can give yourself a good idea where to look to debug it. And, BTW, nothing else in the system can produce a useful message to the user. Life would be a lot easier if everything in an OS did that. (Sure, a core dump helps the developer too, but it won't help the user know what happened, and it won't be very meaningful.)
Asserts just produce crappy software - and BSOD's.
Truth is like the sun. You can shut it out for a time, but it ain't goin' away. - Elvis Presley (source: imdb.com)
Bulletproof code isn't cheap, but it can be done.
This is the most insightful comment I've seen so far. Particular tools can fix particular problems, but that's the easy part. The hard part is finding and noticing the problems, so that you know to look for (or make) the tools.
My teams have in-production bug rates well below one per developer-month. Here are the ten things I think are most important:
Crack smoker.
We used C++, no other language could match the portability and performance requirements. (Windows, Linux, MAC)
The system we built is a P2P real-time communications system layered over TCP/HTTP and UDP of which we implemented a virtual file system for sharing files as well as data streams (telephony and streaming video/audio).
The performance was needed becuase of the need to minimize the number of root servers we need and the stability is required because the clients run for hours and hours and crashing is not an option.
We used smart pointers for everything that is dynamically allocated (either reference counted Austria C++ smart pointers or std::auto_ptr).
We wrote our own serializer/deserializer (which took about 3 days to write) which uses standard C++ and does some pretty clever type safety (hard to serialize somthing you're not supposed to be able to - in fact you get compile-time errors). As an aside, all the libraries we wrote follow the "catch errors in the compiler if you can" rule, making it harder to write broken code.
Virtually everything has a unit test (using the Austria C++ unit test environment).
Automated builds and tests. We have 2 machines that continuously build and test for Linux and Win32. Often we find errors in the code that only show up in one of the builds but the error is truly a problem in the other system as well.
There are a whole bunch of other things we do as well but these are the ones that get you 80% of the way.
Oh, I know!! Model it after Microsoft Internet Explorer!!!
<overrated>Insert Sig Here</overrated>
What a stupid question.
Its either a complex problem, whereby no amount of threaded dribble will help, or trivial enough that the author should put his head down and work it out for himself.
(The GNU Compiler for Java, I mean.)
Lets you write your logic stuff in Javur, compile it to native code with all of the benefits of managed code (except dynamic optimization) and interface some C++ in once you start getting spots where near-managed languages aren't enough. It's what we use in our company's core business functions and it hasn't failed us yet. You can even use SWT to get nice, GTK+-ish GUIs. And then there's JUnit.
... I *do* like ocaml. Especially as a replacement for C/C++. So i'm even planning to use it for games (has opengl and sdl support.)
;)
But every language has its purpose. And ocaml is more targeted on a fast next-generation language for everything, while haskell wants to be the perfectionist.
And i guess this job is a perfectionist's job.
Use Ocaml where speed really matters and C is too outdated for you.
Any sufficiently advanced intelligence is indistinguishable from stupidity.
... after reading the opening line, I'm ROTFLMAO, SCOOMN (spitting coffee out of my nose). No kidding. I had a mouthful of coffee, now I have to clean off my monitor and keyboard.
Ultra stable... crash-free... rhetoric of the "zero defects" camp. Ooops, Service Pack 2 is ready to install, gotta run.
I have recently become a fan of test-driven design and recommend the discipline.
If you can use GC/managed code, then go for it. At least in those parts where you can. Memory mis-management is an important source of errors.
Get help with testing and with regression testing.
I18N == Intergalacticization
C++ Coding Standards... by Herb Sutter. This book is a summary of other fine books. See them for details.
Effective C++ by Scott Meyers
More Effective C++ by Scott Meyers
Effective STL by Scott Meyers
Check out http://boost.org/
From the overview:
XML is a known as a key material required to create SMD: Software of Mass Destruction
I managed a project with similar requirements, and ended up using separate modules that communicated with each other to isolate the effect of bugs causing crashes. I had the advantage of using a message passing microkernal architecture (qnx), so the IPC was easy and stable. The two most sigificant advances I made: 1) addinng code to the base library compiled into each project to capture all traps (bad ptr, illegal instruction, etc) and write the call stack and memory dump etc. to a log file before dying, 2) wrote a monitoring program that detected a "dead" program, logged it (and paged me), and then restarted the program (along with alerting the operators of the system to expect a glitch). This way, every time a once in a month type bug occured, everything was logged and I could easily track down what exact line carped and what conditions existed to enable it. In later years, I ported the entire codebase to windows, by creating an emulation layer of the IPC functions. The result was a near crash-proof package, which could be distributed to different clients, but this took several years to achieve. There was also a lot of simulations done in house before shipping to customer site, to locate problems before they happened in the field. So: yes, modularize, use IPC, add a monitoring program, and build self-diagnosing routines into all the programs, that call home and "bug" you to fix them...
I have written a lot of code with very stringent uptime requirements, and I have some advice that is going to be backwards compared to the advice of others. But you have to take _all_ this advice as one truth for it to work for you.
1: Use the heap. A lot. The heap is your friend. Virtually every overrun exploit and crash involves the stack. It's not that you "can not" overrun something and bone yourself on the heap, it's jsut that you tend not to. Really. See number two below
2: AUTOMATE your heap. I use reference counted handles to target objects. Not "auto_ptr", but real live, honest to goodness, refrence counting. I spend/spent the time to come up with a bullet-proof Handle class TEMPLATE and then I _use_ it.
3: Virtual Distructors. I'll say it again, VIRTUAL DISTRUCTORS. If it has the word class, it has "virtual ~" in it somewhere. Every class, every time. Not inline either. The cost is miniscule, most implementations will be helped by this either in the vtable or the heap management or the code generation, and if you are following rule 1 and 2 and you are doing _anything_ right in your design (e.g. encapsulation and inheretance) this will lead to natural solutions to the +/-2% problems of final integration.
4: THREADING. Use it. It is easier to write N small programs that block on their inputs than it is to balance or maintain an N-way poll/select environment. If you don't use threads you will end up with a hell-a-painful switch-case statement somewhere in the core of everything, and that way lies pain and madness.
5: Recursive Mutexes. The standard pthread_mutex et-al is not recursive, but a simple class wrapper which checks ownership and increments counts can make it so. With recursive mutexes, and an agressive locking policy you will find that you will not make mistakes.
6: The "automatic Lock_Object". The Mutex class should only be takeable/lockable by a companion LockObject class, and _these_ should always be automatic (stack based). The (recursive mutex) lock is taken in the constructor and returned in the (virtual) destructor. It's the law. It will make your life better.
7: _Use_ EXCEPTIONS. People spout all sorts of garbage about when "NOT TO" use exceptions. They are full of "it", and then never tell you when _TO_ use them. The simple answer is.... this: Make a directed graph of your code _state_ dependencies [for each thread] in (aproximate) call/execution order. Weight the dependencies and restructure the graph so that the most necessary paths form a strict tree. Now the "edge and corner cases" will appear on this tangled-tree as cycles (loops). These loops should be cut open with exceptions. Start with a "class Fault" and then make sub-classes for each kind of fault (locking, logical, semantic, external [q.v. dropped connections]) and then *ALWAYS* use these exceptions for (and _ONLY_ for) backing up the central dependency tree.
The net result is a "process space" (memory image, whatever) of shared state, composed of a set of self-sufficent threads.
[ASIDE: the short version is "don't be afraid of your language, leverage it.]
This is not the advice some others would give you. It does, however, work quite well.
I have used this approach on several mission-critical projects and produced programs that were both "sufficently fast" _and_ maintained continuous heavy-load runtimes in the multiple-months-without-stopping quality of service range.
Innocent people shouldn't be forced to pay for inferior software development.
--"Code Complete" Microsoft Press
remember to bring their flame retardent suits before answering this post???
be very wary of C++ constructs that might result in a whole load of code being fired off behind your back...
Did anyone else read this and think to themselves this guy is using WAY TOO MUCH FIGURATIVE LANGUAGE for c++?
How about hiring a trained professional software engineer? Trained professionals know about the various kinds of security and reliability risks, and have a grab-bag of tools for dealing with them. They also know a lot about how to deal with programming languages and environments.
From your question, you appear to be in over your head. Get professional help.
> Not understanding something is one thing, but not understanding something
> so let's reject it as being "elite-ivory-tower"
I did not say I did not understand it. I said I did not like it. I do not like it because it does not fit with the reality of computer operation, as discussed below.
> Reality is reality. Algorithmic or Functional are just ways people look at it.
On the contrary, you can see reality being algorithmic. Things happen one after another. To type "algorithmic", you depress a, l, g, etc. in order; you don't declare a set of letters, fill it with appropriate values and throw it at the computer. When you receive a specification for your program, it will say something like "get this from the user, then do this, then do that, then print out the result". No specification is ever written in functional notation outside the academic world.
More importantly, the computer itself works algorithmically. It does one thing, then another. No computer has ever worked functionally, and no computer ever will. All of them will decode and execute a sequence of instructions, and if you refuse to write your code likewise, you're only adding translation overhead.
Even in the hallowed halls of science overuse of the functional notation creates serious problems. The entire hodge-podge nonsense we call quantum mechanics stems from the attempt to describe a complicated system as a function. Instead of trying to get a set of time-value maps for the whole system, it would be more appropriate to look at the system's constituent parts and algorithmically simulate them through time. That way you wouldn't get any "spooky action at a distance", stuff being there and not there at the same time, and all other equally ridiculous denials of reality.
> I advise you, that your use of the common peoples' fear for mathematics
> in your arguments is not going to help.
I wasn't using that argument, but, now that you mention it, it is a reasonable one. Most programmers couldn't care less about higher mathematics, and, even if they were forced to study it in college, they likely have forgotten it all by now. Computer algorithms require minimal mathematical background. The most I ever used was a bit of calculus to write scan-conversion routines. So, whether from lack of practice, or from lack of interest, most programmers will prefer you didn't drag them into the world of useless mathematics. (and I use the word literally here)
> Templates, being code generators, differ by nature to hand-tuned codes.
> So your code generates only 33 instructions vs the template's 88. Great
> - now tell me - which architecture? What compiler?
That is quite irrelevant in this case. istream_iterator notation generates extra code for reasons that will not go away no matter how hard you try to optimize it. Yes, I might be able to write an istream_iterator that would have no overhead over my iterative version, but it will not be standard compliant. The istream iterator has to read on construction; it has to store the read value; it has to be constructed, since it must keep a reference to the source stream; it has to handle special cases, like the end-of-file, and the subsequent conversion to the end iterator value. However good you might be at optimization, you will not be able to discard these and still be compliant with the specification.
Also, which compiler or architecture you use will not make all that much difference in the size of the compiled code. I guarantee you that your functional copy will never generate smaller code than my iterative loop, no matter what compiler you use or what architecture you compiler for. There is a certain amount of work to be done, and my version does less work. It is as simple as that.
> And before you count the instructions, did you realize that this code is waiting
> for keyboard inputs, therefore what you're doing is unnecessary (and obviously
> premature) optimization?
First, you should note that I
C++ is such a terrible language that there is simply no way to get anywhere close to what you want without extremely high cost (both time and money).
Use a strongly typed language with built in memory management to solve your problem. In my experience OCaml gives you all that. Once you have learned the language and the concepts implemeted in it, C++ will appear to you anachronistic and ridiculously messed up. Ocaml does provide a nice set of librabries, but you can also use C/C++ libraries via its native language interface (though this introduces risks and of course, the libraries might still crash).
In any case, reconsider the language choice -- it is really quite essential especially with regard to segfaults and similar issues.
I would honestly recommended ACE and TAO frameworks for all kinds of C++ projects. I'm not salesman and it's free software, just (from my 15 years with C++) this IS my software of choice. Start it here http://deuce.doc.wustl.edu/Download.html/ or Google for ACE+TAO
For a really good look at the history and early design of Tandem computers, see this wikipedia article: http://en.wikipedia.org/wiki/Tandem_Computers.
In brief, the Nonstop Kernel is none of the above. It's a proprietary OS, originally written from the ground up by Jimmy Treybig and two other ex-HP engineers. It has a loosly-coupled, message-based architecture, designed with reliability as its primary focus. It runs on proprietary hardware, also designed with reliability as its goal.
When HP expressed no interest in this new high-reliabilty, high-availability computer that three of their engineers had designed, those engineers quit HP and founded their own company, Tandem Computers, around 1974. The name came from the design of the platform; the minimum system you can buy has two processors and two (mirrored) disc drives (the maximum has 4000+ processors); processes run in pairs, with data checkpointed from the primary to the backup, so that if a process or cpu fails, the backup can takeover immediately and continue processing. Failover time is typically less than 15 milliseconds.
True story: a data center where I worked had half Tandems and half Amdahls. The building got hit by lightning. The Amdahls all went down, and took three days to get back up. The Tandems lost half their processors and a third of their disc drives, and kept right on going. Processing continued with nary a pause, and no data was lost. It was amazing to watch.
Tandem had revenue of $1.9 billion; the reason you never heard of them is that their target market was never consumers. The cheapest machine was in the $millions circa 1980, and about $250k circa mid-nineties. They built enterprise-level machines for transaction processing that would almost never go down (some units have run in the field 24/7/365 for 5+ years with zero downtime), and they sold to banks and stock-exchanges and telephone companies. Most of the world's financial processing infrastructure runs on Tandems; almost any financial transaction you make, anywhere in the world, e.g. withdraw money from an ATM, buy gas using your debit card, buy or sell stock, etc., is processed on the backend by a Tandem computer.
In 1997, Compaq bought Tandem. In 2001, HP bought Compaq, and with it, the Tandem division. Tandem computers are now called HP Nonstop computers. Full circle.
Cheers, Tim -- Tim Janke Part mad scientist, part lion tamer: sr. software engineer, global team leader, project mana
The responses are flooded already, but just in case you're reading far down, I'll add my two cents. Use the tips mentioned, but the central most important key is to isolate all the dangerous aspects. C++ has a lot of features that can trip you up with subtle errors, like arrays without boundary protection, and pointer access. So take all of these features, and isolate them to a very small set of very simple functions, and use those functions repeatedly. If everything dangerous is isolated into simple functions that are used repeatedly, you will eliminate a large part of your error by shielding yourself from the problems. You'll be adding a bit of extra overhead, but real stability requires a little extra overhead.
Once you've done that, make sure you test each component separately. Every piece of the program should be tested with a test suite of inputs designed to test its range of outputs, and test its resilience against outputs which are out of bound. Every single component MUST test its inputs, and must have a prescribed action for any type of faulty input, EVEN if that faulty input is "impossible" in your design. A large percentage of crashes are due to high level assumptions about inputs that are made invalid by other sections of code, where the developer, or another developer, is no longer aware of the constraints, or writes a section which accidently generates an invalid input to a function. So as a result, handling invalid inputs inside every function needs to be one of your central focuses.
If no portion of your program can cause a segfault because all pointer work and array accesses are isolated into protected functions, and if every single function has a procedure for handling invalid input and invalid object data for every piece of object data, then your program will not crash. (Unless you made a mistake, that is. Such is life.)
Quoting:
Now I know of several more transformations which I could have applied to my Scheme program before I translated it into C, which would have put mine in first place. A runtime profile of my program revealed that the majority of time was spent in the C routines "malloc" and "free." I could have eliminated that heap usage by transforming my program into a form in which all data allocation and control management would have been completely stack-based, with an explicitly managed stack. I could have pushed onto the stack exactly that control information which I deemed necessary. There were places where I could have modified the existing stack record, rather than popping it off and creating a (similar) new one in its place. These transformations were also presented in my Programming Languages course. Unfortunately, at the time that I did my Algorithms project, I did not yet grasp them well enough to use them well.
well, does it?
You cannot write highly stable code in C++, due to design flaws in the language. For this reason, the FAA doesn't allow C++ for use in aircraft systems. You can improve the situation with the use of a garbage collector though, but if stability and safety is critical, then you should use ANSI C. See this: http://www.hpl.hp.com/personal/Hans_Boehm/gc/issue s.html
Oh well, what the hell...
This may sound like a troll, but it is an honest question.
Seeing as all variants of Lisp support and even use not negligibly setf, setq, loop constructs and others, why is Lisp any more functional than Python, Ruby, or Smalltalk?
If I had to classify Common Lisp with Python or with Haskell, it obviously falls into the Python class far more closely, and that's not really functional...
Design a language well suited to implementing your system, then write an interpreter or compiler for it in C++.
If what you want is stability, you must replace the memory manager. This requires no change to your code; there are several plug-in replacements. You can use a conservative garbage collector to get rid of memory leaks. Another possibility -- in the self-plug department -- is to use DieHard, which eliminates a wide range of memory errors and even defends against heap overflows and other heap corruption. Pick whichever works best for your needs, but pick one. You will not get stability and reliability with C++ if you stick with the default memory manager.
I actually used to maintain avionics software in a previous job, which is obviously extremely safety-critical real-time software. As you can imagine, it was designed to be ultra stable.
It was written in ADA83, with a little bit of C thrown in for good measure (and to provide some fancy interfacing to other interconnected systems).
I left that job a few years ago, and I now maintain software in an Air Traffic Control environment.
It's much the same - the core is written in Ada, with C++ interfaces to the O/S (Unix). There's even a separate system which monitors the modules with a nice GUI front-end, and if one of the modules does happen to crash - it will immediately be restarted.
In short - I've never worked on an ultra-stable software system that was NOT based on the ADA83/95 language.
As for porting to Windows... are you 100% sure about that? For an ultra-stable platform? I hope this isn't safety critical software, where peoples lives could be at risk...
Good luck!
You must investigate this subject. Z, B-Method, ...
It's pretty remarkable that in this entire discussion not one person has made reference to the MISRA guidelines, which are specifically designed to make it feasible to build highly-reliable systems in C (and now C++, although that is still work in progress), and which are backed by a worldwide community, excellent tools, etc. All the detials are at http://www.misra.org.uk./
Look at this benchmarks ...
h p?test=all&lang=all
http://shootout.alioth.debian.org/gp4/benchmark.p
...if null_functor learns D well (and|but) decides to quit (his|her) job, (s)he can make a kick-ass game with it.
You can hold down the "B" button for continuous firing.
It isn't completely clear to me what you're trying to do.
What is the cost/impact of a failure in your system? Are we talking a little bit of money (website), bunch of money (stock trading) or are we talking people dying? Or just your boss saying "I want 24x7 availability for this app darnit!"?
I work on embedded systems using a limited dialect of C++ on safety critical systems. We, for instance, don't get to use dynamic memory allocation because it makes a system's behavior non-deterministic. (And besides, what do you do if you try to allocate memory and it fails?) Multi-tasking is just right out for formal certification of the application.
Also, are you looking at high availability or high reliability? They're similar, and use some of the same techniques, but have different aims. In my environment, we'd rather fail and bring down a system than leave a system up that is producing even slightly incorrect data. (And we have a redundant unit in case of hardware failure, etc.) But in high availability, you want to keep operating, even if degraded. (Think telephone switches, where the ability to carry half the call with some static is better than nothing.)
In general, however, you want as few components (both software and hardware) as necessary to do the job. Assuming a single system can handle the work, splitting it across three boxes just adds two more points of hardware failure, 2 possible communication failures, and a bunch of software to handle inter-machine communication. Every library, every OS call, and every component is a potential source of failure; use with caution.
And in the end, if you really can't have failures, go look at formal development, certification and testing practices specified by the FDA or FAA. They don't guarantee it'll be perfect, but they give you an idea of the kind of work you need to get close. For example, testing that has MCDC code coverage of >90%. The cost will be an order of magnitude higher than for normal software, but it won't fail. And they have advice on use and certification of third party libraries and OSes. (Summary: Libraries verifiable to these levels also cost an order of magnitude more money.)
Those methods will also give you advice on availability assessments of hardware, etc. Is an average server or PC sufficiently reliable for your application? This has to be one of your first assessments. (i.e. how quickly can you get hard drives replaced, and what's the simultaneous failure probability of 2 drives in a RAID/mirror setup?)
I think we're working with slightly different definitions of the word "crash". When I think of a crash, I think of a program that was terminated not of its own volition, but by the operating system in response to, say, a memory segmentation fault.
assert() is (usually) a macro, and you can define it to expand to whatever you want. By default, it usually expands such that if the tested expression is false, then exit() is called after some debug message (like "failure in foo() at line 1234") is sent to stderr.
I *guess* you could refer to the call to exit() as a "crash", but to me it seems like deliberate behavior. In other words, the exit() call itself is not a sudden failure. The failure happened -- silently -- long before we got to the assert(). The call to exit() is done to limit the damage of the failure and to aid the programmer in debugging.
Now, an assertion failure doesn't have to result in a call to exit(). It could result in a cute GUI pop-up window that says, "sorry it didn't work out; please fill out this form that we'll send back to the developers. Try to give as much relevant info as possible." And that could be sent back to the developers along with a core dump. A lot of developers have chosen this route, and it seems like a valid one. In a debug build, an assertion failure could instead result in a call to a function that pauses execution and launches an interactive debugger; this too is valid.
But I think it's unhelpful to try to convince people that "assert will just crash the program" for two reasons: 1) assert() is only a messenger, and the programmer is always in control of how that message is to be delivered, because she always has control over the definition of the macro. 2) you're going against the writings of the experts, and if you don't qualify your meaning very carefully, you'll just end up confusing people.
I think we pretty much agree in principle, but the definitions we're using are at disjoint. (That's a problem, because clear communication is very important in this business.) I think what you *meant* to say is, "don't use the default definition of assert because the default assertion failure action is to halt the program without giving much useful information to the end-user." That's a statement I can much more readily agree with.
Don't use new/delete or pointer arithmetic, and you're 90% of the way to crash-proof. Use STL and boost where you can. Pass references instead of pointers and let the stack manage your memory.
Create processing units by forking, and communicate between them using unix-domain sockets. You can even handle network I/O in several processes this way, passing the socket file descriptor between processes through unix-domain sockets, if you need. For Windows the equivalent is named pipes, but you can't pass descriptors.
-I like my women like I like my tea: green-
That is unless this is an application for helping maintain a barnful of horses.
I don't want to join the language wars so hear are some helpful design considerations for any language.
Functions should do 1 thing and 1 thing only. This should be something that is testable outside of the context of the running system, if possible.
Classes should control 1 complete and stand alone idea in the program and only one. This should be something that is testable outside of the context of the system, if possible.
Well defined interfaces for the classes that are decided early. Changes to the interface will probably be necessary and should be communicated to all developers.
Now everything might not be able to fit into this, but forcing yourself to think and implement in these terms does help to produce testable, straight forward, and simplier code than just banging away on the key board.
The basic idea is to keep things simple and testable. Done right you can automate a lot of testing.
Most importantly, Test early and test often.
*Sigh* Every year or two for the past ten years, I get into an argument with some European professor about the qualifications of American graduated students, Usually on their knowlege of engineering of computer programs.
e rac_1.html
Thanks to this discussion, this year, I am going to have a much harder time defending them, I think.
Safety critical systems is a branch of computer science, routinely taught in the better European universities. The book I first learned it from was Safety Critical Computer Systems http://www.eng.warwick.ac.uk/~neil/safebook.htm
The field really got impetus after the Therac-25 failure http://courses.cs.vt.edu/~cs3604/lib/Therac_25/Th
where a group of people got radiated to death because of a minor (in a computer sense) error in some code, thus adding new meaning to the term "execution error".
Its a vast field (google "safety critical") with large numbers of interesting published papers and few good books. I really can't remmend a current book, I haven't seen one I liked in years. I don't think you can learn it fast enough to be useful.
You also need to know a fair amount of higher math to be really competant, or at least even understand whats going on... most of the true experts in the field apparently regard english as a required second language (Math being the first language, not that I can blame them, but often it is overused)
The field seems to have suffered since the recession. and fragmented, but there is a good starting point at http://vl.fmnet.info/safety/
It also seems to be rapidly migrating to India as well, because of the resistance of American "cowboy" programmers. This time, it is possible the game of "cowboy and indians" may end up with the Indian's winning, inasmuch as the techniques are going to be essential to the new multicore programming models. (I heard a rumor Herb Sutter is investigating that, but thats just a rumor. If so, however, he would be the person to talk to about safety critical C/C++)
One of the techniques used in the field is formal verification. McGee and Kramer are coming out with a second edition of thier incredible book, Concurrency, http://www.doc.ic.ac.uk/~jnm/ early this year (last time I emailed them). The book is an gentle introduction to the field of formal verification and model checking, among other things. There are other books (a new one came out on the Spin Model checker, for example) but this is by far the most penetrable.
Much more interesting is the use of model checkers behind UML or BPEL/SOA tools.
Most of the really interesting stuff is still behind university walls, but tools should be appearing soon. I am trying to develop an open source grid based one but it's been slow going due to committments and resources.
I give you the rest - and I quite agree with your "train yourself to write smaller code" mentality.
However..."Pointers must be owned by a specific object; created by that object, used by or through that object, and deleted by that object. That is the only way of ensuring that you always know when and where it is created and when and where it is destroyed, which is what I mean when I speak of object lifecycle."
This is not necessary true in all cases. For example, in a message-passing multithreaded environment, messages are often created by one object, passes through a message queue, and destroyed when a pool of thread, which often is part of another object, has finished processing it.
However, you can view it as an exceptional case. After all it's the exception that makes a rule.
Use valgrind. It will tell you about memory access violations and can tell you what memory blocks have not been freed at program exit. This will get you 90% of what a good garbage collecting language will get you (the other 10% is, IMHO, not having to manually free memory blocks).
For IPC: Keep it simple. Analyze your problem to determine if you can use something simple like message queues, or just file operations.
Defensive programming: Assume system/c-lib calls will fail, and code appropriately.
Multi-threading: This is a good way to make your program unstable. Avoid it if at all possible.
GUI: I agree with the other poster: Qt is good. And it's portable across Windows and Linux.
Windows: Get it running with no bugs on Linux first, then port it to Windows.
I hope this helps you.
I think we pretty much agree in principle, but the definitions we're using are at disjoint. (That's a problem, because clear communication is very important in this business.) I think what you *meant* to say is, "don't use the default definition of assert because the default assertion failure action is to halt the program without giving much useful information to the end-user." That's a statement I can much more readily agree with.
While I can agree to that at least in principle, I would still not advocate using assert in any form or manner - it's just wrong so far as I'm concerned, and will for all intents and purposes not correct the behaviour that needs to be corrected - that is, telling the user what is wrong and handling it appropriately. The program should terminate as a last resort, after working with the user and the operating system to try to solve the problem.
In referring to the term "crash", I mean any reason why the program terminated without completing its action - either because of a seg fault or because the program hit a point that was either (a) unexpected by the programmer (e.g. resource X did not actually get created) or (b) something didn't happen quite right, and the programmer decided to just quit without trying to figure out what happened - e.g. a lazy programmer.
As per what to do about it, the program should so something like the GUI dialog you talked about, but tell the user what lead to the condition as well - for example, "Program X could not complete its operation because the operating system denied access to the Buffer Pool object while streaming file Y from server Z. Please fill out the information below to help the developers solve this problem.", and then send the information along with the core dump to the developers. Telling the user why the program thinks it has to quite in as much detail as possible (that the average user could understand) will help a long ways in not only debugging the program, and help desk - but also in creating a good perception of the program among your users, and they'll pleasantly be surprised that you think they're not as stupid as they really are, which will reflect well on you and your program.
Truth is like the sun. You can shut it out for a time, but it ain't goin' away. - Elvis Presley (source: imdb.com)
It might be appropriate to wrap up the C++ code in a CORBA/SOAP interface and use some other language to write the GUI.
IME User interfaces tend to leak more than backend stuff because it's harder to unit test and effectively profile GUIS because of the seemingly random nature that some customers use the application.
Beware of round tripping in Corba and Soap. They're what really kills percieved performance. We transfer a lot of data over Corba between a C# Client and Java and C++ servers and Corba is a very effective and efficient solution
You seem confused about the intended use of assert.
It is true that assert should not be used to deal with user and system errors. That's not what it's intended for. Those kinds of errors should be handled by exceptions or error codes, and recovered from gracefully, if possible. An application should not die if the user types an invalid input into a form.
Asserts are intended to handle programmer errors. Think of them as a programmer-defined extension to the compiler's and runtime system's checks. Typically they're used to validating programmer assumptions, such as those about: function argument values, internal data structures, and object states. For example: if part of the contract of a function is that it accepts only values between 0 and 100 for a particular parameter, it's the programmer's fault if it gets passed 102. It makes no more sense to "work with the user" on that error than it does to "work with the user" on the programmer's misspelling the function name, passing it too many or too few arguments, or passing it an array instead of an int.
Another good use of assertions is blocking out "free" behavior, such as when a function or module is capable of more general behavior than you designed it for. Sorry to say: no feature is free. If you offer the functionality, you gotta test it. If you don't wanna test it, assert it out of your code.
Of course, even assertions should attempt to exit gracefully: tell the user a problem occurred that was the programmer's fault, save their work if it hasn't been corrupted, and let them send an error report to the developer.
I gotta say: once I started using assertions, I never looked back. It's rare I trip one of my own assretions, but when it happens, that's an hour or day of my life rescued from having to track down some ridiculous bug. Same story with unit tests. Both assertions and unit tests are layers of redundancy to help combat human error. Use 'em.
I would suggest looking at this:
http://www.csr.ncl.ac.uk/vdm/
We've used this with a good deal of success. And it meets all your criteria!
( Mainly C++ based, portable, distributed, multi-computer, multi-threaded, similar design methodology, etc... )
http://www.connectivelogic.co.uk/
"Because you have no idea of what C++ is" would be my answer to that question. No offense meant, mind you, but IMO knowledge of C++ implies knowledge of the STL. Oh, I agree that your example is interesting (maybe would be better, but... there is no accout for taste, don't you think?
It's better to be the foot on the boot than the face on the pavement. ~~ tkx Kadin2048
> You say "IMHO" too often.
Not really. In my post, I said IMHO every time I was stating an opinion rather than a fact (ie, twice) and IMH?O (once) when I thought my opinion about the beauty (and maintainability) of iterators can be construed as an expert opinion (programming in C++ since 1989...)
It's better to be the foot on the boot than the face on the pavement. ~~ tkx Kadin2048
ANSI C for safety is critical software? Excuse me!
First there is almost no true ANSI C avaliable today. Most of them are missing features like variant arrays or restrict pointers.
And the C is about the unsafest language available. Was that old saying not: There is no space between C and assember for an even unsafer programming language?
But on one point you are right: *not* in C++. I pitty the guy: He needs to write Ultra-Stable software in a programming language unsuitable for Ultra-Stable software.
Martin
std::copy vs.
int n;
while (cin >> n)
v.push_back (n);
Once you put in some braces to limit the lifetime and scope of n (your temporary), and when you factor in the implicit boolean conversion in the while clause, and the fact the code that calls 'copy' to perform a copying operation, I don't think you have much of a readability case. You have 5 lines (including enclosing braces) where std::copy needs 1.
On auto_ptr, perhaps you're forgetting that the intended use is to convert dynamic lifetime into scoped lifetime, like the built-ins. Eg. having member auto_ptrs instead of member pointers (and therefore the dtor doesn't have to call delete). Yes, I agree it is stupid to write Java style C++ new-ing every object instance, where a stack instance would work, giving you immediate, simple lifetime control (by tieing it to scope).
Posters recognized by their sig,
Yes - your are right. "Ultra-Stable" is only possible when the libraries are also "Ultra-Stable". But C/C++ libraries are never Ultra-Stable since they are written in C/C++ and those languages are no good stable code.
Martin
I'm sorry, but I just can't agree. It might appeal to a mathematician who wants to see everything use functional notation and hates every language except lisp, but to a non-abstract-elite-ivory-tower-mathematician this is absurd.
I am a non-abstract-elite-ivory-tower-mathematician. I am an application programmer. And I guarantee you that I rather encounter a in some code I need to maintain then to encounter the first form permits, for instance, components to be a linked list or even a hash. The second is implementation-dependent and if you change the underlying data structure, you'll have extra work to refactor. And so on...
> Remember std::auto_ptr is your best friend
Most of the time, no.
I disagree wholeheartedly. Anedoctal evidence (no evidence at all): In a 40000+ LOC GIS application I once worked, changing all instances of SomeObject* to auto_ptr<SomeObject> eliminated altogether 35 bugs we had lurking in the BTS for a long, long time, with less than one day of work (strange, delayed, errors were suddently transformed in EARLY null-pointer dereferences -- BAM! bug nailed, let's see the next one) My extensive (15years+) C++ experience (*) gave me the following feeling: pointers and arrays to be used only when strictly needed, auto_ptr and vector are the champions, boost::smart_pointers when auto_ptr does not apply.
> Question 2: How can I actually implement such a decoupling?
> I would use a simple, socket-base, take-my-data, gimme-my-results scheme
And thereby slowing your program to a crawl? There is a reason people use CORBA and the like: those frameworks optimize distributed object calls to avoid network hits, often being able to reduce the overhead to be equivalent to a virtual function call.
While I agree with the "corba can be more optimized by the vendor" part, I really don't see why a well-thought "let's see what I need processed, organize a queue, and let the queue have it's own thread" would slow things to a crawl... Maybe I have an more "hands-on, not a lot of third-part libraries" approach, so I'll rather give up on that point than debate it further. (ie, I'm conceding that maybe CORBA would be better)
(*) yes, I use C++ since Stroustrup first edition, when there were no templates, no multiple inheritance, etc.
It's better to be the foot on the boot than the face on the pavement. ~~ tkx Kadin2048
my my you do go on. trouble is some of what you say might be right, but then you go and reveal your complete misunderstanding of the basics of QM (as if the notation matters at all!).
and actually, reveal a misunderstanding of the relationship between physics and mathematics. pretty good going for a post about iterators.
Hello,
p
I am sorry to tell you: If you need to write ultra-stable software then you need a propper "High Integrity Toolchain". My suggestions:
http://www.praxis-his.com/
and
https://www.adacore.com/gnatpro_high_integrity.ph
There are no two ways around it. Speed is no issue - Ada is just as fast as C++. The libraries are more tricky. But then: have you ask yourself: "Are those libraries ultra-stable?"
If they are not then your efforts are furtile. And most likely they are not. Most C/C++ library vendors don't have ultra-stable in mind when they create there libraries. That's unlike for example Ada library vendors which allways take stability into account.
Others said it before, suggestion other languages and I repeat it: C++ is a no go for ultra stable.
Martin
It's better to be the foot on the boot than the face on the pavement. ~~ tkx Kadin2048
Yes, C++ has faults which make it EASIER to cause a crash, but it is the coder who has to ensure that it doesn't. making sure your app has as simple a design as possible, always check for ANY AND ALL invalid data (Create objects with their own validation functions), and ensure both your design and code are well documented. Tricky bits of code, involving memory management, should be avoided as this is the cause for most errors in C++. If possible, limit all memory management code to constructors/destructors and VERY specific resize code. Document inheritances, and avoid tricky multiple inheritance situations. Furthermore, you should ALWAYS have a development cycle that contains a long testing phase, with the ability to test your extremes, as well as testing as close to real world situations as you can - Real users will crash your software much faster than any programmer with knowledge of the internals. Always use strong error handling, and code using try/catch blocks for exception handling - if you're going to have an error, handle it gracefully. Once you have a solid program base from this process, you can start worrying about crash resolution strategies that may increase the complexity of your code - overseer processes and the like. When in doubt, go back to basics and build simpler code. If you don't confuse the coder, it's harder to confuse a computer to cause a crash.
It is very straighforward to write a program in windows C++ that is crash free. You wrap the outer main in a structured excpetion handler for all crash events. This will make the program uncrashable it the expense of orphaning, that is leaking, heap memory. This is a bad idea as it will protect bad code from it's own, deserved demise. You want crash proof code? You should spend more time in QA than development and start with a good design. Iterate until the code behaves as desired. This is not a technology issue it is a cultural issue. History is replete with inferior cultures that fail.
Yes, you can do dangerous things with C - but that is true of any language. Even Ada has pointers and Goto statements. However, with C and other flat languages, you can prove how much memory the program will use, for example. The biggest problem with C++ is that there is no explicit way to know whether an object can be safely deallocated. So C is deterministic, while C++ is not.
Oh well, what the hell...
It does one thing, then another. No computer has ever worked functionally, and no computer ever will.
:-(
That's a bit short-sighted, speaking as someone converting imperative ISO-C code into FPGA logic.
I was working with Confluence (write [functional!] Scheme-ish code, out pops massive RTL netlists for your FPGA to re-wire its logic with), but it seems the project has taken a sharp turn towards Ocaml...
FPGAs are fascinating things... and yes, if you express a function that evaluates (at "compile" time) as recursing 15-deep, it can replicate that function 15 times so that it all happens in one clock cycle (or to save chip space you can rig it up to take 15 cycles on just one instance of the logic, with an accumulator/FIFO idea).
I wish I could get a permenant job working with them.
Well, back to hacking x86 assembly on a 15 year old 16-bit MS-DOS app
Well, if you want to do that, it is easier to use for:> You have 5 lines (including enclosing braces) where std::copy needs 1.
As much as I like one-liners, I would still prefer three short lines to one really really long one.
> when you factor in the implicit boolean conversion in the while clause
You might think of it as a return value instead, reading the while as "while you can do cin >> n, push it back".
From your post i gather the following: 1. Ultra stable application (managed code seems to me the only solution for minimum effort/time) 2. Portable application 3. Need to use existing C++ libraries I recommend doing all coding in JAVA and calling all the existing C++ libraries that you are given using JAVA's native interface (JNI). Lets suppose that you are given a library that does something ultra-super-ific and you cant do without. public static native int callUltraSuper-ificCppMethod(string param1, long param2); then go and write the wrapper code that thanslates parameters/return types from/to JAVA, c++
Hi,
d f), which is titled "Making reliable distributed systems in the presence of software errors".
I would recommend you to read Joe Armstrongs PhD thesis (http://www.lib.kth.se/Fulltext/armstrong031205.p
> For example, in a message-passing multithreaded environment, messages are often
> created by one object, passes through a message queue, and destroyed when a pool of
> thread, which often is part of another object, has finished processing it.
Yes, you might do that, but my pattern would still work better even here. Create a message queue object which will be the owning container. Queue messages with msgq.push_back (new Message). Read messages with msgq.front()->Method(), then pop_front (which will delete it).
However, I would take it one step further, as I did in the code I'm presently working on: marshal the messages into a bytestream and queue messages as memory blocks. That way you can not only spread the work among threads, but also among different processes or even over many computers over a network. This is basically how CORBA and DCOM work.
> I was working with Confluence (write [functional!] Scheme-ish code
That's not the computer. That's your LISP programming language that you use to program the computer. Programming languages can be functional, at the cost of some overhead. Your FPGAs are still algorithmic, they do things sequentially, one thing per clock cycle.
> if you express a function that evaluates (at "compile" time) as recursing 15-deep,
> it can replicate that function 15 times so that it all happens in one clock cycle
Sure, you can write multiprocessing directives with a functional language too. In an algorithmic approach you would iterate over the available processors and spread the work among them, which is what the hardware would actually do.
You need to look into ACE. ACE is a set of middleware libraries that were designed for precisely the purpose you describe - high-performance, mission-critical systems. Doug Schmidt, the guy behind ACE, has written numerous papers on the usses surrounding such systems, which are also available on that site. He was the first to document a number of important patterns in high-performance netwrked systems, like Reactor, Proactor, and Futures. He's also written books, including books of patterns for distributed reliable systems.
From that site you can also find TAO, a free CORBA framework based on ACE. TAO is the test bed and driving force behind the CORBA realtime spec, which is a version of CORBA for realtime systems which demand high and deterministic performance. I believe TAO includes services supporting failover and other reliablity strategies.
ACE and TAO aren't just research software. They are used in mission-critical systems by major defense and aerospace contractors. I work at one such.
Sorry to sound ilke a shill. I'm not paid by them or anything; I'm just a programmer who uses ACE for his job and has been favorably impressed with it.
--
CPAN rules. - Guido van Rossum
If robustness is an issue, then you need to look into two stage construction or a similar method. This is required for two reasons - firstly C++ class construction simply does not support the situation where either malloc operations fail or exceptions are thrown in the constructor (as the class has not been constructed, there is no destructor created, thus all malloc'd memory is instantly leaked).
Two stage construction places all code that requires memory allocation into a 2nd stage constructor function so that no memory is lost if the constructor fails.
(In addition two stage construction allows constructors to be inherited which provides an advantage in some situations).
There is plenty of information on google concerning this technique.
SURELY NOT!!!!!
Hello,
r eshold=1&commentsort=1&mode=thread&cid=14649951 and follow the links.
but we are all only Humans and we err from time to time. And if I use a programming language that finds 10 times as many of my mistakes (i.E. by automaticly "check for ANY AND ALL invalid data") then C that this is a huge difference. Ada would be such a language
But the OP speaks "Ultra-Stable". If he truly mean what he says then he needs to get the big guns out which find 100 times an more of the all so human mistakes then C. Static analysis tools like SPARK would be such a tool.
See my other post: http://ask.slashdot.org/comments.pl?sid=176280&th
Of course there are static analysis tools for C. While they are better then nothing they are very tedious use because C/C++ have so litte to offer security wise.
Blaming the coder only shows that you do not know what it entails to write true high integrity software.
Martin
I think your first and biggest problem is that your haven't defined what you mean by "as crash free as possible". What kind of uptime requirements do you have, what kind of data loss and retention requirements? What's the cost of a failure? What's the cost of preventing a failure? What are the security requirements? Does the system have to resist malicious external interference (and if so, that's a whole new can of worms^H^H^H^H^Hquestions)?
If you don't answer these kinds of questions first, no advise that anyone can give you be anything more than a guess.
That said, probably the biggest 2 suggestions I would have are: 1) Don't overengineer. Make your solution as simple as possible, but no simpler. 2) Code review every damn line of code written or changed on the project with at least 4 people in the room.
The problem with C/C++ is not that you can do dangerous thing. Ada's has very interesting low level contructs. The problem is that it is the normal way to do things. Example:
// 200 lines later...
signed int x = -1;
unsigned int y = x;
Remember: Programms are often done by teams and monitors usualy hold less then 200 lines. And it is such a simple thing for a compiler to just tell me that the value just does not fit.
Martin
If only dynmaic cast could be used for:
// 200 lines of code
sigend int x = -1;
unsigned int y = dynamic_cast (x);
But you can't. Some idiot decided that dynamic_cast is for class pointers only. But in real live you are far more likely to cast signed unsigned or short long.
And boost has no:
typedef range day_of_month;
either. It's possible - I wrote such a template myself. But is't not part of the all mighty boost.
Allways arrays and pointer - but you know: that JPEC virus uses an interger overrun to worm itself into your computer. Windows, Linux, MacOS - they all hat to replace there jpec library because integer ranges are not checked in C or C++.
Martin
I prefer strongly typed languages. Prof Wirth did a few things right with Pascal, notably strong types and strings with a length value. Gawd knows how many problems are caused by C null terminated strings exceeding the boundaries.
Oh well, what the hell...
That's not the computer. That's your LISP programming language that you use to program the computer.
:-)
Ah, no my friend - firstly there is no computer involved at all, just an FPGA chip. I could wire up the contents of the FPGA chip to mimic a CPU and a full System on a Chip, complete with ethernet and audio and VGA sub-systems if I like..
Confluence was but one HDL (Hardware Description Language) I experimented with to write EPXRESSIVE code. There was only sequentiality because I wired the clocks up that way.
It's truly WYSIWYG: "That's your LISP programming language", yes it is - it also translates directly, every statement, into dedicated hardware. There is no assembly language, no smaller instructions it gets broken down into, no OP codes, no nothing - it just gets hard-wired into dedicated logic that way.
Your FPGAs are still algorithmic, they do things sequentially, one thing per clock cycle.
FPGAs are a bit of a specialty field, but you do not understand them.
There is no "central" anything on an FPGA. None at all. They certainly don't do just one thing per clock cycle: for starters, they can have thousands of clock circuits all running at once. Secondly, if it's only doing one thing per clock cycle, then that's only because you've wired it up that way - and you're probably not using the FPGA efficiently.
The key to getting the maximum work out of an FPGA is parallelism... creating logic that will process as much as possible per clock cycle, but with keeping the complexity down so that the clock speed itself doesn't slow down so much so as to work against you...
You are saying nothing "ever" happens all at once when talking about throwing sets of numbers at something: but I say, with FPGA and HDLs like Confluence, you can actually write functional code that translates - line by line - into physical hardware that will evaluate everything in parallel, all at once.
Handel-C, for instance (although not functional) will translate every line of code into physical hardware. Doing i++;? It will create adder logic, wire up registers for input from the previous statement and output onto the next (it's a bad example that implies FPGAs have a concept of "execution" like a CPU that you're clinging to, I just wanted to talk about the dedicated-hardware-for-every-statement thing).
On an FPGA, it's all happening at once. There is no such thing as a CPU unless you create one. There are no bottle-necks unless you create one.
That's why when one of my projects in image processing, my FPGA whose master clock ran at 65MHz (other speeds were derived from this clock to drive complex logic down to speeds of about 4MHz) out-performed a 3.0GHz Pentium 4 by a factor of 10. Because where as the P4 was too general, my FPGA design could do in just one clock cycle something that took the P4 1000s of cycles because my FPGA logic was dedicated to the task.
DSP and Control Systems theory also requires a good understanding of higher mathematics, although now we're entering the realm of EE which is my background... functional programming certainly is extremely useful in FPGA design if you have a problem that can be expressed functionally, which is often the case in electronics.
And think about it: designing FPGA logic is much like building larger "macro" circuits with off-the-shelf chips. Each small part serves a specific "black-box" function. Unlike specs in software, specs in electronic engineering often are mathematical, where signals require some sort of transformation - somtimes in the time domain, sometimes in frequency domain, sometimes complex-S...
I'm not disagreeing with your overall argument, just saying that you need to open your mind a bit - "never" is a very strong word
The process should not be new. Use well-understood engineering principles and processes. (Specify/Design/Code/Verify, and iterate as needed) New processes invite shortcuts and misunderstanding.
For the verification and testing, also use well-understood engineering principles and processes. (Specifiy/Design/Code/Verify the tests!) The tests should be automated, so you can eaily repeat them.
Verify every verifiable thing, and design them to be verified. Test every testable thing, and design them for testability. Review, Verify, Review, Verify, absolutlely everything. Plan a huge fraction of your total budget for review and verification.
If you design in try..catch logic, or N-version programming, voting algorithms, or similar stuff, make sure you apply sound process and verification to each of the alternative codes... If you don't, it is very likely they will all be wrong. Be aware that added complexity only increases opportunities for errors. One correct algorithm is better than several incorrect algorithms.
Measure reliability, using test-case failure rates and frequencies, and review/verification findings, to know when you reach the required reliability.
Do not schedule any deliveries until the reliability measurement tells you it is almost ready to deliver. There is an old saying that goes something like "Feature, Schedule, Quality. Pick any two. You can't have all three." If you want very high quality, you cannot stick to a schedule.
There really are no shortcuts.
The reasons for this are as large and as complex as the language itself. However, just to touch on a couple of the most obvious: incompatibility between compilers (and even within a compiler as new features are added), to some really poor encapsulation techniques (some of this can be addressed using opaque data types and discipline, provided your management is prepared for complaints from coworkers about how hard it is to understand your code). And then...well, you get the picture. I wouldn't wish a C++ project on my worst enemy (and I could have, and I didn't).
If your management is insisting on C++ then they are already emboiled in micromanaging details they should not be - start polishing your resume, you are going to need it soon, no matter how the project goes.
There's no excuse for buffer overflows and memory leaks in C++, not with TR1's smart pointers and not with the standard library's containers.
I've been using smart pointers for nearly as long as I have been using C++, and I can assure you: they don't guarantee correct memory management. There simply is no way to do predictable, safe, well-defined memory management in C++. All smart pointers can do is reduce the probability that you screw up, but at a steep cost in terms of performance, and without any guarantee that third party code is as careful as you.
That's not even considering garbage collectors which have been available in C++ for years.
The only garbage collectors that work with normal C++ implementations are conservative, and those are both inefficient and not accurate.
You can't fix C++ without massive backwards incompatibilities.
Just use the C stuff within C++. Spawning stuff in run time (with hope it actually gets killed when you destruct) is dangerous to stability. Don't use virtual file systems (operating system). Go back to good old days of static coding. Load once and everything you need is in memory, data is initialized at load time as well as at first useage. Kepp it simple and it should be predictable, (stable).
cursethedarkness
One thing at a time? Do you even know how FPGAs work?
"I call a baby goat a 'goatse.'" -- my non-Internet-savvy 6-year-old stepdaughter
You have no idea what your talking about in this case.
First of all if you think that quantum mechanics is "hodge-podge nonsense" then what are your theories on how the laser in your computer works?
Secondly "stuff being there and not there at the same time" has been SEEN (i.e. with the eyes) in the lab(Bose-Einstein). When you want to rant in the future, talk about stuff you have studied inorder to not make your self look like a moron.
I'm a little stunned at the fact that you think C++ is inherently less stable and crashable than other languages. True in C++ there's generally more you need to do than in other languages, but frankly, my first C# application (with garbage collection, etc) had a major memory leak. I only picked it up because I happened to be looking at the task manager and noticed the memory usage going up, not because things were unstable. I was calling .Dispose on an object and just assigning null didn't help for some reason.
I also get errors popping up all the time and need to wrap code in try catch blocks.
C++ also has this construct so my advice to you is make as many mistakes as possible and get used to spotting where code is likely to be vulnerable and take action to avoid crashing.
Employing patterns and decoupling can prevent individual modules from stopping entirely, but it's not going to stop the individual modules from being unstable.
You still need to do explicit error checking whenever you write code, regardless of the language or patterns you decide to use.
If you want to write platform independent code the use an API which has the same goal.
For GUI apps use Qt or GTK, and for 3d games use Ogre3d.
But using these won't eliminate the need to follow sound error handling practices.
A programming language will always give YOU the option of how things should be handled in the event of an exception.
That's my 0.2 EU's.
And how is this different to C ?
Of course this is contrived, but since you don't necessarily know what 'func' is going to do, how do you know it's safe to free the memory ?C++ can trigger an exception in the middle of a constructor for example. So when you then get to the exception, it is not clear what to do.
Oh well, what the hell...
- Pepper your code with assert() calls to make sure that if anything goes wrong, it crashes quickly.
- Make sure that it can restart seamlessly.
What's important is that the crash/resume cycle is faster than the necessary response time. If you have hard real-time constraints, life gets a whole lot nastier, but if an occasional half-second glitch is okay, things can go well.You have to think really hard about state, and do everything as a two-phase commit. Network connections to applications that don't resume cleanly are particularly tricky; you have to save and restore the sequence numbers across the crash/reboot. This requires NOT using the OS TCP implementation, or hacking it heavily to not send the ack until you've committed the state produced by the packet recpetion.
I have crashing bugs happily running in production, becuase it gets back up and keeps going with no problems. The bad problems are reboot loops, when the "resume from crash" code crashes. You have to be very paranoid there.
But it really does work remarkably well. Oh, one more tip: add a version number to your state files. Any time you don't change it, you can perform a software upgrade in-place by crashing the old version and letting the new one resume. Otherwise, you have to have scheduled downtime for every upgrade.
I know the Haskel 98 report, but by standard I meant a normative document, preferredly issued by an international or national body like ISO (or ANSI, ECMA), which is usually necessary step to give the language credibility (not by virtue of its design, as Haskell is rather elegant, but rather by showing that there is enough interest from industry backing it).
How come now one has mentioned Project COSA when it comes to writing, as you termed it, 'bulletproof code', a 'silverbullet'.
The founder of Project COSA intends to create just that, code that is guaranteed to be bug free. http://www.rebelscience.org/Cosas/COSA.htm
Well, in C++ at least, the best way is to employ a few simple techniques with extreme dilligence:
1. With each line of code, handle all possible failure cases, however unexpected or unlikely, gracefully in retail builds.
2. Do not use Asserts or empty catch blocks as your only form of "error handling".
3. Standardize on using either "function return values" or "exceptions" as your form of error passing up the stack in your codebase, and stick to it.
4. Never ignore exceptions or function return values unless it's intentional and for good reason. To indicate that choice, cast the function call explicitly to (void) to make it clear that ignoring the result is intentional, and add a comment documenting the reasoning.
You should never just have an Assert, ever, anywhere, because it will do exactly zilch in retail builds. You should *always* use asserts in combination with code that takes appropriate action in the event the assert happens to be violated for whatever reason. I'm actually much more of a fan of utilizing a carefully-crafted "Ensure" macro, which always evaluates the condition in retail builds, asserts if the condition evaluates to false, and then has either a "goto" label to jump to the relevant point (such as function clean-up and exit) or a function call pointer (which is the handler to execute upon failure).
Similarly, you should not just eat exceptions in most cases. Only do so when you're clearly being intentional about it, and add a comment to point out that the empty catch handler is intentional and explain why.
Standardizing the way you let functions pass errors back to their callers (either by function result, or by exception throwing) is important because when you get a mix of approaches you have to perform acrobatic coding to properly handle failure cases. Stick with one or the other. Obviously anything can throw an exception at any time, so you should always trap exceptions... but ideally each function traps its own exceptions and then returns an appropriate result code to its caller, or else all the functions in your app are "void" return type and throw exceptions containing error codes to communicate failure back to the caller.
Moderator hint: a comment is neither "Flamebait" nor "Troll" if it is true.
There are some simple design patterns that can really help you. The Null Object and the RAII (Resource Acquisition Is Initialization) patterns are essential to stable C++ code. The STL or an equivalent library is the last piece to the puzzle.
The Null Object pattern removes the defect-prone NULL value checks throughout your code by creating an object that represents null behavior for any situation where the existence of a real object is illogical.
The RAII pattern prevents resource leaks by tying resources to object lifetimes. By using stack allocation through the STL container and smart pointer (if needed) classes you can pretty much never use the "new" keyword in your application. You will have to create a copy constructor and an "operator =" deep copy method for STL to be able to handle your classes as value classes, otherwise you will need to use smart pointers.
I've heard people criticize STL and especially its smart pointer feature, but after extensive use I have no complaints other than the god-awful error messages compilers give you when things don't go right and the poor debugging support for visualizing such containers.
personal attacks hurt, especially when deserved