Any "Pretty" Code Out There?
andhow writes "Practically any time I hear a large software system discussed I hear "X is a #%@!in mess," or "Y is unmanageable and really should be rewritten." Some of this I know is just fresh programmers seeing their first big hunk o' code and having the natural reaction. In other cases I've heard it from main developers, so I'll take their word for it. Over time, it paints a bleak picture, and I'd be really like to know of a counterexample. Getting to know a piece of software well enough to ascertain its quality takes a long time, so I submit to the experience of the readership: what projects have you worked on which you felt had admirable code, both high-level architecture and in-the-trenches implementation? In particular I am interested in large user applications using modern C++ libraries and techniques like exception handling and RAII."
Just kidding :))
BoD
Hello World!!!
A Good Troll is better than a Bad Human.
The cruftiness of source code is directly proportional to the amount of time spent working on it times the number of people working on it.
Has someone created such a law before?
GLaDOS for President 2016! "Well here we are again. It's always such a pleasure." -- GLaDOS, 2011
"Practically any time I hear a large software system discussed I hear "X is a #%@!in mess,"
I get that with reading the next line you get the context, but was I the only one taken aback at this seemingly blatant flame of our beloved X?
*''I can't believe it's not a hyperlink.''
public interface MessageStrategy {
public void sendMessage();
}
public abstract class AbstractStrategyFactory {
public abstract MessageStrategy createStrategy(MessageBody mb);
}
public class MessageBody {
Object payload;
public Object getPayload() { return payload; }
public void configure(Object obj) { payload = obj; }
public void send(MessageStrategy ms) {
ms.sendMessage();
}
}
public class DefaultFactory extends AbstractStrategyFactory {
private DefaultFactory() {}
static DefaultFactory instance;
public static AbstractStrategyFactory getInstance() {
if (null==instance) instance = new DefaultFactory();
return instance;
}
public MessageStrategy createStrategy(final MessageBody mb) {
return new MessageStrategy() {
MessageBody body = mb;
public void sendMessage() {
Object obj = body.getPayload();
System.out.println(obj.toString());
}
};
}
}
public class HelloWorld {
public static void main(String[] args) {
MessageBody mb = new MessageBody();
mb.configure("Hello World!");
AbstractStrategyFactory asf = DefaultFactory.getInstance();
MessageStrategy strategy = asf.createStrategy(mb);
mb.send(strategy);
}
}
I find the thing that really makes code unreadible is inconsistency. This is particularly true of languages like C++ where there is no well defined one true coding convention. If all your code is in house, this is not such a problem, because you can define your own coding convention and stick to it. If however you are relying on other libraries, chances are your going to end up with one library that names its function like_this, and one likeThis, and another fnct_LikeThis ...
Worse is when you don't even define a coding convention for the code you throw into the mix. Now you have libraries with inconsistent naming, and multiple developers all using their own favorite notation.
Additionally, their is inconsistency in the functioning of libraries. Some use function pointers, some work by inheritance, some (like glade) read the export list..
I'm not a huge Java fan, but I think they have maintainability down pat. Very consistent language, well defined coding convention, and a mature set of defacto tools (JUnit, javadoc, log4j, struts, spring, hibernate, etc..) make it a lot easier to jump into older code because everything feels familiar. In most other languages you have to spend quite a bit of time just decrypting the existing code, and then more time learning the particular API's they've chosen.
The more GOTOs the better!
Creationists are a lot like zombies. Slow, but powerful and numerous. And they all want to eat our brains.
The source for Tcl is widely considered by those who have worked with it to be unusually clean and clear.
Ooops, I almost forgot:
/*
Hello World
Copyright 2002 MillionthMonkey
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
You're welcome, "World"!
Now, I *have* rewritten a lot of the code on the project, but not because it was "ugly". We had quite a lot of "prototype" code still in the project. Since it was prototype code, it didn't check for or handle error conditions very well (not to mention the endless bugs that have been found due to the prototype code). We've had to rewrite a lot of the code because it was easier to do that than fix the bugs in the code. This usually allows for easier debugging in the future AND gets rid of any of the bugs that were found (the bugs were usually caused by a bad or even completely wrong approach to the implementation).
The difference is knowing when to clean up the code and when to rewrite it. If a developer just can't understand the code (because it needs cleanup or it's just very complicated), then it should be cleaned up and commented properly. Sure it's tedious, but everyone on the project loves you afterwards because they can suddenly understand the code! If there are bugs and it's obvious the implementation should have been done a different way (for speed, usability, modularity, whatever), then a rewrite might be in order.
(and of course, as you mention, as time goes on the code starts looking "bad" or "old" again - time for hopefully another cleanup rather than a rewrite)
"X is a #%@!in mess," or "Y is unmanageable and really should be rewritten."
I see those all the time as comments in my own code.
Boost is what I call "template madness". It uses template metaprogramming to the max, which (in the real world) means three things:
(1) It's impossible to debug. You can't read the code. The debugger can't unravel the templated variables and stuff in any meaningful way for you. You can't even step through code, that's doing a supposedly simple operation like memory allocation!
(2) Some compilers will choke on the code, or compile it wrong in subtle ways due to differing interpretations of some obscure section of the enormous C++ language spec.
(3) The error messages from the compiler are useless. You have to run them through a filter to even figure out what they mean.
(3) Bugs in the library are very difficult to fix. Template metaprograms are essentially programs written in a functional language, except one that has horrible syntax. This is not the stuff that normal C++ programs are made of.
Your mileage may vary. My day job is working on a game engine for an upcoming Xbox360 game. Engines are hard enough without impractical crap like template metaprogramming in them. Give me straight-line C/C++ code any day.
As large and old as it is, OpenSolaris has fairly readable code. Plus, most of it has comments explaining why it's done the way it is.
Maybe not
At least you admit to being uninformed.
I haven't looked either, but I happen to know that BOOT::Python often does NOT work. It has thread-related problems.
At for the rest of BOOST, I've looked at a good chunk. BOOST makes decent programmers cry. The other follow-up post by the Anonymous Coward Xbox developer has it all correct.
I'll add:
BOOST is full of butt-ugly hacks. Check out the, uh, template things, named _0 through _9 being used as stand-in dummy arguments. Eeeeeew!!!
BOOST looks easy to dumb-ass programmers, but these programmers leave bugs that are difficult for expert programmers to find.
BOOST makes compilers run very very slow, and often breaks the optimizer anyway.
"only a smell in my opinion if they are hard to read"
A friend of mine used to say: "Source code is like shit, it stinks when it's not yours."
And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
I wrote a Perl filter that took C code as input, and applied all kinds of "unprettifications" to it (removing comments, collapsing variable declarations, introducing random curly-brace and indentation styles, removing whitespace or adding strange whitespace). The output looked like it had been written at 3am by a hung-over ex-FORTRAN engineer who had just discovered FORTH.
Then I demonstrated that a bunch of code checked into our system looked like it had *already* been run through this tool. After the public shaming, a couple of the offenders cleaned up their acts for a while, but they're back to their old tricks.
These days I'm working on a project where all the devs are really, really serious about the formatting and naming conventions. Some of the rules suck, in my opinion, but there's a lot to be said for consistency.
[In the 80s, HyperCard team at Apple used to regularly run their sources through a Pascal formatter. The code, in a friend's words, "looked ironed." Unfortunately I haven't run across any good C++ formatters.]
Any sufficiently advanced technology is insufficiently documented.
Here is a little teaser though
If you are about to mod me down, keep in mind that this post was most likely sarcastic.
Phoenix Technologies used to make both BIOS and printer software. The printer software department split off and became a different company, and then I lost track of them...
They made printer software that went into virtually every printer not made by HP at the time. Canon, Ricoh, Lexmark, or whoever would come out with new hardware and license the software from Phoenix. Yep, some of my code is in every Lexmark printer right now.
They had a couple hundred thousand lines of code that did PCL, GL, and Postscript for the consumer market, and it was the most readable and well developed code I have seen. Comments were explanatory, variables were well named, and execution paths were well defined and easy to follow.
They really had their act together for testing as well, with an elaborate and comprehensive regression suite that checked *every* aspect of all of the [printer] languages, and a team of QA people who would go over the results nightly. I'm not making this up - you would come in to work in the morning and there would be maybe 5 E-mails from QA outlining bugs which were either in your code or assigned to you for reasonable reasons.
We did the software for the first Lexmark printer. The first internal release gathered 900 bug reports from QA. When we went to market there were 7 remaining, all of which were deemed inconsequential.
When you are in the commercial market making fixed-program computers (dishwashers, printers, cell phones, VCRs) you don't have security updates and new versions, and a recall is usually out of the question. It's much cheaper to do all of your QA up front and ship a quality product.
In my opinion we've grown sloppy in the programming business. I've been a contractor for the past 30 years and I haven't seen anyone else who comes close to true quality procedures. Even FAA safety certified stuff is usually hokey and obscure. Thank god we've still got human pilots.
Having seen the procedures firsthand I have an appreciation of how easy and valuable it all is. No one else seems to understand that, and so everyone keeps running around putting out fires and slipping deadlines.
The Bourne Shell must get some kind of mention here. What do you do if you prefer ALGOL to C? Why, #define your own syntax, and thus turn boring old C code into a thing of beauty.
Repton.
They say that only an experienced wizard can do the tengu shuffle.
I couldn't agree more. As I grow older, I've learned there really is a time when something is "good enough" to satisfy known requirements. I find that many applications are over engineered to some pie in the sky version of what's right and good - usually at the expense of simplicity and stability. I've often heard Java folks talk about re-factoring code and that's fine if no one is using your app, but in the event that folks and money are dependent on it, then re-factoring really just increases risk to all involved. The best possible outcome is that no one will notice the changes.
It's definitely hard for more passionate developers to realize when time to value ratio has diminished to the point that your time is better spent on other projects. There's always one more thing to spruce up or optimize. Having been both a musician and developer, I like to think of my work as a reflection of me. Playing an instrument is similar in that there is always more to learn and practice on any given song, but sometimes you need to put it down and move along to other pieces. Even the best musicians play a variety of songs. I'm sure Eddie Van Halen could have perfected Eruption for the next 20 years after he recorded it, but he decided to spend his time on other works.
My take on C++ is that the best programs only use a fraction of the features. The language is so big it is dangerous. Just because a feature exists in the language, does not mean it is good for every application. I am very wary of operator overloading and templates too. You need to make your code sufficiently clear that you can be sure it works. if you cannot quickly understand your code, then chances are you made a mistake.
that he thought Bill Atkinson's MacPaint was the most beautiful program ever written. Hearing this, Andy Hertzfeld made it a priority to recover the source code from an old Macintosh diskette. He contacted me because he was a bit worried about Apple's reaction if he just released it on the net (since it was Apple property), and I advised him to get the Computer History Museum involved if he didn't want to take the risk. I believe that he donated the code, but I'm not sure what the Museum did to have it made available.
Tim O'Reilly @ O'Reilly Media, Inc. 1005 Gravenstein Highway North, Sebastopol, CA 95472 http://www.oreilly.com
So this post is perfectly timed. It's a collection of essays by leading software engineers about code they find especially beautiful.
h tml
Andy Oram, the editor, thought it would be poor form to make a post himself, but heck, I thought: this is very relevant. The table of contents for the book can be found at http://www.oreilly.com/catalog/9780596510046/toc.
It includes essays by Brian Kernighan, Jon Bentley, Tim Bray, Yukohiro Matsumoto, Simon Peyton-Jones, and many others. The code is intended not only to be beautiful but also instructive and in many cases re-usable.
We're hoping to build an ongoing site around the book so additional examples would be very welcome.
Tim O'Reilly @ O'Reilly Media, Inc. 1005 Gravenstein Highway North, Sebastopol, CA 95472 http://www.oreilly.com
I'll comment (inline) as someone who has come to appreciate certain of qmail's strengths even while tolerating (to varying degrees) its weaknesses:
qmail code is pretty ugly when looked at closely enough, and can seem unnecessarily "different" from a more-distant perspective.
However, pull back far enough and look at it, and you might be able to appreciate that it is, in its own way, a work of art: a reliable, secure, powerful email system — just as pretty much any sufficiently large and beautiful work of art can look pretty flawed when scrutinized closely, especially without an awareness of the "big picture".
So if I wanted to play around with an email server and make it do all sorts of slick stuff, I wouldn't pick qmail.
But if I wanted to improve a mail server in some fashion while still being reasonably assured the resulting (modified) system wouldn't have remotely exploitable bugs in it, based on what I know right now, I wouldn't pick anything but qmail.
Practice random senselessness and act kind of beautiful.
As my page (to which you link) notes, these bugs are likely exploitable only in theory.
And I've been hired (and paid well) to modify qmail code, including patching it to fix bugs as well as extending it, for years now, but nobody has even inquired as to what it'd take to fix the "Guninski" bugs that might theoretically be exploitable — at least, not so far.
I think that's a pretty sure indication that the qmail user base does not consider those bugs to be sufficiently worrisome to fix. (I did publish a simple fix to one of the first bugs Guninski found; that fix was incorporated into netqmail. But I did that gratis.)
I don't know offhand whether DJB has ever acknowledged any bugs in qmail. But, just as code doesn't lie while comments can, code that is reasonably well-specified, as qmail's components' interfaces are, cannot pretend bugs don't exist in it, even if authors or fanboys do, just as it can't pretend it has bugs even when claimed otherwise[*]. So I don't particularly miss djb's opinions and pronouncements on such issues, since I can read the code and decide for myself.
[*] There's a web page out there that claims "qmail-smtpd does not detect CR LF properly on packet boundaries", which strikes me as complete and utter — as well as easily demonstrable, by simply looking at the code — nonsense. Not that it can't happen, but it'd almost certainly be due to an OS, networking, or (non-qmail) library bug. Tellingly, despite the high likelihood such a bug would result in huge numbers of legitimate emails being rejected by many qmail servers worldwide, there's no information on this alleged bug beyond somebody supposedly reporting it. That's only marginally more persuasive than saying "qmail-smtpd dropped every third email on every server running it on March 17, 2001, between 11:45 and 12:15 UTC, according to a guy I overheard in a bar the other day." Color me unimpressed.
Practice random senselessness and act kind of beautiful.
It's just occurred to me you are Tim O'Reilly. Wow, there are still some important folks that still post on
I remember when that cute little home computer came out, and all the programs were just so. . , plinky.
Memory was a huge barrier, because you only had a small quantity of the stuff, and nobody understood the architecture of the system well enough to produce efficient programs.
But back then, there were no video card upgrades. No faster processors and mother boards being produces every three months. If you wanted higher speed and cooler graphics, you had to write your code in more ingenious ways.
And so that's what happened.
By the twilight years of the Color Computer, the games people were writing on that thing were unreal. I remember looking at a few and thinking to myself, "This is the same computer? Wow! Humans rock!"
When you reach the raw power limitations of your muscles but you still want to improve yourself in your combat skills, you take up Kung Fu. That's how it was in the old home computer days. Nowadays, though, (dang kids; I hadda walk fifty miles to school!) it seems that the bulk of improvement comes with the purchasing of increasingly large muscles.
This is not to say that there is no software innovation. Heck, id Software did some pretty amazing things with software ingenuity. But I do remember thinking during the first few years of the big PC revolution, after the 486 was reaching its twilight, "You know, all this hardware innovation is great and all. . , (big muscles are cool), but part of me wishes it would stop cold for six solid years just what would happen when the programmers were really pushed. --You know, to see what one of these machines is actually capable of doing.
-FL