Doom 3 Source Code: Beautiful
jones_supa writes "Shawn McGrath, the creator of the PS3 psychedelic puzzle-racing game Dyad, takes another look at Doom 3 source code. Instead of the technical reviews of Fabien Sanglard, Shawn zooms in with emphasis purely on coding style. He gives his insights in lexical analysis, const and rigid parameters, amount of comments, spacing, templates and method names. There is also some thoughts about coming to C++ with C background and without it. Even John Carmack himself popped in to give a comment."
I'll be here all week.
It's just a personal point of view about coding style... some things like vertical spacing using the braces (at the section "Doom does not waste vertical space") are just the opposite of a readable source code (just in my opinion, of course, as someone that makes a lot of source code reviews of other people)
A developer needs to make an application for the hardware people have, not the hardware he wishes people had. Otherwise, he's likely to end up limiting his market to a subset that's not big enough to turn a profit.
In some ways, I still think the Quake 3 code is cleaner, as a final evolution of my C style, rather than the first iteration of my C++ style, but it may be more of a factor of the smaller total line count, or the fact that I haven’t really looked at it in a decade. I do think "good C++" is better than "good C" from a readability standpoint, all other things being equal.
I sort of meandered into C++ with Doom 3 – I was an experienced C programmer with OOP background from NeXT’s Objective-C, so I just started writing C++ without any proper study of usage and idiom. In retrospect, I very much wish I had read Effective C++ and some other material. A couple of the other programmers had prior C++ experience, but they mostly followed the stylistic choices I set.
I mistrusted templates for many years, and still use them with restraint, but I eventually decided I liked strong typing more than I disliked weird code in headers. The debate on STL is still ongoing here at Id, and gets a little spirited. Back when Doom 3 was started, using STL was almost certainly not a good call, but reasonable arguments can be made for it today, even in games.
I am a full const nazi nowadays, and I chide any programmer that doesn’t const every variable and parameter that can be.
The major evolution that is still going on for me is towards a more functional programming style, which involves unlearning a lot of old habits, and backing away from some OOP directions.
One might suggest that every good programmer, if they spend enough time improving, eventually moves toward a more functional programming style.
"First they came for the slanderers and i said nothing."
I've developed for large game and non-game projects, and each needs a different approach. Console games especially have serious problems with dynamic memory allocation (they don't typically have swap files and can die due to heap fragmentation) so you have to avoid a lot of convenience libraries like STL.
STL, however - especially in newer compilers that support C++0x - is actually quite good and is very, very robust. It's a good way to avoid a lot of the memory management bugaboos that happen when you *are* doing lots of dynamic/heap allocation. So I would very much endorse a sane amount of STL use in desktop code.
The other thing that rubbed me the wrong way here was public member variables. Since inlining and move semantics make getters and setters essentially free, there is no good reason to expose bare, public variables on anything but the simplest, most struct-like objects. The biggest source of weird, hard to trace bugs in our code at the game studio were often due to people modifying public members of other objects in unexpected ways or at unexpected times.
Having public, non-const member variables actually hurts a principle the author supports, which is "Code should be locally coherent and single-functioned". This means that an operation should do one thing and put you in one of several known and easily discoverable states, even on failure. That is, if I say, make this guy do X, then either he does X or he fails and ends up in a known state. If that state is available in the form of modifiable public data, then his state can get messed with at any point along that path by some other code, and the final state (in cases of success and failure) is not fully known. At the very lest, making data private means that only certain code paths can modify the data, and it's much easier to keep state coherent.
Anyway, that's just my $0.02.
Are you an idiot? Case statements don't do the same thing as if else. The example in the article does some floating-point compares. How do you represent that as a case statement in C++? Come on, I'm waiting. Oh, that's right. You can't.
Case statements take an integer value and switch based on it. You cannot have case (dot < -epsilon) or case (dot > epsilon). Got that? Good.
wonder what else ID missed
If you think Carmack "missed" something, take a deep breath, count to ten, and figure out what you missed.
No, he's not perfect - I found a bug in DOOM 2 that he never tracked down - but until you prove yourself STFU about how Carmack may have "missed" something you only learned on Stack Overflow anyway. Carmack is a Level 99 Wizard while all you can do is read the descriptions of the kinds of spells he can cast.
God people like you are annoying. Shut up and think, and you might learn something.
I really liked this bit, because it's something I've been really focusing on for the last year or so, and I think it has significantly improved my code:
Comments should be avoided whenever possible. Comments duplicate work when both writing and reading code. If you need to comment something to make it understandable it should probably be rewritten.
Comments can be useful, IMO, but primarily only for generating documentation (think Javadoc or doxygen, etc.). Other exceptions include bits of code that perform highly-optimized mathematical calculations, in which case I think the best solution is to write a proper document and then add a comment linking to the document, and bits of code that do something which apparently could be done differently but for some other reason must not -- assuming that explanation doesn't belong in the doc-generating comments.
Other than that, I find it makes my code a lot better if every time I find myself wanting to write a comment to explain some bit of code's purpose or operation, I instead refactor until the comment is no longer necessary. Often it's as simple as taking a chunk of code from one method/function and pulling it out into another with a well-chosen name, or else introducing a variable to hold an intermediate value in a calculation, with a well-chosen name. Sometimes the fact that a bit of code is hard to explain is a strong indicator that the design is wrong, that stuff is mashed together that shouldn't be.
The bottom line is that I've found eliminating comments does more for improving the readability of my code than anything else, and I've gotten similar feedback from colleagues whose code I critique by pointing out that they can eliminate their comments if they refactor a bit.
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
And a note on the relative evil of comments; bad or not, well placed comments have saved me an awful lot of time when taking on maintenance of code bases in the past. Most of the time they can't present a design document to you, or if they do it covers the design at the start of the project, a decade and a half earlier. Code is a method of communication between two programmers, but if the code doesn't suffice to illuminate the design the original programmer had in mind, I'd really appreciate a comment explaining his thoughts. Especially if the particular section of code is complex, and especially if I'm the guy writing it and end up being the guy maintaining it a couple years later.
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
I know some situations else-if statements are necessary, but my understanding is that case statements are far faster.
Very often rules about efficiency like this one are incorrect. Sometimes the compiler will even change things completely when you compile it. In one example, I once carefully wrote a function to only have a return statement at the end, because I (somehow) thought it would be more efficient. Then I looked at the assembly output from the compiler, only to find that the compiler had added in all the extra return statements I had so carefully avoided. After that, I just went with what was most readable.
If you really care about efficiency, there is one way to do it: you MUST time your code. Try the case statement, and time it. Then try the if statements, and time it. If you don't time it, you are just guessing and you WILL be wrong.
The case of the if statements in the article is a tricky example, because it is a range, and writing it as a switch statement would likely be a large table. Doing this could actually slow things down because it fills up the memory caches with mostly needless information. Note this can also be a problem with traditional optimizations like pre-calculated tables or loop unrolling, they can actually slow things down.
TLDR: If you want to make your code efficient, you need to time it.
"First they came for the slanderers and i said nothing."
"Things seem wiser when you become older and senile"
Write a review of Solaris code, and it'll probably get posted on Slashdot, too. I for one would be interested in reading that.
"First they came for the slanderers and i said nothing."
after the first three conclusions, and i stopped reading so i can't speak for the rest. should be: 1.) const as appropriate, not "const everything possible". const can fuck you hard in OOP if you use it wrong, 2.) you can never have too many comments, and 3.) tight vertical spacing is archaic and stupid, unless absolutely necessary for some display reason
if this guy was interviewing here and mentioned all the things in his article, i probably wouldn't hire him. too much "religion", as it were, which is a huge red flag for me because it's usually masking something...
Often as in you've measured it, or often as in "I'm making shit up"?
A good compiler will never implement a case statement as a load of if-else's, unless the case values are sparse, or you're not optimizing.
Meanwhile, transforming a set of if-else statements into a lookup table is seldom possible unless the if-elses all compare the same integer variable to a constant. In that case, it can in theory, but almost certainly won't in practice.
Other things being equal, a switch statement with contiguous constant cases will almost always compile to faster code than the equivalent set of if-elses. And it will be far faster. Every if/else induces a branch, and mis-prediction will be severe on most of those branches, causing 10-20+ cycles of stall on modern processors. The jump table mispredicts almost always, but only once. If one arm is taken 99% of the time you can speed things up by using an if/else and then a switch, but that's a rare case.
I appreciate the fact you're responding to the idiocy of the above post, but your points are as wrong as his.
Heh... if they're dictating tab width, they're doing it wrong. If you must have a certain tab width, you should be using spaces for everything or you lose the whole benefit of tabs - letting people choose their preferred indentation size.
Use tabs for indentation, spaces for alignment. That way you'll never go wrong. Looks like this was one of the less "beautiful" things about the Doom 3 code.
== Jez ==
Do you miss Firefox? Try Pale Moon.
Except that will throw off diff.
Case statements can be optimized using jump tables.
Any semantically-equivalent code (that is, two instances of code that "does the same thing") can be optimized to the same set of instructions. It's just a matter of whether or not the optimizer can figure that out.
> I don't see the point of not using STL.
Code bloat, hard to debug, memory fragmentation, and no way to serialize/deserialize in a fast way.
I highly recommend ALL C++ programmers to read this doc on why EA designed and implemented their own STL version. It provides insight into the type of problems console game developers have face that the PC game developers just routinely ignore or are ignorant of.
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2271.html
How do you represent that as a case statement in C++? Come on, I'm waiting. Oh, that's right. You can't.
switch (dot < -LIGHT_CLIP_EPSILON ? 1 : dot > LIGHT_CLIP_EPSILON ? 2 : 0) {
case 1:
sides[i] = SIDE_BACK;
break;
case 2:
sides[i] = SIDE_FRONT;
break;
default:
sides[i] = SIDE_ON;
}
Yeah, yeah, I know, that's totally ridiculous (although I did see things as bad and worse as a CS instructor's assistant whose job it was to grade Pascal students' programming assignments back in the day - that was very interesting to say the least).
On a side note, why can't > and < characters be used in a code element? Um, that's lame, especially for a site that discusses programming so much.
Better known as 318230.
Oh my god, this is the worst programming advice I've ever heard. Is this a joke? Maybe some clever attempt at creating job security?
There is a terrible dearth of commented code in the world -- especially in the lower-level languages like C and C++ -- and this guy is telling people we need fewer comments in our code?
Modern copyright is theft of culture from everyone and it retards the progress of the useful arts and sciences.
He loves the lack of white space, I hate it. Cramped code is irritating to read. If you want to take up less vertical space, reduce your font and increase the whitespace. You have a better sense of the separation of statements, stronger scoping and less room for error.
He also loves the lack of comments. I remain firmly in the camp that if you eschew comments as common practice, you're an idiot and you should stay away from programming on big teams.
It's not a clarity of code issue. I expect your code to be clear, too. But even after 20 years of programming, I read English faster than I read code. A description of an algorithm in English is going to be more terse than the code that implements it. Your code has to account for edge cases, but I probably just want to know what the code does and how the code does it at a high level so I can get a sense of the system and architecture. A descriptive method name only tells me WHAT the method does, not the manner in which it's done.
English (any natural language, really) is a powerful language with extraordinary expressive power. I don't understand why programmers are constantly trying to sweep it under the rug. Don't fill your code with useless comments like // increments the counter by 1, but if you're doing a non-trivial mathematical calculation that takes a whole method to encapsulate it, let me know what I'm getting in to.
Code comments--especially system level comments--should include the name of the author or current maintainer, as well. I tag my methods with my name and the date that the code was put in so people know where to go if there's trouble. They don't have to hunt through perforce time-lapses to see that I checked it in, they just email me.
And have some consideration for the new guy on the team, or the team that has to use your code 5 years in the future. They can't ask you questions, the context of the situation is lost, the code-base might be in the middle of being re-purposed (common in the game industry--which is where I am); comments are essential to maintainability. Man, I do code reviews and people often manage to forget exactly what they were trying to do, and it's only been a few hours. We always work it out, but if there were a comment, we wouldn't even have to spend THAT time.
Use comments. Use them wisely. It makes you a better programmer because you're wasting less of OTHER people's time.
case statements are not faster than if-else statements
This is one of the worst comments I've ever seen with an Informative mod on Slashdot.
Most of the time, switch / case statements are optimized by the compiler to use jump tables that are much more efficient at runtime than evaluating expression after expression.
I went to eat some animal crackers and the box said, "Do not eat if seal is broken." I opened the box and sure enough..
l would agree here. I only saw some bits of it but comparing code between Linux, BSD, and Solaris that did the same thing, the Solaris stuff was definitely the easiest to understand. Linux I found to be the most obtuse in comparison. Though to be fair the code bases are so large with so many authors that some code may look great while others are awful. Solaris I think wins just from having coding styles and standards.