Abandoning Header Files?

← Back to Stories (view on slashdot.org)

Posted by Cliff on Friday January 14, 2005 @09:56AM from the strange-compilation-practices dept.

garethw asks: "I'm working on a project where the lead developer, following a suggestion by our tool vendor, wants to get rid of the header files and directly #include source code. The language is a somewhat specialized language, but for all intents and purposes, you can assume it's Java or C. The conventional argument I recall for using header files, and incremental compilation, is that it's faster to use a makefile and conditionally build only those files that have changed. However, it turns out that the brute force of invoking the compiler once on the top-level does actually compile much faster. I feel that there is something about #include'ing source files directly, compiling only the top-level file, just doesn't 'feel' right and I'm at a loss to really give a solid argument as to why. Has anyone actually used this approach? Does anyone have any thoughts on any advantages or drawbacks?"

8 of 207 comments (clear)

Min score:

Reason:

Sort:

Need more info... by sfjoe · 2005-01-14 10:00 · Score: 4, Insightful

...following a suggestion by our tool vendor,...

How much money will your tool vendor make if you implement this suggestion and what, if any, product does she sell that neatly solves any problems this might bring up?

--
It's simple: I demand prosecution for torture.
1. Re:Need more info... by Tim+Browse · 2005-01-14 15:04 · Score: 4, Insightful
  
  If they're anything like some tool vendors I've come across, it's because they either don't have decent compilation perfomance, or don't support the features that would help, such as pre-compiled headers, etc.
  
  So rather than fixing the problem by investing in their product, they're telling their customers to use ugly hacks to get around the product's shortcomings, and hope they won't switch to another system (I suspect).
  
  I've certainly been on the receiving end of such tactics.
  
  The dead giveaway is when they start saying things like "pre-compiled headers wouldn't help you anyway" :-)
Not useful for C by david.given · 2005-01-14 10:00 · Score: 4, Informative

...or, to a lesser extent C++, because of the way C scoping works:
static global variables have scope within the module they're defined in. Which means that two static globals in different source files don't collide, because they're in different modules.
Including everything into one big source file will mean that they're both in the same module, and so will collide. Not good.
Can't say about other languages, though.
Keep the header files by SunFan · 2005-01-14 10:06 · Score: 4, Insightful

They are just about the only way to centrally organize declarations for data structures and function signatures. Doing so will save your ass eventually, because having function prototypes available can allow the compiler and lint tools catch stupid programmer errors. You do use lint-like tools, right? They _will_ catch bugs that testers and visual scanning wont.

The only draw back to headers in C is that if you forget to 'make clean' after changing a header, you can end up with object files using old definitions. Just make a habit of doing a full build after changing the headers. If you designed your software properly, changing header files won't be all that common (adding functions new data structures, etc.).

--
-- Microsoft is the most expensive commodity operating system and office suite vendor in the marketplace.
Why? by Pacifix · 2005-01-14 10:23 · Score: 4, Insightful

It seems that the onus should be on the vendor to explain very, very convincingly why you should abandon decades of standard practice and good coding practice. This better be one hell of a good product you're developing to justify the should a radical change. You shouldn't need to defend standard practice, they must campaign for a change to that practice. Imagine trying to explain this to all the coders who will work on the product for the next decade - will they think you're crazy or is there really a reason to do this?
1. Re:Why? by CamMac · 2005-01-14 10:33 · Score: 5, Funny
  
  Remeber, if you remove all the comments from the code, it will compile faster and the executable will be smaller.
  
  --Cam
  
  --
  All jocks think about is sports. All nerds think about is sex.
The immortal advice of Rocket J Squirrel by crmartin · 2005-01-14 11:35 · Score: 5, Insightful
"Oh, Bullwinkle, that trick never works."
One of the really depressing things about having been in the business for nigh on to 40 years now is that, along with the occasional new dumb idea, all the old dumb ideas keep coming back. Among those dumb ideas that keep coming back are "visual programming" --- using graphics instead of programming languages; complicated schematic graphics for software --- UML in its utter complex form; and, sure enough, using the preprocessor to mess with C-like languages.
Every time this is tried --- and God knows it's been tried a lot --- you run into some severe problems:
1. The scoping rules of C-like languages give semantics to file inclusion. If you #include chunks of code, you are defeating the language's (limited) ability to protect you from name space clashes, mis-named variables, and so on.
2. While it might be that you gain something from only needing to start the compiler once, parsing and compilation are inherently a bit harder than O(n) where n is the number of source characters or tokens. A normal environment with make(1) will generally need to process fewer tokens than compiling everything all the time; the time required for a big file will inevitably dominate the startup time eventually.
  If you've got control of the compiler for this peculiar language, why not explore making the startup time shorter, say, eg., by using shared libraries, DLLs, or by setting the sticky bit?
3. From sad experience, I can tell you that using the #include scheme will introduce weird-ass order dependencies into the code (ie., what order do you include files in?) that are very very difficult to debug.
4. Most tools for C-based languages expect you to do the sources in a normal fashion. You confound the tools' expectations' at your peril.
5. Similarly, most debuggers exploit, or attempt to exploit, scoping rules that you will break through this approach.
6. When you write lots of smaller modules, each one can be create a single, small TEXT and DATA section, or a collection of small code sections. This makes the job of memory mapping in virtual memory systems much easier. Do it all as one big thing, and you're liable to get one big TEXT section.
7. Optimization is comnbinatorially fairly hard, quadratic or worse, and global optimizations tend to be managed within section bounardies. One-big-module designs may either make the optimization phase very lengthy, or defeat optimization entirely when table space etc. runs out.
8. You piss off every experienced C programmer who ever has to deal with the code in the future, especially old farts like me who've seen this trick 20 years ago.
Re:Several advantages and disadvantages by stonecypher · 2005-01-14 21:22 · Score: 4, Informative

1. Faster compile of the full product.

Well, back in the real world, in a properly decoupled project incremental linking is a massive speed win, even when building from the top, as there's far less cross-lexing and as the build tables may be handled a small piece at a time, which is important because their parsing in the compiler itself is generally of O(n^2 log n) time or better. Once you've worked on a large project which fails to make proper decouplings, you will become painfully aware of this trend.

Whereas in this particular project the complete build is apparently faster, that is almost certainly the result of a very naive code tree and/or build scheme; the importance of incremental linking towards speed of compile cannot be overestimated, even in the case of compiling from clean.

2. Much better optimization. Compilers can only optimize within a compilation unit.

This simply isn't true. Whereas only some compilers make cross-TU optimizations, that is not the same as cross-TU optimizations being only able to optimize within a translation unit (why do people keep saying compilation unit? There's no such thing!) Besides, you're dramatically underestimating the commonality of link-time cross-tu counterspecialization, which now exists in ICC, BCC, MSCC, ARM ADS, EDG/Comeau, GHOC, and is in experimental development within GCC.

You're not doing it the way everyone expects you to do it. Certain components (the compiler, the linker, and pre-existing code) might have been designed under the assumption that individual files would be compiled separately.

They most certainly have not been. The C and C++ standards do not allow for such ridiculously inappropriate behavior. Where did you get this idea? Compiler writers may not impose arbitrary restrictions on the codebase in any relation to the local filesystem. This is just untrue.

The pre-existing code might have declared static (per-file) variables or functions in a way that could collide with other code (namespaces might help here).

This is a well known gigantic red flag indicating an amateur programmer. File-scoped variables are antiquated even within the pure C community; the only time they're acceptable in most professional programmer's eyes are within a library which is built alone. In fact, you might want to read the things Kernighan himself said about when file-scoped variables are appropriate in K&R 2; the primary author of the language himself says that this is a fundamentally bad technique and should not be done.

Of course, that you're causing problems by misusing the toolchain and allowing bad code to collide when build trees written seperately are blindly merged without the help of a linker is just not surprising.

The compiler and linker might have limits.

Not if they're standards compliant, they mightn't. Did you know that there's a document out there floating around telling compiler authors in concrete detail what they may and may not do? You should read that before commenting on what a compiler may or may not do; you are simply out in left field, here.

As with every issue you'll ever run into, there are two (or three) sides to it.

Not when you know what you're talking about. Whereas many things are issues of pro/con, many simply aren't; you'll be hard pressed to find pros in the distribution of heavy ordinance to delusional sociopaths, you'll be hard pressed to find pros in setting up a "bring a molester to school day," and you'll be hard pressed to find pros in non-decoupled code, once you've actually read the standard and are aware of the real limitations of compiler authors, instead of your guesses about what might maybe happen if someone wasn't paying attention.

--
StoneCypher is Full of BS