Abandoning Header Files?

← Back to Stories (view on slashdot.org)

Posted by Cliff on Friday January 14, 2005 @09:56AM from the strange-compilation-practices dept.

garethw asks: "I'm working on a project where the lead developer, following a suggestion by our tool vendor, wants to get rid of the header files and directly #include source code. The language is a somewhat specialized language, but for all intents and purposes, you can assume it's Java or C. The conventional argument I recall for using header files, and incremental compilation, is that it's faster to use a makefile and conditionally build only those files that have changed. However, it turns out that the brute force of invoking the compiler once on the top-level does actually compile much faster. I feel that there is something about #include'ing source files directly, compiling only the top-level file, just doesn't 'feel' right and I'm at a loss to really give a solid argument as to why. Has anyone actually used this approach? Does anyone have any thoughts on any advantages or drawbacks?"

11 of 207 comments (clear)

Min score:

Reason:

Sort:

Need more info... by sfjoe · 2005-01-14 10:00 · Score: 4, Insightful

...following a suggestion by our tool vendor,...

How much money will your tool vendor make if you implement this suggestion and what, if any, product does she sell that neatly solves any problems this might bring up?

--
It's simple: I demand prosecution for torture.
1. Re:Need more info... by Tim+Browse · 2005-01-14 15:04 · Score: 4, Insightful
  
  If they're anything like some tool vendors I've come across, it's because they either don't have decent compilation perfomance, or don't support the features that would help, such as pre-compiled headers, etc.
  
  So rather than fixing the problem by investing in their product, they're telling their customers to use ugly hacks to get around the product's shortcomings, and hope they won't switch to another system (I suspect).
  
  I've certainly been on the receiving end of such tactics.
  
  The dead giveaway is when they start saying things like "pre-compiled headers wouldn't help you anyway" :-)
Interface vs implementation, shared libraries, etc by Dimwit · 2005-01-14 10:00 · Score: 3, Insightful

Well, there's the obvious separation of interface from definition. And the problem of duplicate definitions - there's a reason why "extern" is a keyword. :)

Plus, header files define an interface, which is useful if you don't actually have the code (i.e. binary shared library). Moot point in your case, I think, but...

Plus it's just good programming style to have separate definitions and implementations. Easier to track down bugs.

--
...but it's being eaten...by some...Linux or something...
Keep the header files by SunFan · 2005-01-14 10:06 · Score: 4, Insightful

They are just about the only way to centrally organize declarations for data structures and function signatures. Doing so will save your ass eventually, because having function prototypes available can allow the compiler and lint tools catch stupid programmer errors. You do use lint-like tools, right? They _will_ catch bugs that testers and visual scanning wont.

The only draw back to headers in C is that if you forget to 'make clean' after changing a header, you can end up with object files using old definitions. Just make a habit of doing a full build after changing the headers. If you designed your software properly, changing header files won't be all that common (adding functions new data structures, etc.).

--
-- Microsoft is the most expensive commodity operating system and office suite vendor in the marketplace.
Speed by jbrandon · 2005-01-14 10:14 · Score: 3, Insightful

Have you tested the speed difference when you change only one non-header file? I bet incremental compilation will make that quite a bit faster. In addition, if you want to compile that changed source file to check for syntax or type errors, you don't have to check for collision between it and the whole rest of the project, only collisions between it and the header defining it.
1. Re:Speed by stonecypher · 2005-01-14 21:58 · Score: 2, Insightful
  
  I bet incremental compilation will make that quite a bit faster.
  
  Chances are he's got massive coupling problems, which can totally throw away any benefit of incremental linking. And by the way, incremental compilation is something totally different; whereas I realize that the error is that of the original speaker, not yours, it should nonetheless be pointed out.
  
  C++ does not support incremental compilation, though ICC and MSVC both have extensions to support it. MSVC refers to it as runtime code generation: you change the source, MSVC swaps out a vtbl, and the new compiled piece of source is literally injected into a running program.
  
  TU seperation is incremental linking, not incremental compilation.
  
  --
  StoneCypher is Full of BS
Why? by Pacifix · 2005-01-14 10:23 · Score: 4, Insightful

It seems that the onus should be on the vendor to explain very, very convincingly why you should abandon decades of standard practice and good coding practice. This better be one hell of a good product you're developing to justify the should a radical change. You shouldn't need to defend standard practice, they must campaign for a change to that practice. Imagine trying to explain this to all the coders who will work on the product for the next decade - will they think you're crazy or is there really a reason to do this?
Several advantages and disadvantages by cookd · 2005-01-14 10:26 · Score: 3, Insightful
1. Advantages:
2. Faster compile of the full product. You only invoke the compiler process once, and much less work for the linker to do.
3. Much better optimization. Compilers can only optimize within a compilation unit. Intel and Microsoft have "Link-time code generation" compilers which performs a final optimization pass during link, but if you aren't using those compilers, there might be a significant amount of additional optimization enabled by putting everything in the same compilation unit.
1. Disadvantages:
2. You're not doing it the way everyone expects you to do it. Certain components (the compiler, the linker, and pre-existing code) might have been designed under the assumption that individual files would be compiled separately. The pre-existing code might have declared static (per-file) variables or functions in a way that could collide with other code (namespaces might help here). The compiler and linker might have limits. And you might not hit those limits until late in the project.
3. For building the whole product, yeah, it will be faster. But for making a small change and rebuilding the results of that change, it might be much slower.
As with every issue you'll ever run into, there are two (or three) sides to it.
--
Time flies like an arrow. Fruit flies like a banana.
Time you gain, you loose in debugging by StarWynd · 2005-01-14 10:27 · Score: 2, Insightful

While including code directly may speed up the compilation time, you will loose all the time you gain and then some when you get into debugging.
If you have a complicated #include chain, you can wind up with a lot of duplication. Some compilers will complain, some won't. However, if you have typedefs, structs or the like, most compliers will complain and not compile your code until the duplications are removed. I don't know what compiler you're using or if you are planning on including more than functions or global variables, so I don't know if this is an issue or not.
The more general issue is that it's much easier to track down bugs and other problems if there is a clean separation between definitions and implementations. I can't characterize that difference in a few sentences, so I'll just say that it has been my experience that projects which are developed in a true modular nature are much easier to debug than projects designed in a monolithic nature. The time saved in debugging more than makes up for a little time lost in compilation.
1. Re:Time you gain, you loose in debugging by stonecypher · 2005-01-14 21:32 · Score: 2, Insightful
  
  If you have a complicated #include chain, you can wind up with a lot of duplication. Some compilers will complain, some won't.
  
  So sorry: the ODR rule prevents duplicated code from functioning in any compliant C or C++ compiler, all the way back to day one (and in fact into the parent languages B and BCPL.) This is simply false.
  
  However, if you have typedefs, structs or the like, most compliers will complain and not compile your code until the duplications are removed.
  
  If by most you mean all...
  
  so I'll just say that it has been my experience that projects which are developed in a true modular nature are much easier to debug than projects designed in a monolithic nature.
  
  Uh. Modularity and monolithism are not related. Modularity is the design technique of seperated interface definition such that one may swap in alternate implementations of code, such as through TUs, polymorphism, SFINAE or various metaprogramming techniques. One may contrast structured programming, functional programming, lambda calculus or contract substitution.
  
  Monolithism is a development model: designing all at once and then implementing top-down from start to finish. Constrast the waterfall model, iterative development, chain development, and so on.
  
  The time saved in debugging more than makes up for a little time lost in compilation.
  
  Right on: preach it, brother. This is one of the least understood principles of modern design: machine time is significantly inferior to programmer time. Herb Brooks would be proud.
  
  (No, I'm not being sarcastic. Yes, I did just compliment you heavily after criticism.)
  
  --
  StoneCypher is Full of BS
The immortal advice of Rocket J Squirrel by crmartin · 2005-01-14 11:35 · Score: 5, Insightful
"Oh, Bullwinkle, that trick never works."
One of the really depressing things about having been in the business for nigh on to 40 years now is that, along with the occasional new dumb idea, all the old dumb ideas keep coming back. Among those dumb ideas that keep coming back are "visual programming" --- using graphics instead of programming languages; complicated schematic graphics for software --- UML in its utter complex form; and, sure enough, using the preprocessor to mess with C-like languages.
Every time this is tried --- and God knows it's been tried a lot --- you run into some severe problems:
1. The scoping rules of C-like languages give semantics to file inclusion. If you #include chunks of code, you are defeating the language's (limited) ability to protect you from name space clashes, mis-named variables, and so on.
2. While it might be that you gain something from only needing to start the compiler once, parsing and compilation are inherently a bit harder than O(n) where n is the number of source characters or tokens. A normal environment with make(1) will generally need to process fewer tokens than compiling everything all the time; the time required for a big file will inevitably dominate the startup time eventually.
  If you've got control of the compiler for this peculiar language, why not explore making the startup time shorter, say, eg., by using shared libraries, DLLs, or by setting the sticky bit?
3. From sad experience, I can tell you that using the #include scheme will introduce weird-ass order dependencies into the code (ie., what order do you include files in?) that are very very difficult to debug.
4. Most tools for C-based languages expect you to do the sources in a normal fashion. You confound the tools' expectations' at your peril.
5. Similarly, most debuggers exploit, or attempt to exploit, scoping rules that you will break through this approach.
6. When you write lots of smaller modules, each one can be create a single, small TEXT and DATA section, or a collection of small code sections. This makes the job of memory mapping in virtual memory systems much easier. Do it all as one big thing, and you're liable to get one big TEXT section.
7. Optimization is comnbinatorially fairly hard, quadratic or worse, and global optimizations tend to be managed within section bounardies. One-big-module designs may either make the optimization phase very lengthy, or defeat optimization entirely when table space etc. runs out.
8. You piss off every experienced C programmer who ever has to deal with the code in the future, especially old farts like me who've seen this trick 20 years ago.