Abandoning Header Files?

← Back to Stories (view on slashdot.org)

Posted by Cliff on Friday January 14, 2005 @09:56AM from the strange-compilation-practices dept.

garethw asks: "I'm working on a project where the lead developer, following a suggestion by our tool vendor, wants to get rid of the header files and directly #include source code. The language is a somewhat specialized language, but for all intents and purposes, you can assume it's Java or C. The conventional argument I recall for using header files, and incremental compilation, is that it's faster to use a makefile and conditionally build only those files that have changed. However, it turns out that the brute force of invoking the compiler once on the top-level does actually compile much faster. I feel that there is something about #include'ing source files directly, compiling only the top-level file, just doesn't 'feel' right and I'm at a loss to really give a solid argument as to why. Has anyone actually used this approach? Does anyone have any thoughts on any advantages or drawbacks?"

11 of 207 comments (clear)

Min score:

Reason:

Sort:

GCC by lexarius · 2005-01-14 10:17 · Score: 2, Interesting

My OS prof was demonstrating the differences in what errors the C compiler and linker would pick up. However, we found that we could make two source files with no include lines in either that both defined a global variable (sans extern). The main function set the global variable and then called a function that is defined in the other source file, which would then print the gv. Then we compiled and linked them with gcc. No warnings, no errors. The program ran exactly the way we wanted it to, which was unexpected. So yes, you can do away with includes and header files without even performing the includes manually. Depending on the language, your compiler might be smart enough to figure it out.
But that doesn't make it a good idea. Besides, do you want to be the one who has to go update the library functions that would normally have been included any time you change the code in one file?
unclear on what you mean by Dink+Paisy · 2005-01-14 10:27 · Score: 2, Interesting

You need to clarify exactly what is going on... My best effort at interpretation is that currently you have something like:
gcc -c f1.c gcc -c f2.c gcc -c f3.c gcc -o f f1.o f2.o f3.o
Your vendor instead thinks it would be better to do:
gcc -o f f.c
Where f.c looks like:
#include "f1.c" #include "f2.c" #include "f3.c"

Am I right, or am I completely off track?
If I'm right, you'd probably still want to include header files because you want everything to remain modular. According to software engineering type people, that makes maintenance easier. Another problem is symbol scoping. C keeps symbols local to the module they appear in, so you want to make sure you have naming conventions, namespaces, or some other protection against naming clashes. I'm dubious about the benefits, but I work on projects that take significant amounts of time to compile. Not hours, but enough time that if you wait for all the objects to compile you are wasting a lot of time. In general, I'd claim that the larger the project, the worse an idea it is.

--

Whoever corrects a mocker invites insult;
whoever rebukes a wicked man incurs abuse.
--Proverbs 9:7
Never thought we'd need to explain the obvious by Lord+Kano · 2005-01-14 10:46 · Score: 2, Interesting

You have issues with scope.

The easist one for me is that with #includes, it's so much easier to fix bugs. If you find a bug or an inefficient way to solve a problem, you only have to fix it once. Everything that #includes the suspect file will be fixed on the next compile.

If customfunctions.h has been changed or optimized, you don't have to edit the 30 projects that you're using those functions in. Just the one file is fixed, each project gets the benefits during the next compile.

LK

--
"Hi. This is my friend, Jack Shit, and you don't know him." - Lord Kano
Off the top of my head... by Dormann · 2005-01-14 10:50 · Score: 2, Interesting

Playing devil's advocate to most of the replies I've read so far, there are two benefits you could get from this approach.
If your language supports "static" in the file-scope sense, you could declare every global object as static and reap the compiler optimizations that come with that declaration.
If your language supports smart inlining, you could end up with code that has been inlined more effectively, since any code could be an inlining candidate regardless of location.
I can think of plenty of reasons to back away from the idea, but they'll flood in here without my help.
Use #includes sparingly? by mattgreen · 2005-01-14 11:06 · Score: 2, Interesting

When I write libraries, I try to make them header-only. Generally users don't want to have to modify their makefiles if they don't have to, and I'll resort to compiler specific pragmas if I have to.

It depends on the size of the system. If you are using a component-based system then only the pieces of the system that are actually being modified should be compiling anyway, which cuts out a lot of compilation. However this implies there is fairly loose coupling involved. In a more conventional application, there has to be a breaking point where the amount of time to parse files is longer than the time to link them normally. Using precompiled headers on any system header will also drastically decrease the time it takes to compile, since the compiler essentially just dumps the parse tree out to disk. So much time is spent inside some system headers! (Especially Windows.h. Ugh!)

There are some tools that keep the header files in sync with source files automatically, but I don't know of any off-hand. I have seen some for C, but I'm not sure there is one for C++ that supports all the wild and crazy stuff like namespaces and templates. :)
It can work by Foolhardy · 2005-01-14 11:23 · Score: 2, Interesting

Including all the source code into one main file compiled to one object can work, if the source files cooperate. C can have problems with the namespace, but C++ allows multiple namespaces and you can even put the namespace blocks in the main file around the #includes. The source code has to support this, though. It's best if all the source files to be included are under your control. For libraries that expect to use a declarative header, use it like it was intended.

I've done this on lots of projects and it works great. Most of the arguments here are either about performance or an appeal to tradition (that's the way we've always done it... must be the only true way). Modern compilers will create pre-compiled headers that can include code, usually used for template and inline definitions; modern compilers don't get the same benefits from the traditional model anymore. Actually, even larger projects seem to take longer to link with iostream and windows.h than the source does to compile.
The compiler's ability to optomize code may be increased greatly, espescially its ability to inline functions. Too much inlining will cause code bloat, but the compiler's options should give you control over the balance.
Modern compilers also allow you to change the compilation options mid-file.
Any debugger or source analyzer shouldn't have problems handling inline or same-file implementations, or you're using bad tools.
It can also be easier to create test code; create a series of test files t01.cpp, t02.cpp (each with a main) but include only one. The others are there for reference but don't interfere. This is also useful for testing a prototype replacement for a component; include the new one and comment out the old include. Going back is trivial.

It's more a question of coding style than anything. Personally, I hate maintaining redundant information of any kind, and this very much includes the prototypes in the header with the actual functions. Source code redundancy is bad for all the same reasons that database redundancy is bad. Making my C++ member functions inline and including their files frees me from this.

I don't think this will work too well in Java. A Java source filename = the .class filename = the ONE public class exported by the file. Unless you want a total of 1 public class, it won't work. Java doesn't use header files anyways. Class binaries export everything public automatically.
Learn from others mistakes by GoofyBoy · 2005-01-14 11:28 · Score: 2, Interesting

Its an interesting approach and you have no idea why you shouldn't do it.

So do it.

In the end, regardless if it works or not, you will have learned something new.

--
The surprise isn't how often we make bad choices; the surprise is how seldom they defeat us.
Faster!?! Have you tried it? by multriha · 2005-01-14 12:22 · Score: 2, Interesting

Well, coding style and software engineering aside, you need to do some testing if you think this will increase you speed.

Quick test to illustrate. 1,000,000 lines of C code, using gcc 3.3.4, default options.

Time to compile spread of 1000 files (with 8 lines of include and function body per file): ~2 minutes

Time to compile all in a single file: unknown

Why is the second time unknown? My computer doesn't have the memory to do it. Now I could pump up the memory of my machine (assuming I've got a 64-bit machine) to let me do the second compilation, but I doubt it'll be faster.
Re:The immortal advice of Rocket J Squirrel by crmartin · 2005-01-15 02:27 · Score: 2, Interesting

Uh, which loader and which linker, and for that matter which operating system etc? I suspect you're right about linux, and it'd be lovely to believe that there aren't any systems (other than in hobbyists' basements) that don't do it that way any more, but with the information we've got, we can't exclude the possibility.
removal of duplication is usually a good thing by mqx · 2005-01-15 04:31 · Score: 2, Interesting

Just remember that a header file defines the interface to the body: which actually duplicates some of the material in the body. Because of this duplication, you can have problems, i.e. faulty build dependencies, mismatch between header/body, etc. Removing this sort of duplication is usually a good thing: so if the technology (i.e. compiler) is smart/performance/etc to get it right, then the change could be a good thing.

I'd like to point out that many other respondants have argued their case with reference to 'C', however the poster clearly said it was not 'C' -- without further information, it's difficult to know whether these 'C' type issues will translate. I'd point out that some languages, e.g. python, java, perl, do not have ideas of separate header/body -- suggesting that "current trends" in languages is to do away with the duplication.

The compiler could be intelligent enough to construct a parse tree quickly, and only resolve parts of the parse tree when necessary: so for example, if there was previously a 5K header, and a 30K body, but now only a 30K body, the compiler may read the entire 30K, and only "roughly" parse it (e.g., say for a function, it parses the outer scope of the function, but resolves nothing inside the function until some other code actually uses the function).

I don't think there's an answer for this guy: there are too many issues that haven't been stated, as we know nothing about the particular toolchain, the build environment, the language, etc. All we have an abstract concept of splitting files into header/body. That concept by itself isn't good or bad, it depends upon a lot of other issues that change the perspective.

My answer would be that surely in the guys company he has a couple of clueful senior engineers that can sit around a whiteboard and discuss (using their computer science training) what actual impact the change will have on the project, and whether to go with the impact.
Re:This isn't C++ or Java by garethw · 2005-01-15 14:28 · Score: 2, Interesting

The answer has to be "we don't know" because you haven't told us what language this is. I reckon it's some sort of verilogrevolting or system-Cyukyuk abomination,

Not a bad guess. It's Vera. But what's the point of telling people that? How many people on slashdot even know what Vera is?

in which case I would say *** do what the vendor tells you *** because if you don't and it all falls apart they will say "we told you so"

But what's messed up is that vendor just decided to start advocating this one day. The Vera compiler even supports header generation - it's been done the way we expect for years. And bang, one day they suddenly start encouraging this weird approach. "What the vendor tells me" is not what the vendor told me five years ago, or last year.

I know that there was some discussion sparked internally at the vendor because I raised this question with their FAEs. I already know what other verification people think; I wanted to get an impression of what people thought of the same methodology if it were applied to similar languages.

--
garethw