Abandoning Header Files?
garethw asks: "I'm working on a project where the lead developer, following a suggestion by our tool vendor, wants to get rid of the header files and directly #include source code. The language is a somewhat specialized language, but for all intents and purposes, you can assume it's Java or C. The conventional argument I recall for using header files, and incremental compilation, is that it's faster to use a makefile and conditionally build only those files that have changed. However, it turns out that the brute force of invoking the compiler once on the top-level does actually compile much faster. I feel that there is something about #include'ing source files directly, compiling only the top-level file, just doesn't 'feel' right and I'm at a loss to really give a solid argument as to why. Has anyone actually used this approach? Does anyone have any thoughts on any advantages or drawbacks?"
...following a suggestion by our tool vendor,...
How much money will your tool vendor make if you implement this suggestion and what, if any, product does she sell that neatly solves any problems this might bring up?
It's simple: I demand prosecution for torture.
static global variables have scope within the module they're defined in. Which means that two static globals in different source files don't collide, because they're in different modules.
Including everything into one big source file will mean that they're both in the same module, and so will collide. Not good.
Can't say about other languages, though.
Well, there's the obvious separation of interface from definition. And the problem of duplicate definitions - there's a reason why "extern" is a keyword. :)
Plus, header files define an interface, which is useful if you don't actually have the code (i.e. binary shared library). Moot point in your case, I think, but...
Plus it's just good programming style to have separate definitions and implementations. Easier to track down bugs.
...but it's being eaten...by some...Linux or something...
> it's just good programming style to have separate definitions and implementations. Easier to track down bugs.
You can seperate definitions and implementations within 1 source file by using the following complex formula:
1. Put the definitions at the top of the source file.
2. Put the implementations at the bottom of the source file (i.e. after the defintions)
This may be difficult to get used to at first, but once you learn how to use the Page Up and Page Down keys it's not so bad...
The unofficial
They are just about the only way to centrally organize declarations for data structures and function signatures. Doing so will save your ass eventually, because having function prototypes available can allow the compiler and lint tools catch stupid programmer errors. You do use lint-like tools, right? They _will_ catch bugs that testers and visual scanning wont.
The only draw back to headers in C is that if you forget to 'make clean' after changing a header, you can end up with object files using old definitions. Just make a habit of doing a full build after changing the headers. If you designed your software properly, changing header files won't be all that common (adding functions new data structures, etc.).
-- Microsoft is the most expensive commodity operating system and office suite vendor in the marketplace.
Have you tested the speed difference when you change only one non-header file? I bet incremental compilation will make that quite a bit faster. In addition, if you want to compile that changed source file to check for syntax or type errors, you don't have to check for collision between it and the whole rest of the project, only collisions between it and the header defining it.
My OS prof was demonstrating the differences in what errors the C compiler and linker would pick up. However, we found that we could make two source files with no include lines in either that both defined a global variable (sans extern). The main function set the global variable and then called a function that is defined in the other source file, which would then print the gv. Then we compiled and linked them with gcc. No warnings, no errors. The program ran exactly the way we wanted it to, which was unexpected. So yes, you can do away with includes and header files without even performing the includes manually. Depending on the language, your compiler might be smart enough to figure it out.
But that doesn't make it a good idea. Besides, do you want to be the one who has to go update the library functions that would normally have been included any time you change the code in one file?
I think the separate header is simply code duplication and memory limitations of old C compilers.
Larger programs (compilation unit) could be compiled if the preprocessor - compiler were separated, and used batch processing, unused parts of the headers were never seen by the compiler.
The main problem with headers is that preprocessor stat is global to the entire operation, not per header or C file.
This makes conditional state flow from one to the other, which makes separate precompiled headers hard (since the conditionals might not match).
Also the header system requires manual making of makefiles (or using quite complex scanners and tools) , while this could be easily done by a compiler fairly easily. See most Wirthian languages (e.g. Turbo Pascal vs Turbo C++) for examples.
It seems that the onus should be on the vendor to explain very, very convincingly why you should abandon decades of standard practice and good coding practice. This better be one hell of a good product you're developing to justify the should a radical change. You shouldn't need to defend standard practice, they must campaign for a change to that practice. Imagine trying to explain this to all the coders who will work on the product for the next decade - will they think you're crazy or is there really a reason to do this?
- Disadvantages:
- You're not doing it the way everyone expects you to do it. Certain components (the compiler, the linker, and pre-existing code) might have been designed under the assumption that individual files would be compiled separately. The pre-existing code might have declared static (per-file) variables or functions in a way that could collide with other code (namespaces might help here). The compiler and linker might have limits. And you might not hit those limits until late in the project.
- For building the whole product, yeah, it will be faster. But for making a small change and rebuilding the results of that change, it might be much slower.
As with every issue you'll ever run into, there are two (or three) sides to it.Time flies like an arrow. Fruit flies like a banana.
gcc -c f1.c gcc -c f2.c gcc -c f3.c gcc -o f f1.o f2.o f3.o
Your vendor instead thinks it would be better to do:
gcc -o f f.c
Where f.c looks like:
Am I right, or am I completely off track?
If I'm right, you'd probably still want to include header files because you want everything to remain modular. According to software engineering type people, that makes maintenance easier. Another problem is symbol scoping. C keeps symbols local to the module they appear in, so you want to make sure you have naming conventions, namespaces, or some other protection against naming clashes. I'm dubious about the benefits, but I work on projects that take significant amounts of time to compile. Not hours, but enough time that if you wait for all the objects to compile you are wasting a lot of time. In general, I'd claim that the larger the project, the worse an idea it is.
Whoever corrects a mocker invites insult;
whoever rebukes a wicked man incurs abuse.
--Proverbs 9:7
While including code directly may speed up the compilation time, you will loose all the time you gain and then some when you get into debugging.
If you have a complicated #include chain, you can wind up with a lot of duplication. Some compilers will complain, some won't. However, if you have typedefs, structs or the like, most compliers will complain and not compile your code until the duplications are removed. I don't know what compiler you're using or if you are planning on including more than functions or global variables, so I don't know if this is an issue or not.
The more general issue is that it's much easier to track down bugs and other problems if there is a clean separation between definitions and implementations. I can't characterize that difference in a few sentences, so I'll just say that it has been my experience that projects which are developed in a true modular nature are much easier to debug than projects designed in a monolithic nature. The time saved in debugging more than makes up for a little time lost in compilation.
If you put all your source code into the compiler at once, what about memory usage? Sometimes, I've seen g++, for example, go crazy on an otherwise normal source file. I really don't know enough to know why.
-- Microsoft is the most expensive commodity operating system and office suite vendor in the marketplace.
Depending on the size of your project, you will get varying returns from each of these:
1. Seperate source files means that units of code can hide data and functions.
2. Seperate headers, combined with something like GCC's -Wmissing-prototypes enforces the good coding practice of well defined functional interfaces.
3. Seperate headers and source files means that when you look at a function in a file, you will have some idea of what it touches because you can go and look that it included header X but not Y.
4. You can tell the compiler to explicitly forbid global data symbols, which is pointless in one single file.
5. You can use different compiler switches for different files.
6. Your code will have some hope of portability.
If your project is small, it doesn't matter anyway. If your project is large, you can get your compiler to enforce some good design rules on you, which doesn't mean you can't still have a good design anyway, but it will make it more likely. I worked on a project that used a compiler that let you get away with everything. Try and port that code to anything UNIX-like, and it was ridiculous.
Outside of a dog, a book is a man's best friend. Inside a dog, its too dark to read.
For a few years I worked in Modula-2. One of the interesting things about it is that the language has include built in as a first-class concept.
Normally, one file contains one module. A module is pretty similar to a class in that it's an encapsulation structure. Each module has an interface and body/implementation part, each of which are coded. This is very similar to a prototype in a header.
Whenever code in one module references code in another, the module starts with a series of imports. An import reads the interface.
There are a few nice things about this scheme; linking doesn't require a make file because all the information for linking is contained in the code; and the compiler can be smart - when an interface hasn't changed but the implementation has changed, only a recompile of that module is required, and a relink; when an interface changes, the compiler knows what the impact is and need only recompile those implementations that are affected.
Before I used the system, I thought it would be slow. However, it was FAST. I think there were several reasons:
- Modula-2 is a well designed language so compiling is fast.
- The compiler can be aware of what must be compiled.
- The linking process is smart.
Between then and now, C++ has gained all these capabilities, so there's no reason to think that C++ hasn't gained in the same way.
I think it's a better and smarter system than separate header and body files, which were a hack to gain this advantage. And modern compiler technology supports it.
I can understand why you got modded down, but honestly, this is the question that occurred to me as well. There's a great many polite and informative responses here, but when you get right down to it, I think you nailed it.
Fascism trolls keeping me up every night. When I starts a preachin', he HITS ME WITH HIS REICH!
You have issues with scope.
The easist one for me is that with #includes, it's so much easier to fix bugs. If you find a bug or an inefficient way to solve a problem, you only have to fix it once. Everything that #includes the suspect file will be fixed on the next compile.
If customfunctions.h has been changed or optimized, you don't have to edit the 30 projects that you're using those functions in. Just the one file is fixed, each project gets the benefits during the next compile.
LK
"Hi. This is my friend, Jack Shit, and you don't know him." - Lord Kano
If your language supports "static" in the file-scope sense, you could declare every global object as static and reap the compiler optimizations that come with that declaration.
If your language supports smart inlining, you could end up with code that has been inlined more effectively, since any code could be an inlining candidate regardless of location.
I can think of plenty of reasons to back away from the idea, but they'll flood in here without my help.
Your solution seems interesting, if somewhat unconventional. But what is the problem you are trying to solve with it?
It is hard to tell from your statements, but this may stop tools like ccache from working. I use ccache in my projects and it radically cuts down the amount of recompilation required when I do a complete rebuild. Now, an obvious question is why I don't simply rely on makefiles to ensure only changed files ever get rebuilt. This often happens because compilation involves generating new cpp files that are then compiled and I don't want to be grepping through these all the time. I suppose I could move them all to a different directory, but ccache works very well.
The other problem, of course, is that separating your classes into header and implementation means that if you change the implementation, you only need to recompile that one file and relink, rather than recompiling EVERYTHING. This can be a matter of a few seconds vs. several minutes. And implementation does change, a lot... fix a bug, you fix the implementation. The headers change too, but much much less frequently.
Oceania has always been at war with Eastasia.
Sadly, that doesn't work so well for template classes. I understand why, but it's still unfortunate that you can't separate template declarations from their definitions -- as it is, the code winds up being somewhat messy and hard to read, and if you care about such things (I don't for our in-house software) you can't "hide" the implementation of your code.
.h file to see what's "available" -- and that's always the problem when learning a new language or environment, isn't it? You need to know what's there for you to use, so you don't reinvent the wheel -- good header files do that for you.
... but when you've got hundreds?
I know for our newer programmers, it's hard to read through a messy
It takes me over an hour to build our project, when I have to. We have to build more often that I'd like because several years ago we screwed up some interfaces such that our originally-separate libraries and our main app are somewhat locked together (cross-referencing #include) -- it can be fixed, it'd just take time (and we don't have a good tool to hunt down the dependencies we want to rid ourselves of.) Just changing one of a few files affects everything, even if it shouldn't in theory (that is, if I had been in my right mind.) In our case at least, there's a clear advantage when we don't build the project, but only a subset (over an hour vs. a couple minutes, including linking) Maybe if you only have a *few* source files
Besides, you don't always have the source. Eventually, you start having to rely on interfaces to get what you want anyway. Might as well stick with good habits and separate implementation from interface. Eh.
When I write libraries, I try to make them header-only. Generally users don't want to have to modify their makefiles if they don't have to, and I'll resort to compiler specific pragmas if I have to.
:)
It depends on the size of the system. If you are using a component-based system then only the pieces of the system that are actually being modified should be compiling anyway, which cuts out a lot of compilation. However this implies there is fairly loose coupling involved. In a more conventional application, there has to be a breaking point where the amount of time to parse files is longer than the time to link them normally. Using precompiled headers on any system header will also drastically decrease the time it takes to compile, since the compiler essentially just dumps the parse tree out to disk. So much time is spent inside some system headers! (Especially Windows.h. Ugh!)
There are some tools that keep the header files in sync with source files automatically, but I don't know of any off-hand. I have seen some for C, but I'm not sure there is one for C++ that supports all the wild and crazy stuff like namespaces and templates.
Including all the source code into one main file compiled to one object can work, if the source files cooperate. C can have problems with the namespace, but C++ allows multiple namespaces and you can even put the namespace blocks in the main file around the #includes. The source code has to support this, though. It's best if all the source files to be included are under your control. For libraries that expect to use a declarative header, use it like it was intended.
.class filename = the ONE public class exported by the file. Unless you want a total of 1 public class, it won't work. Java doesn't use header files anyways. Class binaries export everything public automatically.
I've done this on lots of projects and it works great. Most of the arguments here are either about performance or an appeal to tradition (that's the way we've always done it... must be the only true way). Modern compilers will create pre-compiled headers that can include code, usually used for template and inline definitions; modern compilers don't get the same benefits from the traditional model anymore. Actually, even larger projects seem to take longer to link with iostream and windows.h than the source does to compile.
The compiler's ability to optomize code may be increased greatly, espescially its ability to inline functions. Too much inlining will cause code bloat, but the compiler's options should give you control over the balance.
Modern compilers also allow you to change the compilation options mid-file.
Any debugger or source analyzer shouldn't have problems handling inline or same-file implementations, or you're using bad tools.
It can also be easier to create test code; create a series of test files t01.cpp, t02.cpp (each with a main) but include only one. The others are there for reference but don't interfere. This is also useful for testing a prototype replacement for a component; include the new one and comment out the old include. Going back is trivial.
It's more a question of coding style than anything. Personally, I hate maintaining redundant information of any kind, and this very much includes the prototypes in the header with the actual functions. Source code redundancy is bad for all the same reasons that database redundancy is bad. Making my C++ member functions inline and including their files frees me from this.
I don't think this will work too well in Java. A Java source filename = the
Its an interesting approach and you have no idea why you shouldn't do it.
So do it.
In the end, regardless if it works or not, you will have learned something new.
The surprise isn't how often we make bad choices; the surprise is how seldom they defeat us.
One of the really depressing things about having been in the business for nigh on to 40 years now is that, along with the occasional new dumb idea, all the old dumb ideas keep coming back. Among those dumb ideas that keep coming back are "visual programming" --- using graphics instead of programming languages; complicated schematic graphics for software --- UML in its utter complex form; and, sure enough, using the preprocessor to mess with C-like languages.
Every time this is tried --- and God knows it's been tried a lot --- you run into some severe problems:
If you've got control of the compiler for this peculiar language, why not explore making the startup time shorter, say, eg., by using shared libraries, DLLs, or by setting the sticky bit?
Here's a post of mine to comp.compilers from 10 years ago....
2 -0 74
http://compilers.iecc.com/comparch/article/95-0
I believe that this strategy still makes sense in some environments.
AG
Isn't the article poster in favor of include files? He's looking for arguments to go to battle with the vendor. Is there something about using #include which seems stupid to you?
My UID is the product of 2 primes.
Well, coding style and software engineering aside, you need to do some testing if you think this will increase you speed.
Quick test to illustrate. 1,000,000 lines of C code, using gcc 3.3.4, default options.
Time to compile spread of 1000 files (with 8 lines of include and function body per file): ~2 minutes
Time to compile all in a single file: unknown
Why is the second time unknown? My computer doesn't have the memory to do it. Now I could pump up the memory of my machine (assuming I've got a 64-bit machine) to let me do the second compilation, but I doubt it'll be faster.
If the amount of code changed or added in your program per unit time is roughly constant, then it will take O(N^2) time to compile, where N is the size of the program, over the life of the program.
Think about that for a moment. Think back to computer science 101. Now ask yourself if this is a good idea or not.
sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
Speeding up a full build should not be important. The only people who care about it are in your test lab doing daily builds and regression tests, who can start the build overnight and have it ready by morning. Of course, this is the situation in a well-designed application. If you find yourself needing a full rebuild all the time, it means one of two things: 1. you are hacking a core component, or 2. all your components are written with spaghetti code and any change in one forces rebuilds in all the others.
In the first case, try just testing one or two components during development, and then verify all the others when the API is stabilized. This is, incidentally, the advantage you gain from using header files: once the API is stable, you never need to rebuild that component again except to fix bugs (which require rebuilding only that component).
In the second case, you need some serious refactoring. Look at the code and break it up. Encapsulate everything you possibly can. Make stuff private and static. Make everything you don't modify const. Keep it up until each component is accessed only through its API and that API is clean. Trust me, this is possible in any project. The enormous decrease in maintenance costs will more than pay for any time you spend on it.
Header files aren't a good way to separate interface from definition. For one, the programmer can still put the entire program code in the header file. And in C++ the private parts of classes end in there too.
Java and Delphi use just one file but the interface could be easily obtained from the binary object. That way prevents duplicate definition and I like it better.
"I think this line is mostly filler"
It seems there are two separate issues, with different pluses and minuses.
.h files and the whole separate declaration/definition thing. The second is compiling it all together in one compiler invocation.
.c files is, IMHO, a bad idea.
.c files into a .h file. You could make a single "project.h" file which everything includes. The only downside is that you have to recompile all source if any header file changes.
.c files, or something like that. It doesn't necessarily go with eliminating .h files entirely. Personally, I find incremental recompilation a useful speedup, but whatever works for you.
The first is eschewing
Regarding the first, a master "project.c" file #including all the other
Having a declaration in scope (i.e. included in the source prior to) at the time of the function call is a good thing. Otherwise ANSI C defaults to K&R rules, and you're not allowed to call variadic functions at all.
If you don't have separate declarations preceding definitions, how do you resolve the ordering problem for mutually recursive functions? For example, suppose foo() calls bar(). And bar() calls foo(). What order do you define them in?
Even without this, you could have icky ordering requirement between source files, where foo.c:foo2() calls bar.c:bar() calls foo.c:foo1(). Do I include foo.c or bar.c first in the master source file?
Far better, IMHO, would be to mark exported functions, structures, etc. in the source somehow, and write a tool that automatically pulls the info out of the
As for a single compiler invocation being faster, the only thing you lose is file-scope static variables and helper functions. Personally, I find this useful (and wish C had features to share a scope among a few related files), but I also try to keep such names globally unique for debugging purposes anyway. If you can live with that, a single compile is far less of a silly thing to do. But you can just #include all the header files, then all of the
You can separate them, somewhat. You can separate them into separate headers. Put the declaration in one, then the implementation in another, and put a #include for the implementation in the declaration file, after the actual declaration.
I know this isn't technically the same. The implementation is still in a header, but there's really nothing to be done about that because of the way C++ implements templates under the sheets. But for the purposes of code clarity, it's nearly identical.
First, your language is not like anything like Java, because Java does not have header files at all.
Second, your language is not anything like C, because C was carefully designed from the ground up to use header files and compilation units. Running this way will annoy your compiler, your linker, your debugger, and every other link on your tool chain, and muck up many standard C coding practices.
So, yeah. If you're using a language that's not like C or Java, and your tool vendor is telling you to do this, and your lead developer is telling you to do this, and your managers trust the lead developer's decisions, then do it. Your project isn't going to succeed or fail based on this decision.
In your IDE, change the code to be a smaller font - this will use less disk space and ultimately make things run faster...
creation science book
The "pure" solution of actual separate compilation of template implementation is supported by some compilers, but the complication it adds to the linker is probably more trouble than it is worth.
[Set Cain on fire and steal his lute.]
You don't want to abandon headers; they are the basis of seperation into translation units. If you're compiling from the top and it's faster than compiling small pieces, then that means your source needs a hell of a lot more decoupling; by definition the only way that top-down build could possibly be faster than incremental build is if incremental build is still making everything anyway.
Consider reading Modern C++ Design, specifically the section about generic functors, which should help you with your coupling problem until you have the time to learn to seperate TUs aggressively on your own. In my opinion, the Boost functors are better than the Loki functors, but the book is a hell of a lot better than the Boost documentation, so read it anyway.
Ask a large project supervisor in a company with aggressive compilation tactics about the large compile-time wins of precompiled headers. Whereas it's not the same issue, the same critical foundations are there, and it's generally easier to squeeze out an answer about PCH than properly seperated TUs.
StoneCypher is Full of BS
Nothing prevents you from doing so in C++ --- try man nm.
However, the interface you get this way is not very well documented --- in Java or C++. For this reason, most people doing C++ (and Java) uses comments in the source files, which are then transformed into documentation by doxygen and similar programs (like javadoc).
Religion is regarded by the common people as true, by the wise as false, and by rulers as useful.
One of the really depressing things about having been in the business for nigh on to 40 years now is that, along with the occasional new dumb idea, all the old dumb ideas keep coming back.
This is just a matter of being standards compliant, see #11 in http://www.faqs.org/rfcs/rfc1925.html.
It isn't exactly the same. nm works just with some compilers (or is it a gcc only utility?), and can't really extract all the type information, that isn't really there. Except may be in C++ decorated object code but I don't remember it now.
Also most C/C++ code doesn't make clear in a nm listing which aftifacts are public, a programmer has to extra steps to prevent this. nm is at least confusing when you want to get the public API of an object compared to other language tools that just do it.
I'm not an expert in nm, feel free to correct me if I'm wrong.
"I think this line is mostly filler"
First of all, "speed", either compilation-wise or runtime-wise, has nothing to do with why you should use header files.
I too disliked header files, long ago, in my early days of programming C. It seemed pointless, to have two files (or rarely, as many as four), when one would do just as well.
For small projects, I'll still use one large monolithic source file. In that aspect, it makes sense to skip breaking out your data and function definitions.
But when you get to the "real" world... Imagine even a "small" serious project, with perhaps 10k lines of code. Try to find a single function in that file - I hope you feel on good terms with your IDE's search capabilities!
So, break that out into a dozen files - You have your network code in one file, your UI code in another, your file I/O in another, perhaps some database interaction in another, and so on. Okay, that works well... But wait, your network code, your file I/O, and your database code, all make use of the same checksum algorithm! So, you have the same exact code duplicated three times.
That would work, because each file will compile to a module with its own namespace (in most languages). But it wastes space, both in the source and in the compiled code. It also wastes time and can very easily introduce bugs - For example, if you decide you need to switch from MD5 for SHA1 as your checksumming algorithm, you now need to change three places instead of one. If you miss one of those, but use them to compare results between the three different uses, you have a very serious bug that may drive you batty trying to track it down.
So, the obvious solution, break out all your common functions into a toolkit-like source file. Now, you could just #include that in every other file that needs it, but WOW would that cause some serious bloat in the compiled code - In my experience, shared code files frequently end up as the single largest source file in the entire project.
So, use a header file. That way, you don't end up with massive duplication of code, you have the advantage of a logical breakout of your code into similar-purpose files, and you can still make changes to only one file to modify one function.
Incidentally, the above chain of thinking more-or-less describes the evolution of standard libraries... Would your professor actually suggest that you shouldn't "#include<stdio.h>", but instead should manually pull the code for each function you use into your source file? Because, in the degenerative case, he has told you exactly that.
Well, nm is from binutils. It's certainly not gcc only --- it can read various object file formats. Anyway, surely you are only interested in those that your linker, ld, can actually read.
It is true that nm is faithful to the object code, in that "private" artifacts and type info in C (C++ has them, as you suspect, encoded into the name, much like Java) are listed. This reflect the fact that nothing prevents you from modifying a header file to use a private artifact --- it's a compiler thing, not a linker. I'm not famliar enough with Java to know if accessibility is enforced down at the runtime level. It seems a counter-constructive thing to do, but maybe there is something about Java that forces this to be so.
However, my point was that even with private/public adoration, the API you get is rather useless. It's very well to know that a constructor for a rectangle takes for doubles as input, but would that be lower left corner, height and width? Does the class require that the height and width (parameters) to be positive? How about 0, is that ok? If I copy a rectangle, would that be a deep or shallow copy? Is the constructor thread-safe?
All this is usually documented in comments in the code --- doxygen or javadoc. And there you get the full API for free anyway, with clickable links and everything. What's the point in jerking around with object/bytecode when you have this?
Religion is regarded by the common people as true, by the wise as false, and by rulers as useful.
One of the better ways to discourage client programmers from subverting your interface is to use as the only private member a pointer to an implementation struct which is forward declared in the header and defined within the implementation file. This can seriously cut down dependancies in large projects and remove one of the "practical" reasons programmers may have for subverting your interface.
[Set Cain on fire and steal his lute.]
I agree with you, my point is that having separate definition/declaration files is useless and redundant. You can have the docs generated or -at least in java- can have the exact prototypes generated from the binaries. There's no technical need to includes nor is needed for documentation. That's why I like it better the Java/Delphi way than C/C++ #includes.
"I think this line is mostly filler"
Wrong. If the compiler sees that a "global" (ie. the outermost scope) variable, a is declared static, then it knows that the value of that variable cannot be changed by other code outside the current translation unit. That might enable some optimizations which would not have been possible before (in particular, it may be possible to put the value into a register and keep it there for longer).
The author is considering NOT using include files, rather than laughing at the vendor and reminding him where his money supply originates.
Fascism trolls keeping me up every night. When I starts a preachin', he HITS ME WITH HIS REICH!
This is my favorite gripe with C++, so I agree with you that is seems silly and redundant.
However, you could get close to the Java way by just writing the class out like in Java in a C++ header file. We have to ask ourselves the question: Won't don't C++ programmers do this? Force of habit? Compile times? Group pressure? Or is it, after all, better this way?
Another thought: Nothing would prevents header file being automatically generated from source files, though a few extra comments would be neccessary like /*+ public: */ and /*+ virtual*/ Why has this not been done? Don't tell me nobody thought of it :)
Religion is regarded by the common people as true, by the wise as false, and by rulers as useful.
Just remember that a header file defines the interface to the body: which actually duplicates some of the material in the body. Because of this duplication, you can have problems, i.e. faulty build dependencies, mismatch between header/body, etc. Removing this sort of duplication is usually a good thing: so if the technology (i.e. compiler) is smart/performance/etc to get it right, then the change could be a good thing.
I'd like to point out that many other respondants have argued their case with reference to 'C', however the poster clearly said it was not 'C' -- without further information, it's difficult to know whether these 'C' type issues will translate. I'd point out that some languages, e.g. python, java, perl, do not have ideas of separate header/body -- suggesting that "current trends" in languages is to do away with the duplication.
The compiler could be intelligent enough to construct a parse tree quickly, and only resolve parts of the parse tree when necessary: so for example, if there was previously a 5K header, and a 30K body, but now only a 30K body, the compiler may read the entire 30K, and only "roughly" parse it (e.g., say for a function, it parses the outer scope of the function, but resolves nothing inside the function until some other code actually uses the function).
I don't think there's an answer for this guy: there are too many issues that haven't been stated, as we know nothing about the particular toolchain, the build environment, the language, etc. All we have an abstract concept of splitting files into header/body. That concept by itself isn't good or bad, it depends upon a lot of other issues that change the perspective.
My answer would be that surely in the guys company he has a couple of clueful senior engineers that can sit around a whiteboard and discuss (using their computer science training) what actual impact the change will have on the project, and whether to go with the impact.
I wish I'd included a few more details, which might have avoided questions like, "Are you stupid?" and "Have you taken basic Computer Science course?" (the answers are "On occasion" and "Waterloo, Comp Eng '98" respectively
A few details which might put the question into perspective might be:
- The project is a chip verification project. There is no final "product" at the end of my work. The name of the game is endlessly re-compiling and running new tests. So compile time is actually quite significant.
- There is no linker.
:) The nature of the language is such that it is linked at run time.
- The compiler actually doesn't allow you to list multiple source files on the command line and produce one object. So I guess my C/Java analogy was misleading. But that's partly why I'm at a loss to rationalize the question - there is little direct reference point.
- A lot of people missed my point - I think abandoning header files is abhorrent. But when it came down to it, I couldn't actually produce any inarguable reasons why (namespace is one, but I don't think it's a show-stopper).
Thanks again for your insights.garethw
There is what you do for release, and what you do for development. For release (which can include the daily build!) this is a good idea, if there is any gain in run speed.
For development I consider this a bad idea. When I change foo.c I don't want to recompile every other file in the project. It will slow things down in any project with size.
It will be more work to implement both. Since computer time is cheaper than human time most people don't bother with adding the ability to do both. However I wouldn't call it a bad idea for releases.
...or has forgotten everything he ever knew about it.
The answer has to be "we don't know" because you haven't told us what language this is. I reckon it's some sort of verilogrevolting or system-Cyukyuk abomination, in which case I would say *** do what the vendor tells you *** because if you don't and it all falls apart they will say "we told you so".
......
... ...
But just imagining that it was C after all, here's an annecdote. I wrote an instruction set simulator once. It looked like this:
switch(opcode) {
case 0: foo(...); break;
case 1: blah(....); break;
case 2: blob(....); break;
}
There were many many lines in the case statement; the functions were defined in other files and were mostly short, e.g.
void add(int a, int b, int& c) { c=a+b; }
In this case, replacing '#include "add.h"' with '#include "add.c"' for each of the files gave a huge increase in execution speed since the functions all got inlined, but of course compilation time is reduced in the typical case where only a few files have changed. Solution:
#if compile_fast
#include "add.h"
#else
#include "add.c"
#endif
Plus some magic in the Makefile. Useful when you have a problem like this, but *most problems aren't like this*.
You can also make comparisons with the way that the C++ STL is defined purely in headers. Slow to compile, larger code size, but fast.
What about
static void init(); static void cleanup();
This is very common to have utility functions in a .c file and to not want them cluttering the namespace. This kind of perfectly valid construct would obviously not work includin,g .c files if ther is any symbol collision. And since in C symbols are lookd up only on name and not on type or signature differing args could not save you...
Why have the .h files? Several reasons. First off- if you #include the code directly, you end up with multiple copies of functions in your executable. In big projects this can be significant space.
.h and some in the .c. The .c things are similar to private class members- the outside files can't access them directly. There is no way to do this with a directly include the source method.
.h or two. Parsing that info from the .c file would be difficult and time consuming.
.1% of the time writing the functionality takes. And writing the declaration usually gives you time to think things through and make sure you have things the way you want them, as well as serving as documentation of the interface. A script that auto generated it would be very little to no gain, and in many ways a loss.
Secondly, it allows you to hide implementation details. You have some things declared in the
Third, its cleaner and easier to read. If I want to knopw what functionality a module implements, I read a
As for your other thought- why would you want to do that? Writing the class declaration takes maybe
I still have more fans than freaks. WTF is wrong with you people?
Remember that Bjarne has said from the first that a lot of C++ was designed to get away from the damn preprocessor.
What is the difference between including source and one big file (except for more complex recursion and scope problems). Not much, so why not get rid of those pesky functions while you are at it and just have file:line refrences.
You can use me as an example of point #8. As a fellow "old fart" I have come across this type of code on many occasions, it is always a hack job, usually with no statement of what it is supposed to do. They may save some time on the compile but that is trully insignificant compared to the time they will spend unraveling it when somthing goes wrong.
The only problem with "visual programming" is that people think it is usefull for more than just skeletal code. Its great for whacking together a "normal" looking GUI and thier associated handler shells, but that's as far as it goes. I have used many compilers on *nix and Windows, (when you spend the time to know how to use it) the MSVC-IDE is hard to beat.
UML: I have never known anyone to fight "Rational Rose" and actually come out a winner, but some pretty flow charts WITH AN ACCOMPANIYING LEDGENED can help express an idea.
Java: Networked P-Code?
And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
Everyone here thinks you are talking about C. If you aren't, this thread is pointless.
are we talking about C?
in header file:
class iFoo
{
public:
iFoo() = 0;
~iFoo() = 0;
Bar() = 0;
}
in cpp file:
class fooImpl : public iFoo
{
public:
fooImpl();
~fooImpl();
Bar();
etc...
private:
stuff...
}
fooImpl::fooImpl()
{
stuff...
}
etc..
stuff
First, let me make it crystal clear that I have always used the separate .cpp and .h files for C++ (and C) since the beginning, and still do.
Good argument, but completely wrong :) There would be only one compile unit, and the header guards would remove any duplicates. This is why it works with templates, BTW.
If the methods are declared private, then that's it. If not, nothing prevents you from calling them, though you may have to steal the prototype from the implementation file. The only exception is global function that are declared static --- those can't be called outside the compilation unit. Not exactly a huge impact.
That is very true, though the difference between two separate files and having the implementation at the bottom of the header file is rather minimal. Also, interestingly, the Java people have always inlined the entire thing in the "class" file, and I haven't heard them complaining. So maybe it is just habit (?).
Well, I write the class definition, press a key and have XEmacs generate the implementation file stub. So I don't get time to think, and I get annoyed at having to sync the two files whenever I change something. I believe it is the latter that annoys people, not the initial work, which most IDE can do automatically anyway.
Interestingly, you didn't touch on another point: With multiple compile units, you don't have to recompile the entire thing every time you touch anything. As a KDE developer, I can certainly appreciate that!
Also, having two classes depend on each other would certainly be harder (though not impossible).
Religion is regarded by the common people as true, by the wise as false, and by rulers as useful.
If you do that you can't share the header part freely in order to provide others with an easy interface to your code (while with-holding the actualy implementation). That can be extremely useful if you work on projects large enough that not everyone can work on the same part. (Ie "in the real world" code.)
Putting defs at the top is good for defining "local" functions and variables though. Used with h-files it creates a good separation of private and public information.
Personally I'd be very sceptic of jumping on a wagon that goes contrary to software design practices while not really having any arguments.
> has nothing to do with Fowlerian refactoring
I have never heard of anyone named Fowler, and chances are that most people who speak of "refactoring" had not either. This is not "equivocation" or "fallacy of composition", but rather the difference between the meaning of the word given to it by its creator and the one actually in general use.
> Encapsulation does not decrease coupling in any way;
> it simply hides non-coupled parts from one another.
When I tell you to encapsulate your class, I imply that you would also reconsider the design of the other code that uses it to reduce coupling. Encapsulate, as in "put it in a capsule", which other code can swallow regardless of what's inside. To do this, you must define an interface and rewrite all the other code to make proper use of it. I can't see how you can do this without decoupling. I take it that you are one of those people who want everything specified to the last detail, while I am a big-picture person, who assumes that others can figure out the details for themselves. This is obviously the root of our communication problem.
> This is why I suggested Alexandrescu-style functors
Perhaps you could explain what they are? A google search turns up nothing except references to his book. It sounds like a good one, so I'll probably buy it some time, but you really shouldn't assume that everyone in the world has read it.
> You should realize that statics are static variables, and that
> static functions, despite having static in their name, are not statics at all.
Well, there we are splitting linguistic hairs again. Human language is not that precise; words are simply references to shared concepts and if your word-concept map is different from mine (and, quite likely from many many other people), there is no need to take offence. As long as you can see what I mean, there should be no problem.
> This is myth: static local arrays are not in any
> way faster than nonstatic local arrays.
Big static local arrays are most definitely slower because they are built at runtime:
Try compiling this to assembly. My gcc 3.4.3 generates c_Strings at runtime on the stack, even with optimization turned on. If you make it static, this does not happen. However, speed is not really the issue here; it mainly helps reduce the size of the code.
>> As for threaded code, I never write any
> Obviously. You also clearly don't write libraries,
> where you cannot guess about the nature of outside code.
On the contrary, I write little but libraries. I just write libraries for non-threaded code, and I don't "guess" the nature of the outside code, I define it by telling the user exactly what the library is for. And it is usually not for threaded applications.
> You do realize that the event mechanism in essentially every
> major OS is driven by an underlying threaded model, right?
Absolutely. It doesn't have to be this way though. In fact, to implement a good event mechanism is currently my main personal project, and I am determined to make threads entirely unnecessary in it. Theads create far more problems than they solve, IMO, and I haven't seen any good use for them that I couldn't rework into an asynchronous design. Threads and single-thread event systems really do the same thing: share the CPU between several tasks, with the former doing it at the instruction level and the latter doing it at a logical packet level.
> As far as event driven code being faster, well, horseshit.
> Show me non-naive test code which supports this in any way.
I can point you to the
It sounds like the vendor wants to make sure you can't use other vendors someday by locking you in a code marriage of some type. Keep it seperate, there is no good reason to mess the code together.
Have you considered the possibility that both your tool vendor and lead developer don't know what they are doing? Any questions regarding compilation time are simply irrelevant. The only reason someone would include a source file is because they can't get the project to link with either a library or object code. Pretty much the only way that would happen is if either the tool vendor or the lead developer (or both) don't know what they are doing. The logical suggestion is to just link a sample app yourself and show them that it can work after all. However, that would probably offend the lead developer and get your fired or shunned.
Is this a stupid question or what?
The conventional argument I recall for using header files, and incremental compilation, is that it's faster to use a makefile and conditionally build only those files that have changed
No it isn't. Header files doesn't help in incremental compilation. Atleast not in C.
The reason for header files is
1) Function prototyping. Otherwise, you may have
to prototype functions in all source files
2) For structure definitions. Otherwise, you may
have to do it in all source files.
3) #defines. Same reason
and so on & so forth.
Is there something I am missing here?
Using the suggested new system, how would parallel make work? Parallel make on a uniprocessor or multiprocessor machine allows multiple compilation units to be compiled simultaneously. While one process is waiting on file I/O, the other process can be chugging away on another compilation unit. In addition, only compilation units depending on a changed header will need to be recompiled.
Personally I'd be very sceptic of jumping on a wagon that goes contrary to software design practices while not really having any arguments.
Precisely, sir.
A curiously large proportion of responses seem to think I'm advocating this approach. I'm not; it is being thrust upon me.
garethw
But it's not that different.
The language itself is syntactically very similar to Java, but has traditionally used header files as one would in C. Then one day, the vendor starts telling people not to do it that way.
What is different is the tool chain. But that's not really the language per se. Not all C tools are used in the same way either. SO I think the comparison is still valid.
But what they're advocating would still work in C, which is why I posed the question that way. Many replies have latched on to the idea that you'd have to have function prototypes in every source file, but that's not true. The entire app ends up being one translation unit, so there ARE no externals; it's all right there in the output of one pass of the pre-processor.
I'm not saying it's a good idea. I really, really don't like it. But it works better than most of the replies here have assumed.
Interestingly, this language actually uses a modified cpp for pre-processing.
garethw
But you don't get need duplicate defintions.
You end up with the entire code in one compilation unit (if that's the correct term). There is no need for function or class prototypes, or for externs - the implementation is right there in the #include'd source hierarchy.
Bizarre? Yeah, I agree. But what would you do if your tool vendor one day tells you to do something you think you believe you will later regret?
garethw
Sounds like you're using Vera because 'e' does not allow separate compilation - but then I've never used SystemVerilog.
/Ed
You probably should have asked at http://www.verificationguild.com.
We've never had a problem with the "single compilation" when using Vera - although we tend to compile "the environment" as one compile and the testcase as another. There can be issues (you have to generate a master header file, forward referencing certain classes, etc.).
I've been doing this for over a year. I have a few projects that are 5000+ lines of code.
/work and custom editors that save automatically. Then I have a sync_work.sh script that looks like this:
/work
I use a 10 MB memory file system for
$ cat sync_work.sh
#!/bin/sh
while true
do
day=`date +%A`
p=/home/gps/src/automated_backups/$day
mkdir -p $p
cd $p
d=`date +%s`
tar -cPf work_$d.tar
echo "done with $day $d"
#60 * 5 = 300
sleep 300
done
I compile the file(s) with the many #include's using gcc -pipe.
Overall it saves me time when compared to the traditional techniques. The only things to watchout for would be global variables, or static functions. I haven't found this to be a problem in practice.
-GPS
These are static functions, not variables, and the original posters didn't make any comment about whether or not they were good ideas.
The problem with your formula is that it prevents you from taking your implimentation and creating a shared library out of it.
Seriously. Let's just say your tasked with creating a graphical toolkit. You go off and in 3 months you've created a masterpiece. You've got everything! Sliders, icons, buttons, rulers, progress bars, menus, text windows, labels, etc.
Now, lets say you take your graphical toolkit and create your own "copy" program. So, you'll need a couple text boxes and/or some open/save dialogs, some labels, maybe a button or two and a progress bar.
What makes more sense. Use header files and have the compiler link against a graphical toolkit shared library OR have the compiler compile thousands of lines worth of your code you'll never use and statically link it into your executable?
Yes Francis, the world has gone crazy.