Slashdot Mirror


Abandoning Header Files?

garethw asks: "I'm working on a project where the lead developer, following a suggestion by our tool vendor, wants to get rid of the header files and directly #include source code. The language is a somewhat specialized language, but for all intents and purposes, you can assume it's Java or C. The conventional argument I recall for using header files, and incremental compilation, is that it's faster to use a makefile and conditionally build only those files that have changed. However, it turns out that the brute force of invoking the compiler once on the top-level does actually compile much faster. I feel that there is something about #include'ing source files directly, compiling only the top-level file, just doesn't 'feel' right and I'm at a loss to really give a solid argument as to why. Has anyone actually used this approach? Does anyone have any thoughts on any advantages or drawbacks?"

36 of 207 comments (clear)

  1. Need more info... by sfjoe · · Score: 4, Insightful

    ...following a suggestion by our tool vendor,...

    How much money will your tool vendor make if you implement this suggestion and what, if any, product does she sell that neatly solves any problems this might bring up?

    --
    It's simple: I demand prosecution for torture.
    1. Re:Need more info... by Jeremiah+Cornelius · · Score: 2, Funny

      i.e.: Will they pay for Dennis Ritchie's cardiac medication?

      --
      "Flyin' in just a sweet place,
      Never been known to fail..."
    2. Re:Need more info... by Tim+Browse · · Score: 4, Insightful

      If they're anything like some tool vendors I've come across, it's because they either don't have decent compilation perfomance, or don't support the features that would help, such as pre-compiled headers, etc.

      So rather than fixing the problem by investing in their product, they're telling their customers to use ugly hacks to get around the product's shortcomings, and hope they won't switch to another system (I suspect).

      I've certainly been on the receiving end of such tactics.

      The dead giveaway is when they start saying things like "pre-compiled headers wouldn't help you anyway" :-)

  2. Not useful for C by david.given · · Score: 4, Informative
    ...or, to a lesser extent C++, because of the way C scoping works:

    static global variables have scope within the module they're defined in. Which means that two static globals in different source files don't collide, because they're in different modules.

    Including everything into one big source file will mean that they're both in the same module, and so will collide. Not good.

    Can't say about other languages, though.

    1. Re:Not useful for C by angel'o'sphere · · Score: 2, Informative

      Lol,

      reread your parrent!!

      Exactly what you show is what he says. But he was talking about *.c Files, not *.h files. So while the *.c files would scope the foo variables leading to two distinct ones the +. h file pulls them both into the same c file.

      So what in the beginning worked, while it was scoped, does no longer work if everything is pulled int one single source file via #include.

      So your example exactly shows the conflict your parent wanted to point out.

      angel'o'sphere

      --
      Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
    2. Re:Not useful for C by Bloater · · Score: 2, Informative

      The term used in C is the "Translation Unit". When you compile a .c file you are compiling a translation unit. If the C source file #includes the contents of another file, then those contents replace the #include line in what the compiler considers to be the code to be translated.

      It doesn't matter what the file name is from the point of view of C, but a given compiler may use the last dot of the filename and the characters after it to determine which language it is, and whether it is a source file to be compiled or an object file to be linked.

  3. Interface vs implementation, shared libraries, etc by Dimwit · · Score: 3, Insightful

    Well, there's the obvious separation of interface from definition. And the problem of duplicate definitions - there's a reason why "extern" is a keyword. :)

    Plus, header files define an interface, which is useful if you don't actually have the code (i.e. binary shared library). Moot point in your case, I think, but...

    Plus it's just good programming style to have separate definitions and implementations. Easier to track down bugs.

    --
    ...but it's being eaten...by some...Linux or something...
  4. Keep the header files by SunFan · · Score: 4, Insightful


    They are just about the only way to centrally organize declarations for data structures and function signatures. Doing so will save your ass eventually, because having function prototypes available can allow the compiler and lint tools catch stupid programmer errors. You do use lint-like tools, right? They _will_ catch bugs that testers and visual scanning wont.

    The only draw back to headers in C is that if you forget to 'make clean' after changing a header, you can end up with object files using old definitions. Just make a habit of doing a full build after changing the headers. If you designed your software properly, changing header files won't be all that common (adding functions new data structures, etc.).

    --
    -- Microsoft is the most expensive commodity operating system and office suite vendor in the marketplace.
  5. Speed by jbrandon · · Score: 3, Insightful

    Have you tested the speed difference when you change only one non-header file? I bet incremental compilation will make that quite a bit faster. In addition, if you want to compile that changed source file to check for syntax or type errors, you don't have to check for collision between it and the whole rest of the project, only collisions between it and the header defining it.

    1. Re:Speed by stonecypher · · Score: 2, Insightful

      I bet incremental compilation will make that quite a bit faster.

      Chances are he's got massive coupling problems, which can totally throw away any benefit of incremental linking. And by the way, incremental compilation is something totally different; whereas I realize that the error is that of the original speaker, not yours, it should nonetheless be pointed out.

      C++ does not support incremental compilation, though ICC and MSVC both have extensions to support it. MSVC refers to it as runtime code generation: you change the source, MSVC swaps out a vtbl, and the new compiled piece of source is literally injected into a running program.

      TU seperation is incremental linking, not incremental compilation.

      --
      StoneCypher is Full of BS
  6. GCC by lexarius · · Score: 2, Interesting

    My OS prof was demonstrating the differences in what errors the C compiler and linker would pick up. However, we found that we could make two source files with no include lines in either that both defined a global variable (sans extern). The main function set the global variable and then called a function that is defined in the other source file, which would then print the gv. Then we compiled and linked them with gcc. No warnings, no errors. The program ran exactly the way we wanted it to, which was unexpected. So yes, you can do away with includes and header files without even performing the includes manually. Depending on the language, your compiler might be smart enough to figure it out.
    But that doesn't make it a good idea. Besides, do you want to be the one who has to go update the library functions that would normally have been included any time you change the code in one file?

    1. Re:GCC by Jamie+Lokier · · Score: 2, Informative

      That's not an error, and if your OS prof said it was an error that was not picked up, he/she is mistaken.

      The C language definition is clear that you can write the program you did, with a variable defined in "common" form (no initialiser and no extern) in both files, and a function called without a prototype, and it is a valid C program with well defined behaviour.

      -- Jamie

  7. Why? by Pacifix · · Score: 4, Insightful

    It seems that the onus should be on the vendor to explain very, very convincingly why you should abandon decades of standard practice and good coding practice. This better be one hell of a good product you're developing to justify the should a radical change. You shouldn't need to defend standard practice, they must campaign for a change to that practice. Imagine trying to explain this to all the coders who will work on the product for the next decade - will they think you're crazy or is there really a reason to do this?

    1. Re:Why? by CamMac · · Score: 5, Funny

      Remeber, if you remove all the comments from the code, it will compile faster and the executable will be smaller.

      --Cam

      --
      All jocks think about is sports. All nerds think about is sex.
  8. Several advantages and disadvantages by cookd · · Score: 3, Insightful
    1. Advantages:
    2. Faster compile of the full product. You only invoke the compiler process once, and much less work for the linker to do.
    3. Much better optimization. Compilers can only optimize within a compilation unit. Intel and Microsoft have "Link-time code generation" compilers which performs a final optimization pass during link, but if you aren't using those compilers, there might be a significant amount of additional optimization enabled by putting everything in the same compilation unit.
    1. Disadvantages:
    2. You're not doing it the way everyone expects you to do it. Certain components (the compiler, the linker, and pre-existing code) might have been designed under the assumption that individual files would be compiled separately. The pre-existing code might have declared static (per-file) variables or functions in a way that could collide with other code (namespaces might help here). The compiler and linker might have limits. And you might not hit those limits until late in the project.
    3. For building the whole product, yeah, it will be faster. But for making a small change and rebuilding the results of that change, it might be much slower.
    As with every issue you'll ever run into, there are two (or three) sides to it.
    --
    Time flies like an arrow. Fruit flies like a banana.
    1. Re:Several advantages and disadvantages by stonecypher · · Score: 4, Informative

      1. Faster compile of the full product.

      Well, back in the real world, in a properly decoupled project incremental linking is a massive speed win, even when building from the top, as there's far less cross-lexing and as the build tables may be handled a small piece at a time, which is important because their parsing in the compiler itself is generally of O(n^2 log n) time or better. Once you've worked on a large project which fails to make proper decouplings, you will become painfully aware of this trend.

      Whereas in this particular project the complete build is apparently faster, that is almost certainly the result of a very naive code tree and/or build scheme; the importance of incremental linking towards speed of compile cannot be overestimated, even in the case of compiling from clean.

      2. Much better optimization. Compilers can only optimize within a compilation unit.

      This simply isn't true. Whereas only some compilers make cross-TU optimizations, that is not the same as cross-TU optimizations being only able to optimize within a translation unit (why do people keep saying compilation unit? There's no such thing!) Besides, you're dramatically underestimating the commonality of link-time cross-tu counterspecialization, which now exists in ICC, BCC, MSCC, ARM ADS, EDG/Comeau, GHOC, and is in experimental development within GCC.

      You're not doing it the way everyone expects you to do it. Certain components (the compiler, the linker, and pre-existing code) might have been designed under the assumption that individual files would be compiled separately.

      They most certainly have not been. The C and C++ standards do not allow for such ridiculously inappropriate behavior. Where did you get this idea? Compiler writers may not impose arbitrary restrictions on the codebase in any relation to the local filesystem. This is just untrue.

      The pre-existing code might have declared static (per-file) variables or functions in a way that could collide with other code (namespaces might help here).

      This is a well known gigantic red flag indicating an amateur programmer. File-scoped variables are antiquated even within the pure C community; the only time they're acceptable in most professional programmer's eyes are within a library which is built alone. In fact, you might want to read the things Kernighan himself said about when file-scoped variables are appropriate in K&R 2; the primary author of the language himself says that this is a fundamentally bad technique and should not be done.

      Of course, that you're causing problems by misusing the toolchain and allowing bad code to collide when build trees written seperately are blindly merged without the help of a linker is just not surprising.

      The compiler and linker might have limits.

      Not if they're standards compliant, they mightn't. Did you know that there's a document out there floating around telling compiler authors in concrete detail what they may and may not do? You should read that before commenting on what a compiler may or may not do; you are simply out in left field, here.

      As with every issue you'll ever run into, there are two (or three) sides to it.

      Not when you know what you're talking about. Whereas many things are issues of pro/con, many simply aren't; you'll be hard pressed to find pros in the distribution of heavy ordinance to delusional sociopaths, you'll be hard pressed to find pros in setting up a "bring a molester to school day," and you'll be hard pressed to find pros in non-decoupled code, once you've actually read the standard and are aware of the real limitations of compiler authors, instead of your guesses about what might maybe happen if someone wasn't paying attention.

      --
      StoneCypher is Full of BS
  9. unclear on what you mean by Dink+Paisy · · Score: 2, Interesting
    You need to clarify exactly what is going on... My best effort at interpretation is that currently you have something like:

    gcc -c f1.c gcc -c f2.c gcc -c f3.c gcc -o f f1.o f2.o f3.o

    Your vendor instead thinks it would be better to do:

    gcc -o f f.c

    Where f.c looks like:

    #include "f1.c"
    #include "f2.c"
    #include "f3.c"

    Am I right, or am I completely off track?

    If I'm right, you'd probably still want to include header files because you want everything to remain modular. According to software engineering type people, that makes maintenance easier. Another problem is symbol scoping. C keeps symbols local to the module they appear in, so you want to make sure you have naming conventions, namespaces, or some other protection against naming clashes. I'm dubious about the benefits, but I work on projects that take significant amounts of time to compile. Not hours, but enough time that if you wait for all the objects to compile you are wasting a lot of time. In general, I'd claim that the larger the project, the worse an idea it is.

    --

    Whoever corrects a mocker invites insult;
    whoever rebukes a wicked man incurs abuse.
    --Proverbs 9:7
  10. Time you gain, you loose in debugging by StarWynd · · Score: 2, Insightful

    While including code directly may speed up the compilation time, you will loose all the time you gain and then some when you get into debugging.

    If you have a complicated #include chain, you can wind up with a lot of duplication. Some compilers will complain, some won't. However, if you have typedefs, structs or the like, most compliers will complain and not compile your code until the duplications are removed. I don't know what compiler you're using or if you are planning on including more than functions or global variables, so I don't know if this is an issue or not.

    The more general issue is that it's much easier to track down bugs and other problems if there is a clean separation between definitions and implementations. I can't characterize that difference in a few sentences, so I'll just say that it has been my experience that projects which are developed in a true modular nature are much easier to debug than projects designed in a monolithic nature. The time saved in debugging more than makes up for a little time lost in compilation.

    1. Re:Time you gain, you loose in debugging by stonecypher · · Score: 2, Insightful

      If you have a complicated #include chain, you can wind up with a lot of duplication. Some compilers will complain, some won't.

      So sorry: the ODR rule prevents duplicated code from functioning in any compliant C or C++ compiler, all the way back to day one (and in fact into the parent languages B and BCPL.) This is simply false.

      However, if you have typedefs, structs or the like, most compliers will complain and not compile your code until the duplications are removed.

      If by most you mean all...

      so I'll just say that it has been my experience that projects which are developed in a true modular nature are much easier to debug than projects designed in a monolithic nature.

      Uh. Modularity and monolithism are not related. Modularity is the design technique of seperated interface definition such that one may swap in alternate implementations of code, such as through TUs, polymorphism, SFINAE or various metaprogramming techniques. One may contrast structured programming, functional programming, lambda calculus or contract substitution.

      Monolithism is a development model: designing all at once and then implementing top-down from start to finish. Constrast the waterfall model, iterative development, chain development, and so on.

      The time saved in debugging more than makes up for a little time lost in compilation.

      Right on: preach it, brother. This is one of the least understood principles of modern design: machine time is significantly inferior to programmer time. Herb Brooks would be proud.

      (No, I'm not being sarcastic. Yes, I did just compliment you heavily after criticism.)

      --
      StoneCypher is Full of BS
  11. Depends on the size of the project by nadador · · Score: 2, Informative

    Depending on the size of your project, you will get varying returns from each of these:

    1. Seperate source files means that units of code can hide data and functions.
    2. Seperate headers, combined with something like GCC's -Wmissing-prototypes enforces the good coding practice of well defined functional interfaces.
    3. Seperate headers and source files means that when you look at a function in a file, you will have some idea of what it touches because you can go and look that it included header X but not Y.
    4. You can tell the compiler to explicitly forbid global data symbols, which is pointless in one single file.
    5. You can use different compiler switches for different files.
    6. Your code will have some hope of portability.

    If your project is small, it doesn't matter anyway. If your project is large, you can get your compiler to enforce some good design rules on you, which doesn't mean you can't still have a good design anyway, but it will make it more likely. I worked on a project that used a compiler that let you get away with everything. Try and port that code to anything UNIX-like, and it was ridiculous.

    --

    Outside of a dog, a book is a man's best friend. Inside a dog, its too dark to read.
  12. Never thought we'd need to explain the obvious by Lord+Kano · · Score: 2, Interesting

    You have issues with scope.

    The easist one for me is that with #includes, it's so much easier to fix bugs. If you find a bug or an inefficient way to solve a problem, you only have to fix it once. Everything that #includes the suspect file will be fixed on the next compile.

    If customfunctions.h has been changed or optimized, you don't have to edit the 30 projects that you're using those functions in. Just the one file is fixed, each project gets the benefits during the next compile.

    LK

    --
    "Hi. This is my friend, Jack Shit, and you don't know him." - Lord Kano
  13. Off the top of my head... by Dormann · · Score: 2, Interesting
    Playing devil's advocate to most of the replies I've read so far, there are two benefits you could get from this approach.

    If your language supports "static" in the file-scope sense, you could declare every global object as static and reap the compiler optimizations that come with that declaration.

    If your language supports smart inlining, you could end up with code that has been inlined more effectively, since any code could be an inlining candidate regardless of location.

    I can think of plenty of reasons to back away from the idea, but they'll flood in here without my help.

  14. ccache by yamla · · Score: 3, Informative

    It is hard to tell from your statements, but this may stop tools like ccache from working. I use ccache in my projects and it radically cuts down the amount of recompilation required when I do a complete rebuild. Now, an obvious question is why I don't simply rely on makefiles to ensure only changed files ever get rebuilt. This often happens because compilation involves generating new cpp files that are then compiled and I don't want to be grepping through these all the time. I suppose I could move them all to a different directory, but ccache works very well.

    The other problem, of course, is that separating your classes into header and implementation means that if you change the implementation, you only need to recompile that one file and relink, rather than recompiling EVERYTHING. This can be a matter of a few seconds vs. several minutes. And implementation does change, a lot... fix a bug, you fix the implementation. The headers change too, but much much less frequently.

    --

    Oceania has always been at war with Eastasia.
  15. Use #includes sparingly? by mattgreen · · Score: 2, Interesting

    When I write libraries, I try to make them header-only. Generally users don't want to have to modify their makefiles if they don't have to, and I'll resort to compiler specific pragmas if I have to.

    It depends on the size of the system. If you are using a component-based system then only the pieces of the system that are actually being modified should be compiling anyway, which cuts out a lot of compilation. However this implies there is fairly loose coupling involved. In a more conventional application, there has to be a breaking point where the amount of time to parse files is longer than the time to link them normally. Using precompiled headers on any system header will also drastically decrease the time it takes to compile, since the compiler essentially just dumps the parse tree out to disk. So much time is spent inside some system headers! (Especially Windows.h. Ugh!)

    There are some tools that keep the header files in sync with source files automatically, but I don't know of any off-hand. I have seen some for C, but I'm not sure there is one for C++ that supports all the wild and crazy stuff like namespaces and templates. :)

  16. It can work by Foolhardy · · Score: 2, Interesting

    Including all the source code into one main file compiled to one object can work, if the source files cooperate. C can have problems with the namespace, but C++ allows multiple namespaces and you can even put the namespace blocks in the main file around the #includes. The source code has to support this, though. It's best if all the source files to be included are under your control. For libraries that expect to use a declarative header, use it like it was intended.

    I've done this on lots of projects and it works great. Most of the arguments here are either about performance or an appeal to tradition (that's the way we've always done it... must be the only true way). Modern compilers will create pre-compiled headers that can include code, usually used for template and inline definitions; modern compilers don't get the same benefits from the traditional model anymore. Actually, even larger projects seem to take longer to link with iostream and windows.h than the source does to compile.
    The compiler's ability to optomize code may be increased greatly, espescially its ability to inline functions. Too much inlining will cause code bloat, but the compiler's options should give you control over the balance.
    Modern compilers also allow you to change the compilation options mid-file.
    Any debugger or source analyzer shouldn't have problems handling inline or same-file implementations, or you're using bad tools.
    It can also be easier to create test code; create a series of test files t01.cpp, t02.cpp (each with a main) but include only one. The others are there for reference but don't interfere. This is also useful for testing a prototype replacement for a component; include the new one and comment out the old include. Going back is trivial.

    It's more a question of coding style than anything. Personally, I hate maintaining redundant information of any kind, and this very much includes the prototypes in the header with the actual functions. Source code redundancy is bad for all the same reasons that database redundancy is bad. Making my C++ member functions inline and including their files frees me from this.

    I don't think this will work too well in Java. A Java source filename = the .class filename = the ONE public class exported by the file. Unless you want a total of 1 public class, it won't work. Java doesn't use header files anyways. Class binaries export everything public automatically.

  17. Learn from others mistakes by GoofyBoy · · Score: 2, Interesting

    Its an interesting approach and you have no idea why you shouldn't do it.

    So do it.

    In the end, regardless if it works or not, you will have learned something new.

    --
    The surprise isn't how often we make bad choices; the surprise is how seldom they defeat us.
  18. The immortal advice of Rocket J Squirrel by crmartin · · Score: 5, Insightful
    "Oh, Bullwinkle, that trick never works."

    One of the really depressing things about having been in the business for nigh on to 40 years now is that, along with the occasional new dumb idea, all the old dumb ideas keep coming back. Among those dumb ideas that keep coming back are "visual programming" --- using graphics instead of programming languages; complicated schematic graphics for software --- UML in its utter complex form; and, sure enough, using the preprocessor to mess with C-like languages.

    Every time this is tried --- and God knows it's been tried a lot --- you run into some severe problems:

    1. The scoping rules of C-like languages give semantics to file inclusion. If you #include chunks of code, you are defeating the language's (limited) ability to protect you from name space clashes, mis-named variables, and so on.
    2. While it might be that you gain something from only needing to start the compiler once, parsing and compilation are inherently a bit harder than O(n) where n is the number of source characters or tokens. A normal environment with make(1) will generally need to process fewer tokens than compiling everything all the time; the time required for a big file will inevitably dominate the startup time eventually.

      If you've got control of the compiler for this peculiar language, why not explore making the startup time shorter, say, eg., by using shared libraries, DLLs, or by setting the sticky bit?

    3. From sad experience, I can tell you that using the #include scheme will introduce weird-ass order dependencies into the code (ie., what order do you include files in?) that are very very difficult to debug.
    4. Most tools for C-based languages expect you to do the sources in a normal fashion. You confound the tools' expectations' at your peril.
    5. Similarly, most debuggers exploit, or attempt to exploit, scoping rules that you will break through this approach.
    6. When you write lots of smaller modules, each one can be create a single, small TEXT and DATA section, or a collection of small code sections. This makes the job of memory mapping in virtual memory systems much easier. Do it all as one big thing, and you're liable to get one big TEXT section.
    7. Optimization is comnbinatorially fairly hard, quadratic or worse, and global optimizations tend to be managed within section bounardies. One-big-module designs may either make the optimization phase very lengthy, or defeat optimization entirely when table space etc. runs out.
    8. You piss off every experienced C programmer who ever has to deal with the code in the future, especially old farts like me who've seen this trick 20 years ago.
    1. Re:The immortal advice of Rocket J Squirrel by Hard_Code · · Score: 2, Funny

      Yeah, seriously, you should be designing modules to interact through clean interfaces with known, published contracts. How you partition your compilation units should fall naturally from this. You shouldn't try to do the reverse and somehow "accrete" a design based on low-level preprocessor hacks.

      --

      It's 10 PM. Do you know if you're un-American?
    2. Re:The immortal advice of Rocket J Squirrel by crmartin · · Score: 2, Interesting

      Uh, which loader and which linker, and for that matter which operating system etc? I suspect you're right about linux, and it'd be lovely to believe that there aren't any systems (other than in hobbyists' basements) that don't do it that way any more, but with the information we've got, we can't exclude the possibility.

  19. Faster!?! Have you tried it? by multriha · · Score: 2, Interesting

    Well, coding style and software engineering aside, you need to do some testing if you think this will increase you speed.

    Quick test to illustrate. 1,000,000 lines of C code, using gcc 3.3.4, default options.

    Time to compile spread of 1000 files (with 8 lines of include and function body per file): ~2 minutes

    Time to compile all in a single file: unknown

    Why is the second time unknown? My computer doesn't have the memory to do it. Now I could pump up the memory of my machine (assuming I've got a 64-bit machine) to let me do the second compilation, but I doubt it'll be faster.

  20. You are solving the wrong problem by Chemisor · · Score: 3, Informative

    Speeding up a full build should not be important. The only people who care about it are in your test lab doing daily builds and regression tests, who can start the build overnight and have it ready by morning. Of course, this is the situation in a well-designed application. If you find yourself needing a full rebuild all the time, it means one of two things: 1. you are hacking a core component, or 2. all your components are written with spaghetti code and any change in one forces rebuilds in all the others.

    In the first case, try just testing one or two components during development, and then verify all the others when the API is stabilized. This is, incidentally, the advantage you gain from using header files: once the API is stable, you never need to rebuild that component again except to fix bugs (which require rebuilding only that component).

    In the second case, you need some serious refactoring. Look at the code and break it up. Encapsulate everything you possibly can. Make stuff private and static. Make everything you don't modify const. Keep it up until each component is accessed only through its API and that API is clean. Trust me, this is possible in any project. The enormous decrease in maintenance costs will more than pay for any time you spend on it.

  21. Total red herring... by pla · · Score: 3, Informative

    First of all, "speed", either compilation-wise or runtime-wise, has nothing to do with why you should use header files.

    I too disliked header files, long ago, in my early days of programming C. It seemed pointless, to have two files (or rarely, as many as four), when one would do just as well.

    For small projects, I'll still use one large monolithic source file. In that aspect, it makes sense to skip breaking out your data and function definitions.

    But when you get to the "real" world... Imagine even a "small" serious project, with perhaps 10k lines of code. Try to find a single function in that file - I hope you feel on good terms with your IDE's search capabilities!

    So, break that out into a dozen files - You have your network code in one file, your UI code in another, your file I/O in another, perhaps some database interaction in another, and so on. Okay, that works well... But wait, your network code, your file I/O, and your database code, all make use of the same checksum algorithm! So, you have the same exact code duplicated three times.

    That would work, because each file will compile to a module with its own namespace (in most languages). But it wastes space, both in the source and in the compiled code. It also wastes time and can very easily introduce bugs - For example, if you decide you need to switch from MD5 for SHA1 as your checksumming algorithm, you now need to change three places instead of one. If you miss one of those, but use them to compare results between the three different uses, you have a very serious bug that may drive you batty trying to track it down.

    So, the obvious solution, break out all your common functions into a toolkit-like source file. Now, you could just #include that in every other file that needs it, but WOW would that cause some serious bloat in the compiled code - In my experience, shared code files frequently end up as the single largest source file in the entire project.

    So, use a header file. That way, you don't end up with massive duplication of code, you have the advantage of a logical breakout of your code into similar-purpose files, and you can still make changes to only one file to modify one function.

    Incidentally, the above chain of thinking more-or-less describes the evolution of standard libraries... Would your professor actually suggest that you shouldn't "#include<stdio.h>", but instead should manually pull the code for each function you use into your source file? Because, in the degenerative case, he has told you exactly that.

  22. removal of duplication is usually a good thing by mqx · · Score: 2, Interesting


    Just remember that a header file defines the interface to the body: which actually duplicates some of the material in the body. Because of this duplication, you can have problems, i.e. faulty build dependencies, mismatch between header/body, etc. Removing this sort of duplication is usually a good thing: so if the technology (i.e. compiler) is smart/performance/etc to get it right, then the change could be a good thing.

    I'd like to point out that many other respondants have argued their case with reference to 'C', however the poster clearly said it was not 'C' -- without further information, it's difficult to know whether these 'C' type issues will translate. I'd point out that some languages, e.g. python, java, perl, do not have ideas of separate header/body -- suggesting that "current trends" in languages is to do away with the duplication.

    The compiler could be intelligent enough to construct a parse tree quickly, and only resolve parts of the parse tree when necessary: so for example, if there was previously a 5K header, and a 30K body, but now only a 30K body, the compiler may read the entire 30K, and only "roughly" parse it (e.g., say for a function, it parses the outer scope of the function, but resolves nothing inside the function until some other code actually uses the function).

    I don't think there's an answer for this guy: there are too many issues that haven't been stated, as we know nothing about the particular toolchain, the build environment, the language, etc. All we have an abstract concept of splitting files into header/body. That concept by itself isn't good or bad, it depends upon a lot of other issues that change the perspective.

    My answer would be that surely in the guys company he has a couple of clueful senior engineers that can sit around a whiteboard and discuss (using their computer science training) what actual impact the change will have on the project, and whether to go with the impact.

  23. More from the author by garethw · · Score: 2, Informative
    Thanks for all the interesting replies. It's always nice to start a flame war.

    I wish I'd included a few more details, which might have avoided questions like, "Are you stupid?" and "Have you taken basic Computer Science course?" (the answers are "On occasion" and "Waterloo, Comp Eng '98" respectively :) )

    A few details which might put the question into perspective might be:
    • The project is a chip verification project. There is no final "product" at the end of my work. The name of the game is endlessly re-compiling and running new tests. So compile time is actually quite significant.
    • There is no linker. :) The nature of the language is such that it is linked at run time.
    • The compiler actually doesn't allow you to list multiple source files on the command line and produce one object. So I guess my C/Java analogy was misleading. But that's partly why I'm at a loss to rationalize the question - there is little direct reference point.
    • A lot of people missed my point - I think abandoning header files is abhorrent. But when it came down to it, I couldn't actually produce any inarguable reasons why (namespace is one, but I don't think it's a show-stopper).
    Thanks again for your insights.
    --
    garethw
  24. Re:Incremental compilation by stonecypher · · Score: 2, Informative

    I had a whole reply ready, but IMHO it is not worth the trouble replying to.

    "Oh, I wrote a reply, but I don't want to paste it because you're not a good person and I don't want to." My eight year old son knows that nobody falls for this sort of passive agressive dismissal; it's disappointing that you do not.

    Namedropping doesn't make you seem correct, y'know.

    Neither does getting personal about perceived faults, when it's pretty much your own assumptions that are the problem.


    Observing that something you've done isn't effective is hardly my getting personal. Believe me, there's no shortage of material; what I said above about my son is, for example, personal, as is the following: turn down the whine knob until you've got something worth saying to say.

    So the whole template rant is bogus.

    Given that it was not you but the original poster which set the domain of important languages, and given that I also touch on pure-C and pure-Java issues, this protest is as bogus as it pretends what it's attacking to be.

    - When I am talking about state, I'm talking about state in the compiler. (which in most _performing_ tools has the preprocessor built in btw)

    Yes, I heard you the first time. The reason I referred you to modern c++ design is that you're wrong, and I have neither the patience nor the kindness to explain it to you. Start with section 3.5, or with any page explaining how C++ template metaprogramming is a functional language rather than an imperative language. Before you fly off the handle talking about how you weren't referring to templates *again*, please realize that the observations regarding template MP as a functional language in fact apply to everytihng in the C and C++ preprocessors.

    Do not reply until you have read; repeating ignorance is no more argument than repeating falsehoods.

    -- by {$i} ({$include in delphi), like #include in C

    Actually, you're shooting yourself in the foot here. You're attempting to make the hasty generalization that there are two approaches to bringing outside code into a local place, that C/C++ advocate an "inline header system" (whatever the hell that is) and that Delphi does something different.

    What you seem to fail to understand is that uses is a call to the Delphi linker; it is literally the same thing as the linker in C++, and in fact if you take the time to look at borland's BPIL, you'll find that they generate the exact same intermediate language binary. Furthermore, the very same examples you give, {$i} and {{$include}}, are the same as #include. Furthermore, both languages offer still other mechanisms to bring code in or to generate code.

    That said, talking about Pascal's differences with C and then discussing what Delphi does is roughly equivalent to describing what Objective C does. Delphi is not pascal any more than Objective C is C. They are distinct languages. Delphi is Borland's third pascal variant, Object Pascal, which follows both Borland Pascal and Token Pascal (the last of which is so old that you pretty much can't find references to it online.)

    Please don't lecture to me about Pascal; my use of Pascal predates Borland's very existence.

    The main reason, and the fundament of a unit system, why the second way is more optimal than headers, is that the compiler reinitialises before reading a unit interface.

    Uh. The pascal unit system is simply an in-code linking mechanism. It's no different than rolling your source together with your makefiles; if you'd bothered to read Wirth's papers on the design of the language you'd find out that Wirth himself suggests that "unit" is nothing important, and pretty much just syntactic sugar.

    Now, how the compiler "reinitializes" before reading a unit interface is a little bit beyond me: the unit interface is just what a C++ programmer would call a collection of vtbls. Would you be willing to point me to any point

    --
    StoneCypher is Full of BS
  25. Re:This isn't C++ or Java by garethw · · Score: 2, Interesting

    The answer has to be "we don't know" because you haven't told us what language this is. I reckon it's some sort of verilogrevolting or system-Cyukyuk abomination,

    Not a bad guess. It's Vera. But what's the point of telling people that? How many people on slashdot even know what Vera is?

    in which case I would say *** do what the vendor tells you *** because if you don't and it all falls apart they will say "we told you so"

    But what's messed up is that vendor just decided to start advocating this one day. The Vera compiler even supports header generation - it's been done the way we expect for years. And bang, one day they suddenly start encouraging this weird approach. "What the vendor tells me" is not what the vendor told me five years ago, or last year.

    I know that there was some discussion sparked internally at the vendor because I raised this question with their FAEs. I already know what other verification people think; I wanted to get an impression of what people thought of the same methodology if it were applied to similar languages.

    --
    garethw