GCC 4.0 Preview
Reducer2001 writes "News.com is running a story previewing GCC 4.0. A quote from the article says, '(included will be) technology to compile programs written in Fortran 95, an updated version of a decades-old programming language still popular for scientific and technical tasks, Henderson said. And software written in the C++ programming language should run faster--"shockingly better" in a few cases.'"
A lot of it is down to templates. As soon as you use them or STL you're upping the compile by an order of magnitude and gobbling up a hundred megabytes of RAM... With great power comes a cost.
It's pretty cool. You write a loop like this:
and the complier will handle the creation and syncronization of all the threads for you. Here's a OpenMP for GCC project on the FSF site. Looks like it's still in the "planning" state, though, so I'm guessing it's not in GCC 4.X.Problems are like gifts, it's better to give than to receive
but the reason it takes forever to compile KDE lies in fact that it uses extensively the templates. While templates (a.k.a. generics) are a very useful language feature, they increase compile times. Including support for export template feature could help but only when anybody would use it in their code.
You can make an experiment and try compiling KDE with Intel C++ or Comeau C++ compilers, and see that not much can be gained comparing to GCC.
You can defy gravity... for a short time
I'd love for boost to be in the standard library, but I'm not sure that complaining to the gcc folks is the way to get this done. Surely if we want this in the standard library, it should be included as part of the next version of the ISO C++ standard?
Oceania has always been at war with Eastasia.
GCC is just the compiler. If you want an IDE that works with it, look around... there are a few. I don't really have any recommendations as I'm mostly working with other tools right now.
Read the site you just linked:
Ten Boost libraries will be included in the C++ Standards Committee's upcoming C++ Standard Library Technical Report as a step toward becoming part of a future C++ Standard.
So yes. Eventually, anyway.
Likewise, there are several IDEs that can nicely handle a C++ project which uses GCC. Eclipse is maybe the best example of these.
Besides, do you really want "Must have GUI to cope with compiler" on your resume? ;-)
What does GCC have to do with this?
If you want something added to the standard, talk to the C++ standard committee. (Either the Library or the Evolution groups, in this case.) You'll find you're about the 10,000th person to ask for this. You'll find there's an extensive FAQ on this exact subject. You'll find that the committee is very keen on adapting large parts of Boost, as experience in the real world smooths the rough edges of Boost.
If you look a bit more, you'll find that some extensions have already been adopted (called "TR1") and are being shipped with GCC 4.0.
You'll also find that GCC does not get to determine what's in the standard. And -- speaking as one of the libstdc++ maintainers, although I'm largely too busy to do much myself these days -- GCC will not ship Boost. Or glibc. Or libAPR. Or OpenSSL. Or any of the other million very useful open source libraries out there, because that's not our job.
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
If you're interested, here's a (long) discussion which makes reference to many of the things coming in the new GCC.
Game! - Where the stick is mightier than the sword!
Don't all compilers convert a program's source code into binary instructions?
We're working on the necessary infrastructure to associate the pragmas with the syntactic constructs they apply to. Actually parsing the OpenMP directives was already implemented - twice - but GCC does not support pragmas with a lexical context yet. This is needed for a bunch of C extensions, so we're working on that. This is probably GCC 4.1 material. After that, actually generating concurrent code from OpenMP pragmas is next.
Have they found some new-fangled magical technique for compilation?
Actually, SSA trees probably count, which is new in GCC 4 (invented in the early 90's). Look here, scroll down to "Power Through Builds" for a list of improvements from SSA trees.
Of course, this claim may be due to no longer doing something shockingly inefficient.
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
$ gcc --version
gcc (GCC) 4.0.0 20050310 (Red Hat 4.0.0-0.33)
Copyright (C) 2005 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
I've noticed it compiles a bit faster and the binaries are a bit bigger aswell.
No, propolice is something entirely different. That's a stack smashing protector. Mudflap is a bounded pointers implementation.
Yes, from here: "
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
Even the most cursory search of the GCC mailing list archives would disprove this.
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
What do they do that makes their c compiler so much faster ?
Guess again. Gfortran is an entirely new compiler frontend and runtime library. Andy Vaught undertook the enormous task of writing a Fortran 95 compiler from scratch. An early version of Andy's work was integrated into GCC and we have been working on bug fixes, completing a few missing pieces, and backwards compatibility with g77. Use google with g95 and gfortran keywords. BTW, gfortran beats g77 on some of my private benchmarks.
Damn!
That's: g++ -o myapp file1.cpp file2.cpp file3.cpp
General Relativity: Space-time tells matter where to go; Matter tells space-time what shape to be.
Have you tried maintaining a compiler used in as many situations as GCC? (If not, you should try, before making complaints like this. It's an educational experience.)
We added a "select ABI version" to the C++ front-end in the 3.x series. If you need bug-for-bug compatability, you can have it.
Wanna know when this is gonna happen? Sooner, if you help.
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
So this is a nice debugging feature that might catch a few hard system bugs but I'm a bit concerned that the GCC folks are touting it as a security feature if it's rarely going to be used in production code.
check for a class of vulnerabilities called buffer overruns
Eerily reminiscent of VAX/VMS's "/ARRAY_BOUNDS_CHECKS=ON" option, around 1985 this was. Admittedly, this was for Pascal or somesuch.
Cool thing for gcc nonetheless. Don't forget to check Boehm's Garbage Collector for C and/or Bruce Perens' Electric Fence
Maybe someone's already said this, but look into three projects to speed up your compile:
1) make (or some equiv). Yes, I said make.
GNU make accepts a -j parameter, to thread builds. Only really useful on hyperthreading or multiprocessor boxes, however. That said, if you use:
2) http://distcc.samba.org/: distcc. You can distributedly compile your apps across other machines with a similar setup. Only really helpful if you have more then one box.
3) http://ccache.samba.org/: ccache. This is a C/C++ compiler. Only really useful for iterative development, and if you're doing a lot of make clean/make, as it'll cache things that don't to be rebuilt.
Just some suggestions. Also, check out prelink, to prelink anything using shared libraries (trade space-savings into performance) and make startup code run faster in some cases.
Hope that helps!
++Informative? Pwetty pwease?
- - - -
KickingDragon
Apparently. Found via google:
http://people.redhat.com/bkoz/benchmarks/
Doesn't look public though.
I don't know when was that last time you're refering to, but all benchmarks I've seen in the last couple of years clearly show Intel's compiler superiority. The generated code was up to 3 times faster.
I just hope that GCC closes the gap with 4.0.
1's and 0's should be free.
GCC is an incredibly versatile compiler, with frontends for C, C++, Java, Ada and Fortran provided with the basic install. 3rd party extensions include (but are probably not limited to) Pascal, D, PL/I(!!) and I'm pretty sure there are Cobol frontends, too.
They did drop CHILL (a telecoms language) which might have been useful, now that telecoms are taking Linux and Open Source very seriously. As nobody seems to have picked it up, dusted it off, and forward-ported it to modern GCCs, I think it's a safe bet that even those interested in computer arcana are terribly interested in CHILL.
OpenMP as been discussed on and off for ages, but another poster here has implied that design and development is underway. OpenMP is a hybrid parallel architecture, mixing compiler optimizations and libraries, but I'm not completely convinced by the approach. There are just too many ways to build parallel systems and therefore too many unknowns for a static compile to work well in the general case.
Finally, the sheer size and complexity of GCC makes bugs almost inevitable. It provides some bounds checking (via mudflap), and there are other validation and testing suites. It might be worth doing a thorough audit of GCC at this point, so that the 4.x series can concentrate on improvements and refinements.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
C binary compatability is broken constantly, with every version of glibc. Anything compiled statically will crash using NSS if you compile statically and use a sligtly different gblic version. If you compile dynamically, then anyone who doesn't have this weeks version of glibc can't run your binaries.
At present, there are numerous features of Fortran 95 that gfortran (the gcc Fortran 95 compiler) does not handle correctly. G95 http://www.g95.org/, from which gfortran forked, is closer to being a full Fortran compiler. One can search the Usenet group comp.lang.fortran to confirm these statements.
If you just want a free Fortran 95 compiler use g95. Bugs reported to Andy Vaught are usually fixed quickly, and fresh Linux compiler binaries are posted almost daily. If you want to participate in the development of a Fortran 95 compiler, gfortran is more democratic.
Execution speed.
The gcc/g++ driver's purpose in life is to rip through the command line, figure out what other programs need to be run (compiler, assembler, linker, etc), fork them all off -- possibly in a loop, if you've passed more than one file on the command line -- and clean up afterwards.
"gcc -> real-work-programs" or "g++ -> real-work-programs" is a much faster executation path than "sh parser -> gcc -> real-work-programs", especially when your makefile is repeatedly invoking g++.
Maintainence is not especially difficult; g++ isn't really a seperate program. The difference between gcc and g++ is one or two extra .o files that get linked into the final executable. (Same for other language drivers that can't get by with plain "gcc", like the Java one.)
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
using namespace name;
That's all you need to do. What's so hard? I use "using namespace std;" in the common include files of all of my home-built programs.
C++ is a different language. Not only is its syntax different, but the style of doing things is different. If you're expecting to not feel like it's an alien environment, you'll be sorely mistaken.
That doesn't mean it's bad; after a long time of resisting it for taste reasons, I started learning exactly *why* C++ does certain things, and how to put them to good use. And the differences can be staggering at times - templates are invaluable, destructors are invaluable, classed arrays (things like vectors instead of pointers) are invaluable, maps are invaluable, etc. These sort of things can knock out bugs you didn't even know were there, improve performance, drastically shorten, and clarify your code all at the same time. That's a rare combination of benefits in the programming world. You pay for it in compile time costs, but it's well worth it - especially when it comes to maintenance. You just have to accept that it's going to feel rather alien for a while, and during that time, you'll be asking yourself, "Why?".
BTW, Java 1.5 is becoming more and more like C++ every day. So, if you don't like the features of C++, you won't like them in modern Java.
"Here's a fun fact: the moon has turned to blood!" -- Newscaster, "Jesus Christ Supercop"
Point-by-point response:
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
Yeah, heavy on the "might".
Politics is what's preventing us from considering LLVM, let alone the long and torturous process of making the code work. The brutally short story is that GCC is operating under a certain restriction imposed by RMS since its inception, and LLVM -- or really, any good whole-program optimization technique -- would require us to violate that restriction.
Now, there are some of us (*waves hand*) who feel that RMS is a reactionary zealot in this respect, and would be more than happy to use the LLVM techniques, but we won't get into that.
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
Anyone who mistook g95 for F95 would indeed be right in concluding fortran was a dated useless language.
Dude, g95 isn't yet completed. Why the hell would one expect it to be fully-functional? Got an axe to grind about the g95/gfortran fork?
My favorite parts of fortrans are that one cannot overflow a buffer
Rubbish, you can do just the same stupid things that you can with C. The difference is that Fortran can implement arrays without the need for pointers, and most Fortran compilers support decent (but very slow) bounds checking on these arrays.
RIP fortran95, killed by g95.
Again, this statement is misleading and meaningless. There are plenty of other mature Fortran 95 compilers on the market, and -- upon their completion -- g95 and gfortran will add to to the selection. How has g95 'killed' Fortran 95?
Tubal-Cain smokes the white owl.
Amen. I used to think that C++ was slower because you would have to deal with the overhead of objects. Until I actually started using it, and found out that during compilation, you don't really experience much if any overhead. On the other hand, you tend to benefit greatly from the highly optimized operations in the STL library.
:)
Const, operator overloading... all of it is great. Inheritance, too. There are so many things in C++ to help you keep your code small, easy to read, and clean. It feels a bit alien at first if you've been programming in C for a long time, but it's well worth it.
I have my faults with it, of course. I think streams were done rather poorly, for example. But overall, I'm glad I switched.
"Here's a fun fact: the moon has turned to blood!" -- Newscaster, "Jesus Christ Supercop"
Yes, anyone who badmouths RMS is "ranting incoherently" about our Dear Leader.
They propogate down into every .cpp that includes your library's headers, whether or not the calling programmer wanted to import the entire std namespace.
Some programmers may have their own classes called map, or string, or list, or a dozen other things, and a single using statement buried in a nested .h can cause unanticipated namespace collisions.
In general, it's safest and most polite to refer to classes canonically in header files (std::string, etc), and keep the using statements in your implementation files.
Sources: "Accelerated C++" (Koenig, Moo); comp.lang.c++ (sample)
LLVM is sort of a mostly-compiled form of a program.
(like preprocessed, but more work having been done)
If gcc can convert C to LLVM, and LLVM to native,
then you could replace either half with something
proprietary. You could add a proprietary middle
step that optimized LLVM code.
Just thought I would point out that Phil is one of the people who have been in charge of the libstdc++ (GNU Standard C++ Library) for a long time. I realise that no one is used to actual subject authorities on slashdot.
[RIAA] says its concern is artists. That's true, in just the sense that a cattle rancher is concerned about its cattle.
It's a shame, since I think the compile server has major potential - and not only in terms of improving compile speed. However, there is still a significant amount of work without a guaranteed payoff, and I guess Apple decided to spend its resources elsewhere. Also, various issues make it difficult for me to work much on the project.
It took me all of 60 seconds to Google this link subtantiating the factor of 3-5 slowdown with Mudflap: http://gcc.fyxm.net/summit/2003/mudflap.pdf The performance data is tabulated on page 7: the average slowdown out of six test cases (three build case, three run cases) appears to be a factor of 4 or so, with the best case being 1.25 (in one run case), and the worst case being 5 (in one build case and in one run case).
"It take 9 months to bear a child, no matter how many women you assign to the job."
Change the type face and half your problems are solved. I'll be happy to accept my +informative moderation now.
The Farewell Tour II
I think even when you are on a single processor box, you can get a bit of speedup with -j2.
Hopefully, while one g++ process is writing out an object file to disk (waiting for disk io to complete) another can be using the CPU to parse or optimize.
I've found that for C++ projects on the low side of medium (say, around 50,000 lines), arranging the compilation so that all of the source code is lumped into a single "unit" accelerates GCC's from-scratch compilation times considerably. Of course, it also uses large amounts of RAM.
.cpp file containing hundreds of .cpp file in the entire project.
For example, direct GCC to compile one top-level
#include "other-source-file-0.cpp"
#include "other-source-file-1.cpp"
#include "other-source-file-....cpp"
#include "other-source-file-673.cpp"
statements, one for each
This forces GCC to compile the whole program as a single unit, so it preprocesses the headers only once, and generates only a single object file.
This achieves approximately the same reductions in compile time as precompiled headers.
Come link time (which can be quite long even for a medium-sized project), instead of needing to combine hundreds of object files, the linker need only deal with one. On x86-Linux with GCC 3.4.x, at least, this accelerates the linking process, plus generates a much smaller and slightly faster binary. I assume that the slight acceleration in the generated code is due to the fact that compiling the entire project as one compilation unit has essentially the same cross-unit optimization benefits as link-time code generation.
Although this technique reduces from-scratch compile times considerably, it's of dubious value in most situations because:
a) Users compiling the software as part of an installation process might find it uncomfortable--or even impossible--to allow the compiler several hundred MB RAM.
b) The developer's changes in a typical edit-compile-test cycle are localized. This technique destroys the possibility of using previously compiled object files for those compilation units whose source code hasn't changed, since there's only one compilation unit for the entire project.
Erlang.org: wow
I only suffer this problem when the line endings in an include file is not "native" to your platform. One include file with LF instead of CR endings is sometimes can be enough.
I don't know precisely about STL, since I avoid relying on it, but you might check all include files. If you're on a unixy platform, try to get the list of all include files (via a "depend" like command) and check 'm all for line endings (the "file" command can be helpful).
That's not true. Building C is much quicker with Visual C++ than building C++. I know, I do it every day.
However, it is generally speaking true that gcc takes more time to compile than Visual C++ does.
I didn't go into details because this has been covered elsewhere, and I'm tired of discussing it myself. But I didn't realize I would be accused of "uninformed slander". So. A bit of background info first.
Inside the guts of the compiler, after the parser is done working over the syntax (for whatever language), what's left over is an internal representation, or IR. This is what all the optimizers look at, rearrange, throw out, add to, spin, fold, and mutilate.
(Up to 4.0, there was really only one thing in GCC that could be properly called an IR. Now, like most other nontrivial compilers, there's more than one. It doesn't change the political situation; any of them could play the part of "the IR" here.)
Once the optimizers are done transforming your impeccable code into something unrecognizable, the chip-specific backends change the IR into assembly code. (Or whatever they've been designed to produce.)
Each of these transformations throws away information. What started out as a smart array class with bounds checking becomes a simple user-defined aggregate, which becomes a series of sequential memory references, which eventually all get turned into PEEK and POKE operations. (Rename for your processor as appropriate, or look up that old joke about syntactic sugar.)
Now -- leaving out all the details -- it would be Really Really Useful if we could look at the PEEKs and POKEs of more than one .o at a time. Since the compiler only sees one .c/.cpp/.whatever at a time, it can only optimize one .o at a time. Unfortunately, typically the only program that sees The Big Picture is the linker, when it pulls together all the .o's. Some linkers can do some basic optimization, most of them are pretty stupid, but all of them are limited by the amount of information present in the .o files... which is nothing more than PEEK and POKE.
As you can imagine, trying to examine a pattern of PEEK and POKE and working out "oh, this started off as a smart array class with bounds checking, let's see how it's used across the entire program" is essentially impossible.
Okay, end of backstory.
The solution to all this is to not throw out all that useful abstract information. Instead of, or in addition to, writing out assembly code or machine code, we write out the IR instead. (Either to specialized ".ir" files, or maybe some kind of accumulating database, etc, etc; the SGI compiler actually writes out .o files containing its IR instead of machine code, so that the whole process is transparent to the user.) Later on, when the linker runs, it can see the IR of the entire program and do the same optimizations that the compiler did / would have done, but on a larger scale.
This is more or less what all whole-program optimizers do, including LLVM. (I think LLVM has the linker actually calling back into the compiler.)
The "problem" is that between the compiler running and the linker running, the IR is just sitting on the disk. Other tools could do whatever they want with it. RMS's fear is that a company would write a proprietary non-GPL tool to do all kinds of neat stuff to the IR before the linker sees it again. Since no GPL'ed compiler/linker pieces are involved, the proprietary tool never has to be given to the community. Company wins, community loses.
End of problem description. Begin personal opinionating.
It's a legitimate concern, but many of us feel that a) it's going to happen eventually, and b) we do all GCC users a disservice by crippling the tools merely to postpone an inevitable scenario. As usual, there's a wide range of opinions among the maintainers, but the general consensus is that keeping things the way they are is an untenable position.
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
And if you are using templates, you are actually using a compile time language, which obviously will slow compilation down.
But they do claim C++ compilation is much faster with 4.0:
Personally, I don't care much. Development seem to be restricted by link time, for a well organized project. And I always compile with full optimization.