Reduce C/C++ Compile Time With distcc
An anonymous reader writes "Some people prefer the convenience of pre-compiled binaries in the form of RPMs or other such installer methods. But this can be a false economy, especially with programs that are used frequently: precompiled binaries will never run as quickly as those compiled with the right optimizations for your own machine. If you use a distributed compiler, you get the best of both worlds: fast compile and faster apps. This article shows you the benifits of using distcc, a distributed C compiler based on gcc, that gives you significant productivity gains."
While distCC is a great tool, there are a couple things to mention. First, the article blurb states that distCC is "a distributed compiler based on GCC." It is actually a method of passing files to GCC on a remote computer in such a way that the build scripts think it was done locally.
The article also says that other than distCC, the computers need not have anything in common; this is not strictly true. Different major versions of GCC can cause problems if you are trying to compile with optimization flags that are only on the newer version. I have run into this on my gentoo box, trying to use an outdated version of GCC on a redhat box.
Another thing is that some very large packages have trouble with distributed building of any sort (either multiple threads on the same machine, or over a network like with distCC). As far as I know, at least parts of xfree86, KDE and the kernel turn off distributed compiling during the build. Some of this might just be in the gentoo ebuilds, but I tink some of it is in the actual Makefiles. If a program has trouble compiling, it's always worth a shot to turn off distCC.
A good resource for setting up distCC on a gentoo system (since compiling is so large of gentoo, this is particularly important) is gentoo.org's own distCC guide
It's also been been discussed here on Slashdot (two years ago!) in "A Distributed Front-end for GCC" and earlier this year in "Optimizing distcc."
Distcc is great for installing Gentoo on an older computer because you can have other (faster) computers help with the compile, and if you like distcc, you may also like ccache.
That's why I use Gentoo!
Now one can install Gentoo in _only_ 5 days!
I think nc can be used like distcc by redefining CC="nc gcc". However, more commonly it is done by putting $(NC) at the beginning of the build rules. Then you can use nc for any build rules, not just C compiles.
In addition to use with make, nc works well with SCons.
Compare the speed cost of loading a "generic" binary to an "optimised" one, multiply by the number of times you load that binary.
Then look at the time required to compile the optimised copy.
How often, in the lifetime of a particular version of a binary, do you really need to reload it?
The promise of distcc is closely related to source distributions like Gentoo. The benefit is overstated. Don't waste your time.
Quick wafting zephyrs vex bold Jim
We use the distcc that Apple distributes with XCode even though we dont' use XCode itself. It really helps to get a few dual-CPU G5's working!
The cool thing about Apple's version is that by default it uses Rendezvous to determine which machines are available to distribute work to.
Precompiled binaries will never run as quickly as those compiled with the right optimizations for your own machine
And there are maybe about ten to fifteen people on all of slashdot who actually know how to go about setting the right optimizations for their own machine.
There are still enterprise uses where coders need to compile huge projects from scratch that take too long on a single workstation. Instead of that build taking 15 minutes on a single workstation, they can tap the power of all the workstations and build it in a few minutes or perhaps even seconds.
thisnukes4u.net
"(...) a distributed C compiler based on gcc, that gives you significant productivity gains."
Assuming
a) That compiling will give you any significant performance increase (which I kinda doubt, it's not like the defaults are braindead either)
b) You don't spend more time mucking about with distCC / compiling than you'll actually use the software
c) Your software is actually code bound (and not "What do I type/click now?" human bound, or bandwidth bound or whatever)
I can't think of a single thing I do that's code bound. And I actually do a bit of compiling, but I spend those seconds thinking about what to code next. Either that, or it is bandwidth bound or non-time critical (i.e. does it take 6,5 hours or 7 hours? Who cares. The difference is half an hours work for my computer, 0 for me. So the time I'd spend to improve it is - gasp - 0.
Kjella
Live today, because you never know what tomorrow brings
It's news to people that don't read slashdot every day.
I don't mind revisiting older topics once in awhile - it's only annoying when it's two days in a row. And even then, it's not that big of a deal, I simply pass over it.
Posts like this are more waste of space then then a duplicate article post, and you get a lot more posts like yours then we do dupes. It's especially annoying when people say "We talked about this TWO YEARS AGO!!!" Well here's some news for you: I don't memorize every slashdot story since the beginning, and there's been a lot of new members since then.
- It's not the Macs I hate. It's Digg users. -
Other then distributed compiler tools like distcc and nc are there any other ways of speeding up a linux compile with gcc?
I was blown away when my project group compiled a Qt app that we developed on the Linux platform with the MS VC++ compiler. The compilation took 1/10th the time! We were using Makefiles generated by QMake in both cases.
Should I just switch compilers? If so does anyone have any suggestions?
Just last weekend I set up distcc via cygwin on 3 PCs to help my Gentoo box compile. Unfortunately, I wasn't able to successfully compile the cross compiler under cygwin, so I used a pre-built version, available under the Gentoo forums thread linked below. It seems to work well so far, although the Windows boxes are definitely slower than equivalent Linux boxes. But as they are not my computers to begin with, I won't be complaining anytime soon ;)
Gentoo has a HOWTO entitled:
"HOWTO: Use a Windows box as a distcc server for linux."
http://forums.gentoo.org/viewtopic.php?t=66930
I've spent the last week setting up a Gentoo cluster with distcc and I've noticed a few things:
1. when *recompiling*, the advantage due to ccache far outweighs the performance of distcc on the first compile. If you're testing distcc you need to be aware of this and disable ccache.
2. most large packages either disable distcc (e.g. xfree by limiting make -jX) or compile small sets of files in bursts and spend the majority of time performing non-compilation and linking. Distcc helps with the compilation but because it's only a small part of the total build time, the overall improvement isn't as great as you might have hoped.
3. distccmon-gnome is very cool.
4. using distcc with Gentoo transparently involves modifying your path and this can make non-root compilations troublesome (permissions on distcc lock files). I haven't figured this one out yet other than to specify the full path to the compiler: make CC=/usr/bin/gcc rather than CC=gcc.
5. the returns from adding an extra distcc server to the pool drop considerably after the first few machines. Even on a 1 gigabit LAN the costs of distcc catch up with the benefits after a while. This is more of a concern when compiling lots of small files.
6. it can handle cross-compilation with a bit of configuration.
So although distcc can often reduce build time, it's not quite as effective as you might assume or hope at first.
If you fell sad, alone. If you think you are wasting your youth in many lonely nights, then compiling your Gentoo distribution with dedicated and optimizing flags may suit you.
precompiled binaries will never run as quickly as those compiled with the right optimizations for your own machine
A straw man. Precompiled binaries may have been compiled with the optimal settings for your machine, and binaries which you compile may not have the optimal settings. Identifying the optimal settings can actually be non-trivial. Source-based distributions are not necessarily the best fix to the 'one-size-fits-all' approach used by some distro's.
I came across distcc by chance about 4 months ago, and I must say, it has utterly improved things around here.
:-( ).
We reguarly develop/compile/debug a moderate-small sized software package, typically taking about 1 minute per compile. Now, while 1 minute doesn't sound like a long time, it starts adding up when you find yourself recompiling 100+ times a day.
With the inclusion of distcc into the whole situation, we're able to reduce that 1 minute compile down to a little less than 20 seconds; highly appreciated (although now we have less excuses to go get a coffee
Distcc is a great package which can be extremely useful.
PLD.
No point building mozilla with GTK2 support if you don't need it, is there? Or Samba with any of the following with question-marks:It's the fact that you get packages on your system that match the settings you set before, not the fact you can compile every package with -fomit-frame-pointer that gives Gentoo its strength.
Get your own free personal location tracker
Well, that won't help anyone, since almost nobody runs Plan 9.
Also, advising people to use faster compilers is bad advice. The point is to make the application faster, and the slower the compile, the faster the application is likely to run. eg. GCC2 vs GCC3
Yes, you are reducing the time it takes to complete the process. It doesn't reduce CPU-time, it reduces real-time. You know, the real world, in which we live... The only thing that really matters.
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
But this can be a false economy...
Every time something that is distributed in binary is rebuilt from source for local use, by definition it's to change some assumption that was inherent in the testing of the original binary (or else the binary distribution would suffice). And with that, some non-0 confidence that was built into the binary release by that testing is wiped out and must be recovered by local analysis and testing (i.e., time and effort) or reduced expectations. Otherwise, it's running on blind faith. This is particularly true with programs that are used frequently, i.e., one expects to depend on them repeatedly. So in my mind, "the best of both worlds" is more meaningful if it refers to fast and reliable apps. I don't care how fast the compiler is if I can't trust the results anymore. That is a different economy equation, and completely justifes the "convenience" of pre-compiled binaries in many applications.
As your CPU gets faster, installing the binary will be quicker too.
But most of all, you will see programs getting a Mozilla complex... Lots and lots of bloat, with no effort going into optimizing anything. KDE and GNOME have that problem. Even formerly lightweight programs like XFce are now heavy programs (thanks in no small part to the bloat of GTK2).
If processing power continues to rise, pretty soon you'll see programming becomming far sloppier, and waste a lot more time. Sure, you can compile mozilla in under 25 minutes now, but you could do the same with other browsers before Mozilla, when slower CPUs were king. When Mozilla 2 comes along, it'll be massive, and we'll be back where we started. Telling people to waste tons of money on new hardware, rather than paying a bit larger salary for a better programmer than can make a full-featured browser that will run on a 100MHz processor. Think about it, is there anything fundamental that Mozilla can do that Netscape 3 couldn't?
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
I wouldn't. I compile my programs once, then use them for MONTHS on end. Even a tiny speed improvement in something like Mozilla will save a HUGE ammount of time overall, and by far make up for the compile time.
When I am programming, I'll disable ops so I can test my changes quicker, but that's not what we are talking about here...
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
At the moment there's a bug in Linux kernel 2.4.26 that causes the remote compiling systems to encounter a kernel panic (and crash.)
It's a known bug and has been discussed on the lkml. The bug is also discussed on the gentoo bugzilla. A patch is also available, though the patch program didn't work for me so I had to apply it manually.
The patch seems to be holding up, too. If you're using distcc on systems with vanilla 2.4.26 kernels, I'd suggest patching them.
-kidlinux.
It's so PAINFULLY SLOW to build anything for the Mac, with this inefficient Objective-C compiler and large linking requirements for Carbon, that without these distributed tools and some G5 servers, it would be hard for us to develop.
Interestingly, our Windows version of this product, built in C#, compiles extremely fast with no distributed trickery needed.
For those using Visual Studio on Windows, I highly recommend a tool called Incredibuild to do the same job. It is not free like distcc, but is very effective and integrates nicely with Visual Studio. It cut my build time for a project at work from 15 minutes to 1 minute 20 seconds. Nice!
kingos
... was just released.
Only available on mirrors, currently.
Belief is the currency of delusion.
Absolutely, I use it at home all the time. It's great for sofa computing: sit on the sofa with a modest laptop, and send your compile jobs across a wireless network to a faster machine in the study.
If you use the LZO compression option then it's quite useful even on 5Mbps wireless.
You can also tell distcc to run as many jobs remotely as possible to keep the laptop from scorching your lap.
It's really nice to be able to build the kernel from source in a reasonable time on a 800MHz machine.
> I'm confused. Which is it supposed to be? Are
> Gentoo users full of crap, or are they correct?
for almost all programs, it's not going to be *noticably* any faster. on average, you can get maybe 1% to 5% performance improvement from CPU-optimised binaries. this generally isn't worth the time and effort it takes to do the custom compile.
for heavy graphics processing or number crunching, you'll probably notice that. you almost certainly wouldn't notice it on anything else.
so, yes, the gentoo users are full of crap....the same kind of crap that obsessive overclockers are full of when they get so self-impressed by the 1% extra performance that they get out of their combination CPU/egg-frier.