Reduce C/C++ Compile Time With distcc
An anonymous reader writes "Some people prefer the convenience of pre-compiled binaries in the form of RPMs or other such installer methods. But this can be a false economy, especially with programs that are used frequently: precompiled binaries will never run as quickly as those compiled with the right optimizations for your own machine. If you use a distributed compiler, you get the best of both worlds: fast compile and faster apps. This article shows you the benifits of using distcc, a distributed C compiler based on gcc, that gives you significant productivity gains."
I think nc can be used like distcc by redefining CC="nc gcc". However, more commonly it is done by putting $(NC) at the beginning of the build rules. Then you can use nc for any build rules, not just C compiles.
In addition to use with make, nc works well with SCons.
My family can't afford more than one computer, you insensitive clod!
But seriously, is there a way to make use of the concepts embodied in distcc in a home computing environment? Or is distcc designed for use by for businesses and schools?
Personally, I think that distcc will become more and more useless as computers get faster. My new machine (P4 2.8, 1 GB RAM, SATA drives) can compile a complete Gentoo desktop system in just about two hours. That's pretty damn cool considering that it used to take like 24 on my old laptop when I first started using Gentoo several years ago. It would probably only take about an hour to setup a server system on Gentoo on my same machine since the biggest component, X, would not need to be compiled.
Computing power is outstripping the size of source code that needs to be compiled. Soon there will be little difference in install time between the source and binary distros, and all the jokes about Gentoo's compile time will be pretty much obsolete. Already, once you have your system installed, the time required to keep current/install new apps is minimal. My system can compile any new program (except maybe OpenOffice) in under 25 minutes. Even Mozilla can be compiled in that time.
Well, as someone who recompiles FreeBSD/DragonFly quite frequently, I've got to say that the best way to reduce the time it takes is to build eveything in a ramdisk. I've cut 100 minute compile times down to about half an hour by mounting /usr/obj in a ramdisk instead of on my hard drive.
r tid=53
http://bsdvault.net/sections.php?op=viewarticle&a
We use the distcc that Apple distributes with XCode even though we dont' use XCode itself. It really helps to get a few dual-CPU G5's working!
The cool thing about Apple's version is that by default it uses Rendezvous to determine which machines are available to distribute work to.
Reducing compile time by distributing the load isn't reducing it all, it's just distributing it. Try using a compiler that compiles fast -- such as Plan 9s compilers.
Other then distributed compiler tools like distcc and nc are there any other ways of speeding up a linux compile with gcc?
I was blown away when my project group compiled a Qt app that we developed on the Linux platform with the MS VC++ compiler. The compilation took 1/10th the time! We were using Makefiles generated by QMake in both cases.
Should I just switch compilers? If so does anyone have any suggestions?
I've got two Gentoo systems that run distccd.
I did some non-scientific testing with distcc.
System one:
Athalon 1400 XP
512 Meg RAM
System two:
Pentium III 450
512 Meg RAM
I compiled GAIM on System one with distccd running on system one and two, also compiled with just distccd running on system one.
I found that with both systems running distccd I got about a two minute faster compile. Then with just distccd running on system one.
With distccd running on just system one I found that it would process many of the individual compiles in parallel ah-la SMP, thing is it's a single processor system. I've not tested the time on system one of distccd vs no distccd. I imagine that with the parallel compiles it works faster. It all depends on what you set your -jN to, where N is the number of "systems" X2 +1. I found that with two systems I could run well with a -j of 7. A bit higher then suggested.
It is correct that many programs that have sensitive builds, XFree and Opera for example, it turns off the -j option. Not a big deal, just means a longer coffee break.
Distccd has come in very handy when I was installing Gentoo on an old Gateway 2100 Solo laptop. The laptop only has a Pentium 120 and 40 Megs of memory.
I'd suggest distcc for anyone who does quite a bit of source builds, a must for a Gentoo install!
--All programmers are playwrights and all computers are lousy actors.
So, do you really think there is a significant difference between the optimum gcc-compiled output on a P2 and an AthlonXP? Why don't you just save yourself a LOT of time and electricity by transferring the binaries from the fast machine to the slower one?
No there are parts of KDE, Gnome, X and others, for instance Net-SNMP, that have problems with anything other then make -j1. The make process fails to find a library that it hasn't built yet, passing make -j1 fixes everything, well these problems anyway.
"I use a Mac because I'm just better than you are."
If you have a pretty sticker on the front of your computer that says "Intel inside," "AMD Athlon XP," or whatever, and you can click a button corresponding to that, then you know enough to optimize your binaries ;)
I came across distcc by chance about 4 months ago, and I must say, it has utterly improved things around here.
:-( ).
We reguarly develop/compile/debug a moderate-small sized software package, typically taking about 1 minute per compile. Now, while 1 minute doesn't sound like a long time, it starts adding up when you find yourself recompiling 100+ times a day.
With the inclusion of distcc into the whole situation, we're able to reduce that 1 minute compile down to a little less than 20 seconds; highly appreciated (although now we have less excuses to go get a coffee
Distcc is a great package which can be extremely useful.
PLD.
Yes, you're trolling but I feel the urge to bite. Gentoo really isn't about CFLAGS. Its about USE flags. I can build programs as I want with whatever options I want. Portage takes care of the dependencies and my systems are _exactly_ as I want them to be.
If you like binaries, fine, there's no shortage. You use what you want, I'll use what I want. Keep your insults to yourself.
The obvious and most popular answer is encoding video. I think a great many people do a lot of this. Since no processor is fast enough to encode DVD-res video at 16X, it isn't bound by IO speeds either. I can start videos encoding in far less time than it takes to complete the process as well. Pure CPU number-crunching.
Other applications are any form of crypto. Reduce the time you have to wait for PGP to encrypt. Reduce the delay on your SSH sessions.
Then there are databases. Sure, they're often IO bound, but it is commonly a CPU limitation.
Also any heavy-load service. If apache is serving lots of threads, especially PHP/Perl compiled pages, you are going to be maxing out your CPU.
Then there are the programs that are just bloated. Mozilla/Firefox is still quite slow, and I can open pages far, far faster than they can be rendered. Anything that makes it even 1% faster is very welcome, as those savings eventually add-up to large ammounts of time.
If you really never use any of those, hooray for you, but most people certainly do.
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
No point building mozilla with GTK2 support if you don't need it, is there? Or Samba with any of the following with question-marks:It's the fact that you get packages on your system that match the settings you set before, not the fact you can compile every package with -fomit-frame-pointer that gives Gentoo its strength.
Get your own free personal location tracker
But this can be a false economy...
Every time something that is distributed in binary is rebuilt from source for local use, by definition it's to change some assumption that was inherent in the testing of the original binary (or else the binary distribution would suffice). And with that, some non-0 confidence that was built into the binary release by that testing is wiped out and must be recovered by local analysis and testing (i.e., time and effort) or reduced expectations. Otherwise, it's running on blind faith. This is particularly true with programs that are used frequently, i.e., one expects to depend on them repeatedly. So in my mind, "the best of both worlds" is more meaningful if it refers to fast and reliable apps. I don't care how fast the compiler is if I can't trust the results anymore. That is a different economy equation, and completely justifes the "convenience" of pre-compiled binaries in many applications.
i wnated to point out , if you are going to try distcc , run it with unsermake for example to compile kde. try google with unsermake - it make compiling distributed much faster.
best ist : ccache, distcc and unsermake.
How much productivity is lost in setting up the distrubuted compilation network and enviornment when compared to just doing a few "./configure; make install" commands?
My main use for distcc currently is building software for my powerbook.
I do a lot of work with Qt on both Linux and Mac, and lets just say Qt compiles very slow on my powerbook (which is an older 800 mhz G4).
Also, I've had to build all of Qt on this machine because the fink packages are old and don't even use the Mac version (they use the X11 version which really sucks and makes apps on Macs look like crap).
So at work we have a couple dual G5s I use, and also a few Linux machines which I've built darwin cross-compilers for (yes its a pain in the ass).
I remember compiling 2.0.x kernels on my own 100MHz Pentium. It took -forever-.
Later on, I built a 350MHz K6-2 machine for a customer, and it was a screamer compiling its 2.0.x kernel, taking just a few minutes.
Fast forward: I've got a very similar K6-2 350 as a miscellaneous server and firewall here. Compiling its 2.6 kernel takes -forever-.
But the new 2.4GHz HT 800MHz FSB P4 box I built recently for work is again a screamer, compiling its 2.6 kernel in a few minutes. This box is in roughly the same relative performance league as that K6-2 was Back In The Day.
Moral of this story: The more things change, the more they stay the same. Program and compiler complexity has kept pace with increases in processor speed, leaving the time to compile x more or less constant over at least the past few years.
'Sides, even if you can build a proper desktop system in two hours, distcc serves well to decimate the amount of time required. Whether it is used to cut that two-hour-long run down to 30 minutes minutes, or a 5-minute compile down to 60 seconds, distcc will always have its place[1].
[1]: Yes, yes, I know. People everywhere are saying "But who cares if it takes 5 minutes instead of 60 seconds? It's not like I can't continue using the machine while it's compiling." These people are ignoring the human aspect of the whole thing, which can be summarized as follows: Wife, house, kids, cars, jobs = 4 minutes worth of life that has been rescued from the computer by distcc.
Kid-proof tablet..
It's so PAINFULLY SLOW to build anything for the Mac, with this inefficient Objective-C compiler and large linking requirements for Carbon, that without these distributed tools and some G5 servers, it would be hard for us to develop.
Interestingly, our Windows version of this product, built in C#, compiles extremely fast with no distributed trickery needed.
The claim also contains the assumption that applications are CPU-bound. All the recompiling in the world won't make something go faster if it's waiting on a disk or a UART or a NIC. Many applications are fast enough anyway -- who cares if /bin/cat gets a 2% improvement of its CPU use? I bet I could add a 20 microsecond gratuitous delay in the main loop of cat, and not noticably affect its performance!
That said, the kinds of things I would like to have extra-optimized for speed are generally big, huge, complicated things that take forever to compile. Like an Xserver. And that's definitely where distcc could come in handy.
Worryingly the article does not mention *at all* the obvious security questions. If you run a distcc service on a host then who is authorized to connect to it and compile programs? How do they authenticate? What about protection against man-in-the-middle attacks (you may not be paranoid enough to worry about people fiddling with the object code before it is sent back, but at least you ought to know if it's possible). I hope it's not another case of 'ignore security in the service, but it's okay, we'll just put it behind a firewall'.
FWIW, distributed compliation programs like distcc are a good reason to check for buffer overruns and other memory trampling in the compiler. If you've ever managed to segfault gcc by feeding it a bad piece of code, there is a potential exploit via distcc if you can craft a C program that makes the compiler misbehave in the way you want.
-- Ed Avis ed@membled.com
Xcode build system: Distributing Builds Among Multiple Computers
Yes, Apple has come standard with distcc for quite some time.
“Common sense is not so common.” — Voltaire