Slashdot Mirror


Optimizing distcc

IceFox writes "Having fallen in love with distcc and its ability to speed up compiling (insert anyone who compiles like Gentoo users or Linux developers). I recently got the chance to dive deeper into distcc. By itself distcc will decrease your build times, but did you know that if you tweak a few things you can get a whole lot better compile times? Through a lot of trial and error, tips from others, profiling, testing and just playing around with distcc, I have put together a nice big article. It shows how developers can get a bigger bang for their buck out of their old computers and distcc with just a few changes."

7 of 201 comments (clear)

  1. ccache by Lord+of+Ironhand · · Score: 4, Interesting

    ccache is also nice for optimizing compiling. He probably mentioned it in the article, but since it seems /.-ed I wouldn't know... and by the time you've got both distcc and ccache running the article might be available again so you can read if you did it the right way :-)

  2. Re:Article Text (Slashdotted Server) by Lord+of+Ironhand · · Score: 2, Interesting
    If it was posted _seconds earlier_, then how could he know he was being redundant?

    He couldn't. It's simply a risk you take when posting the article. The moderation system is intended to improve things for the reader, not to judge his (undoubtedly good) intentions. You have a point though, maybe Redundant moderations shouldn't decrease karma, just like Funny doesn't increase it.

    btw, posting the article as non-AC is viewed by many as karma whoring, so it's not recommended anyway.

  3. jobs/cpu? by swebster · · Score: 2, Interesting
    Try putting your localhost machine first in the list, in the middle and at the end. Normally you want to run twice the number of jobs as processors that you have. But if you have enough machines to feed, running 2 jobs on the localhost can actually increase your build times.
    About the "Normally you want to run twice the number of jobs as processors" part... is that really true? I thought it was best to just run 1 job/cpu by a long shot. Am I confused or is he?
  4. Re:behind the XCode curtain by jcr · · Score: 4, Interesting

    Yes, it is. This was described in the XCode session at WWDC last year.

    I had a project that took about 15 minutes to build on my Dual G4. I turned on distributed builds in XCode, and it dropped to 2 minutes. Turns out that about a dozen of my collegues on my subnet are running the same build of our developer tools as I am.

    distcc rocks.. Whoever thought it up should get the appropriate "special award for extreme cleverness."

    -jcr

    --
    The only title of honor that a tyrant can grant is "Enemy of the State."
  5. distributed codebase by Doc+Ruby · · Score: 2, Interesting

    It would be cool to use a distcc client which took my local code diffs, distributed them around the Internet, patched the distributed "standard" version, cc'd the code, and sent back binaries to my client. Crypto hashes against the revised code could ensure that I was really getting binaries from my actual uploaded diffs. But then everyone with "difstcc" would be recompiling so much that we'd each return to our original CPU bandwidth ratios :).

    --

    --
    make install -not war

  6. Re:Recursive Make Considered Harmful by ewhac · · Score: 3, Interesting

    Seconded.

    When I was at Be, Inc. (RIP), one of our engineers, motivated largely by the above-referenced article, converted our entire build environment to a non-recursive structure using gmake. The result was a large speedup, as well as more effective use of multiple processors (which BeOS utilized very well). gmake would grovel over the build tree for a minute or two, then launch build commands in very quick succession. 'Twas great.

    Schwab

  7. Re:Why wasn't a factorial experiment used? by DarkMan · · Score: 3, Interesting

    Probably because it wasn't needed. And secondly, factorial DOE isn't as good as your implying it to be.

    Factorial DOE is useful if you have multiple measurable, continious or quasi continous [0] factors, and want to optimise - particualry when there is some trade off. In this case, however, most of the variables that were altered were clearly discrete (This version of make, or that version of make, for example), or it was clear that the optimum was at an extreme (More CPU speed is always good, for example).

    So, the factors I can see that would be suitable to a factorial DOE is the number of machines in the farm. Except, each machine is different, so that's effectivly an n-dimensional set, with 2 options on each dimension, for n machines. If your going to do the stats, you'd want to do them properly, so no handwaving them all together there.

    Plus, this is a determanistic situation. There is no real need for empirical analysis - you can do it all from first principles, which would be much more efficent, I think. And, indeed, that's what the author did - by looking at the theoretical background of it all, to use different makes and so on, to optimise.

    Finally, if you think that a factorial DOE will get you a global optimum solution, then your sadly mistaken. It's a good procedure for optimising, and it can avoid some local minima - but it's not guarenteed to find a global minima. The only guarenteed method I'm aware of is a synthetic annealing - and if you've got a faster method, I, and a large number of people doing numerical caluclations, would love to hear it.

    Oh, and the aim here was _not_ to find a global minima. It was to get something that was good enough. Trying for better than that is wasted effort.

    [0] For example, the set of integers, from 0 to 1000 is quasi continous. It's not really continous, but it's close enough for real purposes.