network speed is fine for >40 machines compiling concurrently.
and, imho a lot of the headers are cached by the systems.
and, make is definitely not working correctly for me when i'm using a lot of jobs.
make by itself is not the best tool for compiling with more than say 2 or 4 jobs at the same time.
first of all, a make -j4 starts more than 4 jobs at a time under certain conditions, and, it can also happen that only one job is running at a time because make is waiting for it to finish.
this decreases performance if you have some objects that take REALLY long to compile (in my setup i've an object which takes as long to compile as all the other objects split onto 40 machines)
#1 my makefiles are correct, this is a make issue (really, took me quite a while to figure out that its make's fault)
#2 i have more than 40 machines available for compilation (and most of them dual or quad and thus its quite common to compile with more than 40 jobs in parallel, but 40 - 60 is the max i can get due to slow network)
#3 will have a look at it
#4 first figure out all the objects i have to compile, and then compile them on all of the machines but do everything (including pre processing on these machines, the have a shared source / object file system). afterwards do all the linking. using this approach i have almost linear speed increase only limited by network io
i did only do the preprocessing on my machine, no compiling, for sure
i did however not try ccache, though, imho this wont help much if i've to do the pre processing on one machine, or did i misunderstand something ?
another thing to note is that make itself has problems with my project, if i compile with >>40 jobs in parallel make sometimes creates libs before the objects for the libs have been compiled !
and, when using make there is another big disadvantage, make uses way less parallel jobs than specified with -j sometimes (eg when make has to wait for linking a static lib)
my approach does not have this problem since i first compile ALL objects and then create the libs / binaries afterwards
i've tried that, but in the project i'm working for the preprocessing takes WAY too long, so my machine can only serve, say, 6 to 8 clients. this is a severe limitation and limits scalability
kind regards -ph-
i've recently written some dist compile tool using a different approach after i've been using distcc for half an hour...
the big problem with distcc is that it does all the preprocessing on one machine, which is really an overkill in some situations (and limits the total speed increase one can gain).
what i've done is first running a modified version of make and then distribute all the objects which have to be compiled to the machines.
everything is done on these machines (including preprocessing) which only limits the number of compile machines by the speed of the network (i've been compiling on 60 hosts, with almost linear speed increase).
the only problem involved with this approach is that the same compiler has to be installed on these machines, and that they have to write on some sort of network shared filesystem (for the objects)
the same compiler is an easy thing since i've been using the intel compiler version 6 and common system includes (i've put them into a shared include dir)
any thoughts about this thing ? (or some folks willing to help me create a version basing on gpl stuff so i could release this one ?)
network speed is fine for >40 machines compiling concurrently. and, imho a lot of the headers are cached by the systems. and, make is definitely not working correctly for me when i'm using a lot of jobs. make by itself is not the best tool for compiling with more than say 2 or 4 jobs at the same time. first of all, a make -j4 starts more than 4 jobs at a time under certain conditions, and, it can also happen that only one job is running at a time because make is waiting for it to finish. this decreases performance if you have some objects that take REALLY long to compile (in my setup i've an object which takes as long to compile as all the other objects split onto 40 machines)
#1 my makefiles are correct, this is a make issue (really, took me quite a while to figure out that its make's fault) #2 i have more than 40 machines available for compilation (and most of them dual or quad and thus its quite common to compile with more than 40 jobs in parallel, but 40 - 60 is the max i can get due to slow network) #3 will have a look at it #4 first figure out all the objects i have to compile, and then compile them on all of the machines but do everything (including pre processing on these machines, the have a shared source / object file system). afterwards do all the linking. using this approach i have almost linear speed increase only limited by network io
i did only do the preprocessing on my machine, no compiling, for sure i did however not try ccache, though, imho this wont help much if i've to do the pre processing on one machine, or did i misunderstand something ? another thing to note is that make itself has problems with my project, if i compile with >>40 jobs in parallel make sometimes creates libs before the objects for the libs have been compiled ! and, when using make there is another big disadvantage, make uses way less parallel jobs than specified with -j sometimes (eg when make has to wait for linking a static lib) my approach does not have this problem since i first compile ALL objects and then create the libs / binaries afterwards
i've tried that, but in the project i'm working for the preprocessing takes WAY too long, so my machine can only serve, say, 6 to 8 clients. this is a severe limitation and limits scalability kind regards -ph-
i've recently written some dist compile tool using a different approach after i've been using distcc for half an hour ...
the big problem with distcc is that it does all the preprocessing on one machine, which is really an overkill in some situations (and limits the total speed increase one can gain).
what i've done is first running a modified version of make and then distribute all the objects which have to be compiled to the machines.
everything is done on these machines (including preprocessing) which only limits the number of compile machines by the speed of the network (i've been compiling on 60 hosts, with almost linear speed increase).
the only problem involved with this approach is that the same compiler has to be installed on these machines, and that they have to write on some sort of network shared filesystem (for the objects)
the same compiler is an easy thing since i've been using the intel compiler version 6 and common system includes (i've put them into a shared include dir)
any thoughts about this thing ? (or some folks willing to help me create a version basing on gpl stuff so i could release this one ?)