Slashdot Mirror


23 Second Kernel Compiles

b-side.org writes "As a fine testament to how quickly linux is absorbing technology formerly available only to the computing elite, an LKML member posted a 23 second kernel compile time to the list this morning as a result of building a 16-way NUMA cluster. The NUMA technology comes gifted from IBM and SGI. Just one year ago, a Sequent NUMA-Q would have cost you about USD $100,000. These days, you can probably build a 16-way Xeon (4X 4-way SMP) system off of ebay for two grand, and the NUMA comes free of charge!"

5 of 222 comments (clear)

  1. Tempting... by JoeLinux · · Score: 4, Insightful

    ok..I'm NOT about to start the perverbial deluge of people wanting to know about a beowulf cluster of these things. But what I will ask is this: if it can do that for a kernel, I wonder how long it will take to do Mozilla, or XFree? It'd be interesting to see those stats.

    JoeLinux

  2. this may be good but... by m0RpHeus · · Score: 3, Insightful

    This may be good news, but what the heck! They should have at least included the .config that they used so that we can know what drivers/modules that are compiled with it, or maybe this is just bare-bones kernel enough to run the basic. We need to know the complexity of the configuration before we could really say it's fast.

    --
    Take-off every .sig! For Great Justice!
  3. HELLLOOOOOO??? by Anonymous Coward · · Score: 4, Insightful

    You can't build a NUMA cluster worth a crap without a fast, low-latency interconnect.

    Sequent's NUMA Boxen use a flavor of SCI (Scalable Coherent Interface) which is integrated into the memory controller.

    While you can use some sort of PCI-based interconnect, the results are just plain not worth it.

    Infiniband should be better, though I've heared the latency is too high to make this a marketable solution.

    Keep your eyes on IBM's Summit chipset based systems. These are quads tied together with a "scalability port" and go up to 16-way. They should go to 32 or higher by 2003. That's when NUMA will -finally- be inevitable...

  4. Re:Why? by LinuxHam · · Score: 5, Insightful

    but why would you want to compile a kernel in 23 seconds?

    I think this benchmark is used time and time again because its really the only one that nearly any Linux user would be able to compare their own experiences to. If they said 1.2 GFLOPS, I (and I suspect most others) could only say "Wow, that sounds like a lot. I wonder what that looks like." OTOH, I have seen how long it takes to download 33 Slackware diskettes in parallel on a v.34 modem, and I still run 3 P75's today.

    I've been told that I will soon be deploying Beowulf HPC clusters to many clients, including universities and biomedical firms. If they were to tell me that the clusters will be able to do protein folds (or whatever they call it -- referring back to the nuclear simulation discussion) in "only 4 weeks", I won't have a clue as to how to scale that relative to customary performance of the day.

    Sure, there are many other applications that are run on clusters, but kernel compiles are the ones that all of us do. It can give us an idea of what kind of performance you'd get out of other processor-intensive operations. And many people will tell you there are so many variables with kernel compiles that its ridiculous to compare the results.

    Check out beowulf.org and see what people are doing with cluster computing. I've always wanted to open a site that compiles kernels for you. Just select the patches you want applied and paste the .config file. I'll compile it, and send back to you by email a clickable link to download your custom tarball. Of course no one here would trust a remotely compiled kernel :)

    --
    Intelligent Life on Earth
  5. Re:hmph by Paul+Jakma · · Score: 3, Insightful

    what about the interconnect? the machine in question is /not/ a simple beowulf cluster, it's NUMA. Non Uniform Memory Architecture, which implies there is some form of memory architecture, and that the main difference between that architecture and that of a normal computer is that it is non-uniform.

    Ie, the CPUs in this computer share a common address space and can reference any memory, just that some memory (eg located at another node) has a higher cost of access than other memory. (as opposed to a typical SMP system where all memory has an equal 'cost of access').

    at the moment, under linux, this implies that there is special hardware in between those CPUs to provide the memory coherency - ie lots of bucks - cause there is no software means of providing that coherency (least not in linux today).

    NB: normal linux SMP could run fine on a NUMA machine (from the memory management POV), but it would be slower because it would not take the non-uniform bit into account.

    anyway... despite what the post says, this machine is /not/ a collection of cheap PCs connected via 100/1G ethernet or other high-speed packet interconnect.

    --
    I use Friend/Foe + mod-point modifiers as a karma/reputation system.