Slashdot Mirror


23 Second Kernel Compiles

b-side.org writes "As a fine testament to how quickly linux is absorbing technology formerly available only to the computing elite, an LKML member posted a 23 second kernel compile time to the list this morning as a result of building a 16-way NUMA cluster. The NUMA technology comes gifted from IBM and SGI. Just one year ago, a Sequent NUMA-Q would have cost you about USD $100,000. These days, you can probably build a 16-way Xeon (4X 4-way SMP) system off of ebay for two grand, and the NUMA comes free of charge!"

146 of 222 comments (clear)

  1. hmph by Jukashi · · Score: 1

    2 grand? i think not :(

    1. Re:hmph by showboat · · Score: 1

      That's what I was thinking...

      But, Here's a 4-P3 system, and it's nearly $1.5g, and a cpu alone isn't more than $50, so...

    2. Re:hmph by showboat · · Score: 1

      ok, i'm not sure what you #2 means, but I see that the equip only takes 8 processors, so you'd need two of these (+the extra proc.s that aren't included)... and the reserve's not met, so it'll probably cost upwards of $4g. damn.

    3. Re:hmph by Paul+Jakma · · Score: 3, Insightful

      what about the interconnect? the machine in question is /not/ a simple beowulf cluster, it's NUMA. Non Uniform Memory Architecture, which implies there is some form of memory architecture, and that the main difference between that architecture and that of a normal computer is that it is non-uniform.

      Ie, the CPUs in this computer share a common address space and can reference any memory, just that some memory (eg located at another node) has a higher cost of access than other memory. (as opposed to a typical SMP system where all memory has an equal 'cost of access').

      at the moment, under linux, this implies that there is special hardware in between those CPUs to provide the memory coherency - ie lots of bucks - cause there is no software means of providing that coherency (least not in linux today).

      NB: normal linux SMP could run fine on a NUMA machine (from the memory management POV), but it would be slower because it would not take the non-uniform bit into account.

      anyway... despite what the post says, this machine is /not/ a collection of cheap PCs connected via 100/1G ethernet or other high-speed packet interconnect.

      --
      I use Friend/Foe + mod-point modifiers as a karma/reputation system.
    4. Re:hmph by Anonymous Coward · · Score: 1, Informative

      brilliant that you're the only person who caught that, and no one has modded you up. ;) just further proof that the /. editors (not to mention the average reader/moderator) don't actually know anything at all about technology, and aren't in any way interested in fact checking submissions.

    5. Re:hmph by spikedvodka · · Score: 1

      or don't have Mod points at this point

      --
      I will not give in to the terrorists. I will not become fearful.
  2. Tempting... by JoeLinux · · Score: 4, Insightful

    ok..I'm NOT about to start the perverbial deluge of people wanting to know about a beowulf cluster of these things. But what I will ask is this: if it can do that for a kernel, I wonder how long it will take to do Mozilla, or XFree? It'd be interesting to see those stats.

    JoeLinux

    1. Re:Tempting... by showboat · · Score: 1

      I concur: let's get us some benchmarks!

    2. Re:Tempting... by kigrwik · · Score: 1

      You might want to add OpenOffice to that list.

      --
      -- don't discount flying pigs until you have good air defense
    3. Re:Tempting... by castlan · · Score: 4, Interesting

      A Beowolf cluster of these? That's so 2 years ago... I'd love to see a NUMA-linked cluster of these! And I wonder how long it would take that cluster running GNOME under XFree86 to have Mozilla render this page nested at -1!

      Seriously, I wonder how long it takes to boot. Every NUMA machine I've ever used took more than its fair share of time to boot... much more than a standard Unix server. It would be pretty funny if compiling the kernel turned out to be trivial compared to booting!

    4. Re:Tempting... by hansendc · · Score: 4, Informative

      Seriously, I wonder how long it takes to boot.

      They do take a good bit of time to boot. In fact, it makes me much more careful when booting new kernels on them because if I screw up, I've got to wait 5 minutes, or so, for it to boot again! But, they do boot a lot faster when you run them as a single quad and turn off the detection of other quads.

  3. $500 for a quad xeon? by Anonymous Coward · · Score: 3, Informative

    No way. Just a no-CPU, no-memory case and
    motherboard costs $500. More like $2000
    to $3000 for an old quad.

    1. Re:$500 for a quad xeon? by Wells2k · · Score: 5, Informative

      No way. Just a no-CPU, no-memory case and
      motherboard costs $500. More like $2000
      to $3000 for an old quad.


      I am actually in the process of building a quad xeon right now with bits and pieces I bought off of E-Bay, and this is certainly doable. Not sure about the $500, but $2000-$3000 is high. I have the motherboard and memory riser now for $150, I am pretty sure that I can get a used rackmount case for $100 or so, the CPU's are going to cost around $60-70 each (P-III Xeon 500's), and memory is cheap as well.

      I figure I will be in it for around $1000 in the end. Yes, $500 is a low number, but I also know that your estimates of $2000-3000 is high.

    2. Re:$500 for a quad xeon? by Lumpy · · Score: 2

      Bzzzt...

      Old compaq ML-530 with 4 processors.. $850.00

      I just got one off of ebay...

      Problem: Hard drives for an ML530 are overpriced modified drives that you HAVE to buy from compaq.
      I puked when I found out the the 4 drives I need to get this running will cost me $3500.00 from compaq.

      Damn those jerks and their custom hot-plug sleds.. why cant they use standard hotplug drives and mounts? (Just like the rack mount... a compaq WILL NOT mount in a standard rack without heavy modification.)

      --
      Do not look at laser with remaining good eye.
    3. Re:$500 for a quad xeon? by cmkrnl · · Score: 2, Informative


      Have to buy hard drives from CPQ my arse! Whats stopping you putting in a stock U160 controller and hanging drives off the back of that ? $3500 buys a LOT of Atlas 10K3s these days.

      The CPQ drive rails mount standard SCSI drives and are freely available if you talk to your nearest CPQ reseller nicely. Compaq use stock quantum, fuji and seagate drives, they just change the vendor mode page entry to make it look like they are 'adding value' somehow.

      Curmudgeon

    4. Re:$500 for a quad xeon? by DivideByZero · · Score: 1

      I puked when I found out the the 4 drives I need to get this running will cost me $3500.00 from compaq.
      ...why cant they use standard hotplug drives and mounts?

      Because they can't get $3500 for standard hardware? (See Connector Conspiracy for more examples)

      Maybe there's a reason you only paid $850 for a loaded quad system?
    5. Re:$500 for a quad xeon? by NetJunkie · · Score: 2

      Does it use the older beige drives or the new ultra2/3? I may be able to help you.

    6. Re:$500 for a quad xeon? by Cramer · · Score: 1

      Actually, they used to put their own buggy-ass microcode in there. A quick trip through DOS with a the Quantum firmware utility and they are suddenly back to being true Quantum drives.

      Sun did the same thing with Seagate OEM drives. Apple did the same thing with Connor drives.

    7. Re:$500 for a quad xeon? by Cramer · · Score: 1
      • custom hot-plug sleds
      As opposed to who's "standard"? DEC, Compaq, IBM, Kingston, Antec, etc? There's no such thing as a standard drive mounting sled. Even the AT drive mounting rails (an actual standard) were never usefull - I have thousands of those rails and in 20 years, I've yet to see a case that uses them.
    8. Re:$500 for a quad xeon? by cmkrnl · · Score: 1


      I remember, unexplained lockups, data losses on arrays, absolutely crap write performance.

      These days I am reliably informed that the drives ship from the OEM directly with the vendor string changed and thats it. Seemingly it cost too much to maintain their own microcode for yet A.N Other drive.

      Curmudgeon

  4. 42 seconds by decep · · Score: 4, Informative

    23 seconds is impressive. I, personally, have seen a 42 second compile time of a 2.2 series kernel on a Intel 8-way system (8GB ram, 8 550Mhz PIII Xeons w/ 1mb L2). It was in the 1 minute range with a 2.4 kernel.

    Definately the most impressive x86 system I have ever seen.

    1. Re:42 seconds by kigrwik · · Score: 5, Funny

      Arthur Dent: Ford, I've got it ! "What's the kernel compile time in seconds on an Intel 8-way Xeon ?"
      Ford: 42 ! We're made !

      --
      -- don't discount flying pigs until you have good air defense
    2. Re:42 seconds by Anonymous Coward · · Score: 1, Funny

      That's nothing. On my 40MHz 386DX, compiling the 1.2 series kernel used to take just 90 minutes.

    3. Re:42 seconds by VAXman · · Score: 2

      Bah. When I first started with Linux (around the 0.99pl12 days), the kernel compile took 4.5 hours on my 386SX-16 with 4MB of RAM.

      A year or two ago I compiled the NetBSD kernel on a MicroVAX II with 8MB of RAM and it took about 24 hours.

    4. Re:42 seconds by hansendc · · Score: 2

      Linux has booted on the Sequent NUMA-Q stuff for a while. Martin was the one who first added the support. 2.5 is adding things like Pat Gaughen's discontig-mem and the ability to bind tasks to a single quad for scheduling, and memory allocation. Basically, the kernel support is very good, and very stable; I use these machines every day.

    5. Re:42 seconds by Bert64 · · Score: 1

      Someone recently managed 20 seconds for a 2.4.0 kernel compile on Sparc64 (E10k with 20 cpus), the url was somewhere on linuxcare.com.au but i forget the exact address, maybe someone will post it up.. i liked the dmesg log from that box too.

      --
      http://spamdecoy.net - free throwaway anonymous email - avoid spam!
  5. Is it worth it? by erik+umenhofer · · Score: 1

    Ok, it compiles a kernel hella fast. But can it be applied to other stuff. Obviously these fast compiling machines are best for places where you have tons of users compiling all over the place, but will other compilers be as fast?

    1. Re:Is it worth it? by Arrian · · Score: 1

      You mean, will it run Counter-Strike server?

    2. Re:Is it worth it? by jejones · · Score: 2

      Yes, it can be applied to other stuff...and there are always CPU-bound problems that can use the speed. (I hope someone knowledgeable about current computer graphics technology can comment on what could be done with the machine under discussion.)

    3. Re:Is it worth it? by castlan · · Score: 2, Insightful

      By computer graphics technology, do you mean a render-farm? That would be much better suited to a standard beowolf cluster, because the interprocess communication is minimal. That is an example of an "embarrasingly parallel" compumpting problem. As for live graphics, an Onyx workstation doesn't benefit from CPU power so much as its Reality Engine/Infinite Reality graphics pipeline. When you need better graphics performance, you can utilize multiple graphics pipelines. Some of the Onyx 3000s can use (I think) as many as 16 different IR3s for improved graphics output, like in RealityCenters.

      The point of this article isn't that kernel compilation is fast because it is usually CPU bound, and 16 CPUs alleviate that problem. If fact kernel compiliation isn't strictly CPU bound... there are other performance limits too, especially disk performance. The significance of this article is that multithreaded kernel compiles benefit from the increased interprocess communication potential in NUMA architectures... performance would be much worse trying to spread that across a beowolf cluster.

      While rendering (not displaying) graphics or running basic number crunching does not benefit much from a NUMA setup as compared to a beowolf style setup, some complex equation do benefit... computing the first million digits of Pi would use interprocess communication, as would large scale data minig application. It's been a few years since I've been there, I saw a huge cluster of Origin 2000s CC-NUMAed together with one Onyx 2, which handled displaying the results of the data mining. (An Onyx2 is basically an Origin 2000 with a graphics pipeline. An Onyx 3000 without any graphics bricks is an Origin 3000.)

    4. Re:Is it worth it? by oxfletch · · Score: 2, Informative

      These machines were designed to run huge databases. The IO scalability isn't there in Linux yet as it was in Dynix/PTX, and there hasn't been so much work on the scalability of Linux as there has on PTX, but it'll get there pretty soon ;-)

      So yes, it will apply to other stuff, though maybe not as well as it could, quite yet.

  6. Was I the only one... by leviramsey · · Score: 4, Funny

    ...who wondered, "I didn't know that Clive Cussler had gotten into cluster design?

  7. in 1996 (Re:42 seconds) by Chexum · · Score: 2, Interesting
    Dave S. Miller (the Sparc guy) boosted a post on 42 seconds kernel compile, although the exact article is not available on web archives, at least two quotes are on a 68k list, and a Hungarian Linux list.

    Remember, this was in 1996. Now, how much did we progress in the last five-six years? :)

    --
    "Ten years from now, they could do it in a few seconds." -- The Racketeer of the Hellfire Club, 1993, Phrack 42
    1. Re:in 1996 (Re:42 seconds) by BJH · · Score: 2

      Well, considering a 36-processor SGI Challenge with 5GB of RAM would have cost you several multiples of six figures in 1996, I don't really think the comparison's valid...

    2. Re:in 1996 (Re:42 seconds) by Cramer · · Score: 1

      Sparc hardware hasn't advanced as much. It certainly hasn't advanced at the same speed of the linux kernel complexity. (you can read that as "bloat" if you wish.)

      In 1996, the fastest processor was 200MHz? Now processors are around 400MHz or 700MHz if you have really new (and expensive) hardware. So, the hardware is 2x to 4x faster and the kernel is ~10x more complex. For comparison, in 1996, my 486dx50 with 16M of 60ns FPM DRAM and a 1.2MB/s IDE drive compiled the kernel in about 5 minutes. My modern dual Athlon MP 1500 with 256M of PC2100 DDR SDRAM and 19MB/s IDE drive compiles the kernel in just under 2 minutes. So, that's 52x more cpu power, 16x more memory that's 20x faster, and a drive that's 16x faster, yet it compiles the kernel only 4 times faster.

    3. Re:in 1996 (Re:42 seconds) by Cramer · · Score: 2, Informative

      Stop hiding behind the AC and people might pay you attention.

      You appear to be equating clocking and processor speed like apples and oranges. They aren't. If we consider all of the technological advances in the modern ia32 processors vs. it's earilier brethern, then the comparision is even less favorable... Modern processors should be exceptionally faster. But they aren't. There are two primary reasons for this: increasing inefficiency, and increasing complexity. Present day programmers are far less motivated to write "good code" because they live in the falacy that the processor is fast enough to run anything. ("No one will notice the difference.") In fact, they are generally incapable of generating efficient code as they've never been taught to think that way. These people will surely spend an eternity in computing hell writing programs in BASIC on 1MHz machines that have 32x16 character console displayed on a 12" BW TV. (Any resemblance to the movie Brazil is unintentional.)

      Complexity breeds more inefficiency. As the saying goes, "Make it work. Then make it fast."

      As for my comments about Sparc... Unless Sun is deploying reverse engineered alien technologies, the core of the processor (ie. how it adds and subtracts) hasn't changed much. It's the clock speed (how fast it runs through the "add" proceedure) that makes it faster. The efficient adaption of code to the native 64bit environment also helps alot. (Better code + better compiler yeilds faster execution.)

  8. Why? by James_G · · Score: 2

    Maybe this is a silly question.. but why would you want to compile a kernel in 23 seconds? I mean, ok, it's cool and everything, but is there some hidden application of this that I'm not seeing? Or are people really devoting hardcore time to this just because they can?

    6 years ago, a kernel compile for me took about 3 hours. These days, it takes less than 3 minutes, which is more than fast enough for my needs. So, you can push it down to 23 seconds with a few thousand $ - what's the point? Someone help me out here!

    1. Re:Why? by quigonn · · Score: 2

      The lesser compile time the better. That's especially useful for companies that have lots of software to compile, e.g. Linux distributors.

      --
      A monkey is doing the real work for me.
    2. Re:Why? by showboat · · Score: 1

      Unless I am completely daft (please note if so), then we're talking about a general progression in processing power, a certain configuration of which (e.g., compiling a piece of software) can serve as a preliminary benchmark against other systems (considering it's a task many are familiar with and can carry such a "wow" factor). It's not some strange optimization that only affect kernel compiles; it's about memory optimization (read the definition of the NUMA tech.).

    3. Re:Why? by Anonymous Coward · · Score: 1, Interesting


      Think "outside the box" (sorry, horrible pun) of just kernel compiles, and I suspect you'll understand the potential value here.

      Let's say you run a decent-sized development house, employing a healthy number of coders. Now, as these folks churn through their days (nights?), they're gonna be ripping through a lot of compiles (if they're using C/C++/whatever). From my personal experience as a developer in the industry, a large portion of a developer's time is spent just compiling code.

      If you can implement cost-effective tech like this to reduce time spent on routine tasks like code burns, you increase productivity.

      Holy shit, I may have actually come up with a halfway decent justification for "hippie tech" to throw at the suit-wearing types... ;).

      temporary email, because MS deserves a good Linux box.

    4. Re:Why? by Emil+Brink · · Score: 2

      Well, for my own little pet project, a full rebuild takes ~5 minutes. On my nearly vintage K6-233, that is. One main reason I'm looking forward so much to a new computer system (besides the gaming, that is) is the chance to shrink that time by a significant amount. If I was a kernel developer, the ability to do a full rebuild in 23 seconds wouldn't hurt a bit, I'm fairly sure.

      --
      main(O){10<putchar(4^--O?77-(15&5128 >>4*O):10)&&main(2+O);}
    5. Re:Why? by quintessent · · Score: 4, Informative

      is there some hidden application of this that I'm not seeing?

      How about doing other stuff really fast?

      3D modeling. 3D simulations. Even extensive photoshop editing with complex filters can benefit from this kind of raw speed.

      It wouldn't be a catchy headline, though, if it said "render a scene of a house in 40 seconds--oh, and here are the details of the scene so you can be impressed if you understand 3D rendering..."

      There are hundreds of applications for this, many of which we don't do every day on our desktop simply because they take too much juice to be useful. With ever-faster computers, we will continue to envision and benefit from these new possibilities.

    6. Re:Why? by kubla2000 · · Score: 2

      Because 6 years ago, you would have been asking, "Maybe this is a silly question.. but why would you want to compile a kernel in 3 minutes?"

    7. Re:Why? by JabberWokky · · Score: 3, Informative
      Maybe this is a silly question.. but why would you want to compile a kernel in 23 seconds?

      That's not the point - kernel compilation (or the compilation of any large project like KDE or XFree[1]) is a fairly common benchmark for general performance. It chews up disk access and memory and works the CPU quite nicely.

      [1] Large is, of course, a relative thing. Also, some compilers (notably Borland) are incredibly efficent at compiling (sometimes through manipulating the language specs so the programmer lines things up so the compiler can just go through the source once and compiles as it goes).

      Still, benchmarks are suspect to begin with, and kernel compile time is a decent loose benchmark. What was that quote from Linus about the latest release being so good he benchmarked an infinate loop at just under 6 seconds? :)

      --
      Evan

      --
      "$30 for the One True Ring. $10 each additional ring!" -- JRR "Bob" Tolkien
    8. Re:Why? by Adnans · · Score: 1, Insightful

      Maybe this is a silly question..

      Yes it is... :-)

      -adnans

      --
      "In short: just say NO TO DRUGS, and maybe you won't end up like the Hurd people." --Linus Torvalds
    9. Re:Why? by cheezehead · · Score: 2, Interesting

      Also, some compilers (notably Borland) are incredibly efficent at compiling

      You can say that again. Back in '95 or '96, Borland was claiming that their Delphi Object Pascal compiler compiled 350,000 lines per minute on a Pentium 90. I never checked this, I do know that it was incredibly fast.

      What I do know from own experience however:
      Back in those days we built a system on Solaris, implemented in C++, that took about 1 hour to compile for about 100,000 lines of source code (hardware was kind of modest compared to today's stuff).
      For a bizarre reason that I won't go into, we had to build part of the system on a PC platform. This was done using Borland C++ 3.0 for DOS. Some fool had configured something in the wrong way, resulting in the fact that all the 3rd party libraries were recompiled from source every time. This was more than 1 million lines of C++. It took about 10 minutes on a 486/33!

      --

      MSN 8: Now Microsoft even has bugs in their ad campaigns.

    10. Re:Why? by Snard · · Score: 1

      Well, for people who really want to be up to date, they could recompile the kernel every time they log in or boot up.

      --
      - Mike
    11. Re:Why? by LinuxHam · · Score: 5, Insightful

      but why would you want to compile a kernel in 23 seconds?

      I think this benchmark is used time and time again because its really the only one that nearly any Linux user would be able to compare their own experiences to. If they said 1.2 GFLOPS, I (and I suspect most others) could only say "Wow, that sounds like a lot. I wonder what that looks like." OTOH, I have seen how long it takes to download 33 Slackware diskettes in parallel on a v.34 modem, and I still run 3 P75's today.

      I've been told that I will soon be deploying Beowulf HPC clusters to many clients, including universities and biomedical firms. If they were to tell me that the clusters will be able to do protein folds (or whatever they call it -- referring back to the nuclear simulation discussion) in "only 4 weeks", I won't have a clue as to how to scale that relative to customary performance of the day.

      Sure, there are many other applications that are run on clusters, but kernel compiles are the ones that all of us do. It can give us an idea of what kind of performance you'd get out of other processor-intensive operations. And many people will tell you there are so many variables with kernel compiles that its ridiculous to compare the results.

      Check out beowulf.org and see what people are doing with cluster computing. I've always wanted to open a site that compiles kernels for you. Just select the patches you want applied and paste the .config file. I'll compile it, and send back to you by email a clickable link to download your custom tarball. Of course no one here would trust a remotely compiled kernel :)

      --
      Intelligent Life on Earth
    12. Re:Why? by sohp · · Score: 5, Funny

      Never ask a geek, "why?". Just nod your head and back away slowly.

    13. Re:Why? by guile*fr · · Score: 1

      infortunatly not all sources accept make -j flag :-(

    14. Re:Why? by Ripat · · Score: 1

      Ahh... I like your filemanager. Looking good...

      Thanks for posting that link.

      To bad it's not all that I'm looking for in a filemanager, so my hunt for a perfect unix NC-clone goes on... :-)

    15. Re:Why? by daveman_1 · · Score: 1

      The original thread came from the kernel mailing list. I think you can see why a kernel developer might want to compile a kernel in 23 seconds...

      --
      Russian Russian Russian RussianDollSig DollSig DollSig DollSig
    16. Re:Why? by pslam · · Score: 2, Informative
      Maybe this is a silly question.. but why would you want to compile a kernel in 23 seconds? I mean, ok, it's cool and everything, but is there some hidden application of this that I'm not seeing?

      I'll give you the benefit of the doubt and assume you're not just a trolling karma whore here. The answer is as obvious as always: faster is always better. If there's nothing which needed that speed, it's because it wasn't previously viable and nothing got written with it in mind. If every computer were this fast, it makes compiling huge projects viable on small workstations.

      And here's a great example - where I work there are three things that reduce productivity because of technical bottlenecks:

      • Internet speed (both ways).
      • Waiting for CVS (we've got this down to less than a minute for the whole tree).
      • Compiling.

      Of these, the major bottleneck is compiling. If it takes 30 seconds just to recompile a single source file and link everything, you end up writing and debugging code in "batch" fashion rather than in tiny increments. And it's 30 seconds where you're not doing anything except waiting for the results.

      If I had a machine like this on my desk, I'd probably get twice as much work done.

    17. Re:Why? by satterth · · Score: 2, Funny
      Why you ask...

      Cause 23 seconds is braggin rights for a bitchin fast machine...

      /satterth

      --
      Being called a dork on Slashdot must be like being called the retard in special ed.
    18. Re:Why? by JesseL · · Score: 2

      But in that case, you can just compile more than one program at a time.

      --
      "Prefiero morir de pie que vivir siempre arrodillado!"
    19. Re:Why? by valkr1e · · Score: 1

      yes, but how quickly can it format the hdd? oh wait...that's a hdd speed issue, not a memory/processor. ummm....quake, that's it, QUAKE! yeah, how fast can it run quake? oh wait...it's a server, i'm guessing no geforce...hmm.... so if i bought one of these, will it keep track of all my contacts and sync with my palm? that's what is really important to me (man i hope i get at least a 1 with (funny))

    20. Re:Why? by Emil+Brink · · Score: 1

      Glad you like it! And, a more productive thing to do might be to actually (gasp!) tell your friendly free software coder (that'd be me) what it is you're missing. Preferably through private e-mail, with the magic word ("gentoo", although "please" is always welcome too) in the subject. ;^) No guarantees of course, but it can't hurt to ask, either.

      --
      main(O){10<putchar(4^--O?77-(15&5128 >>4*O):10)&&main(2+O);}
    21. Re:Why? by mabinogi · · Score: 1

      I find it rather amazing that almost everyone that replied to this post actually took it seriously......

      --
      Advanced users are users too!
    22. Re:Why? by tongue · · Score: 1

      Then you should check out Gentoo Linux.

    23. Re:Why? by LinuxHam · · Score: 2

      It's a shame you were too nervous to risk burning karma. I can only hope you bookmarked your comment in your journal in order to look for replies.

      I too started in 96, and I think its rare to find anyone who builds their own distro here. I may get six or seven, "I roll my own" replies but there are 600,000 people here. You should try compiling your own kernel sometime. You'll probably learn so much more about Linux's capabilities just by looking at all of the features that are available but deactivated by default. I'd bet you'd be shocked at what features you could activate by doing a simple recompile.

      Of course, if you don't recompile because you're a desktop user and don't need to tweak the system's performance or support odd hardware, then we would love to know your name even more so we could make you our poster boy for "Linux is not too hard for the average user."

      --
      Intelligent Life on Earth
    24. Re:Why? by akihabara · · Score: 1

      Also, some compilers (notably Borland) are incredibly efficent at compiling (sometimes through manipulating the language specs so the programmer lines things up so the compiler can just go through the source once and compiles as it goes).

      Huh?? GCC only goes through the source once.

      One reason Borland is fast for C / C++ is that they don't implement the full spec (e.g. trigraphs and escaped newlines are not implemented in the compiler proper), and they do little optimization.

    25. Re:Why? by amlutias · · Score: 1

      or if not that, at least a stub installer for the kernel source, for people on low bandwidth connections. the source is getting frickin' huge.

    26. Re:Why? by rew · · Score: 2

      The kernel compile is a "benchmark" as a bunch of reasonably separate CPU intensive jobs. If you do good on a kernel compile, you'll do very good on well-parallelizable hardcore computations as well.

      Here, we do kernel development. For us it's REAL time that we spend in compiling kernels. I really would like to have a machine that does 23 second kernel compiles...

      Roger.

    27. Re:Why? by Ripat · · Score: 1

      Well... Here's the long answer... :-)

      I have been looking for a good Norton Commander clone for linux since I started using linux.

      I noticed that gentoo wasn't really an NC clone, but instead a clone of some amiga file manager, but there seem to be some inspiration from NC anyway. But I can understand if my wishes for a NC clone doesn't fit in with the paradigm for gentoo...

      Anyway, gentoo seam nicer than most filemanagers I have looked at.

      It seems like everyone is trying to maka NC clones at the same time, but almost no one seem to know what they are doing, or they are doing something that I'm not interested in.

      What I would like to see is a NC clone that

      * Works well under X.

      * Has good support for keyboard navigation.

      * Has a edit field at the bottom where you can access the shell.

      * And I *really* want tabcompletion to work! If that means that the tab key can't be used to switch panel, then so be it.

      * Then some nice ftp features, and ability to open zip files, bookmarks to different locations etc would be nice, but I can live without that.

      The main thing is that the original NC just added great stuff to the DOS system. Most modern linux versions remove functionality from the system! Like tabcompletion, commandline support, keyboard navigation...

      Windows Commander does a good job on windows...

      The best one I have found on linux is old MC. It works allright under xterm, but the main problem has always been that tab completion is mapped to alt-tab instead of tab.

      Actually I got so tired of it, that I just did a dirty (very ugly, just added two if statements somwhere...) hack that moved alt-tab to tab, and tab to the key just above Tab ("1/2" - the one key on my keyboard that I never have used for anything good). And now the tab completion works just like I want it to!

    28. Re:Why? by ethereal · · Score: 1

      Why not? It's easy, fun, and a good feeling of accomplishment the first time you do it successfully :)

      --

      Your right to not believe: Americans United for Separation of Church and

    29. Re:Why? by JabberWokky · · Score: 2
      Huh?? GCC only goes through the source once.

      No it dosen't - to compile, it (and assocated tools like cpp) go through at *least* three times. Once for preprocessor, once for parsing function headers and then it starts the actual compile process. As I said, Borland uses its language choice to enforce authors to write source that can be parsed straight through. Turbo Pascal, and the Modula-3ish Delphi are languages constructed in such a way that the compiler can attack them with lots of compile-speed optimization tricks.

      Of course, Borland's C/C++ compilers (which take multiple passes) are also speed demons, but their pascal-based compilers are sickeningly fast. Quite amazing.

      One reason Borland is fast for C / C++ is that they don't implement the full spec (e.g. trigraphs and escaped newlines are not implemented in the compiler proper), and they do little optimization.

      Again, they tweak the language for the compiler, although I *think* they support trigraphs (I haven't used Borland in years, and may very well be wrong). They do optimize the code quite nicely though... at least for the era that I used their compiler, it was usually ranked third in a large large list of common compilers (Watcom sat at the top of that list forever). I used Borland, Turbo and Mix's Power C mostly (plus all the other things like Clipper, bizzare P-code Cobol compilers, etc). Nowadays compiler optimization has fallen way down on my list of "important selling points".

      Come to think of it, the language tool arena has really really shrunk, just like the OS arena. And if Borland~=Apple, gcc~=Linux/BSD and Visual Studio/.NET~=Windows, it kinda has similar dynamics (lots of other specialized stuff, just like there are lots of other specialized OSes, but nothing with a high profile or large market share). Interesting.

      --
      Evan

      --
      "$30 for the One True Ring. $10 each additional ring!" -- JRR "Bob" Tolkien
  9. ok this is NOT a troll by autopr0n · · Score: 4, Interesting

    But, does anyone know how NUMA compares with, say, a beowulf cluster? Does NUMA allow you to 'bind' multiple systems into one, so that I wouldn't need to rewrite my software? Did these guys use a stock GCC or something special? I know you would need to use MPI or similar for beowulf. Is NUMA as scalable as Beowulf in terms of building huge-ass machines (of course if I was going to expend the effort to do that, I might as well want to write custom software).

    If this type of system would allow 'supercomputer' performance on regular programs... well... that would be really nice. How much work is it to setup?

    --
    autopr0n is like, down and stuff.
    1. Re:ok this is NOT a troll by macinslak · · Score: 5, Informative

      NUMA is rather different than Beowulf.

      NUMA is just a strategy used for making computers that are too large for normal SMP techniques. I read a few good papers on sgi.com a couple of years ago that explained it in detail, and the NUMA link in the article had a quick definition. NUMA systems run one incarnation of one OS throughout the whole cluster, and usually imply some kind of crazy-ass bandwidth running between different machines. I don't think you could actually create a NUMA cluster of seperate quad Xeons boxes, and it would probably be ungodly slow if you tried.

      There probably isn't any difference for kernel compiles between the two, but NUMA clusters don't require any reworking of normal multithreaded programs to utilize the cluster and can be commanded as one coherent entity (make -j 32, wheee).

    2. Re:ok this is NOT a troll by jelson · · Score: 4, Informative

      NUMA is somewhere in between clustering (e.g. Beowulf) and SMP.

      On a normal desktop machine, you typically have one CPU and one set of main memory. The CPU is basically the only user of the memory (other than DMA from peripherals, etc.) so there's no problem.

      SMP machines have multiple CPUs, but each process running on each CPU can still see every byte of the same main memory. This can be a bottleneck as you scale up, since larger and larger numbers of processors that can theoretically run in parallel are being serviced by the same, serial memory.

      NUMA means that there are multiple sets of main memory -- typically one chunk of main memory for every processor. Despite the fact that memory is physically distributed, it still looks the same as one big set of centralized main memory -- that is, every processor sees the same (large) address space. Every processor can access every byte of memory. Of course, there is a performance penalty for accessing nonlocal memory, but NUMA machines typically have extremely fast interconnects to minimize this cost.

      Multi-computers, or clustering, etc. such as Beowulf completely disconnects memory spaces from each other. That is, each processor has its own independent view of its own independent memory. The only way to share data across processors is by explicit message-passing.

      I think the advantage of NUMA over beowulf from the point of view of compiling a kernel is just that you can launch 32 parallel copies of gcc, and the the cost of migrating those processes to processors is nearly 0. With beowulf, you'd have to write a special version of 'make' that understood MPI or some other way of manually distributing processes to processors. Even with something like MOSIX, an OS that automatically migrates processes to remote nodes in a multicomputer for you, the cost of process migration is very high compared to the typically short lifetime of a typical instantiation of 'gcc', so it's not a big win. (MOSIX is basically control software on top of a beowulf style cluster, and the kernel mods needed to do transparent process migration)

      I hope this clarified the situation rather than further confusing you. :-)

    3. Re:ok this is NOT a troll by oxfletch · · Score: 2, Informative

      NUMA provides you with a single system image, so there's no need to rewrite your software. At the moment, we're working on default behaviours so that normal software works reasonably well. For something like a large database, we're providing APIs that will allow you to specify things about how processes interact with their memory and each other, allowing you to increase performance further.

      The hardware looks a little like 4 x 4way SMP boxes, with a huge fat interconnect pipe slung down the back (10 - 20 Gbit/s IIRC). But there's all sorts of smart cache coherency / mem transparency hardware in there too, to make the whole machine look like a single normal machine.

      Yes, I used stock GCC (redhat 6.2).

      re Scalability, the largest machine you can build out of this stuff would be a 64 proc P3/900 with 64Gb of RAM. SGI can build larger machines, but I think they're ia64 based, which has it's own problems.

      It's not that hard to set up, but not something you would build in your bedroom ;-)

  10. Alternatively... by Ed+Avis · · Score: 4, Funny

    You can also get 23-second kernel compiles in software using Compilercache :-).

    --
    -- Ed Avis ed@membled.com
  11. Re:Who would have guessed... by BitwizeGHC · · Score: 4, Interesting

    No, this is a case of free software and cheap hardware making technologies available now to many people for whom it wasn't available (i.e., outside the realm of affordability because it was only sold by expensive proprietary vendors) just a short time ago. That is a more significant change than the endless treadmill of Moore's Law to which we had become accustomed.

    --
    N4st0r, trixx0r h0bb1tz0rz! Th3y st0l3 0ur pr3c10uzz!
  12. this may be good but... by m0RpHeus · · Score: 3, Insightful

    This may be good news, but what the heck! They should have at least included the .config that they used so that we can know what drivers/modules that are compiled with it, or maybe this is just bare-bones kernel enough to run the basic. We need to know the complexity of the configuration before we could really say it's fast.

    --
    Take-off every .sig! For Great Justice!
    1. Re:this may be good but... by oxfletch · · Score: 1

      OK, I posted it at http://lse.sourceforge.net/numa/config.mem - this is just the standard config I use every day on this machine. If I wanted to be a real benchmark weenie, I could make this go much faster ;-)

      What I'm really interested in is what makes it go slow on a "normal" workload, so we can fix the scalability problems.

  13. Yeah, that's great, and all... by Wakko+Warner · · Score: 4, Funny

    But where can I get a NUMA cluster for $80? Should I Ask Slashdot?

    - A.P.

    --
    "Remember when the U.S. had a drug problem, and then we declared a War On Drugs, and now you can't buy drugs anymore?"
  14. That's on "old" hardware too by Ami+Ganguli · · Score: 2

    That's on a "16 way NUMA-Q, 700MHz P3's, 4Gb RAM".

    I've been following that thread wondering if anybody would post better results with a dual Athlon or similar. Any lucky soals with really cool hardware who want to post benchmarks? In fact, it would be interesting to know how quickly the kernel compiles on single P3/700, just to get an idea of how it scales.

    --
    It is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail. - Abraham Maslow
    1. Re:That's on "old" hardware too by oxfletch · · Score: 2, Informative

      It doesn't scale too well yet. A single quad (fairly standard SMP 4 way) will do the dirty deed in about 47s.

  15. It takes me 23 seconds to boot by ptbrown · · Score: 1

    So if I could compile a new kernel in less time than it takes to boot-up, then a new kernel would be ready before the boot process was finished. So I'd have to restart with the new kernel, and if I start a new kernel compile too then that boot wouldn't be able to finish before there was another new kernel, so I restart with the new new kernel and begin another compile...

    Maybe naming this box 'Zeno' wasn't such a good idea after all.

    (PS. You can now compile a kernel faster than Nautilus opens a folder. Go fig.)

    --
    Any sufficiently advanced civilization is indistinguishable from Gods.
  16. Who cares about the compile time ... by beanerspace · · Score: 2

    ... just so long as I never have to program on Sequent iron, and that it's insidious operating system ever again. Of course, that was 10 years ago when Dynix, trying to be the best of both worlds, was really neither ATT nor Berkly !

    ... of coure the other problem was indeed the expense, leaving us in situations where we had to program at odd hours and off-days because the client couldn't affort a "development" machine.

    ... two issues which I would hope are sovled a 16-way Xeon for $2K ... hence, making it a REAL-world bargain.

  17. HELLLOOOOOO??? by Anonymous Coward · · Score: 4, Insightful

    You can't build a NUMA cluster worth a crap without a fast, low-latency interconnect.

    Sequent's NUMA Boxen use a flavor of SCI (Scalable Coherent Interface) which is integrated into the memory controller.

    While you can use some sort of PCI-based interconnect, the results are just plain not worth it.

    Infiniband should be better, though I've heared the latency is too high to make this a marketable solution.

    Keep your eyes on IBM's Summit chipset based systems. These are quads tied together with a "scalability port" and go up to 16-way. They should go to 32 or higher by 2003. That's when NUMA will -finally- be inevitable...

    1. Re:HELLLOOOOOO??? by SpinyNorman · · Score: 2

      AMD's Hammer family with the HyperTransport bus controller built in will provide NUMA for the masses! Bring it on!

    2. Re:HELLLOOOOOO??? by GigsVT · · Score: 1

      This is probably the most insightful comment in this thread so far.

      --
      I've had enough abrasive sigs. Kittens are cute and fuzzy.
    3. Re:HELLLOOOOOO??? by oxfletch · · Score: 1

      No it isn't. It's a totally different generation of hardware.

  18. 4x4 cluster for $2k? Show me. by Crag · · Score: 1

    Unless it's two years old I don't believe the price of that cluster is $2k. The cheapest quad-xeon motherboard on pricewatch is $500. If you cut that in half for being used, that's still $1k for just the motherboards. Add 16 processors, ram, cases, NICs, drives, power supplies, and other parts, and there's no WAY you're coming in under $5k, and $10k would be more realistic.

    On the other hand, if someone IS selling such a beast and I can win the bidding with a $2k bid, I might be tempted...

  19. Great news for mozilla and nautilus... by boris_the_hacker · · Score: 5, Funny

    ... with the advent of this new technology and raw speed, you should actually be able to use them!

    [this is actually a joke]

    --
    chris at darkrock dot co dot uk
    http colon slash slash www dot darkrock dot co dot uk
    1. Re:Great news for mozilla and nautilus... by DrXym · · Score: 1

      Mozilla runs just fine on my 450Mhz laptop

    2. Re:Great news for mozilla and nautilus... by boris_the_hacker · · Score: 1

      Yeah, I know, personally I am a knoqueror nut but use mozilla for the stuff that it seems to have issues with - I am very impressed with the current nightlies [very snappy]. I am pleased that there is some very good progess being made [I love the tabs].

      But WRT my original post, pinch of salt, I'm british :-)

      --
      chris at darkrock dot co dot uk
      http colon slash slash www dot darkrock dot co dot uk
    3. Re:Great news for mozilla and nautilus... by tempest303 · · Score: 2

      hehe... I know you were just kidding, but I thought it was worth mentioning that the Gnome 2 version of Nautilus is MUCH faster! I'm really excited to see what the community-at-large has to say about Nautilus when they see all the headway its made in terms of speed!

      As for Mozilla? GREAT project, the Web *needs* Mozilla, but for my desktop? I'll stick with Galeon, thanks :)

  20. IBM and Sequent being good citizens by swirlyhead · · Score: 5, Informative

    I went and looked at the email and noticed that the very first patch he mentions was from the woman who came and gave a talk to EUGLUG last spring. For one of our Demo Days we emailed IBM and asked them if they would send down someone to talk about IBM's Linux effort. We were kind of worried that they would send a marketing type in a suit who would tell us all about how much money they were going to spend, etc., etc. But we were very pleasantly surprised when they sent down a hardcore engineer who had been with Sequent until they were swallowed by IBM.

    She did a pretty broadranging overview of the linux projects currently in place at IBM, and then dived into the NUMA/Q stuff that she had been working on. The main gist of which is that Sequent had these 16-way fault-tolerant redundant servers that needed linux because the number of applications that ran on the native OS was small and getting smaller. Turned out that even the SMP code that was in the current tree at the time did not quite do it. She had some fairly hairy debugging stories, apparently sprinkling print statements through the code doesn't work too well when you're dealing with boot time on a multiprocessor system because it causes the kernel to serialize when in normal circumstances it wouldn't...

    I think the end result of all this progress with multiprocessor systems is that we'll be able to go down to the hardware store and buy more nodes plug 'em into the bus; and compute away.

    1. Re:IBM and Sequent being good citizens by JoshuaDFranklin · · Score: 2

      I don't know if she told you this, too, but IBM saved Sequent from going out of business. I lived a couple of blocks from their headquarters in the Silicon Forest (Portland, OR) and in the last couple of years two or three of their buildings had the Sequent sign taken down. There were articles in the paper about how in trouble they were, guys at Intel (Sequest was an Intel spin-off) saying they're sorry to see such a good idea go, etc. Sure, they had a few years left, but, without IBM, NUMA-Q probably would have went the way of the Alpha.

    2. Re:IBM and Sequent being good citizens by jamesc · · Score: 1
      [ Sequent was an Intel spin-off ]

      Really? Then why did their first generation of HW, the Balance series, use National Semi's CPUs?

      I don't doubt that there was (is?) lots of crossover between Sequent and Intel Hillsboro, but that's probably due to joint projects and hiring from the same population of engineers....

      --
      "You've crossed my Line of Death!" "What? No! Where is it?" "Here in the fine print...."
  21. Re:Who would have guessed... by in10d · · Score: 1

    Intel 486 running at 100 Mhz. Just think how long it would take to compile the kernel on that system.
    Recently I "experienced" compiling 2.2.19 kernel on Intel 486 DX 100 with 16MB RAM. It takes about 4.5 hours - that's about 700 times slower.

  22. Re:Who would have guessed... by sydb · · Score: 3, Funny

    Woah, for a moment I could have sworn that was a Jon Katz article...

    --
    Yours Sincerely, Michael.
  23. Re:Who would have guessed... by nelf · · Score: 2, Interesting


    I'm afraid I have to disagree entirely, mate. I'm no neo-luddite by
    any stretch of the imagination... I too spend a good proportion
    [English is hilarious] of my time on the internet. I could, indeed, be
    said to be leading a 'double life' by the unobservant. Notwithstanding
    Mr Postman, Still and Talbot whom I cannot speak for; your assessment
    of the intrinsically 'good' or 'evil' nature of technology is far from
    clearly correct. You're allusion to the internet as a block of marble,
    awaiting us to sculpt meaning into its form by using it is desperately
    far from the truth. For example, books are not tabula rasa objects,
    waiting for readers to impress upon them meaning and effect. When you
    read the bible, the koran, Herman Hess or whoever, is it not the
    author that steers you're experience of reading?

    There are many forms of media in our lives, and the internet is just
    one of them .. the fact that the internet is common in our curlture
    and accessible to many folks does not detract from its power to
    affect, to sometimes enourmous proportions, our culture, purpose and
    ultimate ly 'mystical' existence.

    Some instances of a particlular media may merely 'incline' us to
    consider something... bland books, poor television programs or vacuous
    theatre productions, but there are some instances that inspire us and
    drive us to better our existance, or in some cases to cause blight,
    cruelty and ruinous events.... Language, for example, has allowed our
    brains to extend far beyond the confines of our boney skulls and
    enables us to communicate and share ideas. If you've ever been in the
    presence of a great speaker, you will know instantly that words are
    not merely emtpy sounds awaiting our interpretations, but are weighted
    vehicles for the influencial dissemination of ideas, and are very
    seldom 'neutral'.

    So my point is this: the internet is not a neutral object awaiting our
    interpretation, but is a rich and varied media that can influence you
    .. it can shock, scare, amuse, frighten.... and more things than you
    can find in a thesaurus or dictionary, to boot; and it is NOT guided
    or limited by your own mind...

  24. Nice try... by castlan · · Score: 4, Informative

    But the reserve for this machine is $3850. The article says 16 way, which would be four of these four-way SMP systems. That also doesn't take into account the need for a high-bandwidth, low latency interconnect (like SGI's NumaLink.) If you aren't expecting more than 16-way SMP, then you can probably get away with switched Gigabit Ethernet, as long as it is kept distinct from the nornal network connectivity. If the Gigabit upgrade is still dual portm then you are set. If not, you'll neet another NIC - though you will only really need one for the whole cluster.

    Maybe instead of two grand, the poster meant twenty-grand. Either way, $20 grand is better than $100K!

  25. Off by one... by daveman_1 · · Score: 1

    You have 26 rows.

    --
    Russian Russian Russian RussianDollSig DollSig DollSig DollSig
  26. What if we had by OeLeWaPpErKe · · Score: 1

    ... a beowulf cluster of these ?

    sorry I couldn't resist

  27. NUMA stands for by Bloody+Bastard · · Score: 2, Informative

    NUMA means Non-Uniform Memory Access. It is a kind of computer where you have shared memory but you dont have the same access time for every processor to every memory position. Therefore, every processor will have access to all memory but sometimes it will take longer or shorter (if the memory belongs to another processor).

    In a Beowulf cluster, you dont have shared memory (unless inside a node, if you have a SMP machine) and you must use message passing to communicate (unless you are using DSM--Distributed Shared Memory--, maybe with SCI).

  28. Why is this alternative funny? by Pooh22 · · Score: 1

    Having read the site for compilercache, I fail to understand why the parent is (by some) moderated as funny.

    It may be my lack of understanding, but it seems rather wasteful to recompile everything when only a few files are changed. Same goes for changed comments.

    Ok, a NUMA achitecture is nice to have for compiling, but it's probably a lot more useful for things that cannot be cached at all (rendering, simulations, etc. they've been mentioned already).

    I would moderate the parent as interesting or informative....

    1. Re:Why is this alternative funny? by Webmonger · · Score: 3, Informative

      Probably because compilercache is a way to AVOID compiling. . .

    2. Re:Why is this alternative funny? by morcheeba · · Score: 2

      Actually, it still takes quite some time. From the FAQ:

      kernel--- compilercache time
      default-- no-------------- 5m28.860s
      default-- yes, but empty 6m56.490s
      default-- yes, filled------ 2m51.900s
      modified yes, filled------ 3m58.730s

      (ugly formatting to avoid lameness filter)

      It looks pretty safe, especially if you've been burned by a badly written Makefile. The FAQ explains the difference between compilercache and makefiles pretty well.

      As a bonus, compilercache ignores changes made to the comments (since it uses the preprocessed source (with the comments stripped) to calculate an MD5 checksum), so you can fix/add comments without worrying about an extra long compile.

      I probably won't use it, though -- my projects tend to require only one file to be recompiled per build.

  29. The real question is... by gorre · · Score: 1

    OK so it can compile the kernal in 23 seconds but the real question is how long does it take to launch an app in kde?

    --
    "Madness is something rare in individuals - but in groups, parties, peoples, ages it is the rule." -- Nietzsche
  30. make bzImage is not a very good benchmark by wowbagger · · Score: 3, Informative
    I would assert that a simple "time make -j32 bzImage" (which is what is being quoted) is not a very good benchmark as it is.

    Reason? Not enough information as to the options.
    • What version kernel was he building (actually, the LKML post did give this, but as a general statement this objection still stands)
    • What were his compile options? Building a kernel with everything possible built as modules will take a great deal less time to build bzImage (the non-module part of the kernel) than would a kernel with everything built in.
    • Then there's the issue of buffercache - to be consistent you would have to do a "make -j32 bzImage && make -j32 clean && time make -j32 bzImage" in order to have a consistent set of files in the VFS buffercache.

    Never the less:

    I WANT ONE
    1. Re:make bzImage is not a very good benchmark by oxfletch · · Score: 1

      1. 2.4.18, and I also told you what patches I was using (though some of them won't be published until next week).

      2. OK, I just posted the config file. http://lse.sourceforge.net/numa/config.mem

      3. I did five kernel compiles in a row (though I omitted to mention that).

    2. Re:make bzImage is not a very good benchmark by salmo · · Score: 1

      I think the interesting thing (and probably the purpose of teh email) was not the 23 sec. compile time but the relative increases due to the patches. 23 sec does undeniably kick some serious tail, though.

      And as for your subject, I don't believe this was ever intended to be a benchmark to show off the machine, it was just an experiment. When you have a 16 way NUMA, you don't need to show off. :-)

  31. Sorry, Anton Blanchard Wins by nbvb · · Score: 3, Informative

    http://samba.org/~anton/e10000/maketime_24

    Wheeeeeee!

    And seriously, I saw some comments about needed a really fast interconnect... check out Sun's Wildcat.

    --NBVB

  32. Not just a kernel compile... by Snowfox · · Score: 2
    It's not just a kernel compile. It's also bzipping, which takes a few seconds alone on most machines, and which can't effectively be done in parallel.

    Very nice. :)

  33. and I hold the other record by Anonymous+Admin · · Score: 2

    my 386-dx40 with weitek coprocessor and 8M ram,
    at 1.36 bogomips, will compile a 2.2 kernel in only 27 hours 13 minutes.

    1. Re:and I hold the other record by Anonymous+Admin · · Score: 1

      probably the extra 8 meg ram

  34. Other options? by TheSHAD0W · · Score: 2

    How well would Firewire, Fibre Channel, or SCA work as NUMA interconnects? How would these guys compare, pricewise and in effectiveness, to 1000baseT?

    1. Re:Other options? by Noehre · · Score: 1

      Firewire is too slow and too processor demanding.

      And Fibre Channel is for hard drives..

      Soo umm, not well.

    2. Re:Other options? by TheSHAD0W · · Score: 2

      Fibre Channel is an advanced form of SCSI, and can actually be used for communications. I know it can be done, I just don't know whether it's been done before, with drivers available, and I don't know how expensive it'd get.

      Thanks for the info on Firewire, though.

    3. Re:Other options? by aminorex · · Score: 2

      No, FC is not an advanced form of SCSI. You can
      run SCSI over FC-AL, however. You can also run
      SCSI over OC384 or a 300 baud modem now (iSCSI).

      --
      -I like my women like I like my tea: green-
    4. Re:Other options? by victwenty · · Score: 1

      I've run tcp/ip over FC (with JNI hba's) under solaris. It wasn't faster than gigabit ethernet in testing.

  35. Re:Who would have guessed... by justinstreufert · · Score: 1

    How can this be? 2.0.x compiled on my 486dx2/66 with 32mb in just under half an hour. Has the code ballooned that much?

    Justin

    --
    "Why would God give us a waist if we wasn't supposed to rest our pants on it?" - Rev. Roy McDaniels
  36. Compilation is highly parallelizable by Tom7 · · Score: 2

    That's good, but compilation is awfully parallelizable: You could (almost) just assign a computer to compile each individual source file; the total time would be the time to compile the slowest file plus link time. You could accomplish this with a shell script and a network file system -- what's the benefit of doing it with a shared-memory system like NUMA?

  37. Moderators, Parent is Completely OFFTOPIC by castlan · · Score: 1

    Who would have guessed that technology would progress? Most people alive this century. There wa at least one guy at the beginning of this century who thought that no more technical advances were possible, and the patent office would have to be closed. He was the exception. Even "Moore's (so called) Law" postulates that the transistors on a CPU would double in number every 18 months, which mostly explains your progression along the Intel product line. That is insightful?

    You don't get the point at all. This is about NUMA technology escaping the proprietary systems of the past and becoming feasible outside of Government funded Nuclear Detonation Simulations and corporate data-mining. This is about Free Software enabling this once proprietary technology due to the generous donations by SGI and IBM, who want Linux to bring its buzz to fruition on their hardware and services.

    Now who would have guessed that somebody could use this as an opportunity to talk about email, the web, and not just "jihads and holocausts, but also rebirths and renaissances", not to mention neo-luddites. This is the poorest mismoderation I've seen for a while, and I wish that AC who claimed to have mod points had actually had the balls to use them. This obvious display of crap-flooding doesent even end with a proper sentence.
    And despite the fact we differ on many points, on this point we agree. He writes:
    God, this is disgusting. What a fucking rediculous troll. If anybody w/mod points actually possesses an ounce of intelligence, maybe I wouldn't have to puke right now.

    -castlan

    1. Re:Moderators, Parent is Completely OFFTOPIC by b-side.org · · Score: 1

      word, yo.

      --
      Indie rock lives! b-side!
  38. Why use NUMA? by kiltedtaco · · Score: 1

    It would be quite easy to configure the kernel build process for several machines to each make a .o file, and them send them to a master machine for a final link.

    There are about 540 object files for (my) kernel build. Given say 20 Pentium Pro's, each would have to build 27 object files. That's not too bad. I don't have a pentium to test the speed on, or a cluster of 20 of them to do this, but it seems alot easier than NUMA.

    1. Re:Why use NUMA? by hansendc · · Score: 2

      It would be quite easy to configure the kernel build process for several machines to each make a .o file, and them send them to a master machine for a final link.

      If the end goal of this was to just compile kernels fast, you would be right. These numbers were posted because everybody knows how fast their kernel compiles. If someone posts TPC-H or SpecWeb99 numbers, no one notices. Normal people can say, "Wow, that is fast!"

    2. Re:Why use NUMA? by kiltedtaco · · Score: 1

      Then we need a better benchmark. Despite it being dificult to do a good benchmark, we need a better one.

  39. Re:Who would have guessed... by CoolVibe · · Score: 2
    Well, I have to wait around 3+ hours for a kernel compile on my SparcStation 5.

    When it's upgrade time, I can start a compile, go to the pub, have a few beers, go back, see that the compile failed (because of , sparc32 and linux 2.4 don't seem to mix very well without some heavy tweaking), fix mistake, start again, and go back to the pub :)

    Thanks to my slow sparcstations, I have a life! :)

  40. NUMA vs Beowolf by castlan · · Score: 2, Informative

    Beowolf clusters are considered horizontal scaling, while NUMA clusters are considered vertical scaling. From my experience (SGI CC-NUMA) a NUMA cluster looks like a single computer system, with a single shared memory. (SGI systems are even Cache-Coherent, so that there is minimal performance loss if your data is in the RAM of the most distant CPU.. a significant issue with 256 nodes). This means that you don't have to deal with MPI or other systems to deal with disparate memory of seperate machines, so you can mostly code as if it were a single supercomputer. In fact, that is how SGI actually makes their supercomputers.

    NUMA clusters tend to have scalability problems related to the cache coherence issue, so for a vertically scalable CC-NUMA box, you have to pay SGI the big bux. I haven't looked at IBMs NUMA technology, but if they own sequent, then they probably have similar capability.

    As for the work to set one up, SGI's 3000 line is fairly trivial, the hardware is designed to handle it, and I think you only need NUMA link cables to scale beyond what fits inside a deskside case, if not a full height case. Now if you have a wall of these systems, you will need the NUMAlink (nee CrayLink) lovin'. As for an Intel based system, I suspect it wouldn't be nearly as easy... unless your vendor provides the setup for you. On your own, you would need to futz with cabling the systems together, just like in a beowolf. Except that your performance depends on finding a reasonably priced, high bandwidth, low-latency interconnect. Gigabit Ethernet wont scale very far, so going past 16 CPUs would be very unpleasant. If you expend the effort, you will end up with a cluster of machines that behave very much like a "supercomputer" though. Good luck!

  41. Object Pascal by Latent+Heat · · Score: 1

    Yes, and if the Kernel were written in Borland Object Pascal, you could get a 23 second compile on a Celeron 500.

  42. First Kernel Compile by Leme · · Score: 1

    I still remember my first kernel compile, it was way back in the early ninties, don't exactly remember the year. I had installed linux on my pride and joy 486slc/40 with 4 megabytes of ram. After reading how to do it I started at around 6pm, around 3am I grew bored at staring at the console and went to sleep.

    When I awoke, it was finally done but I wasn't aware of how lilo worked at the time, so I just erased the old kernel and copied the new one into my / directory and rebooted. But the system never came back up. Frustrated I installed DOS back onto the system.

  43. Re:4x4 cluster for $2k? Show me. by oxfletch · · Score: 2, Informative

    Don't forget to add about $10,000 per quad for the custom interconnect, which is what really makes this machine work

  44. Re:4x4 cluster for $2k? Show me. by oxfletch · · Score: 1

    Oh, and BTW, yes this hardware is about 2 years old.

  45. Re: Compaq hot-plug drives by Raetsel · · Score: 2

    Let me get this right... you buy a used computer, and then go straight to the manufacturer for replacement parts??? (Surely you know 'accessories' are one of their higher-margin profit centers!)

    Still... if you're in the Seattle, WA area, stop in the Boeing Surplus Retail Store. I was there last week, and they had a bucket of what looked like 80-pin 2.1GB Compaq hot-plug drives. They were just sitting there next to a cash register like candy would be at a supermarket. I don't remember the price, but I want to say they wanted ~$5 each for 'em.

    They were also selling an Indigo ($50), lots of PCs (mostly old Dell OptiPlex models, $20 - $300, Pentium MMXs to P-IIs), and even a Barco data-grade projector ($2500). Fun place to go and blow half a day poking around.

    --

    "...America's great minds of today, teaching America's great minds of tomorrow. Poor bastards." -- A Beautiful Min
  46. Re:is Ingo's O(1) scheduler patch in 2.5 yet? by hansendc · · Score: 1

    it is in 2.5.

  47. MOSIX remakrs by fidros · · Score: 1

    1. There exists a MOSIX implmented as a linux patch.
    2. With MOSIX, the migration is largely a function of network. 100BaseT is low, but Ethernet over PCI, Infniband and friends can make this work well.

    3. The MOSIX advantage over NUMA is that it is linearily scalable in the number of machines, NUMA can't go beyond a certain limit.

    --
    Gilad.
  48. kernel compiles on a dual ath... by drscriptt · · Score: 1

    I have a dual athalon 1500+ (1.3 GHz), 512 RAM, and 2 uw scsi drives soft raid(0)ed. I issue the following command "time for c in dep clean dep clean dep clean bzImage; do make -j 4 $c; done" after I do a make distclean on a 2.4.18 and a make menuconfig for the defaults. Compile time, around 41 - 43 seconds.

    -G.T.

  49. Re:Who would have guessed... by Mr+Z · · Score: 1

    It could be the difference between 16MB and 32MB of RAM. GCC processes aren't tiny. And if you're running X, watch out.

    FWIW, my 486-class machine (an AMD 5x86-133) has 64 MB of RAM, and its much happier. However, there's a memory leak (2.2.19 leaks memory whenever dhcpcd renews my lease -- I suspect it's related to IP Masquerading, which I also run though that box). When memory starts getting low (all that leaked memory is unpageable), the machine performance falls off a cliff.

    Thankfully, it leaks slowly enough that I only reboot that machine once every 3 months or so.

    The last time I tried to build a kernel on there, it took probably close to an hour or more. Typically, though, this guy sig-11's out of the compile. He's getting old and tired.

    --Joe
  50. Can buy PCI-SCI cards for 1k with 3us latency by j_dot_bomb · · Score: 1

    http://www.wulfkit.com/scibenchmarks/latency.html

  51. You can buy PCI-SCI cards for 1k with 3us latency by j_dot_bomb · · Score: 1

    http://www.wulfkit.com/scibenchmarks/latency.html These are used in the bewoulf community already. Almost any other solution has much higher latency. Ive never seen anything less than 7us and I think that was myrinet (similar price). Ethernet has a MUCH higher latency.

  52. The Real Question by ewhac · · Score: 2

    Well, a 23-second kernel compile is impressive and all, but the most important question I would have of such a machine is: How fast can it run Quake-3?

    If it can do 1280 * 1024 * 32bpp at 300 frames/second, then I'm getting one.

    :-),
    Schwab

  53. You can actually - Dolphin SCI by Moderation+abuser · · Score: 2

    You can buy the bits needed to build your own NUMA hardware system out of separate boxes relatively[1] cheaply. The speed depends on how you manage the memory and I/O. You'd need Linux to support it as a coherent whole though and I'm not sure that it does.

    [1] For large values of relatively.

    --
    Government of the people, by corporate executives, for corporate profits.
  54. Re:4x4 cluster for $2k? Show me. by Crag · · Score: 1

    But that's still just the mb and cpus. As the other guy who replied to me pointed out, you also need networking hardware to connect the machines, and the result is still 2 years old. $2k is not realistic.

  55. Data General had been doing 32 way NUMA boxes. by Moderation+abuser · · Score: 2

    Ooh, for 5 years or so.

    They've since been bought by EMC and closed down but they had it working *and* scaling to 32 CPUs and on the market. 64 CPU systems were well on the way but I don't recall if they finished them.

    --
    Government of the people, by corporate executives, for corporate profits.
  56. Thank You!!!! by greenrd · · Score: 2
    Ed Avis, I kiss you!!!

  57. Long compile tricks by LinuxHam · · Score: 1, Offtopic

    While I may not recall my first kernel compile (late 96 or early 97), I wanted to share a small kernel compile trick. When compiling on my P75's, I usually

    nice -n -20 screen
    make dep clean bzImage modules modules_install ; echo "Kernel done" | Mail myname@skytel.com

    I started doing that in '98 when compiling on a 486 I commandeered in the server room. Saves you from doing "are we there yet?" with your box. Also lets me know when I need to head home from running my errands or eat a little faster at the dinner table. :)

    --
    Intelligent Life on Earth
  58. Mod parent up please by SurfsUp · · Score: 2


    1. 2.4.18, and I also told you what patches I was using (though some of them won't be published until next week).

    2. OK, I just posted the config file. http://lse.sourceforge.net/numa/config.mem

    3. I did five kernel compiles in a row (though I omitted to mention that).


    Hi Martin!

    --
    Daniel

    --
    Life's a bitch but somebody's gotta do it.
  59. Mod this up too by SurfsUp · · Score: 2

    In fact, somebody please go and mod up all oxfletch's posts on this article, he's Martin Bligh, the guy who did this.

    --
    Life's a bitch but somebody's gotta do it.
  60. Just curious ... In-memory-build by qta · · Score: 1

    This is definitely out of line with the intended benchmark, and perhaps I don't know better, but I am just curious... With that amount of RAM (4Gig), maybe they can setup to do an "in memory" build, and cut the build time by half ?!

  61. Read my lips: N-U-M-A- -Q by juliao · · Score: 1
    What part of it didn't you understand?
    The machine he's running in is a Sequent/IBM NUMA-Q, not just a bunch of PC servers...

    I don't know what was meant by the submitter or the editor or whomever it was that was ranting about $500 machines, but this is not what it's about.

    First of all, it's a porting project, making linux run on the platform. Only then, it's a feature project, making linux make good use of the NUMA capabilities.
    So there. It's a work in progress, and no, you can't afford one of these for home. Maybe, in time, the lessons learned from here will help us build generic Fast-NOW clusters and have software-based NUMA on linux. Until then, keep dreaming, or start working on a project you can contribute to. Remember, the difference with open-source is that you can always do more than just complain.

  62. Re:Who would have guessed... by peter · · Score: 1

    No, NUMA in general. It's useful for all kinds of things. Beowulf is limited by its interconnects relative to a good NUMA implementation, so this is useful to people who need lots of CPUs and one big virtual address space.

    --
    #define X(x,y) x##y
    Peter Cordes ; e-mail: X(peter@cordes , .ca)
  63. 7.5 second kernel compile by millette · · Score: 1

    yep, I've submitted, but I don't think it's gonna be accepted. See here for more details:
    http://www.uwsg.indiana.edu/hypermail/linux/kernel /0203.2/0009.html