Slashdot Mirror


IBM Demos Cray-Matching Linux Cluster

An anonymous reader sent us a link to an InfoWorld story where you can read about IBM slapping together an Open Source Supercomputer capable of matching a Cray on PovRay benchmarks. It's basically just a cluster of Xeon based Netfinitys. Smooth.

129 comments

  1. Heh by Anonymous Coward · · Score: 0

    Gee, I wonder if this will force SGI to start selling Crays at Barnes and Noble :P.

  2. Wow. by Anonymous Coward · · Score: 0

    They're even *emphasizing* the fact that they could just use Linux off-the-shelf to put together a world-class supercomputer at a fraction of the cost.

    Proprietary operating systems are just looking worse and worse... could you do that with NT? :-) Yeah, right, for about 5 million and 37 MCSE's on-call to reboot the machines periodically...

  3. Linux 2.2.2 by Anonymous Coward · · Score: 0

    Linux 2.2.2 is in books already? The POV bench
    says they were using 2.2.2..... Something fishy
    going on here.

  4. No Subject Given by Anonymous Coward · · Score: 0

    Should IBM be doing this? How much have they spent on Deep Blue? Could a Linux cluster equal the Deep Blue for half what a Deep Blue costs?

    and while we are on the subject, a link from a post made yesterday.

    http://www.brightstareng.com/

    laptop cluster?

  5. How about PowerPC's? by Anonymous Coward · · Score: 0

    IBM's really into the PowerPC, I wonder how this set-up would turn out using one of them...

  6. Entry level supercomputing by Anonymous Coward · · Score: 0

    One of the things missed by most of the Linux supercomputers is inter-nodal bandwidth. If you can break the data apart with minimal communications, and you're bound by CPU, not data, then the "cluster" is a great solution. "True Supercomputers" (which are more and more rare) are capable of moving 10s of gigabytes a second thru the CPU using memory that has no latency, etc. Also, things like the SP2 (IBM's BIG toy), the Cray T3E and the Origin 2000 have very high bandwidth I/O interconnections, none of which is less than 1000x faster than "fast ethernet". What the Linux world needs is a SERIOUS I/O solution. Myrinet is as close as it gets, but it doesn't scale past 64 nodes without some pain.

  7. Not really a fair test... by Anonymous Coward · · Score: 0

    Clusters of machine like this are great for tasks that are highly distributed like rendering. You just hand off a frame at a time to each machine.

    Other tasks that aren't so parallelizable (e.g. scientific simulations, where you need to know what happens in the timeslice you're on before you can calculate the next one), won't perform as well.

    My Login's Not working...
    --PoochieReds
    Still, you might be able to get close if you had FDDI or GB ethernet between the nodes. I'd like to see how it performs on a scientific (or engineering) simulation test. Maybe something with fluid dynamics.

  8. IBM move Caldera, Debian, SuSE out of the picture by Anonymous Coward · · Score: 0


    Red Hat rule once again. There is no future for these other also-ran distro's. If any of them were any good, IBM (and others) would invest in them.

    Red Hat is here to stay.

    Way To Go Linux!!

  9. Entry level supercomputing by Anonymous Coward · · Score: 0

    An IBM SP/2 is a cluster. It doesn't have the high-performance architecture of a cray. I don't know what they're using these days but it's probably no better than fast ethernet or FDDI.

  10. ServeRAID by Anonymous Coward · · Score: 0

    I don't think you would stand much of a chance getting that to work, unless int13 can handle gigabytes of diskspace, since no ServeRAID driver exists for win95/98 (assuming that they used the IBM ServeRAID controller for this test).

    Not to mention that win95/98 does not support SMP.

  11. Real-time rendering? by Anonymous Coward · · Score: 0

    POVBENCH measures how long it takes to parse and render a specific POV-Ray scene (skyvase.pov) with specific settings and write the resulting image to the disk in chunks of 1000KB. The method in use is ray tracing using floating point math. This method is not the same medhod used to render Toy Story. The resulting image contains 640 * 480 pixels. For each pixel 1 to 9 rays are traced. In addition to those reflected rays are traced, too. 3 seconds is awesome. You can download POV and try for yourself.

  12. Avalon is used for simulations by Anonymous Coward · · Score: 0

    Check it out at www.top500.org
    It's # 113

  13. trivial problem by Anonymous Coward · · Score: 0

    This is hardly a fair comparison. Let's take a problem that is clearly trivially parallizable and has marginal internodal communication and then claim that it's a true measure of a cluster.

    What about the large memories and IO rates needed to feed a supercomputer?

    Anyway, I could spend some more time on this, but in this forum it's just a religious pissing match anyway.

    It's amazing what passes for technical merit and critical thinking in the linux community.

  14. Is Avalon faster then Cray? by Anonymous Coward · · Score: 0

    They used 36 PIIs in the cluster. Avalon uses 140 Alphas. How fast is that Cray thing anyway?
    Oh, and BTW, wouldn't using Alphas be more cost effective then this Intel crap?

  15. flamebait by Anonymous Coward · · Score: 0

    this one is just asking for it, so lets not start.

  16. Yet another x86 Abomination by Anonymous Coward · · Score: 0

    First, don't use IBM's prices! One can get much cheaper dual systems than from IBM. Hardly a reasonable comparison.

    And, x86 isn't so bad compared to the alpha when running linux because there aren't any good compilers for the alpha on linux. yeah, one can violate the license agreement and use the Compaq compilers but that's hardly proper.

    x86 does very well for the money.

  17. Entry level supercomputing by Anonymous Coward · · Score: 0

    More info...

    According to the books sitting on my desk for the SP2 we own, the "maximum internodal bandwidth" is 120MB/s (which is 1.2Gb/sec) and raw data across this wil lreach this bandwidth. Once you layer IP or PVM or anything else on top of it you're losing data. IP across 100Mb Ethernet won't do 10MB/sec, you're limited by non-detministic protocols and the overhead.

    POVRAY is a useful benchmark for certain applications (those whcih go totally parallel with minimal internodal communications and which work with small data sets). One of the things we use our SP2 for is mining thru terabytes of data which is all related and requires as much bandwidth as possible between nodes.

    I'm not arguing Linux clusters aren't valuable, they are, just not for everything, and not for a lot of hte classic problems. In almost all problems 10 CPUs of 1GFLOP are more usable than 1000 at 10MFLOP. Communications between CPUs will always be too slow.

  18. NT5 is coming...the Linux Killer by Anonymous Coward · · Score: 0

    NT 5.0 will kill all of you lusers off! Just ask any of the branches of the military. I'm going to tell by biatches in Washingtom to send a cluster of SmartShips(TM)your way!

    Ed "Got the pentagon in my pocket" Muth

    "SmartShips, NT5.0 are trademarks of the MicroShaft Corporation"

  19. Big Blue / Power PC's by Anonymous Coward · · Score: 0

    That's interesting. Maybe I should run a search on Big Blue, learn more about it. Thanks!

  20. IBM move Caldera, Debian, SuSE out of the picture by Anonymous Coward · · Score: 0

    You get points for enthusiastic Red Hat/Linux support. However, the blatant bashing of other distros, esp. Debian is bound to lose some attention. And that part about people would invest in the others if they were any good is rather cliche. Perhaps something along the lines of the other distros being to difficult to configure to trust on a supercomputer. Overall, a C-- post. Keep working, though. With enough practice, anybody can be a first class troll!!!

  21. Wow. by Anonymous Coward · · Score: 0

    Proprietary operating systems are just looking worse and worse... could you do that with NT? :-) Yeah, right, for about 5 million and 37 MCSE's on-call to reboot the machines periodically...

    This reminds me of the story about one of the first vaccuum tube computers in America. I guess the tubes burned out often enough that they had a platoon from the army swapping out bad tubes. The whole room was hot enough that they were dressed in their skivvies.

    "Jackson... #37 BSOD'ed again... Reboot!"
    "Cramer! hit the switches on 67 and 68!"
    "Hey Wilson... your goin to the brig! you haven't rebooted a machine all day you lazy puke... What's that you got there, a cd? Some sort of penguin band or something?...

  22. The question is... by Anonymous Coward · · Score: 0

    As interesting as this technology is, and as highly as I think of Linux and Beowulf, IBM's demo was extremely misleading. POVray falls under the category of "EMBARRASSINGLY PARALLEL TASK", for which the node-interconnects are only used to distribute the metadata and collect the end result afterwards. Supercomputers need much lower-latency interconnects than Gb ethernet for solving the kinds of problems they are purchased to solve, like FFT and normalization of very large, very sparse 4D matrices. IBM's cluster would completely tank on an actual supercomputer-class problem because the nodes would need to communicate very quickly during computation.

    That being said, I think there is a bright future for clusters. Commodity network hardware is getting faster (and latency is decreasing) at an exponential rate, FPGA's are getting faster, denser, and cheaper, making customizable parts cheap and easy to build (ie, for short interconnects), and the consolidation of ALU, memory, and glue-logic onto the same die will also make custom hardware easier to build. Moreover, clusters have the quality of having memory bandwith scale linearly with node count, a quality shared to a lesser extent by ccNUMA's and excluded by SMP's/UMA's. Software technology being developed for intelligent use of memory heirarchies (ie, splitting up a task into cache-sized data sets) will be directly applicable to cluster architectures as well, and will take advantage of this linear-scale aggregate throughput. I think traditional supercomputers will remain top dog for certain problems for which these benefits will not apply, but I also think that the set of such problems will be growing smaller with time, as methods are discovered for effectively applying cluster technology to problems traditionally solved by supercomputers.

    --- Guges ---

  23. Amusing that IBM Didn't Bench Against RS/6000 by Anonymous Coward · · Score: 0
    Irrelevant. The test used Pov-Ray. That's called "Embarrassingly Parallel".

    See the "Trolling" thread currently running in comp.sys.super.

    (Can't be arsed to login, too much trouble.)

  24. Yet another x86 Abomination by Anonymous Coward · · Score: 0

    Note that while gcc's optimization for the Alpha is lackluster compared to that of Compaq's compiler, gcc's optimization on x86 is worse. Cygnus is trying to fix this now, but it remains to be seen if they can. The compiler was really built for architectures with 20+ spare GPR's, and gcc's register allocation et al curl up and die with only 6 GPR's, total, to work with (eax, ebx, ecx, edx, esi, edi). We really need to rewrite huge chunks of gcc some day (and egcs isn't enough of a rewrite).

  25. hmmm / Check single benchmarks by Anonymous Coward · · Score: 0

    00:00:06 2466.67 acer
    intel celeron 366 MHz
    windows 98

    An Acer? At 6 seconds?

    same test? not sure.

    but it is interesting

  26. IBM SP2 by Anonymous Coward · · Score: 0

    I work with the SP systems as well and I suggest
    anyone seriously interested in SP switch performance take a look at the following URL:
    http://www.rs6000.ibm.com/resource/technology/sp swperf.html#applperf

    I dont think you can just give a blanket "the SP switch inter-node communication is such and such". Theres a lot more to it than that.

  27. Interesting.... by Anonymous Coward · · Score: 0


    jkhjkhkjh

  28. So much for the FUD that Linux doesn't scale well. by Anonymous Coward · · Score: 0

    By the way, has anyone considered the impact that low-cost supercomputers will have on the security of encrypted communications? It's looking like it's going to be a lot easier to crack codes.

  29. Deep Blue and Linux by Anonymous Coward · · Score: 0

    Deep Blue and Beowulf are analagous. Deep Blue is a special cluster running AIX 4.2, with a very fast switch for passing messages. The shared disk subsystem is also highly optimized.

    It is incorrect to compare SMP to parallel clustering. The difference boils down to whether processors share memory or use communication. SMP never scales well for large numbers of processors, because of cost, and that's why Beowulf etc. use clusters. Linux has limited SMP scalability, but SMP isn't nearly as interesting as clustering anyway.

  30. IBM SP2 by Anonymous Coward · · Score: 0

    ah, well.. I work with the latest silver nodes and I see much better performance than you report. such is life.

  31. A proud Debian user by Anonymous Coward · · Score: 0

    Why the fsck should I care about IBM? I use Debian, it works. My frinens use Debian, they like it. There is a whole user and developer community who use it and like it. In which way IBM will hurt Debian by using RedHat? SuSE or Caldera are the once who might die (worst scenario) since they have actually to make some profit. In fact, RedHat does not hurt the other distributions but benifit them by drawing attention to Linux. More attention to linux, means more applications (which will run on all distributions of course) and it laso means there will be more Linux ready systems available (again this is your choice what to install on them).

    Debian (and others) benefit from RedHat the same way FreeBSD community benefits from Linux. e.g. All those software projects inspired by Linux actually produce software that runs on BSD. KDE, GNOME, GIMP, etc and all your beloved Linux applcations are all available to FreeBSD community at no cost .

    I think in OSS community we all benefit each other.

  32. Poor SGI and others by Anonymous Coward · · Score: 0

    I think Linux clusters can really errode SGI's supercomputers' market share..

  33. anyone can post results by Anonymous Coward · · Score: 0

    It probably makes sense to take some of the numbers at haveland.com with a grain of salt. So I could say that a cluster of 5 286s running Windoze 3.0, were able to do the benchmark in 2 seconds. It would probably be removed but some of you would belive it. Use some common sense people!

  34. No Subject Given by Anonymous Coward · · Score: 0

    PovRay benchmarks..
    Mohaha.. whata joke.

    give me LINPACK numbers baby!

  35. Second Place Results - NOT 1 machine by Anonymous Coward · · Score: 0

    The 9 second time was not a single machine - it was a cluster of 10 machines (remember these are PARALLEL results, single results is a different spreadsheet). Their list of hardware can be found at http://WWW.CE.UniPR.IT/research/parma2/

    The Povbench spreadsheet is not clear on the number of nodes and CPU's used, only the type of CPU used.

    Parma 2 had

    8 450mhz PII processors
    2 PPro 200mhz
    4 Pentium 100 machines

    I would count the 2 PPro 200's as one 400mhz and the 4 100mhz Pentiums as 1 400Mhz - for a total of 10 400Mhz Pentium II processors.

    Also, we did achieve 3 seconds with several nodes down - using 28 processors. To get from 9 seconds to 3 seconds scaling at 100%, they would need to use 3 X the number of processors. Hmmm ....

    10 X 3 = 30

    Sounds about right.

    Jay Urbanski
    Netfinity Systems Engineer
    IBM Advanced Technical Support
    MCSE, PSE, Certified Solaris Systems Administrator
    (817)962-3597 TL 522-3597
    (817)962-7307 fax
    (800)413-9093 pager
    urbanski@us.ibm.com

  36. Real-time rendering? by Anonymous Coward · · Score: 0

    uh, if the resolution if very small maybe.

    Pixar used over 3000 nodes to render their "A bugs life" movie, but recall they render very high resolution frames. (4096x4096?)

    I guess 800x600 realtime rendering can be done with a ordinary cluster, the frames are only 50-100kb each if JPEG/GIF

  37. o/c'ed celerons! by Anonymous Coward · · Score: 0

    If using overclocked Celeron 300A you will get alot of nodes! 2500 node celery cluster for $150k

  38. why not use suns? by Anonymous Coward · · Score: 0

    I understand the interest of extremely low cost intel hardware, but wouldn't it be good to have fewer nodes and have more processors in each node? I'd think that would make the problem of internod communication start to head downward.

    And if we want a multiprocessor node, wouldn't it be good to use Suns, after all Sparc is Scalable Processor ARCitecture (sp?).

    And if we can have IP over scsi, why not IP over ultra2 scsi, or differential ultra2scsi, or fibre channel?

  39. Here you go: by Anonymous Coward · · Score: 0

    Compaq CPlant Cluster (150 Alpha/Linux nodes)

    Rmax: 54240
    Rpeak: 150000
    Rank: #97 in Top 500 world fastest supercomputers.

  40. W0w! D00d! by Anonymous Coward · · Score: 0

    Could you imagine a K-rad b30wulf cluster made from these thi... oh... um...

    nevermind.

  41. The question is... by Anonymous Coward · · Score: 0

    If it were necessary, They could have used IBM's SP Cluster Switch. It runs at 2.5 Gigabits, I think. They are bringing this technology from the RS6000 SPs to Netfinity clusters.

  42. trivial problem - IBM engineer comments by Anonymous Coward · · Score: 0

    Your comments are 100% correct. As one of the IBMers who helped set up the cluster demo, I can attest that the story has been somewhat mis-reported. We never made the claim that this was anything other than a CPU-intensive benchmark, and a neat thing to show off. I don't suspect that the DOD or EPA wil be doing A-bomb particle drift sims anytime soon on a similar system ~ but it WAS fun to see what could be done with an off-the-shelf copy of LINUX.

  43. IBM SP2 by Anonymous Coward · · Score: 0

    One thing to keep in mind is TCP is a lousy way to communicate across ultra high performance topologies... HIPPI+TCP sucks in general. There are lower level protocols necessary to eek out the bandwidth. You can check ouyt some of the research done by Myricom (http://www.myri.com) on their 1.2Gb fabric, and IP sucks for this application. From talking to people at IBM, I get the feeling that they never intended people to use sockets across the fabric! Unfortunately, for many people it's IP or the high-way, which can be a very limiting ideology to wrap oneself in.

  44. Yet another x86 Abomination by Anonymous Coward · · Score: 0

    Yes, but check the specs. Each of the Netfinity machines can be pulled off and run as a stand-alone, with it's own SCSI, Ethernet, graphics card, etc.

    Each of the Alpha cluster nodes can just function as a node. No SCSI, no graphics card. It's built to perform purely as a node.

    Check the pricing for Alpha vs. PII. Crap or not, the Intel chips are cheaper. It's just a matter of how the cards are configured.

  45. 36 PII-400 Xeons match 48 Alpha-450's... by Anonymous Coward · · Score: 0

    Do consider that the Cray T3E-900 has 48x450Mhx alphas -- and that it gets matched by the Xeons.

    The Microway cluster won't even have the memory
    bandwidth of the Cray either. I think what this benchmark shows is the importance of having a large L2 cache running at the core speed of the processor -- since thats where the main difference in architectures lie.

  46. Slow down, people! by Anonymous Coward · · Score: 0

    And also how QUICKLY the solution can be put together -- Cray couldn't even install such a system in the time it took these guys to set this thing up.

  47. Why not compare to O2000? by Anonymous Coward · · Score: 0

    The current best-foot forward that SGI/Cray has to
    offer is the O2000. That's what the ASCII-Blue Mountain site has several of, not a T3t. So why didn't IBM compare with that? Because they wanted to be able to say "We beat Cray", having lost the performance war to the ASCII-Blue Mountain site.

  48. IBM move Caldera, Debian, SuSE out of the picture by Anonymous Coward · · Score: 0

    the notes floating around internally from the few guys that set it up are that RedHat was the one on the shelf.....they'd have grabbed whatever they found.

    this literally was 'slapped together' just to see what the heck would happen.

    IBM as a whole didn't run out and decide to pick Red Hat as their 'approved' version of linux, the two main guys that set it up found Linux at Barnes and Noble and used it the day before the benchmark was run...


    opinions are not IBMs, just my own...

    An Anonymous IBM Employee

  49. Any Beowulf interconnect technology breakthroughs? by Anonymous Coward · · Score: 0

    A lot of people have already mentioned that using network technology (ethernet or myrinet) to interconnect beowulf nodes results in a system that isn't terribly useful for tasks requiring greater bandwidth.

    My question is, has there been any work done on alleviating this situation? Would using interconnected PCI slots help measurably? Or will we quickly run into the brick wall that is due to the x86 architecture's poor memory bandwidth?

  50. Real-time rendering? by Anonymous Coward · · Score: 0

    It might be possable with something like gigabit ethernet but with 100baseT you would run out of bandwith after adding just a few more nodes to the cluster IBM used to demonstrate.

  51. RS/6000 would probably win by Anonymous Coward · · Score: 0

    I know the PowerPC chips don't seem very spiffy when you look at SPEC benchmarks, but you must remember that most Unix boxen have considerably greater memory I/O bandwidth than their x86 brethren. Add this to the fact that IBM's SP clusters are connected with a very high bandwidth pipe. An RS/6000 SP cluster may not look very cost effective compared to a Beowulf, but it would probably still be faster. For absolute performance, Sun and SGI will probably hang on top for quite a while.

  52. Where's NUMA by Anonymous Coward · · Score: 0

    As far as I know, Beowulf is passe.

    The only companies making x86 NUMA machines is Sequent and Data General. Unfortunately, they are both pandering to Microsoft so we'll probably never see Linux running on any of their boxes.

    Any other manufacturers making exotic hardware like this for Linux?

  53. Interesting.... by Anonymous Coward · · Score: 0

    asdfqwerty

  54. Linux 2.2.2 by Anonymous Coward · · Score: 0

    They would have had to compile a new kernel to get Beowulf capabilities, since those aren't in the vanilla kernel.

  55. Any Beowulf interconnect technology breakthroughs? by Anonymous Coward · · Score: 0


    What about 1000Base-T?????

    Max bandwith is in the same balpark as the PCI bus

    Cards are around $350 now but forget about finding a switch for a decent price

  56. What about AGP by Anonymous Coward · · Score: 0


    Stupid question but 4XAGP will be the fastest I/O bus on the P.C.

    As BeoWulf nodes do not need video would it be possible to make an AGP I/O card to interconnect nodes????

  57. Linux 2.2.2 by Anonymous Coward · · Score: 0

    Couldn't they have installed the system from the CD, then simply upgraded to the 2.2.2 kernel (as I'm doing this very moment)?

    And yes, penguins do smell a little like fish.

  58. Wow. by Anonymous Coward · · Score: 0

    which service are you? you almost sound like a cto

  59. What about AGP by Anonymous Coward · · Score: 0

    It may be possible. AGP 1X was basically 66MHz PCI. Usually bandwidth is not the problem but latency. The gigabit backplane in a Sun Ultra Enterprise class machine is quite different from gigabit ethernet. Gigabit ethernet still has a lot of latency while the Sun backplane is probably 100 to 1000 times less.

    You should look into things like hippi if you really need the bandwidth.

  60. Interesting.... by Anonymous Coward · · Score: 0

    aoeu',.pyf

    Get a real keyboard layout!!!

  61. not really... by Anonymous Coward · · Score: 0

    Ya'know, SGI may not be worried but I bet the state department is. Iran, Iraq, China, Pakistan, India have a very nice and cheap way to make a super computer to further their nuclear programs.

    Actually, those countries may find many civilian uses for these boxes.

  62. Is Avalon faster then Cray? by Anonymous Coward · · Score: 0


    http://www.cray.com
    A Cray T3E, which is the fastest massively parallel Cray, runs at up to 1.2 TeraFLOPS, with 2048 DEC Alphas. I believe that there is an upgrade to get it to run 600 MHz Alphas instead of 450's, which would increase the max speed. I remember reading that they put together a machine that did over 1 TeraFLOPS on real code. So basically, the answer to "how fast is a Cray" is "Very". :)

    Forest Godfrey

  63. IBM SP2 by Anonymous Coward · · Score: 0

    The bandwidth numbers you're talking about (25 MB/s) used to be true with the network as of 1994; perhaps you have access to an old model of SP?
    With the latest model I found more than 100 MB/s.

  64. HIPPI/Fibre Channel Wasted on PCI Bus by Anonymous Coward · · Score: 0


    HIPPI's 100/200 Mbytes/sec would not be achieved on the PCI bus. Hense the question regarding AGP I/O adapters.

    Probably never will happen as the market is way too small and unfortunately PCI rules.

    I was all hot and bothered by the 1.6GByte/Sec XIO bus speed on our SGI O2000. Started digging into the details on the four channel XIO SCSI card. You guessed it: card internaly uses a PCI bus.....

    The pc architecure is really hurting for high bandwidth I/O. Every time we purchase a new system for throwing around our multi-GB files we review the NT options, but unfortunately the proprietary expensive unix solutions always win......

  65. Yet another x86 Abomination by Anonymous Coward · · Score: 0

    uh? since when ms compilers are considered good at x86 code generation? my pgcc is far better and I think Intel's C too is.

  66. Shut up zealot by Anonymous Coward · · Score: 0

    Have you actually seen the code that VC++ puts out? It's far better than any GCC derrived compilier I've ever seen including various versions of pgcc and egcs. I think that this stems from a few different reasons. First, GCC is designed to compile to multiple processors with the same front-end. There will always be processor specific optimizations that you really have to put into the front end to make them work. It's a trade-off -- performance for portability. Second, as good as the open-source model is, I'm not convinced that it produces optimal results for something like a compilier. Compiliers require a lot of cutting edge computer science to make fast, and (donning asbestos suit) people who have such skills generally don't work for free. They get jobs at Microsoft or at SGI/Cray or any of the other compilier vendors where they can pull down serious money for their efforts. There are other major problems with open source compiliers, in the area of language features and proper implementation. Egcs has some pretty major C++ bugs in it, and I know all of the OSS wankers love C, but for people who prefer not to live in the 80s, OOP is a real software engineering win, and OOPish C like the GTK people have just doesn't cut it.

  67. RS/6000 would probably win by Anonymous Coward · · Score: 0

    I don't know... I just have this odd feeling that a Cray 2048-processor T3e/1200 setup would beat out an RS/6000 to the Nth degree. =)

    It just can't be classified in the same category... I mean, it doesn't even make your neighbor's lights dim.... =)

  68. IBM WAS by John+Campbell · · Score: 1

    Certainly isn't any worse than whatever genius at Microsoft decided to name their embedded OS "WinCE". I mean, yeah, I wince whenever I think of the thing, but...

    And I thought marketing was supposed to be Microsoft's -strong- point...

  69. IBM move Caldera, Debian, SuSE out of the picture by six11 · · Score: 1
    If any of them were any good, IBM (and others) would invest in them.

    reconsider that. what you say isn't logical. You're saying that the fact that IBM and others only invest in RedHat proves that the other distros don't have merit, but I could line up thousands of slashdotters who'd argue that debian or suse or whatever kicks redhat in the arse. I'm a redhat user, mostly because that's what I have used in the past, and it fits how I need the OS to install/function. I'd imagine that IBM and others are choosing RedHat to be their Linux prodigy child because it's a smart marketing move. From where I stand, redhat is the frontrunner in the corporate world, and companies will just run with that because of redhat's established name.

    The other distros have qualities to them that are better for some people than redhat's distro... IBM picking redhat is purely a marketing move and says very little about the quality of other distros compared to rh.

  70. What if? by gavinhall · · Score: 1

    Posted by Olaimi:


    What if IBM had this package in a box scheme along with
    - UDB (AKA DB2)
    - Visuage Age suite
    - Lotus Notes (Dominos)
    - e.commerce
    - well the list is very long i guess !

    I bet Microsoft have no future in corporate IT Departments!

    Cheers ..

  71. Real-time rendering? by Ami+Ganguli · · Score: 1

    Ok, so how fast is this? Does the benchmark measure how long it takes to render a "typical" frame? If so, does that mean it would take (3 seconds/frame * 24 frames/second * 17 nodes) 1224 nodes to render a movie in real-time?


    Something like that could make a really cool video game. Of course, in ten years your Playstation will be able to do it.


    --
    It is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail. - Abraham Maslow
  72. IBM WAS by Smack · · Score: 1

    IBM has been around long enough that I think they've run out of acronyms! For example, WAS already stands for "warehouse administration system" and "work activity system". And that means it's one of the least used set of initials out there!

  73. Interesting.... by mackga · · Score: 1

    I especially liked the part when they said they got Linux from a bookstore the day before. Heh.

    "Hey, Bob! We got this nifty IBM cluster here. What'd you want to do with it?"

    "Wait a minute. I'll run out and grab a Linux cd from the bookstore down the street."

    Hahahahhahaha. That's fucking great!

    --

    "shop smart:shop s-mart" ash

  74. hmmm by ptomblin · · Score: 1

    But look carefully at that entry for the Dual Pentium IIs, and you'll see that the total cost of the system is listed at $12,000. Either that is one KICK ASS system, or more likely it's a Beowulf cluster of dual P-IIs, and they forgot to mention how many CPUs were involved.

    --
    The next Cmdr Taco duplicate will be ready soon, but subscribers can beat the rush and see it early!
  75. Not a very big T3E in that test... by Troy+Baer · · Score: 1

    That was only an 64 processor air-cooled T3E, the smallest one SGI makes. SGI makes Origin 2000s twice as big (128 CPU SMP) and T3Es *32* times as big (2048 CPU MPP). Also, I'd be more impressed if IBM has pointed to a more widely used (and less temperamental) benchmark than parallel POV-Ray, which ends up being mainly I/O bound. NAS Parallel Benchmark numbers would be nice.

    Beowulf clusters are nice if you've got a parallel problem that only scales well out to a moderate number of processors (32-64 at most), or if price/performance matters more than raw performance. They get clobbered if the problem is bounded by I/O, communication latency, or per-CPU memory bandwidth.

    Frankly, IBM has a lot more to worry about from Beowulf clusters than SGI does. Their supercomputer class machine, the SP, is just a cluster of rackmount RS/6000s with a very high speed internal network, and it has all the same problems as a Beowulf cluster relative to a more tightly coupled parallel system like a T3E or an Origin. Plus, AIX is eeeeeeevil; IRIX is much nice IMHO.

    And before anybody asks, yes, I work with both traditional supercomputers (Cray T94, Cray T3E/600-136LC, SGI Origin 2000/24xR10-250, IBM SP-2/8) *and* a Beowulf cluster. We've been doing benchmarks to compare our Beowulf to our big machines; in some cases, the Beowulf wins, and in others, the big machines win. It really depends on the problem. We (i.e. my group at OSC) may be announcing some benchmark pages here in a few weeks.

    --Troy

    --
    "My life's work has been to prompt others... and be forgotten." --Cyrano de Bergerac
  76. Slow down, people! by red_dragon · · Score: 1

    Before you keep posting messages about how unfair the comparison was, if you read the article again, you'll notice that the point they want to convey is not how fast the Linux cluster was compared to the Cray, but instead how easy and inexpensive it was to set it up and get running, using just a few x86 boxes (I admit that Netfinities aren't exactly what I think about when I hear the word "cheap") and software that can be acquired for free. Damn, they got the software from Barnes & Noble! How much easier than that can it get?

    --
    In Soviet Russia, Jesus asks: "What Would You Do?"
  77. why not use SCSI? by red_dragon · · Score: 1

    Hell, no, not SCSI for internod communication. Not even Ultra2. Yes, it's considerably faster than Ethernet in data transfer rates, but (a) you can't put a switch on a SCSI bus to avoid having a node wait for another one to finish sending data to start transmitting; (b) a device in a SCSI bus can't arbitrarily send data to another device in the chain; (3) you're limited to up to 16 devices in the bus (I'm not sure about whether recent developments change that limitation, though). It would take a fugly hack to make it suitable for the job. Fibre Channel and FireWire could be better choices, but I don't know jack about them, so no further comments.

    --
    In Soviet Russia, Jesus asks: "What Would You Do?"
  78. Where's NUMA by red_dragon · · Score: 1

    Not yet for Linux, but wait for SGI to come up with some ccNUMA stuff in the not-so-distant future... or so it seems to be.

    --
    In Soviet Russia, Jesus asks: "What Would You Do?"
  79. The question is... by LightLiner · · Score: 1

    Is the comparison to a Cray fair when you consider
    the inter-node communication/bandwidth needs?

    In clustering and parallel computation, bandwidth
    counts. My guess is that a different application
    that requires much more communication between
    nodes, the T3E would step on the Netfinitys. 100
    megabit ethernet does it for low-communication
    jobs, but what about those that require much more
    intensive inter-node communication.

  80. Yet another x86 Abomination by Jeff+DeMaagd · · Score: 1

    ---Check the pricing for Alpha vs. PII. Crap or not, the Intel chips are cheaper. It's just a matter of how the cards are configured.---

    Oooh graphics cards are expensive aren't they?

    You have not covered the performance aspect. Alphas systems have twice the FPU power of any intel system at the same price, new (per MHz is a little different, but cost-performance is more important than CPI. For clustering, that is very good. Remember that all new Alphas currently have 64 bit PCI slots, like such used for gigabit or four port duplex 100bTX cards, reducing memory system bottleneck and increasing raw comm throughtput for parallel cluster/node computing like this. Communication is the key for parallel.

    JRDM

  81. 36 PII-400 Xeons match 48 Alpha-450's... --- OLD by Jeff+DeMaagd · · Score: 1

    The comparison you make is bad... The Cray test was done about 15 months ago AND used older software (POVRAY 2.2 vs 3.02), older compilers, older generation CPU, and no one uses 450MHz Alpha CPUs anymore. Cost wise, a new DS20 dual Alpha computer is less expensive than a new quad Xeon-450 AND outperforms it.

    If Microway makes a new cluster, its memory performance/bandwidth will probably multiply by 10 given the new chipset.

  82. No Subject Given by GrenDel+Fuego · · Score: 1

    Deep Blue runs on PowerPC processors. There's a port of Linux for PowerPC processors.

    The cluster may be able to outrun big blue... but if you install Linux on Big Blue, that should speed it up quite a bit i believe.

    Then again, I seem to recall something about Linux being not quite THAT scalable, although I could be wrong..

    Can Linux handle thousands of processors?

  83. Big Blue / Power PC's by GrenDel+Fuego · · Score: 1

    Try http://www.starbridgesystems.com/Pages/technology. html

    It's actually information on another supercomputer, but it compares the system to the IBM Blue Pacific.

    Under the processors section, it says that the IBM system uses "5,856 Power PC 604 processors"

  84. hmmm by tgd · · Score: 1

    I think what's interesting about this is what it didn't say -- that the old record of 9 seconds was on a Dual Pentium II. So having *18 times* the number of processors only got it three times the performance...

    Any comments on this? Obviously a dual Pentium II is pretty damn good at this too, being only 1/3 the speed of a $5.5 million supercomputer. Anyone have any idea why adding so many processors to the Linux cluster would improve results so little?

  85. hmmm... I'll hazard a guess on this one... by tgd · · Score: 1

    That's a point, but I doubt that's the cause in this case. I suspect its something else. If I was rendering images (as opposed to large amounts of calculations that deal with the results of other sets of calculations), then I'd just split the image up into 36 chunks and have one processor blast through each chunk, and stick them all back together at the end. My assumption is that's how this test works because they mentioned how a few scanlines were dropped when one node went off line. So its not true PVM-style clustering like Beowolf.

    In this case, 18 times the processors should give 18 times the speed -- unless the test really isn't processor bound. I'm not sure what it would be bound by, however... lousy implementation? I/O? Network?

    If the test isn't really processor bound then the comparison to the cray is meaningless, because there's something wrong with the way the software is coded to work on parallel machines, I'd think.

    I disagree that the $12k cost means it was a cluster of Pentium II's. I've bought a couple Pentium II systems in that range, its easy to get up there when you add a lot of RAM, lot of harddrive space, etc. Using name parts jacks the price up a lot. (ie, VA's selling systems with Intel boards rather than supermicro or some other lower-cost company...)

    On a side note, I remember reading a year or two ago that someone was working on a networking layer that allowed IP and other protocols to be routed between cluster machines over a 40MB/sec SCSI bus. Anyone know if that ever got to completion? A four-fold jump in network speed would make quite a difference to I/O bound applications. (And SCSI cards are a lot cheaper than Gigabit ethernet or other real high-speed networking technologies...)

  86. Poor SGI and others by tgd · · Score: 1

    I doubt it...

    SGI's NUMA architecture means data can be pumped *much* more quickly between nodes, 100-1000 times as fast. Network-based Linux clustering is useful only for calculations that are fairly self-contained and don't need a lot of data to process.

    What I think would be more interesting, given SGI's leanings towards supporting Linux on their MIPS and Intel platforms, is if they eventually tweak the multiprocessing in the kernel to support NUMA style multiprocessing and I can throw Linux on an Origin server. Or maybe better yet a NUMA-architecture Intel machine (i'm not really up on floating point speed comparisions between newer MIPS and newer Intel chips). Since they've dropped real PC-compatibility on their new Intel machines, that sort of a shift is a lot less painful than the initial dropping of support for DOS/16 bit apps.

    So SGI doesn't get hurt by Linux. Linux *can't* really compete with a Cray at any real-world tasks (not yet...). And SGI is in a *real* good spot to be the ones selling the Linux-compatible hardware that actually could. In which case, why would they care? Their profit may be lower on a $500k Cray-comparible NUMA linux system than on the Cray, but I'd bet they'd sell enough more of them to make up the difference.

    Time will tell.

  87. Second Place Results - NOT 1 machine by tgd · · Score: 1

    Ah, that clears it up. Its always good to hear the real deal from the source. :)

  88. No, but 2.2 is an update to RH5.2 by slothbait · · Score: 1

    ...which is equivalent to loading a service pack in the Windows world. How many people run stock NT 4.0? Even clueless users know better: they make sure to keep up with service packs. Using RH 5.2 plus updates is as close to "off the shelf" as a standard production line Windows system.

    --Lenny

  89. not really... by slothbait · · Score: 1

    Clusters are definately a better idea for something as blatently parallel as non-real-time rendering. Commodity parts are vastly more cost effective. However, this is a very special case. Most applications require far more bandwidth for heavy interprocess communication. Try such an application on a Beowulf-type system and you can watch it fall flat on it's face. Suddenly computation is I/O bound, and the Cray really earns it's keep.

    Things like Crays are expensive mainly because they have very special, very fast hardware for this purpose. It may be extraneous hardware for something straight ahead like a render farm, but there are many cases where such massive bandwidth is very necessary. Thus, for most applications, replacing a Cray with a Beowulf cluster just isn't an acceptable solution.

    Beowulf clustering has been proven to be a cost-effective non-real-time rendering system, however.
    --Lenny

  90. Could it be off the shelf? Close enough by Paul+Carver · · Score: 1

    I'm still running 2.0.36 but I believe a 2.2 kernel rpm is available. If their copy of Linux was Red Hat based couldn't they have bought it off the shelf, installed it, and then downloaded and installed the kernel rpm? I'd say that if they have to get a single file via free download and type 'rpm -i' that still counts as off the shelf.

  91. Impressive failover by mikemcc · · Score: 1

    I was most impressed with the graceful failover. Unplugging one machine and having nothing more dramatic than a slight delay in one portion of the result is the kind of presentation that really makes an impression with "results" type people - you know, the ones who say, "I don't care how it works. Just show me that it does work."

    I'm also pleased with IBM's recent decision to release their Websphere Application Server on Linux - although the person in marketing who thought up that name should be demoted. The acronym is "IBM WAS." Both passive and past-tense. Sheesh!

  92. hmmm / Check single benchmarks by Akira1 · · Score: 1

    I noticed some of those results too. I am thinking that a governing body isn't involved with this process. Several other VERY strange results permeated the benchmarks. I downloaded the results and will keep em around just for kicks.

    --
    Food: It's whats for dinner
  93. Yet another x86 Abomination by Jeffrey+Baker · · Score: 1

    Wouldn't it be possible to export all of your Linux header files over to, say, a windows box, compile the code with VC++, then link it against the appropriate libraries?

    People did this with BeOS when CodeWarrior/x86 was spitting out terrible machine code. Say what you must about M$, but their x86 codegen kicks serious butt.

    For alphas, how about building the code on DEC Unix with the cool compilers?

  94. IBM move Caldera, Debian, SuSE out of the picture by GypC · · Score: 1

    Shouldn't your 13-year-old ass be in school right now?
    Get a clue.
    .

  95. Could it be off the shelf? Close enough by jnazario · · Score: 1

    we did it off the shelf. we're a redhat and kernel mirror, so we built our own 2.2.x kernels on top of a 5.2 install. we tweaked, of course, but hey, if you are building a supercomputer or a cluster, you better get set to tweak.

    100bT and a switched hub, DEC Tulips bought on sale, donated hardware, etc... we paid approx $400 for our 8 node cluster.

    i mean, who the heck wants to drop in a 2.2.x kernel rpm? c'mon! sources have been out for a while....

    --
    jose nazario jose@biocserver.cwru.edu
  96. Where to find software? by wald0+from+j00nix · · Score: 1

    Where does one find software capable of clustering? I've looked at MOSIX, which seems like it might work. But I've heard a lot about Beowulf (excuse the ignorance of this next part). Is Beowulf a software package I can download somewhere? What other options are there for linux openbsd or freebsd? Thanks..

  97. Deep Blue by trb · · Score: 1

    It's misleading to say that Deep Blue runs on RS/6000 processors. The chess engine is all in custom VLSI chess processors, the RS/6000 just acts as a control processor, which isn't particularly interesting as a supercomputer application.

  98. Impressive failover by Guy+Smiley · · Score: 1

    The failover is part of PVMPOV, and has nothing to do with the configuration of the systems. It was coded this way because it used to run on a room full of unreliable machines that ran at wildly different speeds that other people were using, and even if the machine didn't die, it was possible that a system was busy with other things at the time, so I didn't want to wait for renderings to finish when other CPUs were idle... Check out the PVMPOV Home Page for more info. Yes, 32s was impressive at one time (it used to be at the top of the list).

  99. A few problems here... by MuyJuan · · Score: 1

    First, a few posters forgot their reading glasses and failed to notice that there were 17 machines, each of which had 36 PIIs in them. If you could have done that with 17 machines having only ONE PII, THAT would be news.
    Second, the article says they used Xeons. I don't know about what prices IBM gets, but the cheapest I could find a Xeon was about $700. At this price, just the Xeons would cost half a million. The $150k price tag on this setup is just unbelievable, unless either (a) I really misunderstood how many processors they have, and/or (b) $150k was just what they had to buy in addition to what they already had lying around.

  100. A few problems here... by MuyJuan · · Score: 1

    So that would be what? 16 2-headed Xeons and a 4-headed Xeon? 6 4-headed Xeons, 1 2-headed Xeons and 10 1-headed Xeons? I figured since it was IBM they could come up with practically anything on short notice. Wait...what am I SAYING? Anyhoo, nowhere does it say 36 total Xeons. Also nowhere does it say 36 for each server. I just did the math and balked at 2.11765 processors per server.

  101. A few problems here... by Axe · · Score: 1

    ts.ts. 36 Xeons overall. read carefully next time.
    aint no 36 headed xeons arounds.

    --
    <^>_<(ô ô)>_<^>
  102. Partially correct by Jason+Abate · · Score: 1

    Overall it depends on the application. GCC is not that bad on Alpha integer. The performance loss is mostly in floating point and math libraries.

    This shouldn't be true for much longer. Compaq released their math library for the Alpha last week (see here for details), and, acording to posts to comp.lang.fortan they will be releasing their Fortran compiler as well (as a commercial product, not for free). This should make Alphas much more appealing for cluster use.

    -jason

  103. Entry level supercomputing by Jason+Abate · · Score: 1

    In vectorcomputing, each node computes a very small part of the big picture, making communication time a very big (or small in this
    case) bottleneck. Thus these computers need super fast, specialized networking connections.


    Umm, I think you're confused. Vector computers, such as the older Cray machines, use special vector processors that can operate very efficiently on long vectors of data, applying the same operations (hence the name). Things get a little more confusing with later machines which are actually parallel-vector computers (i.e. they had multiple vector processors that worked in parallel).

    It is generally accepted that parallel computers, whether they are "big iron" type machines, such as the T3E, Origin 2000 or SP2, or clusters of workstations and PCs, are the way to go for high-performance computing. Of course, some people would point to the latest vector machines from Japan to contradict this...

    You are right that the true measure of performance is based on applications, and there are applications suited to each of these architectures. We have found that for our problems (large-scale reservoir modelling) clusters of commodity PCs perform quite well in comparision to an SP or T3E, even with 100 Mbps networking, but there certainly are other applications with more fine-grained communication requirements for which even a T3E or O2k is barely sufficient.

  104. Yet another x86 Abomination by arivanov · · Score: 1

    And a very expensive one:

    Check www.microway.com for an Alpha cluster priced at 2500$ per node and $4,500 for the master console. This means that for the $150000 used by IBM one could assemble a 50+ node alpha cluster instead of 17 PCs...

    God, when will people ever learn that x86 just does not worth it...

    --
    Baker's Law: Misery no longer loves company. Nowadays it insists on it
    http://www.sigsegv.cx/
  105. Partially correct by arivanov · · Score: 1

    Microway has NDP compilers and libraries that are comparable to Compaq's. Unfortunately not all of them are available for Linux. Unfortunately they also cost money.

    Still, I would bet that you can actually get a very decent special deal if you purchase all the stuff together.

    Overall it depends on the application. GCC is not that bad on Alpha integer. The performance loss is mostly in floating point and math libraries.

    Anyway, I will bet for the Alpha for most of the cases ;-)

    --
    Baker's Law: Misery no longer loves company. Nowadays it insists on it
    http://www.sigsegv.cx/
  106. And what about the 1G network to support this by arivanov · · Score: 1

    Sorry dude. This does not scale. You are going to start going into some real network equipment problems and heavy expences after 16-24 nodes.

    So getting a bunch of sloppy boxen is not an idea. There has to a compromise between box speed, box quantity and price of network equipment.

    --
    Baker's Law: Misery no longer loves company. Nowadays it insists on it
    http://www.sigsegv.cx/
  107. IBM SP2 by TA · · Score: 1

    Er, the IBM SP2 inter node bandwidth is nothing to write home about.. I'm working with these beasts, and the SP switch can do around 30 megabytes/second at maximum. If you reach 25 MB/s in real life then you're lucky. I haven't heard about any big improvements on that speed on the newest models either, in any event they must increase that speed by several orders of magnitude if you want to compare with e.g. Origin 2000. And considering the price of an SP2 rack the price/performance is, eh, interesting..
    TA

  108. IBM SP2 by TA · · Score: 1

    Oh I have read that page, and I have used all the tuning tricks in the (IBM) books, and the SP switch bandwidth *still* sucks. Other companies we work with on a big project have done a lot of testing as well, and the TCP bandwidth is just bad. As I said, if you get 30MB/sec in real life then you're good (and don't even think about UDP, that's really terrible). Now, we're not using the latest and greatest hardware, the nodes we use are 133 MHz. But we don't see much improvement from the 66 MHz nodes.
    No, I'm not impressed by the SP switch. And besides, it's a terrible beast to work with.
    TA

  109. IBM SP2 by TA · · Score: 1

    Yeah, we're using slightly old models, as I mentioned in another posting. That's the deal with IBM, however I didn't think they're pushing 1994 models on us! The first rack had 66MHz nodes and the HP switch, the next (which came a couple of months later) had changed to the SP switch (less reliable from our experience btw), the newest nodes are now 133MHz which are still a bit behind the specs you can find on the latest and greatest.
    It's interesting that you have measured 100MB on the latest equipment, the application should in theory be running on new hardware when it gets operational. It's very useful to have an idea of how the switch will perform, so thanks a lot for that info.
    TA

  110. Amusing that IBM Didn't Bench Against RS/6000 by InitZero · · Score: 1

    It's amusing to note that IBM didn't compare the Linux cluster to its own hardware. I'd be curious to know how a 12-way RS/6000 S70/S7A running AIX under HACMP would stand up to the Netfinity/Linux assult.

    InitZero

  111. hmmm by David+F. · · Score: 1

    Well, (this is just speculation) I don't think they just used one Dual PII. I'm mainly making this guess based on the cost ($12,000). Perhaps the Dual PII is what one node is, and they have several of these nodes?

    --
    ---- Dave
  112. A proud Debian user by Tooky · · Score: 1

    I use Red Hat myself (well mandrake, but it amounts to the same), but what Big Blue have done is exactly what you ghave just said they ahve highlighted the open source community ina way that no one else has the profile to do...the reason they used RedHat is becaues its the distribution that was in the back of the book they bought, not becaues its any better or worse than any other distribution, or UNIX derivative, I wouldn't have been surprised to see the same article arounf FreeBSD but it was RedHat and Linux...and mighty top stuff it was too!!

  113. Could it be off the shelf? by Daiv · · Score: 1

    Notice that 5 of the top 10 systems were Linux based - from
    http://www.haveland.com/cgi-bin/getpovb.pl?searc h=Parallel%3A&submit=List+all+Parallel+Res ults.

    D

  114. Amusing that IBM Didn't Bench Against RS/6000 by SoftwareJanitor · · Score: 1

    It's amusing to note that IBM didn't compare the Linux cluster to its own hardware. I'd be curious to know how a 12-way RS/6000 S70/S7A running AIX under HACMP would stand up to the Netfinity/Linux assult.

    This demo was done at a Linux show. Linux on RS/6000 is still a work in progress, so it is not surprising that they aren't ready to show that. Doing a demo with AIX at that show would have been a political faux pas.

  115. Could it be off the shelf? by bnf · · Score: 1
    Perhaps Redhat (and Barnes and Noble) has an amazing distribution model, but this chart says they were running Linux 2.2.2. I don't think a 2.2.2 kernel could have been pulled off of the shelf so soon after becoming available.

    bnf

    --

    this space intentionally left blank (oops)

  116. Impressive failover by Gumber · · Score: 1

    The graceful failover is a red herring to me. It was ascribed to IBMs "X architecture" but it sounded more like an application level adaptation.

  117. Yawn by Gumber · · Score: 1

    Bogus:

    Neat but not really. A cluster of PCs connected over fast ethernet is not as flexible as a Cray. On the other hand, a Cray is a waste of money for rendering.

  118. hmmm... I'll hazard a guess on this one... by CodeShark · · Score: 1
    but experts out there in /. land, feel free to tell us if & how I'm wrong.

    There has for some time been a rule of thumb that adding a second (or third, or 17th) processor to a problem doesn't get you double the performance, because there is overhead deciding "which processor is going to do what."

    What this means is that at some point, adding parallel processors to a problem ceases to be cost effective answer.

    See my other note on the main thread (this one got submitted first, so be a little patient!!) as to what I think is of greater long-term significance to the Linux world.

    --
    ...Open Source isn't the only answer -- but it's almost always a better value than the alternatives...
  119. Leading us to the future. by CodeShark · · Score: 1
    (BTW This post is entirely IMHO. (in my humble opinion, for new /.'ees)

    This demonstration all about mind share -- something that Microsoft doesn't want Linux to achieve. Let me explain.

    For many years, Cray build absolutely the highest performing mainframe number crunching computer in the world -- and every computer scientist knew it. We used to joke about having our own desktop Crays -- if someone would just lend us (in this case) $5.5 million dollars per workstation. So here's the point that IBM was really trying to get across to the corporate IS people out there -- that Linux is competitive with anything Microsoft can produce. Follow the steps:

    1. Take off the shelf hardware. [in this case, IBM NetFinities.
    2. Take a common Linux distribution (RedHat, from the back of a book purchased at Barnes and Noble
    3. Give a set of assumedly talented engineers an EXTREMELY limited period of time (they bought the book ONE day before Linux World) to:
      • install and configure a set of 17 parallel machines,
      • set up the network,
      • install the software,
      • Tune the installation...
    [Note: in my book Just setting up the machines to run in parallel in the time they did is awesome enough!] But IBM specifically to demonstrate to the world that this chose this relatively inexpensive Linux cluster could match the performance of a Cray.

    Whether or not it could be done with NT-based machines misses the point.

    Although I'm not always a fan of Big Blue, in this case we should all thank them for a great job in once again proving the power of Linux to the rest of the computing world.

    Take that, Microsoft!!

    --
    ...Open Source isn't the only answer -- but it's almost always a better value than the alternatives...
  120. Xenons vs Celerons by BogoNick · · Score: 1

    I have not actually built any Xenon cluster (or Celeron cluster for that matter), but when you run an application like povray, why the heck would you need such large L2 cache anyway? The bottle-neck is still with the FPU and I/O activity such as the network. FPUwise, a 450mhz xenon does not outperform an o/c'ed 450mhz Celeron.

  121. hmmm by Soko · · Score: 1

    It's math - you require double the ponies to get the time in half - and it gets worse as computational times approach zero. For an analogy, in 1980 a TF Dragster could do the 1/4 mile in just under 6 seconds, with ~2000HP. Today, it takes 6000 HP to do the 1/4 in 4.5. It would take 9000 to get under 4 they say - and so it goes.

    --
    "Depression is merely anger without enthusiasm." - Anonymous
  122. Interesting.... by HR+Pufnstuf · · Score: 1

    My brilliant coworkers woulda loaded Win98 on it with no hesitation.


  123. The question is... by Grit · · Score: 1

    Last year Jim Gray (now at Microsoft Research, bleh) was out here at Stanford giving a talk on what he thought the future of computing was. He seemed to think that clusters were the way to go--- when I asked him about the latency issue, he seemed to be of the opinion that all the interesting computational tasks of the future _were_ "embarassingly parallel", and that anything that wasn't was pretty much good already.

    I'm not sure I agree with that assessment... But I'm just a dumb systems researcher. What do I know about applications? :)

  124. Fair where it matters... by GroundBounce · · Score: 1

    Everyone knows that a cluster won't perform well for computations that can't be easily parallized without massive internodal communication - No one would use a cluster for these types of problems. The point is that it _is_ a fair test for the types of computations that you _would_ use a cluster for. For these types of applications, you're better off spending $150,000 for a PC cluster than millions for a Cray.

  125. Is Avalon faster then Cray? by Kiaser+Zohsay · · Score: 1
    The whole point of Avalon was cost effectiveness. That's why they passed on rack mount cases.

    Just in case anyone was in a coma all last year and doesn't know what Avalon is, here's the link.

    I wonder if the Avalon folks ever tried anything as trivial as Ray Tracing.

    --
    I am not your blowing wind, I am the lightning.
  126. Entry level supercomputing by fletch_f_fletch · · Score: 1

    Ok, someone who actually knows what (s)he's talking about.

    There's nothing special about this news other than the fact that the individual nodes are running linux. Which basically makes this an SP2 minus the superfast network (and the dent in the wallet).

    The measuring stick for all computer hardware issues is application. There is supercomputing (vectorcomputing) (like Cray, traditionally) and there is parallel computing (like any old cluster of workstations). The distinction is the type of operation. In vectorcomputing, each node computes a very small part of the big picture, making communication time a very big (or small in this case) bottleneck. Thus these computers need super fast, specialized networking connections. There are many problems/programs which may be parallelized and yet, still have a significant sequential segment, causing the bulk of the processor cycles to be spent on processing, as opposed to waiting for data communication.

    The problems described by the later are becoming more and more popular. Vector computing, however, is primarily core scientific applications (physics, math, weather prediction, etc.) which have not seen dramatic computational advances in the last decade.

    An SP2 is sort of in the middle of the spectrum since on top of having high powered nodes, it has a fast network. A COW running linux catches the bottom end of this spectrum, it's nodes are high powered by its network is slow. With ethernet, fast ethernet, or ATM it could never match the performance of Crays or Connection Machines. But then what do you expect for $2000 a node.

    Also

  127. trivial problem by fletch_f_fletch · · Score: 1

    I concur.

    (by the way, I bet I could probably piss further than you)

  128. 36 PII-400 Xeons match 48 Alpha-450's... by Corbett+J.+Klempay · · Score: 1

    Whatever. These are old school Alphas...who would build a new Alpha 450 now?? Besides...not to knock Linux clusters (our ACM chapter just brought one online), but this is kind of a bad comparison...as this kind of stuff doesn't show the HUGE difference in internodal bandwidth between these two systems. If you get something that needs a lot of talking between nodes going, the Cray would pretty much rape the cluster like no tomorrow...the latency on switched fast Ethernet (even Gbit Ethernet) just can't compare to these whack (and horrendously expensive) supercomputer interconnection systems.

    CJK

  129. Where's NUMA by Corbett+J.+Klempay · · Score: 1

    I don't know if that's a fair statement...
    Sure, the architecture of the system may be old school, but that's not why people set up Beowulf clusters...they buy them for the untouchable price/performance for coarsely-grained problems. End of story.
    Don't worry, though...Linux development won't stand still...we'll see changes in the future to allow for more flexible architectures.

    CJK