Slashdot Mirror


Cray CTO: Linux clusters don't play in HPC

jagger writes "Linux clustering was touted as the next big thing by many vendors last week at ClusterWorld Conference & Expo 2004. But supercomputer vendor Cray Inc. scoffed at the notion of putting Linux clusters in the high-performance computing (HPC) category. "Despite assertions made by Linux vendors, a Linux cluster is not a high performance computer," said Dr. Paul Terry, CTO of Cray Canada."

7 of 435 comments (clear)

  1. Are too by Anonymous Coward · · Score: 5, Interesting
    "Most cluster [experts] know now that users are fortunate to get more than 8% of the peak performance in sustained performance."
    Tell that to PIXAR. I don't believe it either.

    I guess that the simple problem is just that the algorithm applied is usually not suitable for massively parallel computing.

    1. Re:Are too by bugnuts · · Score: 3, Interesting

      All tests for the top 500 supercomputers are done solving a problem using Linpack, not some trivially parallel code such as raytracing 100,000 frames of a movie.

      Message passing is the biggest issue with such solvers, and in a way, cray was absolutely right about Linux, although misleading. There are some tests going on now with a modified Linux kernel for doing true HPC, and it's been done in the past (I know, I've used it). Things like disk swapping pretty much immediately disqualifies you for high performance computing. It has its place of course, such as trivially parallelizable codes is one example (Pixar).

      Myrinet was out before Gbit ethernet was really available, and also has some nifty routing capabilities. And since the bottleneck for HPC is usually message passing, high performance computing will better realize its theoretical performance as the communication speed catches up to the processor speed.

      But, to Cray's discredit, making a blanket statement that Linux can't do HPC is like saying Macintoshes can't do HPC.

  2. Efficiency and cost argument by yppiz · · Score: 3, Interesting

    The Cray CTO makes the point that Linux clusters get, at best, just under 10% peak as sustained performance and uses this as a justification that Linux clusters are not HPCs. This is a reasonable criticism. Let's take the percentage he cites as real for a moment. Now what is the cost difference between a Linux cluster and a Cray (not some future offering, but today) and how much more of a Linux cluster could you afford? Would that offset the quoted inefficiency? Would the flexibility of being able to use commodity components further offset any advantage Cray might have? What about 24hr or same-day parts replacement without a hyper-expensive service contract? At the end of the day, I suspect the Linux cluster wins out even given the sub-10% efficiency figure Cray cites. --Pat / zippy@cs.brandeis.edu

  3. Problem by rawgod0122 · · Score: 5, Interesting

    It all depends on the problem you are trying to solve. I have been doing some work of late that would not complete in my life time on the 108 node cluster that we have. But when programmed for and run on two Cray X1s I should complete inside of a week.

    Granted there are many codes (and more every day) that will run on clusters, the big iron will never die.

  4. Well... he is sort of correct... by nacks1 · · Score: 5, Interesting

    I happen to work in a facility that has large had both large supercomputers (cray t3e, j90, sgi) and linux and *nix based clusters (beowulf/linux, compaq/Tru64). The Cray CTO is correct that you can't just call every linux cluster out there HPC. Just about anyone with networking and linux knowledge can build a linux cluster.

    What really makes a difference between an HPC cluster and your normal every day cluster is the hardware interconnects used. There is a comment in the artical that refers to not using I/O for memory and message passing. I am not quite sure what he means by that, but I am guessing that he is saying that the network is not used for shared memory/message passing (MPI/openMP/SHMEM).

    If a cluster can limit the impact of latency between nodes either through smarter software or faster interconnects then I can't see any reason not to concider a linux cluster as HPC.

    Clusters without smarter software tend to be a real difficult coding platforms. Some developments with things like globally shared memory might make the difference, but there will still be the problem of latency between nodes.

  5. Re:Help me here... by krlynch · · Score: 5, Interesting

    So depending on the task at hand, the cluster might perform very well, or perhaps a little less well.

    Surely what you meant to say is that, depending on the task at hand, a cluster might perform very well, or perhaps perform attrociously. :-)

    Clusters tend to work well when the various nodes don't need to communicate very often but you need lots of cycles for the subtasks, while dedicated supercomputers tend to perform very well in tasks requiring vast amounts of internode communications bandwidth along with large numbers of cycles. If you need vast bandwidth and relatively low numbers of cycles, your pricepoint is likely a mainframe. And if you don't need either, you get a cheap desktop machine.

    Certain problems parallelize well on a cluster ... others don't. Some don't parallelize at all, and a cluster won't do you a darn bit of good. The different machines are designed for different uses ... and one should be careful not to push a "one size fits all" solution. The Cray guy clearly got it wrong on that point, and likely knows it, but he was marketting, not teaching a course in choosing hardware for the task at hand.

  6. FUD and Thunder-Mongering by pragma_x · · Score: 3, Interesting

    Despite assertions made by Linux vendors, a Linux cluster is not a high performance computer, said Dr. Paul Terry, CTO of Cray Canada. "At best, clusters are a loose collection of unmanaged, individual, microprocessor-based computers."

    Although this statement reeks of FUD, he's right about one thing: a cluster is not an HPC... that's why its called a cluster. But to say that a cluster is 'unmanaged' is one hell of a stretch IMO. All in all, he's just arguing semantics: nothing to see here, put down your flamethrowers, move along folks.

    Since this is slashdot, I'll add that the rest of the article is full of choice quotes all of which point squarely at basic FUD + marketing spin for their new cluster-cost-like product.

    It seems to me that Cray is just plain bitter that Linux (through all the cluster solution providers) has managed to steal Cray's thunder at a mere fraction of the cost. Cray's probably even more bitter that folks are willing to sacrifice performance (at least from Cray's perspective) just to save a buck.

    Okay, this is Cray we're talking about here: people are saving millions of bucks all over the place by using clusters instead of big expensive machines.

    And guess who wants 'their' slice of the pie back.