Slashdot Mirror


Cray CTO: Linux clusters don't play in HPC

jagger writes "Linux clustering was touted as the next big thing by many vendors last week at ClusterWorld Conference & Expo 2004. But supercomputer vendor Cray Inc. scoffed at the notion of putting Linux clusters in the high-performance computing (HPC) category. "Despite assertions made by Linux vendors, a Linux cluster is not a high performance computer," said Dr. Paul Terry, CTO of Cray Canada."

40 of 435 comments (clear)

  1. Marketing by Allen+Zadr · · Score: 5, Insightful

    While Paul Terry makes some good points, in his statements, including the partial quote from the post, "Despite assertions made by Linux vendors, a Linux cluster is not a high performance computer, said Dr. Paul Terry, CTO of Cray Canada. "At best, clusters are a loose collection of unmanaged, individual, microprocessor-based computers."

    Remember to take this with a grain of salt. The inflammatory nature of the comment is nothing more than a marketing ploy to increase visibility of, and sell, the new Cray XD1

    --
    Kinetic stupidity has a new brand leader: Allen Zadr.
    1. Re:Marketing by Total_Wimp · · Score: 5, Funny

      "At best, clusters are a loose collection of unmanaged, individual, microprocessor-based computers."

      I'm sure Paul Terry is nothing more han a loose collection fo unmanaged, individual human cells too. But I'm sure, with hard work and love, he can become a _real_ boy! Lets all have a hug.

    2. Re:Marketing by Hoser+McMoose · · Score: 4, Insightful

      The Top500 list uses Linpack exclusively for it's test. Linpack can be split to run on clusters VERY easily, it could even fall under the catagory of "embarassingly parallel" problems. These sorts of tasks do exist in reality, but they definitely aren't the only kinds of problems you'll encounter.

      If you need to access remote memory in a super cluster, such as the ones mentioned above, you take a BIG hit in terms of performance. Think about running from swap space vs. running an application out of memory and you'll be on the right track. In these sorts of situations a system like that Cray down in slot 19 could easily beat out nearly anything above it on that list (almost all of which are superclusters except for Earth Simulator at #1).

      As others have mentioned, the guy was clearly talking from a marketing standpoint rather than a "chose the best solution for the job" standpoint, however what he said isn't entirely without value. There are a lot of tasks out there where that Big Mac supercluster that people keep touting would suck-ass. Even with their high-bandwidth, low-latency infiniband interconnect you're still looking at a good 3 orders of magnitude lower performance for remote memory vs. local memory.

    3. Re:Marketing by Shinobi · · Score: 4, Informative

      Actually, that crossbar memory bus is just the local bus for each cabinet, and they do have low-latency interconnects that allow globally shared memory and single system imaging. Otherwise they wouldn't be working on a 1024 CPU installation. A clue for you: The technology used in the Origin machines was originally developed by Cray, and it runs 1024 CPU installations as global shared memory and single system image.

      As for research, it's more a case of researchers doing the old "Damn, I'll have to make do with this". And Origin and Altix systems are still selling well in the research market.

      And don't forget, Cray is backed by US government departments such as the NSA. The X1 received a lot of such support, which Cray even admits themselves: http://www.cray.com/products/systems/x1/

  2. Seymour Cray by JargonScott · · Score: 5, Funny

    A quote I've seen before:

    "If you were plowing a field, which would you rather use? Two strong oxen or 1024 chickens?"

    Maybe he meant penguins?

    --
    Nuke Gay Whales for Jesus.
    1. Re:Seymour Cray by theatre_freak · · Score: 4, Funny

      Would that be a kilochicken?

    2. Re:Seymour Cray by wowbagger · · Score: 4, Insightful
      The analogy USED to be valid, however the times have changed as microprocessors are now much more powerful.

      The analogy now would be more like:

      Which would you rather use to plow a field - one big tractor or a 1024 little tractors.


    3. Re:Seymour Cray by lukewarmfusion · · Score: 4, Funny

      Chickens, for the same reasons that you would use 1024 Linux boxen instead of his Cray.

      And when you're done plowing, you can fry 'em up all tasty.

    4. Re:Seymour Cray by turgid · · Score: 4, Funny
      "If you were plowing a field, which would you rather use? Two strong oxen or 1024 chickens?"

      Personally, I'd prefer a John Deere 6003 Series.

    5. Re:Seymour Cray by kitzilla · · Score: 4, Funny

      "If you were plowing a field, which would you rather use? Two strong oxen or 1024 chickens?" How big are the chickens?

      --
      This is my post. There are many others like it. If you don't like what you read here, go try one of the others.
    6. Re:Seymour Cray by Skjellifetti · · Score: 4, Funny

      Depends. Does HPC stand for High Performance Cow or High Performance Chicken?

    7. Re:Seymour Cray by Waffle+Iron · · Score: 5, Funny
      "If you were plowing a field, which would you rather use? Two strong oxen or 1024 chickens?"

      When Seymour Cray made that statement, he was probably pointing out the difference between his he-man vector processors vs. clusters of the wimpy microprocessors of old.

      After reading the article, it seems that this new Cray is powered by a bunch of the exact same AMD microprocessors that a cluster of Linux boxes would use. So what they have now is more like an ox-shaped sack stuffed with chickens.

    8. Re:Seymour Cray by epiphani · · Score: 5, Funny

      wow. I've never seen someone fail so miserably when trying to start a flamewar over why kind of Tractor is better. Man, I thought they woulda been all over that here on slashdot.

      --
      .
  3. Todays headlines... by stevens · · Score: 4, Funny

    Company officer claims competitor isn't as good as his product. Film at 11.

  4. And in other news... by heironymouscoward · · Score: 5, Insightful

    Oracle disclaim MySQL and PostgreSQL as "toy databases", Microsoft claims that "Apache cannot be used for real web serving", and Sun announces that "Intel and Linux simply cannot be used for enterprise computing".

    So all those supercomputing labs that use Linux clustering (that invented Linux clustering, even) have been wasting their time?

    --
    Ceci n'est pas une signature
    1. Re:And in other news... by dasmegabyte · · Score: 4, Insightful

      All of those statements are true. And a cluster is not a mainframe, and the products sold by Oracle, Microsoft and Sun *DO* go far beyond their Open Source competitors in terms of functionality.

      The problem for these guys is that, in terms of real world enterprise usage, not everybody needs the features they offer. My business doesn't need the easy management and clustering features in IIS, heck the website hasn't been updated in months and this time kast year nobody even knew which machine it ran on. We don't need the task scheduling, file striping, data transformation, replication or XML features of Orcale. In fact, we only need a tiny sliver of the possible functionality of these great products...but we're unable to pay a sliver of the price. With OSS ramping up its feature set daily, for a lot of companies with our needs it makes more sense to train a guy on Linux than to drop five digits on Windows Server 2003 and SQL Server.

      As for supercomputing...well, a cluster is NOT a mainframe. They're two similar, but different things, with the main difference being the databus. If your task is to perform a lot of calculations on a trivial dataset, clustering is the way to go. If your task is to perform a few calculations on a massive dataset, you want a mainframe. The mainframe is simply more efficient at processing massive inputs and providing massive outputs because it was designed to efficiently pass data between processors -- give the same dataset to a cluster and most of your time is wasted negociating the network.

      Of course, these days networking is so fast that a cluster will probably do for most of the things people used to do on mainframes...but a cluster is still best for tasks which are easy to split apart and process in pieces.

      --
      Hey freaks: now you're ju
  5. Are too by Anonymous Coward · · Score: 5, Interesting
    "Most cluster [experts] know now that users are fortunate to get more than 8% of the peak performance in sustained performance."
    Tell that to PIXAR. I don't believe it either.

    I guess that the simple problem is just that the algorithm applied is usually not suitable for massively parallel computing.

    1. Re:Are too by dead+sun · · Score: 5, Insightful
      Pixar doesn't need telling, their problem breaks up so miraculously well that they'll see the best performance you could possibly expect from a cluster. The big problem, rendering a movie, decomposes into thousands of small problems, rendering a frame. Each machine in their cluster can handle a group of frames at a time with zero need to communicate or worse, share computation, with other machines in the cluster. It's the best case scenario.

      Many other computing problems don't decompose nearly so nicely. So there are certainly problems that probably won't see more than 8% of peak performance. If you were particularly inclined you could probably invent a problem that had to be done serially, leaving percent of peak performance equal to what percent of your cluster one box was. Cray is right to that extent and if you're solving a problem that falls into the category of not easily parallelized then perhaps one of their machines is the better tool for the job. But, like you mention there are instances where the cluster is a great tool and cost effective to boot.

      Heck, ever check out some of the faster interconnects like Myrinet? They're insane and exist because fast ethernet just doesn't cut it in some places. Just using a slow interconnect is enough to bring real performance down below theoretical peak. Luckily for Pixar off the shelf fast or gigabit ethernet is likely enough.

      Anyway, use the best tool available. If your problem falls into the category of trivially parallelizable like rendering a movie is then don't bother wasting your money on a Cray. If your problem isn't suited to a cluster, however, then maybe a cluster isn't the right answer. If you have a big problem that needs serious computation take the time to figure out what you need before taking a marketing drone's spiel for gospel in your situation.

      --
      If not now, when?
  6. VA Cluster yet to be used by Anonymous Coward · · Score: 4, Insightful

    Regardless of whether I agree with the article or not I feel compelled to point out that:

    The 1100 node Apple G5 cluster in virginia has yet to run any real scientific code. So far it has only ran benchmarks.

    1. Re:VA Cluster yet to be used by Slowtreme · · Score: 4, Informative

      It was annonced that VA tech actually purchased the G5 X-serves before production was in place, but were instead delivered the G5 towers as loaners to have the cluster built in time for ranking.

      The cluster remains, they have not shut it down and were swapping out individual racks for the upgrade.(something like one rack of X-serves is three racks of towers.

      I don't think it's been published that they have or haven't ran any data besides benchmarks.

      --
      Post: Sigged, for your pleasure.
  7. Sure... by avalys · · Score: 4, Insightful

    In other news...

    "Despite assertions made by Toyota salesmen, a Lexus sedan is not a luxury car," said Bill Taylor, CEO of Mercedes-Benz.

    --
    This space intentionally left blank.
  8. He's got a point by PissingInTheWind · · Score: 4, Insightful

    Clusters can get high performance on some types of tasks. But sometimes, you need fine-grained parallelism that just isn't available on a cluster.

    On the other hand, high performance usually comes through special hardware. And on that hardware, I think Linux could be the right thing (modulo some patches).

    --

    A message from the system administrator: 'I've upped my priority. Now up yours.'
  9. Problem by rawgod0122 · · Score: 5, Interesting

    It all depends on the problem you are trying to solve. I have been doing some work of late that would not complete in my life time on the 108 node cluster that we have. But when programmed for and run on two Cray X1s I should complete inside of a week.

    Granted there are many codes (and more every day) that will run on clusters, the big iron will never die.

  10. Just because we love Linux.... by foooo · · Score: 5, Insightful

    Just because we love Lunux doesn't mean that clusters are HPCs.

    There are real issues that differentiate mainframe/supercomputers from large, powerful, clusters.

    Of course this all depends on your definition of an HPC. But I believe that it's reasonable to say that if parts of your computer are connected with low bandwidth connections (10/100,gigabit) they just can't handle the same kinds of transactions that a computer with parts that are connected by 10 gigabit or 1000 gigabit connections or whatever it is nowadays.

    As far as I know if you're deploying a large database it's still advisable to have a big huge IBM mainframe or a Unisys box or a Sun 10k instead of 4,8 or 16 clustered 8 proc machines.

    My point is there are valid arguments for not including clusters of commodity hardware in the HPC category.

    In my mind they aren't High Performance Computers... they are High Performance Clusters of Commodity Computers.

    ~foooo

  11. Partly right, partly wrong.... by ERJ · · Score: 4, Insightful

    How well a cluster will do depends on the application that it is performing. Some problems can be divided into several small problems with little reliance on other parts of the problem (SETI / Encryption breaking). These things can be easily distributed to hundreds or thousands of "small" boxes for processing and are what a beowulf cluster would be good at.

    Other applications require the breakneck interconnect speeds that large Cray / Sun / etc.. build on. When the data being calculated on one CPU requires data from CPU2 to continue its calculations you don't want to have it wait for 100mbit or even 1gbit ethernet speeds. Even quicker interconnects such as SCALI are going to be slowed by PC bus speeds.

    Cray fills an important niche for those who can afford it.

  12. Different tools by BoneFlower · · Score: 4, Insightful

    The comment was stupid, yes, but not all jobs that you'd use supercomputers for can be broken down into many threads as others can. A linux cluster will do well for some jobs, a cray box will do well for others. There *will* be times when a Cray system is so far superior to anything you could do with Linux that it becomes the only real option.

    However, dismissing linux cluster technology automatically is dumb. In many cases, it provides more than enough cpu power and I/O bandwith to support your reason for getting a supercomputer, and probably at less cost than the other options.

    Its all a matter of determining what you need the computer to do, determining your budget, and get the best system in your budget for the uses you have for it. Sometimes that will be a Cray, sometimes a Linux cluster.

  13. Says who? by dagnabit · · Score: 4, Funny

    Who is this guy and what does a company like Cray know about... oh... never mind.

  14. Can you multithread your application? by LostCluster · · Score: 5, Insightful

    Clusters can rival a supercomupter when they are assigned is a task that's suitable for distributed computing. That is, work units can be divided up and worked on in any sequence... the result of segment 45 doesn't depend on knowing the result of 44 and such. Effectively, you can have the sum of all of the processors minus just a little overhead for the clustering.

    What Cray's rightfully pointing out is that for most business applications, however, distributed computing is not a viable option. When processing on a transaction basis, the transactions often need to posted in the exact order they were recieved, which means they must be taken serially. In those situations, the programs can't multithread work out to the other processors so well, and the cluster will end up running at roughly the speed of just one processor while the others waste clock cycles waiting for something to do.

    The cluster isn't the solution to everything. Nor is the supercomputer. You've gotta think about the job, then figure out which tool is right for the task.

  15. Cray has some points. by Saeed+al-Sahaf · · Score: 4, Insightful
    While Dr. Paul Terry's comments are obviously self-serving, especially since in a way, with the Cray XD1 based on multiple AMD processors rather than proprietary Cray processors, he does have a point about the overhead of running the OS on each machine in a cluster, and the statement "The Cray XD1 is not a traditional cluster; it does not use I/O interfaces for memory and message passing semantics."

    In truth, such machine will always have a certain performance advantage over traditional clusters. The question is, will the price point be low enough to invalidate the idea of just adding more boxes to the traditional cluster.

    --
    "Who are in control, they are not in control of anything - they don't even control themselves!" - Glen Beck
  16. He's wrong, but he's also right. by Richard+Mills · · Score: 4, Informative

    While I certainly disagree that you can't build a very high performance computer out a cluster of computers (Linux or otherwise), there is a lot of merit to the fact that clusters just don't scale well for certain classes of applications. Hence the renaissance of the vector supercomputer (ala the Earth Simulator ).

    Obviously, this guy is plugging the new Cray X1 architecture, which really is quite promising. For instance, check out this paper by some folks at Oak Ridge National Lab that appeared in Supercomputing 2003.

    Of course, since this is Slashdot, I expect that there will be a deluge of posts decrying everything about the new Cray machine because it commits the cardinal sin of NOT USING LINUX. Oh, the horror!

  17. Re:Help me here... by maan · · Score: 4, Insightful

    You're right in saying that the Virgina Tech cluster is the 3rd fastest supercomputer (LINPACK tests). I think that for some other tasks however, it would be slower. Sure, they use infiniband as an interconnect (very fast & low latency), but that doesn't change the fact that it's many separate nodes, each with its own memory. So if one processor were to access some memory on a different node, it would slow down things a little.

    So depending on the task at hand, the cluster might perform very well, or perhaps a little less well. Cray supercomputers are a big number of processors all in the same machine, and more importantly all sharing the same memory. Each processor has the same delay to access any memory content.

    The argument in favor of clusters, however, is that it's still cheaper to throw more computers in than to buy a Cray that would perform the same task in less time.

    In the end, there's a lot of marketing involved in all of this...

    Hope this helps (and that I'm not completely wrong!),

    Maan

  18. Well... he is sort of correct... by nacks1 · · Score: 5, Interesting

    I happen to work in a facility that has large had both large supercomputers (cray t3e, j90, sgi) and linux and *nix based clusters (beowulf/linux, compaq/Tru64). The Cray CTO is correct that you can't just call every linux cluster out there HPC. Just about anyone with networking and linux knowledge can build a linux cluster.

    What really makes a difference between an HPC cluster and your normal every day cluster is the hardware interconnects used. There is a comment in the artical that refers to not using I/O for memory and message passing. I am not quite sure what he means by that, but I am guessing that he is saying that the network is not used for shared memory/message passing (MPI/openMP/SHMEM).

    If a cluster can limit the impact of latency between nodes either through smarter software or faster interconnects then I can't see any reason not to concider a linux cluster as HPC.

    Clusters without smarter software tend to be a real difficult coding platforms. Some developments with things like globally shared memory might make the difference, but there will still be the problem of latency between nodes.

  19. Re:Well.. by s00p41337h4x0r · · Score: 5, Insightful
    How could Cray be wrong. I mean just becuase linuxis running some of the top 500 computers there is no reason to consider HPC right. What a self serving statement Cray makes....they still dont get it .... there way is a dead-end...
    That's right. Dataflow vector processing has been shown to be a dead end. The fact that fastest computer in the world is a dataflow machine is a statistical anomaly, right?

    Oh, here's the TOP500 list, btw.

  20. Re:Help me here... by krlynch · · Score: 5, Interesting

    So depending on the task at hand, the cluster might perform very well, or perhaps a little less well.

    Surely what you meant to say is that, depending on the task at hand, a cluster might perform very well, or perhaps perform attrociously. :-)

    Clusters tend to work well when the various nodes don't need to communicate very often but you need lots of cycles for the subtasks, while dedicated supercomputers tend to perform very well in tasks requiring vast amounts of internode communications bandwidth along with large numbers of cycles. If you need vast bandwidth and relatively low numbers of cycles, your pricepoint is likely a mainframe. And if you don't need either, you get a cheap desktop machine.

    Certain problems parallelize well on a cluster ... others don't. Some don't parallelize at all, and a cluster won't do you a darn bit of good. The different machines are designed for different uses ... and one should be careful not to push a "one size fits all" solution. The Cray guy clearly got it wrong on that point, and likely knows it, but he was marketting, not teaching a course in choosing hardware for the task at hand.

  21. The Cray will scale up by Richard+Mills · · Score: 4, Informative

    The reason that Cray only holds 19th right now is because they have only deployed X1 systems using up to 256 nodes. When the number of nodes is increased, you will certainly see the Cray moving up the top 500 list -- the architecture is VERY scalable.

  22. Re:Help me here... by SquadBoy · · Score: 4, Informative

    The how to from way back in the day.

    http://www.ibiblio.org/pub/Linux/docs/HOWTO/othe r- formats/html_single/Beowulf-HOWTO.html

    has a great explanation using a grocery story analogy that makes it really easy to understand what kind of tasks will work well and what kind will suck. And unlike the cheerleaders that have been showing up since clusters became a big business is very balanced about it.

    Still worth reading.

    --

    Cypherpunks: Civil Liberty Through Complex Mathematics. Those who live by the sword die by the arrow.
  23. Re:If it walks like a duck, and talks like a duck. by flaming-opus · · Score: 5, Informative

    Cray could easily be at or close to the top of the top500 list, their X1 architecture will extend that far. However, for a lot of really important supercomputing codes, it's no contest: The cray will trounce the clusters (linux or otherwise). Those #19 crays are only 256 processors. To get similar performance a stack of xeons requires thousands of processors. Some tasks just can be split appart that easily.

    A cray processor has eight floating-point units running at 800Mhz. The big Mac cluster (for example) uses G5 processors which have 2 FPUs at 2000Mhz. Thus the cray has a ~40% advantage. However, the G5 processor has ~4GB/s memory bandwidth. The Cray has ~50GB/s memory bandwidth. If you have a problem that needs to do a HUGE amount of math on a tiny amount of data, the G5 will rock. If you have a problem that needs to do a HUGE amount math on a GINORMOUS amount of data, buy the cray. (for a GINORMOUS amount of money too)

    Similaraly infiniband (ala the big mac) is really hot in the cluster interconnect space because it gives 2.5GB/s per node. The Cray gives you 51GB/s.
    You need to move a little data, buy a cluster. You need to move a lot of data, buy the Cray.

    There's no one solution for all problems.

  24. Obligatory by Rhesus+Piece · · Score: 4, Funny

    "No, a Kilochicken is a 1000 chickens. You're thinking of a kibichicken. Check it out at http://www.nist.gov" Somebody had to, right? Right?

  25. doesn't this CTO of cray remind u of someone? by MoFoQ · · Score: 4, Funny

    doesn't this CTO of cray remind you of someone?
    "There IS no Linux in high-performance clusters."

    "There IS no Americans in Iraq."

    OMG! It's the former Iraqi mis-Informed-ation minister!

    Especially when 2004 has been dubbed the year of the penguin, it's wreckless to claim that Linux can't be used in HPC's.
    Hell, just look at the current top500 list. There's no Cray in the top 10 but there are two Linux based clusters there (and one based on OSX [FreeBSB based]).

    Here's a few:
    NCSA's IA32 Linux cluster
    NCSA's IA32 Linux cluster
    Space Simulator Clust at Los Alamos (SS51G based; makes me proud as I have a SS51G too)
    Beowulf - used in many Linux clustering projects
    Linux clusters at Los Alamos (they seem to have more than one)
    Virginia Tech's Supercomputer X

  26. Re:Help me here... by eric2hill · · Score: 4, Informative

    For the love of Christ people, it's a simple thing.

    Format links like this: <a href="http://somelink">link text</a>

    It takes virtually no extra time and we don't have to trim the fucking slashcode spaces.

    Oh, and here's the link.

    --
    LOAD "SIG",8,1
    LOADING...
    READY.
    RUN