Cray CTO Says Cray Computers Are Great
Jan Stafford writes "Linux clusters can not offer the same price-performance as supercomputers, according to Paul Terry, chief technology officer of Burnaby, British Columbia-based Cray Canada. In this interview, Terry explains that assertion and describes Cray's new Linux-based XD1 system, which will be priced competitively with other types of high-end Linux clusters."
Given the difference in rate-of-evolution in the two camps, it can't be long before PC clusters, probably running Linux / with PVM or BSP (that's bulk-synchronous parallel rather than 3D graphics
It's all very well to mock the I/O of PCI, but that's why we're all imminently moving to PCI Express, at a rather more respectable (current) maximum of 8+GBps rather than 133Mbps... Run a few gigabit ethernets in a hypercube formation and you have some rapid data transfer...
I notice he hasn't quoted the data-transfer rate on these new super-duper chips. The whole article does rather look like a piece of advertising on the cheap, speaking of which, the cluster solution is (relatively) CHEAP. Did I mention that ITS CHEAP...
Simon.
Physicists get Hadrons!
No, no, you misunderstand.
He's saying that linux-based *supercomputers* are faster then linux-based *clusters*.
(although, you can probably cluster those supercomputers...)
No, he's saying you should buy their Linux-based supercomputer instead of a Linux cluster. If you don't RTFA, at least skim the summary.
<jedi> There is something funny here. You laugh. </jedi>
Uhh, no, he's not dissing Linux at all. He's saying that one big supercomputer (running Linux, perhaps) will get you more price-performance (bang per buck, I guess) than a Linux cluster.
If it weren't for fog, the world would run at a really crappy framerate.
Yeah, no wonder this post looked familiar. Yup, it's a dupe, folks.
"Backups are for wimps. Real men upload their data to an FTP site and have everyone else mirror it." -- Linus Torvalds
The latency on Ethernet is too high for many tightly coupled applications (lattice QCD for example). This is why people who need better networking use something like Myrinet. I would assume that these Cray machines have very high band-width, low-latency communications. This is where super-computers distinguish themselves from clusters.
Not to nitpick but a Viola is a string instrument in the violin family, the word you want is voilà.
It's not just the speed of the data transfer, it's also the latency of the interconnect. A lot of scientific codes will pass around a lot of little messages, and GigE is fast for bulk transfer, but it's not so good for that. That's why there are companies like Quadrics, Myricom, etc... Infiniband should fix this, but you'll want a big infiniband switch.
His point is building fast machines is hard, and the fastest machines are really hard. Too many folks think all you have to do is throw enough PCs and GigE nics at the problem. You can build a machine that way, but the codes don't scale well. Some scientific code will quickly show negative scaling in fact (where the more processes you add, the *slower* you code will run.) MPI codes do that all the time, which is one of the reasons you'll see people running their code at sizes smaller than the whole machine, and different sizes on different machines.
Yeah, you can build a Linux based world-class supercomputer as a cluster, but you better be willing to sweat the details is all. Or buy a Cray, I guess. ;-)
There are entire classes of computational problems which are calssed as Embarassingly Parallel.
It means it is so trivial to parallelize the problem and get gains from it (think SETI@Home) that it's a no-brainer.
Other computational problems don't just simply fan out to the bazillions of nodes with tiny independant pieces of data.
Your assertion that the Cray CTO is talking FUD when he uses the actual term is just plain wrong and unfair to him. He actually knows what he's talking about.
Lost at C:>. Found at C.
Scaling or upgrading these systems requires much more than simply ordering more parts; it opens up the whole integration exercise. From an application perspective, clusters limit application scaling. Bandwidth and latency restrictions significantly constrain performance as more processors are applied to a problem.
Has this guy ever heard of Google? I can see his point to an extent; in fact his whole q&a session/blatant advert really boiled down to a single point: If you need to move a lot of data between processors, then a cluster will faire worse than one of Cray's supercomputers which have (obviously) more bandwidth between the CPUs and shared memory. It really does depend on the application, but for him to suggest an HPC is always a more economic, or even better option than a cluster of cheap x86 boxes is demonstrably false...
Code, Hardware, stuff like that.
Are you being funny or serious?
There's an entire branch of parallel application which are labeled "embarrassingly parallel". This description simply means that such programs are trivially parallelized and achieve as close to linear as possible when scaled across many nodes. This is because of the low inter-node communications.
For "embarrassingly parallel" applications, a cluster is a really good tool. For programs that parallelize as nicely a nice big vector or smp will do nicely. Some code will run better on small 20CPU SMP machine than on a 1000 node cluster.
Being the CTO of Cray, can you expect him to say anything less? Now while his points are often valid, I think his conclusion, that supercomputers outshine linux clusters is a little inaccurate. Rather, I think the real conclusion is that linux clusters and supercomputers are both good, but at slightly different things. Which one you need to solve your problem depends ultimately, on the specific details of your problem. Again, though, being the CTO of the company, can really expect him to give a balanced opinion like that, rather than the skewed opinion that his company is always on top?
Cray is a great company, but I really hate that they have to come out with things like this every now and then. Most people in need of a lot of computing power already know the difference between your products and linux clusters and really, they're going to choose whichever's most appropriate for their problem regardless of what your CTO says.
Hm, I haven't played with infiniband, but I have access to a small Myrinet cluster and it takes hell lot of efforts to write your application in such a way as to overcome the big disparity CPU power/network thoroughput and to have some normal speed-up.
Paul Terry is right - if they remove the PCI bottleneck it will be much easier to write scalable high-performance applications and then the costs will decrease.
I don't think the Cray assertion is that crazy.
For a 12 CPU opteron unit the academic pricing (admittedly lower than commercial but where most of their sales will go) is about 45K. That's not too shabby. Before you bounce up and down and say I can build four times the cluster for that price, it should be noted that the XD1 gives you a single systems image, which simplifies programming and makes shared memory applications (increasingly important for areas such as bioinformatics).
We have a cluster with dolphinics wulfkit, using distributed shared memory slows us down. It's not the end of the world type slow down but it's a factor. Our cluster is a sixteen node, dual xeon 2.2GHz with wulfkit 3d torus interconnects. It cost us, at academic prices, $50K. Admittedly more CPU power than the 12 Opterons but we find ourselves using distributed shared memory alot, wulfkit is great here, and that would probably be much better on the XD1. Had the XD1 been available a year ago we may have bought one instead.
It really depends on your application. Are Crays cheaper than clusters in terms of harnessable compute power per dollar? Maybe. Depends on your application. Surely that's the correct answer.
Also, buying Cray is about getting access to their software technology too.
R-S
Hi, clueless Slashbot. This is a quick rundown of why your post was stupid, and why Cray supercomputers do, in fact, do some things better than a PC cluster regardless of price.
If you have a supercomputer, you have a very, very, very fast internal bus handling all necessary data transfer. Even with the advent of PCI Express, a cluster of PCs must run in a network model. Therefore, any data crunching that occurs must pass through a network layer, the bus, the physical medium, and back through those limiters once more on the next system. Therefore, if you are doing number cruncing that truly cannot afford the delays caused by the data transfer limitation of a PC cluster, a self-contained supercomputer is far and away the best option, even if it's more expensive.
Therefore, contrary to the idiotic drivel you just spouted, Cray does, in fact, have something to offer that no PC cluster currently can.
We now return you to your regular informed diatribe in the name of the self-gratifying masturbatory stupidity that is Slashdot.
Alito: A vote for Alito is a punch in the eye to put that bitch back in her place!
Good clusters don't use IP; they use Infiniband, Myrinet, or Quadrics, which all have OS bypass and trasport offload features so that the app can talk directly to the NIC. In fact, Cray's XD1 "supercomputer" uses the same Infiniband interconnect as some "clusters"; Cray just has better NICs.
Have you ever worked with supercomputers?
However, if your supercomputer goes down... well, your screwed
Cray supercomputers have built-in redundancies. All the subsystems are separate from the processors and memory, which are actually "clustered" (depends on model). Even the OS has build-in means to survive the harshest hardware catastrophe by checkpointing the running jobs regularly, to off-site disks.
1000 machines are more reliable then 1 big machine
Wrong again. With 1000 lousy cheap machines, you need an on-site team of technitians to keep the all up. Supercomputers (with built-in redundancy etc.) have equal or less maintenance requirements.
- Heritage and resultant architecture: Linux clusters are typically processors are connected through I/O links, whereas supercomputing machines where processors exchange data and instructions through shared memory.
- PCI bottlenecks: This the key argument made - the bottlenecks introduced by PCI communication and the bottlenecks therein. He goes on to say that performance problems in any given such cluster tend to remain with any other such cluster. I agree with that.
- High Availability: He then goes on to talk about the reliability, availability and manageability of the supercomputers against typical clusters. I think there is where the FUD creeps in, along with marketing BS.
In all fairness, he does raise a critical point, however, overall, I think considering the relative ease and popularity of building, administering and growing a cluster these days, I think cost-effectiveness of a single monolithic machine is a moot pointhttp://efil.blogspot.com/
It's all very well to mock the I/O of PCI, but that's why we're all imminently moving to PCI Express, at a rather more respectable (current) maximum of 8+GBps rather than 133Mbps... Run a few gigabit ethernets in a hypercube formation and you have some rapid data transfer...
The main reason for supercomputers to exist is not the high bandwidth, it's the latency of the switch. The network hardware that is used in clusters as the interconnect medium (switch) can provide very high bandwidth, but the latency is high simply because you can not have low latency over large distance, and the network hardware is designed to connect over large distances. Even if you put your nodes in the same rack, the 1000000 gigabit ethernet or whatnot stock solution you use to interconnect them, will still take milliseconds ping time.
The supercomputers run on a custom, specially designed switch instead. This design includes a lot of cost and complexity just to get the latency down. This may not make any difference for your typical web-server application, but that's not what the supercomputers are designed for.
Some scientific computations have very low dependency between parts of the dataset. For example, pretty much any simulation or search application does fine on a cluster. Anything that allows you to split the work into a large number of independent tasks runs fine on a cluster. Some scientific applications do not allow the work to be split into independent pieces. Sometimes you just need random access all over your distributed data space, and for such applications the speed of computation is determined mostly by network latency. This is where you need a supecomputer, and no cheap cluster would help.
Only in routed Infiniband networks, which no one uses. The normal Infiniband protocol is very lean and totally different from TCP/IP.
While your comment is largely informative you are still confusing PCI-Express with PCI-X. They are different things. I know that it's inherently confusing, but still...
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
In short, if clustering provides a better/cheaper solution, go with it.
Um, yes. The grandparent and ggp were (I think) inferring though that for that particular application you actually won't be able to be both better and cheaper with a clustering solution.
i.e. if you throw enough Linux boxes into the cluster to be able to achieve the "better (faster)" solution, you will no longer be cheaper.
But I don't think anyone was arguing that even if a cluster is cheaper and faster you should still go with a supercomputer instead.
Unless I'm now out of date, the last figures I saw said the CrayLink Interconnect can do 102 GB/sec. That's Just a tad bit more, don't you think? No messing with masses of gig ethernet to crossconnect them. It's just done.