Slashdot Mirror


Cray XT-3 Ships

anzha writes "Cray's XT-3 has shipped. Using AMD's Opteron processor, it scales to a total of 30,580 CPUs. The starting price is $2 million for a 200 processor system. One of its strongest advantages over the std linux cluster is that it has an excellent interconnect built by Cray. Sandia National Labs and Oak Ridge National Labs are among the very first customers. Read more here."

67 of 260 comments (clear)

  1. imagine a... by Anonymous Coward · · Score: 5, Funny

    single node of those.

    1. Re:imagine a... by Anonymous Coward · · Score: 5, Insightful
      *rolls eyes*

      When you have a single CPU, designing the system to be pretty fast is easy. There's no major contention to deal with.

      Two CPUs? Slightly harder, but reasonably straightforward. You don't see a 2x improvement in speed over one CPU, but it's around 1.95x, give or take a bit.

      Four CPUs? Now you're starting to see less improvement ... probably around 3.2x, because of all the contention issues.

      Sixty-four CPUs? You'll be lucky to get a 50x speed up over a single CPU.

      When you get to 200 CPUs, the issue of access to shared memory and other shared resources becomes critically important. It's also an issue that most computer buyers don't need to worry about, because they don't have 200 CPUs in their system. This means that you have a lot of highly specialised research going on, and relatively few buyers to spread the cost of that research over.

      Two million for a 200 CPU box which has low latency, low contention, and solid reliability is not a lot at all. You might not buy it. That doesn't mean nobody will.

    2. Re:imagine a... by crimsun · · Score: 4, Informative

      It's not just hardware: the amount of non-parallelizable code in parallel applications impacts scalability most tremendously.

      The upper bound on speedup is generally Amdahl's law. Plainly, the efficiency approaches zero as the number of processes is increased. Generally we consider the major sources of overhead to be communication, idle time, and extra computation. Interprocess communication is considered negligible for serial programs in this context (we consider message passing). Idle time ends up contributing to overhead, because processes idle awaiting information from others. Extra computation is virtually unavoidable at some point; for instance in MPI's Single Program Multiple Data model, each process in tree-structured communication other than the root is eventually idled prior to the completion of computation, and each process determines IPC at some point based on rank.

      There are notable exceptions to Amdahl's law, however; Gustafson, Montry and Benner wrote about such in Development of parallel methods for a 1024-processor hypercube, SIAM Journal on Scientific and Statistical Computing 9(4):609-638, 1988.

    3. Re:imagine a... by ant_slayer · · Score: 2, Informative

      My apologies, but I couldn't help but think that you'd be *really* lucky to get 50x out of 64 CPUs. Examine the following:

      1 CPU @ 1.00x -> 1.00 / 1 = 1.000
      2 CPUs @ 1.95x -> 1.95 / 2 = 0.975
      4 CPUs @ 3.20x -> 3.20 / 4 = 0.800
      64 CPUs @ 50.0x -> 50.0 / 64 = 0.783

      Pop that into an OpenOffice.org spreadsheet and look at the graph.

      That is not linear, in fact, it's non-linear in the direction that *helps* more and more processors. If the decline from 4 CPUs to 64 CPUs is a mere 1.7% efficiency compared to the 17.5% drop from 2 to 4, then, by golly, I'm going to cram hundreds of CPUs in there and see it tail off. Hello amazing performance.

      Instead, reality is that the dynamics change. You can't evaluate "equivalent performance" to a single processor system. There is no reasonable metric with which to do so.

      -Ant Slayer-

  2. Comment removed by account_deleted · · Score: 2, Funny

    Comment removed based on user account deletion

  3. How big is it? by rooijan · · Score: 3, Interesting

    I read the article (okay, so I kinda read it :-) ) and it has the speed and specs to be a geek's improvement on sliced bread. But how big is it, physically?

    The article doesn't appear to mention its dimensions, and I'm curious to know what kind of space you need to install this baby. Anyone got any idea?

    --
    Daar is nie 'n lepel nie
    1. Re:How big is it? by Anonymous Coward · · Score: 4, Informative

      Dimensions (cabinet): H 80.50 in. (2045 mm) x W 22.50 in. (572 mm) x D 56.75 in. (1441 mm)

      Weight (maximum): 1529 lbs per cabinet (694 kg)

      http://www.cray.com/products/xt3/specifications. ht ml

    2. Re:How big is it? by Anonymous Coward · · Score: 2, Funny

      Dimensions (cabinet): H 80.50 in. (2045 mm)

      Wow... for the first time in my life, I couldn't picture 80 inches, but I could 2 meters. I think there may be hope in the metric system afterall.

  4. I'll pass for now. by mrjb · · Score: 3, Funny

    This is only the XT-3. I'll wait for the Pentium-3-4.

    --
    Visit http://ringbreak.dnd.utwente.nl/~mrjb/growingbettersoftware to download your free copy of the book
    1. Re:I'll pass for now. by Pleione · · Score: 2, Funny

      Don't you mean "AT/ATX"?

  5. we're getting closer... by nilbog · · Score: 5, Funny

    A few more years of advances like this and we might have a machine capable of running Longhorn!

    --
    or else!
    1. Re:we're getting closer... by metlin · · Score: 3, Funny


      Ahh, now that's what I call an optimist.

    2. Re:we're getting closer... by NanoGator · · Score: 2, Funny

      "A few more years of advances like this and we might have a machine capable of running Longhorn!"

      A few more years of computer advances and this joke will still be modded funny!

      --
      "Derp de derp."
    3. Re:we're getting closer... by NanoGator · · Score: 2, Insightful

      "Oh, quiet you...it was funny...sort of...well not really but you know what I mean?"

      It was funny like a year ago. Now it's as overused as an SNL skit.

      --
      "Derp de derp."
    4. Re:we're getting closer... by provolt · · Score: 4, Insightful

      Ah the joys of youth.

      Back in my day we spelled "enuff" without the 'f' character and it was good enough for us.

  6. $2 million for a computer? by commodoresloat · · Score: 3, Funny
    It better have a lot of good games. How many mouse buttons does it have?

    I can't believe people complain about the price of iMacs....

    1. Re:$2 million for a computer? by Klar · · Score: 2, Funny

      How many mouse buttons does it have?
      Please.. it doesn't have any.. it just *knows* what you want to do before you *know* what you want to do..

  7. real FPU operations by Barbarian · · Score: 4, Interesting

    How are the Opterons at standard FPU operations in double precision? SSE2 and friends are nice, unless you have to make compromises in your simulations.

    I ask, because I remember that the Athlons beat the pants off the Pentium 4's in FPU operations, so all the benchmarks were rewritten to use SSE2.

    1. Re:real FPU operations by jmv · · Score: 4, Informative

      Opterons beat the pants off the Pentium 4s in x87 (i.e. old) FPU operations. If you want to get good performance, you need SSE/SSE2. Both for AMD and Intel. For pure SSE, the Pentium 4s beat the Opterons mainly because of the clock speed, but for multi-processor systems, the hyper-transport and all more than makes up for that.

    2. Re:real FPU operations by jmv · · Score: 3, Insightful

      Both SSE and 3DNow! get you (in theory, at best) two adds and two multiplies per clock cycle, even on an Opteron. So yes, just because of the clock, the P4 beats the Opteron in the case of pure (no memory/cache access, no depencency, nothing else) float operation. Now, in real life, you sometimes spend longer waiting for the data than computing with it and that's how the Opteron quite often comes out on top, especially for multi-processors.

    3. Re:real FPU operations by jmv · · Score: 5, Interesting

      Couple facts about SSE:
      1) You can use it in scalar mode, in which case it's almost like x87, only a bit faster because:
      a) It doesn't use a braindead register model (stack)
      b) On P4, you can do a mul and an add in parallel with SSE, but not with x87
      2) You can use SSE intrinsics. It's not as easy as "normal" programming, but easier than assembly and almost the same speed.
      3) Unaligned access is possible. It's slower than aligned access, but overall better than non-vectorized code.
      4) Trig is so slow that SSE/x87 doesn't matter (unless you write approximations, in which case SSE will also be faster).

  8. Just the name brings back memories by Dancin_Santa · · Score: 3, Informative

    In this day and age of very fast computers and clusters built in our basements, there sometimes comes along a story that whispers of the computing age of days long past. Cray is one of those names that can drop a jaw just by the mere utteration of the name.

    The name is synonymous with speed and power and the unwillingness to cut corners in order to shave a few dollars off the final product. When you buy a Cray, you know you are getting top of the line hardware.

    It looks like Sandia wants to build the fastest supercomputer in the world by clustering a few of these monsters, and I have no doubt that they will. Looks like more fun articles about this in the future. :-D

    There are two prominent applications for these machines. The first is nuclear weapons simulation. Personally, I don't see the point to that. The other application is in weather prediction. By feeding in current weather variables into a well-written model, a supercomputer is able to predict to a large degree of accuracy the future weather. Such an application will always be welcome.

    I think I'm going to have to fire up the old ][e, the nostalgia is killing me!

    1. Re:Just the name brings back memories by joib · · Score: 4, Informative


      There are two prominent applications for these machines. The first is nuclear weapons simulation. Personally, I don't see the point to that. The other application is in weather prediction.


      Oh, please. Buy a clue, will ya? There's lots and lots and lots of applications that use supercomputers, or could use if they were more affordable. A few examples from the top of my head:

      Materials science, that is ab initio simulations, moldyn, you name it. This alone probably uses > 50 % of all supercomputer cpu time in the world. By comparison, weather prediction and nuke simulations is small potatoes (or shall we say, the simulations as such are big, but the number of people engaged in weather prediction or nuke simulation is really small compared to all the supercomputing materials scientists).

      CFD, the automobile and aerospace sectors are big users.

      Electronic design.

      Seismic surveys, the oil industry uses lots and lots of supercomputers to find oil deposits.

      Biology. Gene sequencing, moldyn simulations of lipid layers and whatever.

      Climate prediction, somewhat related to weather prediction. Official purpose of the Earth Simulator.

      All of the examples above could easily use almost any amount of cpu power you can throw at them. The only thing that stands between a lot of scientists and improved understanding of the world is computing power.

    2. Re:Just the name brings back memories by LiquidCoooled · · Score: 2, Funny

      There are two prominent applications for these machines.

      Wrong! There is a third, more used application: Solitare.

      Even super computer coders have to wait for results.

      I also asked this recently, but didn't get a reasonable answer, do these beasts have screen savers? if so, Are they just blackout type, or busy 3d rendered whizbang super cool ones "Just because we can"?

      (I realise you may not be able to answer that, but someone might)

      --
      liqbase :: faster than paper
    3. Re:Just the name brings back memories by capmilk · · Score: 2, Funny
      During the times of no activity what does it do?

      It creates random noise that is then fed into the Seti project so our computers have something to do in times without activity.

    4. Re:Just the name brings back memories by droleary · · Score: 2, Funny

      There are two prominent applications for these machines. The first is nuclear weapons simulation. Personally, I don't see the point to that.

      Well, when you nuke the site from orbit, you do want to be sure don't you?

    5. Re:Just the name brings back memories by flaming-opus · · Score: 4, Insightful

      Actually, there is no reason to cluster a few of these. If you have a 2000 node xt3 (or t3e, paragon, blue-gene, cm5, insert mesh-structured mpp here) and a 4000 node xt3, you stick them together and make a 6000 node xt3. But that's just picking nits.

      Curiously the xt3 IS about shaving dollars off the price. If you go read the origional whitepapers on the system, they go through EXTENSIVE cost-return analysis. They studied their (then-) current generation of cluster systems, as well as future linux/solaris/aix clusters, and rejected them as (interestingly) FAR TOO EXPENSIVE, once the administrative costs are factored in. They then looked at, and rejected, cray's vector solution, the X1. They then decided that the (amazingly) most cost effective solution was to underwrite cray's product development cycle on a wholey new product. Basically they asked for an update to the system they already had. (asci-red i.e. intel paragon++) Nobody was building such a thing. Since cray had a really strong similar product in the 90s. (T3D, T3E) the department of energy asked them to create an update. Some designs never die.

      What I'm most interested in is the reliability. One of the biggest difficulties in the T3D engineering cycle was dealing with memory failure. red-storm is going to have 10,000 processors. Lets assume each has 2 banks time 3 dimms (chip-kill) of memory. That means there are 10,000 x 6 x 18 = 1 million+ memory chips in the system. IF 1/100th or a percent of these fail, that's still a lot of memory failures. How well are faults isolated? That's the big question for systems this big.

      I'm also a little wary of cray's use of lustre. I've used lustre before, as well as other cluster-FSes. While I'm not aware of other filesystems that will scale to 700+ i/o nodes, I'm not confident in lustre. It's an immature product at best. (I don't mean to disparage the people working on it, it's a neat architecture, but it's a hard problem, and I'm not sure it's ready for prime-time.)

  9. How big it is by commodoresloat · · Score: 2, Informative

    from TFA -

    Dimensions (cabinet):

    H 80.50 in. (2045 mm) x W 22.50 in. (572 mm) x D 56.75 in. (1441 mm)

    Sorry to reply twice but I forgot this detail.

  10. You don't have to begin to imagine by commodoresloat · · Score: 3, Informative

    You could just read on the spec page: Power: 14.8 kVA (14.5 kW) per cabinet. Circuit Requirement: 80 AMP at 200/208 VAC (3 Phase & Ground), 63 AMP at 400 VAC (3 Phase, Neutral & Ground) Cooling Requirement: Air Cooled, Air Flow: 3000 cfm (1.41 m3/s) Intake: bottom, Exhaust: top.

    1. Re:You don't have to begin to imagine by fbform · · Score: 5, Interesting


      More interesting is this spec:

      Acoustical Noise Level: 75 dBa at 3.3 ft (1.0 m)

      For comparison, that's roughly the same as an average vacuum cleaner when you're operating it, or maybe a good-sized pickup truck passing you in the next lane.

      And remember, this value is *per cabinet*. You have to do a weighted sum over all the cabinets in an installation to get a true dB level. I wonder whether the maintenance people will have to use noise-level exposure limits for this baby.

      And here I was, complaining about the quiet whine of my PC's fan.

      --
      Time flies like an arrow. Fruit flies like a banana.
    2. Re:You don't have to begin to imagine by pchan- · · Score: 4, Interesting

      Power: 14.8 kVA (14.5 kW) per cabinet.

      that's amazing. how did the cray guys get a kilovolt-ampere that is not equal to a kilowatt? just goes to show you the power of fast interconnects.

    3. Re:You don't have to begin to imagine by wronskyMan · · Score: 5, Informative

      Disclaimer: IANACEBIATAPEC (I Am Not A Cray Engineer But I Am Taking A Power Engineering Course)
      It's fairly common to get a KVA !=KW.
      Overall power used by a load is expressed as S=P+jQ, where P is the "real" power and Q is the reactive power (capacitive/inductive from motors, fluorescent lamp ballasts, etc).

      While the "units" of S, P, and Q are power=voltage*current, S is generally expressed in VA, P in W, and Q in VAR(volt-ampere reactive) to differentiate the variables. Because the magnitude of S=sqrt(P^2+Q^2), S will always be greater than or equal to P (in this case, 14.8kVA=sqrt((14.5kW)^2+(+-2.965kVAR)^2)

      --
      --- You shall know the truth, and the truth shall make you mad- Neal (not Cowboy) Boortz
  11. Opterons and PowerPC together by Henriok · · Score: 5, Interesting

    It seems that the XT-3 not only use Opteron processors but they also use PowerPC 440 co-processors from IBM to off load inter-processor communication from the main computing CPUs. Quite an interessting set up.

    The XT-3's biggest comptetitor in this segment must be the BlueGene/L type super computer made by IBM. The processors in Blue Gene/L is a custom built dual core version of the PowerPC 440 with built in high speed interconnects.

    Just like IBM have a finger in all the future game consoles, they seem to have a finger in several of the next generation super computers also. Nice going IBM.

    --

    - Henrik

    - when the Shadows descend -
    1. Re:Opterons and PowerPC together by Shinobi · · Score: 2, Informative

      No. The biggest competitor to the XT3 will be machines like the NEC SX-8, their own X1 family or the IBM p690's. They are all shared memory systems, while the Blue Gene family is not. And therein lies a whole world of difference.

    2. Re:Opterons and PowerPC together by evilviper · · Score: 2, Insightful
      Just like IBM have a finger in all the future game consoles, they seem to have a finger in several of the next generation super computers also. Nice going IBM.

      It's not that they're the best thing since sliced bread, it's mainly that all their competition went down the chute for one reason or another.

      HP/Compaq/DEC was the king of supercomputers. Now they're only supporting their formerly glorious products, with practically nothing new comming to replace it.

      Sun seems to really be sitting on their ass.

      Intel was trying, but screwed the pooch with Itanium/Itanic.

      SGI was a competitor, but they've just faded out.

      Motorola could compete if they put some effort into it, but they've been out of it for some time.

      etc.
      --
      Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
    3. Re:Opterons and PowerPC together by pchan- · · Score: 2, Interesting

      let's see what you're missing:

      * first, sgi still makes and sells supercomputers, they are far from faded. they also own cray (or did).
      * tandem, bought by compaq, we all know what happened there.
      * hp sells a superdome once in a while. but nobody seems excited about their itanic systems.
      * sun, rotting with their out of date cpus.
      * fujitsu is doing well in the supercomputer market.
      * nec is also successful.
      * ibm, of course.

      and you mentioned motorola? you're joking, i hope.

      the largest purchasers of supercomputers in the world - national labs and the nsa, like to buy american hardware. they've always had a hand in keeping the industry afloat. notice that the big labs tend to round-robbin their supercomputer vendors so that they buy a machine from each vendor.

  12. Re:So......the cost compared to? by the_2nd_coming · · Score: 2, Informative

    X-serve clusters would be cheaper, but I think that Cray has the edge n the interconnect tech. So, you need massive bandwidth in the system, get the Cray. you need next best bandwidth at a low price, get the Xserve cluster.

    --



    I am the Alpha and the Omega-3
  13. Sic transit gloria mundis by oakad · · Score: 2, Insightful

    It seems like Cray is not capable of sustaining its heritage. Buying cheap AMD processor and connecting them with customized HT interconnect is not enough to build a machine capable of record-breaking single-task performance, old Crays exhibited. When one could be sure with Cray XMP that he has the best machine money can buy (with outstanding scalar and vector abilities), new Cray is just another loosely-coupled AMD cluster. Thanks god it's not a NEC clone (at least).

    1. Re:Sic transit gloria mundis by Anonymous Coward · · Score: 2, Informative

      It's not a customized HT interconnect. There's a dedicated SeaStar router chip that connects via HT to the uniprocessor Opteron + RAM node, but the actual fabric connecting the SeaStars is proprietary (each SeaStar connecting to six others via 7.6 GB/s interconnects, forming a 3D grid fabric expandable to 30K+ nodes).

      That's why they use mere 100-series Opterons: they need only one HT link per CPU. Because the whole is not based on HT interconnects.

      Really, loosely-coupled cluster my ass. This machine *is* capable of record-breaking single-task performance. Read the product pages again.

  14. The first test of the new Cray by teamhasnoi · · Score: 3, Funny
    they simulated a woman who posts to Slashdot and is waiting for her Centris running PearPC on Debian to boot OS X.

    Strangely, it took roughly a week. The second test was a simulation of the moderation results of this post.

    It received a +5 Funny, which puzzled researchers, as it is currently modded -1 Offtopic.

    Damn you Schroedinger!

  15. Re:software by Coryoth · · Score: 5, Informative

    what kind of operation system runs on this beast?

    UNICOS is usually a safe bet. In this case the specs say UNICOS/lc, which is made up of "SUSE(TM) Linux(TM), Cray Catamount Microkernel, CRMS and SMW software"

    I'm not entirely clear how to interpet that, but I think it runs as follows: It runs the Catamount Microkernel as the kernel, and uses SUSE for everything else (so we have SUSE Linux, without the Linux - all of a sudden that GNU/Linux stuff starts to make sense). The CRMS is their interconnect management and monitoring software, and SMW is the System Management Workstation - which I'm guessing is their administration frontend.

    It's worth noting that that's some pretty serious software there (because Cray has a lot of experience dealing with large systems) - you can bet that the management and monitoring software is some very serious stuff.

    This thing is to a beowulf cluster what a dual G5 PowerMac is to homebuilt PC system running Linux From Scratch. It's going to work flawlessly "out of the box" with a smooth and polished interface that lets you get done everything you want to do simply and easily. You can of course make your home built PC with LFS work just as well, it's just going to take you an awful lot of effort.

    Jedidiah.

  16. Re:So......the cost compared to? by Coryoth · · Score: 4, Informative

    So, how does this compare to running Apple's Xserve? Bang per buck? Heat? Space? Etc etc....

    There's not a lot to compare. We're talking apples and oranges. It's like asking to compare a PowerMac G5 with a bunch of PC parts scattered on the floor as desktop machines. Sure, you can put the PC together, load it with Linux, tinker with it to get everything working, etc. but that's a fair amount of work compared taking the PowerMac out of the box, plugging it in, turning it on, and having everything work perfectly.

    Read the specs, particularly with regard to the interconnect, system administration, and hardware and software reliability features. This thing is seriously engineered to be massively parallel system with top of the line hardware and software to support and maintain that, as well as extremely impressive reliability features.

    Jedidiah.

  17. AMD gets about... by BrookHarty · · Score: 2

    So 96 processors, AMD gets about 144K per PE node at 1500 per cpu, or does Cray get a discount?

    Also, a 30,000 cpu complex, AMD must be making a tidy sum.

  18. Re:MP performance overhead by Big+Mark · · Score: 3, Informative

    If Crays were built the same was as desktop dual-proc machines, then yes, the multi CPU overhead would cripple it. Fortunately, it's designed completely differently - e.g. they use PowerPC chips to handle almost all of the inter-processor communication.

    You can't really compare something that can hold thousands of CPUs to something powered by Abit that can hold two, anyway. It's like comparing apples and a strange bug thing with tentacles.

  19. Intersting note by floydman · · Score: 2, Interesting

    from their Tech.sheet they are using the Luster file system

    This is the first time i see a shipped linux with this file system. Now the intersting part is that lusterfs is made for linux clusters, but this monster is not a cluster... any body can shed some light?

    --
    The lunatic is in my head
  20. Re:My new dream toy by Guppy06 · · Score: 3, Funny

    Maybe if you included promises of free iPods...

  21. Re:cray by Anonymous Coward · · Score: 5, Interesting

    Cray never went "belly up". It was acquired by SGI around 1997 or so, then divested and merged with Tera, who renamed the resultant entity "Cray Research".

    Although it's true that Cray was not growing strongly before the SGI buy-out, it was not failing either. It could have kept running quite happily for many years, but in the bizarro-world of Wall Street, a company which is not growing is dying. I so love it when economists use biological terminology for corporations. In Wall Street's thinking, the only healthy growth would be a cancerous tumor.

    Anyway....

    The whole SGI-period of Cray is actually quite fascinating, and I suspect the true story will never be fully known. Lots of SGI engineers had their non-Cray technology branded with Cray marketting names, most egregiously LegoNet becoming CrayLink. Lots of Cray folks - aka. Crayons - felt that the core of their company was gutted by an SGI operation which didn't care for the extreme high-ends of HPC.

    One rumor I heard, from a well-placed source, is that the Cray merger with SGI was primarily arranged by the USG. The intelligence services have huge investments in both company's products, so the merger between them made sense. I was told that as a quid-pro quo, the USG had an in-principle agreement to continue purchasing Cray gear to provide enough revenue inside SGI to keep both Cray architectures alive. However, certain parts of SGI felt that the US government didn't live up to their agreement, negotiations to rectify that weren't successful, and so SGI management defunded significant aspects of the Cray engineering work.

    Also, FYI, Cray is one of those companies which will never totally go "belly up" anyway. Given the sensitivity of the work which they did, their support databases alone are full of sensitive and/or classified information. Should the company cease trading, it would be acquired by a shelf company whose sole function is to ensure this data would remain private. That's been the fate of almost all of the now-defunct supercomputer and high-end graphics companies who formerly supplied the defence and intelligence market.

  22. Re:Nuclear Simulations by October_30th · · Score: 2, Insightful
    you don't have to do any realworld testing

    I admire your positive outlook on the prospects of simulations, but as an experimentalist, I find this "soon we won't need experiments at all" (see Rev. Mod. Phys. 64, 1045-1097 (1992), for instance) attitude very dangerous. Simulations and models, even at the first principles level, should never be trusted implicitly. They only sure way to tell how nature works is via experimentation.

    I can sort of understand simulating nuclear explosions, but simulating the aging process of a warhead doesn't make that much sense to me - unless the simulations are accompanied by direct observation of the (accelerated) aging of a warhead.

    --
    The owls are not what they seem
  23. Re:newfangled buzz. by adzoox · · Score: 2, Interesting

    Ha - well I'm sure the guy behind CherryOS will have a press release that it runs The Mac OS at 30 Terahertz.

    --
    Yell & scream & rant & rave... it's no use... you need a shaaaave ~ Bugs Bunny
  24. Re:cray by Fred_A · · Score: 2, Funny

    It seems to be really lacking in the blinkenlight department though.

    What good is a supercomputer without blinkenlights ? They just don't make them like they used to...

    --

    May contain traces of nut.
    Made from the freshest electrons.
  25. Re:So......the cost compared to? by adzoox · · Score: 2, Informative

    You say it's comparing Apples to Oranges but its not really ...

    The VT Supercomputer specs vs the Cray specs page you pointed to:

    CRAY 460 GFLOPS per cabinet (96 processors @ 2.4 GHz)

    Apple - if my math is right - 420 GFLOPS (100 processors @ 2.0Ghz)

    The new specs for the specialized VT Supercluster are pretty impressive.

    Their throughput and interconnect is most likely weaker - but still VERY strong with fiber channel.

    --
    Yell & scream & rant & rave... it's no use... you need a shaaaave ~ Bugs Bunny
  26. Re:The math for a comparable Xserve system by joib · · Score: 4, Insightful


    What a value!!


    That is, until you throw a tightly coupled problem at it and the Cray is 10 times faster because it has much better internode bandwidth and lower latency.

    And, you forgot to count the cost of the InfiniBand interconnect that the VT cluster used? That's a couple grand per node.

    Bottom line, apples and oranges. If your applications is easily parallelizable (i.e. doesn't require much communication between the nodes) you'd be stupid to piss away your money on a "real" supercomputer instead of a cluster. And vice versa.

  27. Re:So......the cost compared to? by ozbird · · Score: 2, Funny

    There's not a lot to compare. We're talking apples and oranges.

    No, we're talking Apples and Crays... Didn't you read the post before replying? ;-)

  28. 700kgs, 75dB and 14kW... by Alkonaut · · Score: 4, Funny

    ...Sadly I think that beats my Volkswagen on all three

  29. Finally ... by Zurd3 · · Score: 2, Funny

    We'll be able now to install Gentoo in just a few days !

  30. No, what stands in the way is price by Moraelin · · Score: 3, Interesting

    The real problem that stands between scientists and them having lots of shiny toys is funding.

    E.g., yeah, having a 30,000 CPU super-computer to simulate your gene model on would be nice. Forking over half a billion for it, well, it's suddenly not that nice any more.

    Having one of those to simulate an electronic circuit, now that would probably rock. Again, paying half a billion for it, suddenly isn't that attractive.

    The real question isn't how nice a toy you'd like to have, it's ROI. (Unless you work for the government, and just have a budget you _have_ to blow on stuff, whether you need that stuff or not.)

    And in that context, you'd be surprised what you _can_ do with a lot less expensive toys.

    Having Cray's custom interconnects sure is impressive, but for a lot of problems they're not even needed any more. _That_ is what killed Cray.

    Most RL problems are not really the kind described as "_one_ huge indivisible data set, that you have to process in _one_ huge batch process." They're more like "we have this process with a small data set that we have to run 100,000,000 times." Most design problems or biology problems are really of that kind: run the same thing 100,000,000 times with different parameters.

    And as Seti@Home or Folding@Home proved, a helluva lot of those don't really need _any_ kind of shared memory or fancy interconnects. The real ticket is noting that instead of accelerating the batch run 200 times, you could just split it into 200 smaller batches ran on 200 single-CPU machines.

    The super-computer solution costs 2,000,000 just for the machine alone, while the 200 PCs solution costs 200,000 or so. I.e., 10 times cheaper. Better yet, the 200 PCs solution is also far cheaper to program. (Anyone can program a non-threaded batch app.) _And_ for that kind of a problem the 200 PCs solution would actually finish faster, since it has no contention issues whatsoever.

    Again, that's what really killed Cray and the super-computers. They're techologically impressive, they're a geek's wet dream, but... for 99.9% of the problems out there they're just not worth the price any more.

    --
    A polar bear is a cartesian bear after a coordinate transform.
  31. ... Back in my day .... young whippersnapper by ebooher · · Score: 4, Interesting

    So come on, ante up. How many remember being awed at the mere sight of old Crays back in the day? Like the Cray-3? I remember the first time I saw a Cray .... thing was in an anti-static environment. To access it, one had to pass through an airlock and be "decharged" or "depolarized" etc. Basically they some how charged the air to get rid of static electricity. Then you had this system that was running *in* liquid! Take that "Oh I'm so cool cause I have a l337 haX0r water cooled CPU" overclockers

    They (Cray) were so proud of this accomplishment that the upper portion of the cabinet was some kind of plexiglass so you could see the fluid as it moved, and moved wiring and what not with it. Very surreal feeling, almost like the thing was breathing.

    And what about the Cray-1? Wasn't that a true testiment to 70's *art* and sculpture? The thing looks like some kind of freaky bus station bench with it's odd red and white panels and black base. Though, I don't know if they all looked like that, maybe you could get them in other colors?

    Ahh .... those were the days.

    --
    "Genius may shine aloof and alone, like a star, but goodness is social, and it takes two men and God to make a Brother."
    1. Re:... Back in my day .... young whippersnapper by mrdogi · · Score: 2, Informative
      Then you had this system that was running *in* liquid!

      Before that was the Cray-2 (a.k.a World's most expensive aquarium")? In case anybody's interested, I believe they used Fluorinert as the liquid, as it wouldn't swell the PC boards, short anything out, or cause anything to corrode.

      A note, the Cray-3 was created by Cray Computer Corporation of Colorado, whereas the Cray-1 was made by Cray Research of Wisconsin. In ~1990, Seymore wanted to start working on computers using gallium arsenide instead of silicon, since they could switch faster. Cray Research didn't want to try anything so revolutionary, so Seymore headed to Colorado with a group of people and started CCC. Unfortunately, they apparently made exactly one Cray-3, then folded.

      Seymore Cray was quite the Übergeek.

  32. Yeah, it's gotta be awful by thegnu · · Score: 2, Funny

    Last time I bought a Cray super-computer, I was kicking myself for weeks about the 2 million dollars I wasted.

    Next time, I'm just gonna build a beowulf cluster out of 200 overclocked AMD Barton 2500s. I shall NOT be suckered again!

    --
    Please stop stalking me, bro.
    1. Re:Yeah, it's gotta be awful by minus9 · · Score: 2, Funny

      I've got a bag full of thinnet coax you can have, complete with T pieces, you might have to find your own terminators though.

  33. hybrid system with multiple kernels by Dink+Paisy · · Score: 3, Informative

    From the documents, it looks like it runs Linux on the management nodes and Catamount on the compute nodes. The idea is you can do what you like with the general purpose nodes, but for the compute nodes, you run a lightweight operating system that has low overhead, minimal services and predictable scheduling. BlueGene/L works the same way; it runs Linux on the management nodes and a custom operating system on the compute nodes. Compute nodes likely provide scheduling for only the number of threads that run on the node, communication through MPI and some proprietary API, and basic debugging facilities. Compute nodes probably lack normal OS services like network, disk, or even a console.

    --

    Whoever corrects a mocker invites insult;
    whoever rebukes a wicked man incurs abuse.
    --Proverbs 9:7
  34. are you sure you remember seeing the Cray 3 ? by bmajik · · Score: 2, Informative

    Because, IIRC, that was the one that they were only building one of, and when the govt cancelled the order, thats when Cray Research went under.

    --
    My opinions are my own, and do not necessarily represent those of my employer.
    1. Re:are you sure you remember seeing the Cray 3 ? by ebooher · · Score: 2, Interesting

      Looked around on the net, as well as a couple other /.'rs here, and someone posted a link here to a 2 and I found a pic of a 2 with the waterfall system that was mentioned by another person, and I must accept defeat within the loosened strands of my unraveling mind.

      It was indeed a Cray-2 that I remember so vividly. Nevertheless, still an extremely exotic machine. Very much the Ferrari F40 or McLaren F1 of super computing. You've seen pics, maybe even seen one at a car show, but you know you'll never be allowed to touch one. It had as much class as an Italian sports car too.

      I find myself wondering how many /. geeks it would take at what $$ amount to colocate a community Cray somewhere ...... be like going in with 100 friends to buy a Ferrari though. Who gets to keep the keys?

      --
      "Genius may shine aloof and alone, like a star, but goodness is social, and it takes two men and God to make a Brother."
  35. Re:The math for a comparable Xserve system by SuperQ · · Score: 2, Informative

    You're leaving out a lot of stuff necessary to make a cluster:

    #1 RAM: $3000 for the G5 cluster node includes 512mb ram. Most places demand atleast 2gb ram per CPU, we require 3GB ram per CPU in all new system purchases. This brings the node price (from apple.com) to $6500
    200x $6500 = $1,300,000

    #2 Racks and power: Each rack can hold about 32 machines (without getting way to hot/dense) for 200 nodes, this would be about 7 racks.
    7x $1200 = $8400

    #3 Interconnect: No HPC system is usefull without an interconnect. An 80 node myrinet system was $250,000, so at $3125/node you're looking at:
    200x $3000 (estimate) = $600,000

    #4 Networking: you need a network switch and cabling to connect all the nodes... gige is a must these days. Let's say we go cheap with HP ProCurve 2848 Layer2 managed for $3300 each we need 7 of those, one for each rack cabinet.. with trunking we can get 4gb back to a central switch. not too bad. Say we add $10/cable for pre-made patch cables, (length averaged) that's about $2250 in cables.
    7x $3300 + $2200 = $25,300

    #5 Disk: You quoted a bunch of XservRaid's without any kind of apple care.. with IDE raid.. I'm not going without some kind of support on it. Oh wait.. 1 file server is NOT enough to handle 200 nodes of HPC.. and apple doesn't have a clustered filesystem. You're going to have to go with Linux/Intel with RedHat GFS for that one (yes, there are other options, but I know GFS)
    Say we do 4 XserveRaid's with applecare:
    4x $16,000 = $64,000
    We also need for dual whatever intel machines: (i'll be nice and include F-C cards in the price)
    4x $3000 = $12,000
    We also need a F-C switch to link all the nodes:
    SanBox 8 port $5200 and 8x SFP modules $750 = $11,200
    I'll pretend like we don't need GFS software support, but most places would want it. (it's another $20,000 or so, but eh.. we want cheap solution)
    Disk total comes to: $87,200

    Price so far: $2,020,900

    And that doesn't even include setup!

  36. Re:software by flaming-opus · · Score: 3, Informative

    This split microkernel architecture has been in use for a long time on big mpp systems like the paragon and the t3e. The software base (catamount/linux) is new, but the design is old.

    catamount is the kernel that runs on the compute nodes. IT's a tiny kernel that packages up the OS service requests, and sends them, over the interconnect, to an OS or I/O node, which does the real work of the operating system. catamount is a descendant of PUMA, which came from Cougar. These are heavily derived from work done at caltech. (I believe CMU, and one of the UTexas schools also played a role, but am not sure). The idea is that the microkernel is small and unobtrusive, and it gets the hell out of the way so the application can use the CPU as much as is possible.

    The OS and I/O nodes run linux, and provide services to the compute nodes. This is probably, but it could just as easily be running as a user-space daemon on the OS node. (Though you might have to do some mem-copys that way, which would lower performance)

    NOTE: Though these nodes take advantage of some of linux's features (like the lustre file system) they do NOT necessarily implement these features for the system as a whole. They probably provide a minimal set of features necessary for the sorts of problems that the xt3 runs. All the scheduling work that has gone into more recent linux kernels is of little use, as the compute nodes have their own scheduler, probably more closely tied to the batch dispatcher than to the linux kernel. To say that the system runs linux is true, but a little misleading. It's a very different linux than what runs on my desktop, and it's used in a very different way.

  37. Wall Street by Moraelin · · Score: 2, Informative

    You have to understand though that the stock market's expectations have nothing to do with whether the company is doing well or not.

    Surrealistic point in case: at one point 3Com had a lower market value than the Palm daughter-company. Basically if you subtract the value of the Palm shares, the whole rest of 3Com was actually worth a _negative_ value for the stock market.

    And we're talking divisions which were making a tidy profit. Yet they were apparently worth a _negative_ number.

    No, it's not a joke. Roll it around a bit in your head to fully grasp how completely sad and idiotic that is. Real profits, real assets, worth a negative number of dollars. Stupid.

    Or at the other end of the spectrum you have Microsoft whose stock market value is _way_ above the value of its assets. Without paying any dividends or acquiring much in the way of long term assets, people just flocked to drive the price up and make Bill Gates rich. Basically to give their money to Bill Gates and not even get a Windows CD in return.

    The thing is, however, the stock market value has _nothing_ to do with a company's value or profits. The value of a share is only worth as much or as little as people want to believe it is. It is like Monopoly (the board game, not MS;) money: if tomorrow we decide that the blue bills are worth 10% more and the red bills are worth 10% less, who's to argue with that.

    The _only_ reason the stock market on the whole goes up is basically because yearly people dump more money into it. Basically it goes up just because people want to believe it's going up, and put their money where their belief is.

    And the way those values fluctuate, now that just has to do with hype and greed.

    The stocks worth buying are those who'll make you a profit: typically meaning they'll raise in value. The stocks worth selling are those who don't.

    Except with no intrinsic value it becomes a game of guessing what the other lemmings will buy (driving the price up), and what the other lemmings will sell (driving the price down.)

    One thing that makes lemmings buy is the prospect of growth. Hence, hype is good. Hence, yes, shares in a cancerous tumor would sell like hot cakes and rocket sky high in price.

    Hence, conversely, shares in a company which doesn't grow or otherwise cause more lemmings to buy, are not worth holding on to. Because they won't bring a profit. If Microsoft truly plateaued and didn't pay dividends either, regardless of how much profits it made at that point, its shares would plummet. Because between holding onto a share in MS that doesn't bring a profit, and investing in some startup that grows quickly, the second promises more of a ROI.

    Now that's all a bit of an over-simplification.

    Of course, there are other factors. Like just paying dividends to give people a reason to hold onto your shares even without massive hype and growth. (See why MS started doing that when its market explosion slowed down.) Or like fraud: "analysts" just telling lemmings what to buy, and thus drive up the price of the shared owned by the "analyst" and his/her clients. Etc.

    But as a quick intro to the madness of the stock market, it will have to do.

    --
    A polar bear is a cartesian bear after a coordinate transform.
  38. Of interest to Cray-3 info by ebooher · · Score: 2, Interesting

    Cray-3 memories by Steve Gombosi From a comp.unix.cray posting

    Graywolf ("S5") was installed at NCAR. Like all NCAR supercomputers, until fairly recently, it was named after a Colorado locale.

    This was the *only* Cray-3 shipment, installed in May 1993, the machine was a 4-processor, 128 Megaword system.

    Two problems in the Cray-3 system were uncovered as a result of running NCAR's production climate codes (particularly MM5): a problem with the "D" module causing intermittent problems with parallel codes, and an error in the implementation of the square root approximation algorithm which caused incorrect results for certain data patterns (kinda like the Pentium divide bug ;-) ). These were rectified and replacement CPU modules were installed, although I can't remember the date.

    The machine ran NCAR production until CCC folded in March, 1995. Since NCAR never paid for it, at some point we reduced the CPU count to 2 and let the machine run essentially unattended. I'm not too sure when that happened, although it marked the end of my regular commuting between Colorado Springs and Boulder.

    There were a total of 7 Cray-3 "tanks" constructed. S1-S4 were single "octant" tanks (the smallest that could be constructed) which accomodated up to a 2 processor/128MW configuration. S5 and S6 were two-octant tanks. S7 was a four-octant tank which we used as a software development and benchmarking platform. S6 was chiefly used for system testing.

    S1-S3 were diverted to Cray-4 testing once the Cray-4 project built up steam. S4 was diverted to the quite possibly suicidal Cray-3/SSS project after S7 became available (S4 was previously our software development machine).

    For those of you who have Cray-3 posters lying around (by the way, I took all the photos on that poster as well as the Cray-3 and Cray-4 brochures and all the annual reports except the first two):

    1) The big photo is of S5
    2) Seymour is leaning on S5 (and you have no idea how hard it was to get him to hold still that long while wearing a suit...or to talk him into that particular pose)
    3) The two "cooling system" photos are S6
    4) The hand holding the module is mine ;-)

    Cray-3 modules were 4x4x0.25 inches in size. Each module consisted of a multi-layer "sandwich" of PC boards (69 electrical layers), with 2 layers of 16 1x1 inch stacks. The stacks were the circuit boards containing the actual circuits (GaAs for logic, SRAM for memory modules). There were 16 bare GaAs chips mounted to each side of a logic stack. I think there were 12 bare SRAM chips on each side of a memory stack (the logic chips were square, the memory chips were rectangular).

    --
    "Genius may shine aloof and alone, like a star, but goodness is social, and it takes two men and God to make a Brother."