Slashdot Mirror


SGI & NASA Build World's Fastest Supercomputer

GarethSwan writes "SGI and NASA have just rolled-out the new world number one fastest supercomputer. Its performance test (LINPACK) result of 42.7 teraflops easily outclasses the previous mark set by Japan's Earth Simulator of 35.86 teraflops AND that set by IBM's new BlueGene/L experiment of 36.01 teraflops. What's even more awesome is that each of the 20 512-processor systems run a single Linux image, AND Columbia was installed in only 15 weeks. Imagine having your own 20-machine cluster?"

93 of 417 comments (clear)

  1. hmmmm...... by commo1 · · Score: 4, Funny

    Let's see them predict the weather.....

    1. Re:hmmmm...... by Anonymous Coward · · Score: 5, Funny
      Today we predict a high of +3 Funny, with localised Trolling.

      Tomorrow looks like developing a slight rise in Insightful post, but a drop in overall Informative. "First Post" will remain as a constant pattern.

    2. Re:hmmmm...... by OblongPlatypus · · Score: 4, Informative

      You asked for it: "...with Columbia, scientists are discovering they can potentially predict hurricane paths a full five days before the storms reach landfall."

      In other words: RTFA, that's exactly what they're using it for.

      --
      -- If no truths are spoken then no lies can hide --
    3. Re:hmmmm...... by Shag · · Score: 4, Insightful
      "...with Columbia, scientists are discovering they can potentially predict hurricane paths a full five days before the storms reach landfall."

      You don't live somewhere that gets hurricanes, do you? 'Cause scientists can already "potentially predict hurricane paths a full five days before the storms reach landfall." Hell, I can do that. A freakin' Magic 8 Ball can potentially do that.

      Maybe they're trying to say something about doing it with a better degree of accuracy, or being right more of the time, or something like that, but it doesn't sound like it from that quote.

      "Hey, guys, look at this life-sized computer-generated stripper I'm rendering in real-ti... oh, what? Um, tell the reporter we think it'd be good for hurricane prediction."

      --
      Village idiot in some extremely smart villages.
  2. That's nothing... by Anonymous Coward · · Score: 5, Funny

    ...when they hit the "TURBO" button on the front of the boxes they'll really scream.

    1. Re:That's nothing... by jm92956n · · Score: 5, Informative

      when they hit the "TURBO" button on the front of the boxes they'll really scream.

      They did! According to C-Net article they "quietly submitted another, faster result: 51.9 trillion calculations per second" (equivalent to 51.9 teraflops).

      --
      An effective signature identifies a particular user amongst a base of thousands.
  3. 20 system cluster?!? by Emugamer · · Score: 5, Funny

    I have one of those... in a spare room!

    Who cares about a 20 system cluster, I want a one 512 processor machine!

    or 20, I'm not that picky

  4. Everyone needs one! by Dzimas · · Score: 5, Funny

    Just what I need to model my next H-bom... uhh... umm.... I mean render my next feature film. I call it "Kaboom."

    1. Re:Everyone needs one! by polecat_redux · · Score: 4, Funny

      Just what I need to model my next H-bom... uhh... umm.... I mean render my next feature film. I call it "Kaboom."

      Not to be pedantic, but the correct term is "Freedom Bomb".

    2. Re:Everyone needs one! by fm6 · · Score: 2, Funny

      So the French are claiming they invented H-Bombs now? How like them!

    3. Re:Everyone needs one! by hunterx11 · · Score: 3, Funny

      The French invented the H-Bomb first, but since the H is silent, they thought it was just a regular bomb and forgot about it.

      --
      English is easier said than done.
  5. Wow---- by ZennouRyuu · · Score: 5, Funny

    I bet gentoo wouldn't be such a b**ch to get running with all of that compiling power behind it :)

    1. Re:Wow---- by Anonymous Coward · · Score: 2, Funny

      An asterisk can represent any number of characters, 0..inf (or at least MAX_INT). Therefore, the word is most likely "bch". Or maybe "bean salad crunch".

  6. and thats only 4/5 of the performance! by m00j · · Score: 3, Informative

    According to the article it got 42.7 teraflops using only 16 of the 20 nodes, so the performance is going to be even better.

  7. And after further cooperation with Redmond... by ferrellcat · · Score: 4, Funny

    ...they were *almost* able to get Longhorn to boot.

    1. Re:And after further cooperation with Redmond... by kzinti · · Score: 2, Funny

      Q: What kind of machine does Longhorn run best on?

      A: A slide projector.

      (Old joke. cat nt-joke-1990.txt | sed -e 's/Windows NT/Longhorn/g')

  8. its not the hardware thats important by fender_rock · · Score: 5, Funny

    If the same software is used, its not going to make weather predictions more accurate. Its just going to give them the wrong answer, faster.

    1. Re:its not the hardware thats important by khayman80 · · Score: 3, Interesting
      Well, maybe what makes the weather models inaccurate is the grid size of the simulations. If you try to model a physical system with a finite-element type of approach and set the gridsize so large that it glosses over important dynamical processes, it won't be accurate.

      But if you can decrease the grid size by throwing more teraflops at the problem, maybe we'll find that our models are accurate after all?

    2. Re:its not the hardware thats important by chriguhose · · Score: 3, Interesting

      I'm not an expert on this, but your statement is in my opinion not completly true. Weather forecasting is a little bit like playing chess. One does have a lot of different path to take to find the best solution. Increased computing power allows for "deeper" searches and increases accuracy. My guess is that more accuracy requires exponentially more computing power. Comparing earth simulator to colombia makes me wonder how much accuracy has increased in this particular case.

    3. Re:its not the hardware thats important by ozmanjusri · · Score: 4, Funny

      Well, if the weather's too chaotic to predict, even with this much computing horsepower, maybe it would be simpler to go to Brazil and just swat the damn butterfly with the owner's manual.

      --
      "I've got more toys than Teruhisa Kitahara."
  9. In other news... by thedogcow · · Score: 2, Funny

    SGI & NASA now have developed a computer that will be able to run Longhorn.

    --
    Yes! I listen to NYC Speedcore and do math at 3AM. I suggest you try it too.
  10. Photos of System by erick99 · · Score: 5, Informative

    This page contains images of the NASA Altix system. After reading the article I was curious as to how much room 10K or so processors take up.

    --
    http://www.busyweather.com/
    1. Re:Photos of System by RadioheadKid · · Score: 5, Funny

      You'd think with all that super-computing power they'd be able to figure out the zipping JPEGs is retarted.

      --
      "Karma can only be portioned out by the cosmos." -Homer Simpson
    2. Re:Photos of System by Jeffrey+Baker · · Score: 2, Informative
    3. Re:Photos of System by cnkeller · · Score: 5, Interesting
      After reading the article I was curious as to how much room 10K or so processors take up.

      I don't have a square footage number, but it's the overwhelming majority of the server floor. We had to "clear the floor" earlier this summer to make room.

      --

      there are no stupid questions, but there are a lot of inquisitive idiots

    4. Re:Photos of System by raodin · · Score: 2, Funny

      Wow, I'm glad I'm not the one installing 10240 heatsinks..

    5. Re:Photos of System by peterpi · · Score: 2, Interesting

      On this picture you can see what I'm sure is an 'Intel Inside' sticker on the bottom of some of the cabinets.

  11. Interesting Facts by OverlordQ · · Score: 4, Informative

    1) This was fully deployed in only 15 weeks.
    (Link)

    2) This number was using only 16 of the 20 systems, so a full benchmark should be larger too.
    (link)

    3) The storage attached holds 44 LoC's (link)

    --
    Your hair look like poop, Bob! - Wanker.
  12. Imagine a... by Anonymous Coward · · Score: 2, Funny

    ...single node of these...

    oh wait, sorry, Cray deja-vu :-)

  13. Re:Intent of NASA... by SenatorTreason · · Score: 3, Funny

    Seti@Home. They'll be in the Top 10 in no time!

  14. Here's the current list... by daveschroeder · · Score: 4, Funny

    Prof. Jack Dongarra of UTK is the keeper of the official list in the interim between the twice-yearly Top 500 lists:

    http://www.netlib.org/benchmark/performance.pdf See page 54.

    And here's the current top 20 as of 10/26/04...

    1. Re:Here's the current list... by Jeremy+Erwin · · Score: 2, Informative

      It may prove enlightening to check that paper for updates-- as the november 8 deadline approaches, particularly competitive teams may submit new scores as they jockey for position.

      Slashdot may have announced the news at 10:45, but this particularly silly post of mine demonstrates, I had the news 6 and half hours early, from Dongara's paper.

  15. Ways you are wrong by RealProgrammer · · Score: 3, Informative

    Computer superclusters don't even have O-rings.

    They don't carry schoolteachers.

    They don't fly in the air.

    This runs Linux, not Windows. It won't crash.

    --
    sigs, as if you care.
    1. Re:Ways you are wrong by WormholeFiend · · Score: 2, Funny

      Computer superclusters don't even have O-rings

      so it's not water-cooled?

      [didn't RTFA]

  16. NASA.org? by lnoble · · Score: 5, Funny

    Wow, I didn't know the NewAdvancedSearchAgent had such an interest or budget for super computing. I'd think they'd be able to afford their own web server though instead of being parked at domainspa.com and having to fill their entire page with advertisments.

    Try NASA.GOV.

  17. What is the stumbling block? by Dancin_Santa · · Score: 5, Insightful

    Why does it take so long to build a super computer and why do they seem to be redesigned each time a new one is desired?

    It's a little like how Canada's and France's nuclear power plant system are built around standardized power stations, cookie cutter if you will. The cost to reproduce a power plant is negligble compared to the initial design and implementation, so the reuse of designs makes the whole system really cheap. The drawback is that it stagnates the technology and the newest plants may not get the newest and best technology. Contrast this with the American system of designing each power plant with the latest and greatest technology. You get really great plants each time, of course, but the cost is astronomical and uneconomical.

    So to, it seems with supercomputers. We never hear about how these things are thrown into mass production, only about how the latest one gets 10 more teraflops than the last and all the slashbots wonder how well Doom 3 runs on it or whether Longhorn will run at all in such an underpowered machine.

    But each design of a supercomputer is a massive success of engineering skill. How much cheaper would it become if instead of redesigning the machines each time someone wants to feel more manly than the current speed champion, that the current design be rebuilt for a generation (in computer years)?

    1. Re:What is the stumbling block? by kst · · Score: 2, Interesting

      Why does it take so long to build a super computer ...
      It doesn't.

    2. Re:What is the stumbling block? by anon+mouse-cow-aard · · Score: 4, Insightful

      Thought experiment: Order 10000 PC's. time how long it takes to get them installed, with power, network cabling, and cooling, in racks, and installed with the same OS.

      Second thought experiment. Imagine the systems are built out of modular bricks that are identical to deskside servers. so that they can sell exactly the same hardware in anywhere from 2 to 512 processors by just plugging the same standard bricks together, and they all get the same shared memory, and run one OS. Rack after rack after rack. That is SGI's architecture. It is absolutely gorgeous.

      So they install twenty of the biggest boxes they have, and network those together.

      $/buck ? I dunno. Is shared memory really a good idea? Probably not. but it is absolutely gorgeous, and no-one can touch them in that shared memory niche that they have.

    3. Re:What is the stumbling block? by Geoff-with-a-G · · Score: 4, Insightful

      Why does it take so long to build a super computer and why do they seem to be redesigned each time a new one is desired?

      Well, are we talking about actual supercomputers, not just clusters? 'Cause if you're just trying to break these Teraflops records, you can just cram a ton of existing computers together into a cluster, and voila! lots of operations per second.

      But it's rare that someone foots the bill for all those machines just to break a record. Los Alamos, IBM, NASA, etc. want the computer to do serious work when it's done, and a real supercomputer will beat the crap out of a commodity cluster at most of that real work. Which is why they spend so much time designing new ones. Because supercomputers aren't just regular computers with more power. With an Intel/AMD/PowerPC CPU, jamming four of them together doesn't do four times as much work, because there's overhead and latency involved in dividing up the work and exchanging the data between the CPUs. That's where the supercomputers shine: in the coordination and communication between the multiple procs.

      So the reason so much time and effort goes into designing new supercomputers is that if you need something twice as powerful as today's supercomputer, you can't just take two and put them together. You have to make new architecture that is even better at handing vast numbers of procs first.

    4. Re:What is the stumbling block? by sloth+jr · · Score: 2, Insightful

      I don't build supercomputers, but I do build systems that look a lot like them in very similar infrastructures. I'm not sure why it took them 120 days (okay, "under" 120 days), but when we build out a datacenter with 70 to 100 machines, it usually takes a bit of time:
      a) obtain space. Usually, raised floors, rack systems, with adequate HVAC for the huge thermal load you're about to throw into a few racks. For collocation, it'll take some time for your provider to wire together a cage for your installation, especially if you need earthquake bracing. Expect two weeks.

      b) obtain power. For our production environment, each redundant power supply needs to be served by a separate circuit. The way most redundant power supplies seem to work is they split the load between the two circuits - so each circuit has to be able to handle the full load. Supercomputers may not have the same production requirements, but probably - lost cycles is lost money. Anyway, this is contracted out in almost all cases - expect two weeks minimum. b is usually dependent on a, some providers may perform buildout concurrently. Not much of an issue if you use Equinix - very cool overhead power systems (imagine a very large power track system, with drops wherever you need them).

      c) obtain equipment. delivery time from a week to 5 weeks.

      d) it takes some time to unbox 100 machines and rack them. Throw people at it, or throw time at it.

      e) network infrastructure. do it yourself, you're using a lot of time to cut cables to length. contract it out, you get very neat work, at expense, and usually only to rack-specific patch panels. Buy lots of different length cables and forego contracting, you save time, but you end up with a cage that looks like hell that's easy to snag.

      f) configure 100 machines. This is probably the easiest part - set up your DHCP server and PXE boot server, roll up some kickstart system, and deploy - 100 machines can be done in a few hours. There's obviously some setup and thought that needs to be put into the installation scripts, but that can be done ahead of time.

      In my experience, buildout of production datacenters is very difficult to do in less than 6 weeks.

      sloth jr

    5. Re:What is the stumbling block? by IncandescentFlame · · Score: 2, Informative

      Well, are we talking about actual supercomputers, not just clusters? 'Cause if you're just trying to break these Teraflops records, you can just cram a ton of existing computers together into a cluster, and voila! lots of operations per second.

      Actually, this method won't work for the benchmark that is used for the top 500 list, LINPACK. The difficulty is that to solve most problems in parallel, the processors need to talk to each other. This introduces an overhead into the program, and the amount of overhead depends on the interconnect. Programs which can be parallelised without a communication overhead are called trivially parallel. LINPACK is not trivially parallel, so if you took a whole lot of computers and banged them together over Ethernet, all you'd end up with is an expensive way to keep your network busy.

      The beauty of the Altix systems is that the NUMA (Non Uniform Memory Architecture) is a really fast interconnect (speaking as someone who gets to run on them).

  18. will soon be surpassed... by Doppler00 · · Score: 4, Informative
    by a computer they currently being set up at Lawrence Livermore National Lab: 360 teraflops

    The amazing thing about it is that it's built at a fraction of the cost/space/size as the Earth simulatior. If I remember correctly, I think they already have some of the systems in place for 36 teraflops. It's the same Blue Gene/L technology from IBM, just a larger scale.

  19. One is a parity bit... by NotQuiteReal · · Score: 3, Funny
    ... um never mind.

    RAEM (redundant array of expensive machines) just doesn't ring right - to close to REAM.

    --
    This issue is a bit more complicated than you think.
  20. Re:NEC's seems to be faster by toby · · Score: 3, Informative

    NEC's is announced, this one is installed.

    --
    you had me at #!
  21. Re:Ok, what is the point of this? by dagur · · Score: 4, Funny

    Yes what is the point? We all know the resulting answer is going to be 42.

  22. This time there really is a turbo button! by Dink+Paisy · · Score: 5, Informative
    This result was from the partially completed cluster, at the beginning of October. At that time only 16 of the 20 machines were online. When the result is taken again with all 20 of the machines there will be a sizeable increase in that lead.

    There's also a dark horse in the supercomputer race; a cluster of low-end IBM servers using PPC970 chips that is in between the BlueGene/L prototype and the Earth Simulator. That pushes the last Alpha machine off the top 5 list, and gives Itanium and PowerPC each two spots in the top 5. It's amazing to see the Earth Simulator's dominance broken so thoroughly. After so long on top, in one list it goes from first to fourth, and it will drop at least two more spots in 2005.

    --

    Whoever corrects a mocker invites insult;
    whoever rebukes a wicked man incurs abuse.
    --Proverbs 9:7
  23. Cost by MrMartini · · Score: 5, Interesting

    Does anyone know how much this system cost? It would be interesting to see how good of a teraflop per million dollar ratio they achieved.

    For example, I know the Virginia Tech cluster (1,100 Apple Xserve G5 dual 2.3Ghz boxes) cost just under $6 million, runs at a bit over 12 teraflops, so it gets a bit over 2 teraflops per million dollars.

    Other high-ranking clusters would be interesting to evaluate in terms of teraflops per million dollars, if anyone knows any.

    1. Re:Cost by MrMartini · · Score: 4, Informative

      Since no one else has answered my question, I'll post the results of searching on my own:

      http://news.com.com/Space+agency+taps+SGI,+Intel+f or+supercomputer/2100-1010_3-5286156.html

      The cost is quoted in the article at $45 million over a three year period, which indicates that the "Columbia" super cluster gets a bit more than 1 teraflop per million dollars. That seems impressive to me, considering the overall performance.

      It would be interesting to see how well the Xserve-based architecture held its performance per dollar when scaled up to higher teraflop levels...

    2. Re:Cost by Junta · · Score: 2, Informative

      Actually, a lot of the top500 supercomputers acheive or beat $1million/TFLOPs. Even if the price points weren't that good on the component parts, marketing departments are inclined to give huge discounts for the press coverage. You can bet SGI and Intel both gave exhorbitant discounts here, SGI's market presence has been dwindling, and overall the Itanium line has been a commercial failure. Being #1 on the top 500 for 6 months (the length between list compilations, and BlueGene isn't even close to finished, the NEC supercomputer is likely to make the list after next, etc etc), is very good marketing.

      Of course, if BlueGene, Big Mac, and this supercomputer demonstrate one thing, it is that focusing on the processors exclusively is ridiculous. It is the processing element interconnect that really makes the difference in parallell computing. BlueGene has 16k 'pathetic' processors (700Mhz PPC) with a focus on a really potent interconnect network to be able to scale to 65k processors with very good scaling factors.
      Big Mac leverages infiniband, low latency, expensive, high bandwidth network to get where it is.
      And this, only 20 nodes, each with 512 processors within a box. I don't know the boxs interconnect strategy is, but you can bet the design is much better than myrinet and infiniband, technologies that communicate via PCI bus, that are not hardset in terms of processing element count, have longer cable lengths, etc.

      Look at the top500, processors are important, but the network technology is what truly makes or breaks the clusters in that realm with such high node counts.

      --
      XML is like violence. If it doesn't solve the problem, use more.
  24. Not fully true by ValiantSoul · · Score: 2, Informative

    They were only using 16 of those 20 servers. With all 20 they were able to peak 61 teraflops. Check the article at CNET.

  25. I think you meant to say... by spineboy · · Score: 2, Funny

    A Beowolf cluster of Beowolf clusters....

    ARRRGGGHHHH PEOPLE'S HEADS ARE EXPLODING!!!

    Now you know that there's some engineer with acces to this thing thinking how he can jump to the front of SETI@HOME.

    --
    ..........FULL STOP.
  26. Ya know... by Al+Al+Cool+J · · Score: 5, Funny
    It's getting to the point where I'm going to have call shenanigans on the whole freakin' planet. Am I really supposed to believe that an OS started by a Finnish university student a decade ago and designed to run on a 386, is now running the most powerful computer ever built? I mean, come on!

    Seriously, am I on candid camera?

  27. My proposed use of this super computer.... by chicagozer · · Score: 4, Funny

    Emulating a Centris 650 running Mac OS X at 2.5 Ghz.

    --
    ZZ
  28. 70.93 TeraFLOPs by chessnotation · · Score: 5, Interesting

    Seti@home is currently reporting 70.93 TeraFLOPs/sec. It would be Number One if the list were a bit more inclusive.

    1. Re:70.93 TeraFLOPs by Anonymous Coward · · Score: 2, Informative

      ... also, Folding@home, as I just check it out, is running at 196.463 TFLOPS, thankfully proving that the general population would rather solve real problems than fucking pretend they're Uhura.

  29. Read on to the next paragraph by jd · · Score: 5, Interesting
    There it talks of a third run, at 61 teraflops, slightly over the estimated 60 teraflops predicted.


    Ok, so we have Linux doing tens of teraflops in processing, FreeBSD doing tens of petabits in networking, ... What other records can Open Source smash wide open?

    --
    It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
    1. Re:Read on to the next paragraph by Troll-a-holic · · Score: 3, Informative

      From the article -

      NASA Secures Approval in 30 Days
      To accelerate NASA's primary science missions in a timely manner, high-end computing experts from NASA centers around the country collaborated to build a business case that Brooks and his team could present to NASA headquarters, the U.S. Congress, the Office of Management and Budget, and the White House. "We completed the process end to end in only 30 days," Brooks said.


      Wow. That's incredibly fast, IMHO.

      As the article mentions, I suppose NASA owes this to the success of their 512-processor Kalpana system, in honor of the late astronaut Kalpana Chawla.

      And look at this --

      "In some cases, a new Altix system was in production in as little as 48 hours," said Jim Taft, task lead, Terascale Applications Group, NASA. "This is starkly different from implementations of systems not based on the SGI architecture, which can take many months to bring to a reliable state and ready for science."

      w00t! That's like super-fast in terms of development time. Good job, NASA. Way to go.

      And what about the other companies mentioned in the article?

      In addition to Intel Itanium 2 processors, the Columbia installation features storage technology from Brocade Communications and Engenio Information Technologies, Inc., memory technology from Dataram Corporation and Micron Technology, Inc. and interconnect technology from Voltaire.

      I've not heard of any of them other than Voltaire - are they well known in this area, or are they defense/NASA contractors of some kind?

    2. Re:Read on to the next paragraph by luvirini · · Score: 3, Insightful

      "NASA Secures Approval in 30 Days" Knowing how govermental processes normally go, this part really seems incredible. Normally even the "fluffy" pre-study would take that long(or way more), before anyone actually sits down to discuss actual details and such. Specially the way most everything with NASA seems to be over budget and way late. It is indeed good to see that there is still some hope, so lets hope they get the procurement prosesses in general more working.

    3. Re:Read on to the next paragraph by Mulletproof · · Score: 2, Interesting

      "What other records can Open Source smash wide open?"

      Mmmm, home consumer usage, maybe?? HA! What was I thinking!?

      --
      You need a FREE iPod Nano
    4. Re:Read on to the next paragraph by luvirini · · Score: 2, Insightful
      Indeed it is the hardware, and no you cannot directly claim it is a open source victory except for one small thing...

      Wonder why they run open source instead of proprprietary operating system on this? Maybe the multitude of answers to that question can show you why it can be considered open source victory.

    5. Re:Read on to the next paragraph by RageEX · · Score: 5, Insightful

      Good job NASA? Yeah I'd agree. But what about good job SGI? Why does SGI always seem to have bad marketing and not get the press/praise they deserve?

      This is an SGI system. SGI has laid out plans for terascale computing (stupid marketing speak for huge ccNUMA systems) a while ago. I'm sure NASA and SGI worked together but this is essentialy an 'Extreme' version of an off-the-shelf SGI system.

    6. Re:Read on to the next paragraph by jd · · Score: 5, Informative
      Hardware only takes you so far. Scalability comes largely from the efficiency of the software. Poor software results in large amounts of communication between nodes, slowing down a cluster.


      This is why SMP computers tend to have 2 or 4 processors, and 8 at a pinch, but no more. It's just not practical, using current methods, to directly wire up more than 8 processors in such a tight package.


      Lets say you have N processors, each capable of executing I instructions per second. Your total theoretical throughput would be N x I. However, this would only be the case if the system is 100% parallel, and no processor needed to communicate with any other. Rarely the case.


      In practice, the function of performance to processors follows a distribution that looks a bit like a squished bell curve. As you increase the number of processors, the performance gain decreases, reaches zero, and actually becomes negative. At that point, adding more CPUs will actually SLOW the computer down.


      The exact shape and size of the curve is partly a function of the way the components are laid out. A good layout keeps the amount of traffic on any given line to a minimum, minimizes the distances between nodes, and minimizes the management and routing overheads.


      However, layout isn't everything. If your software can't take advantage of the hardware and the topology, then all the layout in the world won't gain you a thing. To take advantage of the topology, though, the software has to comprehend some very complex networking issues. It has to send data by efficient pathways.


      If connections are not all the same speed or latency, then the most efficient pathway may NOT be the shortest. This means that the software must understand the characteristics of each path and how to best utilize those paths, by appropriate load-balancing and traffic control techniques.


      If you look at extreme-end networking hardware, they can be crudely split into two camps - those where the bandwidth is phenomenal, at the expense of latency, and those where the latency is practically zero but so's the bandwidth.


      The "ideal" supercomputer is going to mix these two extremes. Some data you just need to get to point B fast, and sometimes you're less worried about speed, but do need to transfer an awful lot of information. This means you're going to have two physical networks in the computer, to handle the two different cases. And that means you need something capable of telling which case is which fast enough to matter.


      Even when only one type of network is used, latency is a real killer. Software, being the slowest component in the machine, is where most of the latency is likely to accumulate. Nobody in their right minds is going to build a multi-billion dollar machine with superbly optimized hardware, if the software adds so much latency to the system they might as well be using a 386SX with Windows 3.1


      And that means Linux has damn good traffic control and very very impressive latencies. And it looks like these are areas the kernel is going to be improving in still further...

      --
      It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
    7. Re:Read on to the next paragraph by AndyChrist · · Score: 3, Informative

      The reason SGI isn't getting the kind of credit it should is probably because of how they resisted linux and clustering for so long (before apparantly caving and deciding to go the direction the wind was blowing and put their expertise to doing the fashionable thing BETTER.)

      Slashdot carries grudges.

    8. Re:Read on to the next paragraph by pjbass · · Score: 2

      What about good job Intel? I see nowhere in this entire set of postings a nod to Intel? Sure, Intel has had the suck of suck lately in PR, but the brains behind this whole SGI monster are IA64 Itanium 2 processors. I certainly concede the SGI interconnects for the cluster are absolutely awesome, but as others have pointed out, if your cluster has killer software with crappy hardware, or killer hardware and crappy software, then your cluster sucks. 2+2=4 here.

    9. Re:Read on to the next paragraph by RageEX · · Score: 3, Interesting

      Yes there's some truth to that. One thing SGI has been guilty of is bad management and wishy-washiness. But it should be pointed out that SGI has been a supporter of OSS for a very very long time and has a been an important contributor not only to the Linux kernel but has also open sourced a lot of their own software. Heck they gave the world XFS for free!

    10. Re:Read on to the next paragraph by CommieOverlord · · Score: 2, Informative

      how they resisted ... clustering

      Their new machines stilled aren't clustered. Clusters don't generally run single system images on shared memory computers. SGI's Altix systems use a NUMA link to enable them to efficiently acces memory on remote computers, making them a kind of distributed shared memory machine. And SGI's Origin systems are your traditional SMP machine. The Altix or Origin systems are neither cheap, nor off the shelf.

      Regarding your comment about them ignoring Linux, what was fundamentally wrong with that? Irix was a very capable OS, why should they have just dumped it?

    11. Re:Read on to the next paragraph by CommieOverlord · · Score: 2, Informative

      This is why SMP computers tend to have 2 or 4 processors, and 8 at a pinch, but no more

      Umm, not true. Sun, can hold up to 106 processers in its Sunfire 15K product, or 72 dual-core processors in the E25K.

      SGI's Origin systems are equally large I believe. And manufacturers like IBM also have large SMP machines.

      Being able to efficiently use that many processors is a completely different matter that depends on the nature of the problem. It is possible to efficiently to use more that 8 processors though. I've heard of programs that scaled almost linearly up to at least 40.

    12. Re:Read on to the next paragraph by lweinmunson · · Score: 2, Insightful

      Umm, not true. Sun, can hold up to 106 processers in its Sunfire 15K product, or 72 dual-core processors in the E25K.

      SGI's Origin systems are equally large I believe. And manufacturers like IBM also have large SMP machines.


      There's a difference between SMP and NUMA used in the big iron. SMP is normally a shared bus or switch topology with the processors connected to each other with little or no arbitration logic. So if you get above 4 you normally max out the busses as the CPUs try to figure out who's doing what and what instruction comes next. NUMA architecture is somewhere between SMP and clustering. The SGI boxes use c-bricks of 4 CPU's and I think 8GB of RAM. Each c-brick is connected to one or more routers via craylink cables. Get enough of these together and you've got your 512 CPU monster. Sun uses the same idea, but is unfortunatly a LOT slower with their interconnect technology. I've seen 16x SMP boxes before, but they really didn't scale at all. Anything over the standard 4-8 SMP is a waste of CPU's and money.

    13. Re:Read on to the next paragraph by flaming-opus · · Score: 2, Informative

      Commodity linux clusters are not the only kind of cluster out there. SGI has been building clusters since the late 80s. Their first super-computer product, the power-challenge clusters, were 16 and 36 way SMP boxes clustered together with hippi. Remember terminator-2 and jurasic park? Those were rendered on clusters of crimsons and indigo workstations. They may have called the NOW(network of workstations) instead of beowulf, but it was the same thing.

      As for linux, they stepped towards linux about the same time IBM, HP, and Oracle did. They've contributed a LOT of code to linux and GPL products. They have transitioned the bulk of their product-line to linux in the last year or so, but they started that process five years ago. They have a LOT of legacy customers and legacy code to transition. Linux is a stable and high performance OS, and it would be that way without SGI, but it got there a lot faster because of SGI's efforts.

      Furthermore, SGI doesn't give a damn (nor does anyone else) if slashdot loves them or not. They care if nasa, boeing, the US navy, BP, and NBC love them. These are the people with the bucks, more interested in a solution to a problem than to any license or technology.

      The real reason that SGI doesn't get the credit they should is much simpler: They put a crappy scsi controller on the mezanine-bus of the challenge-S server in 1994. In the early 90s SGI was the darling of the multi-media world. Their workstations were everywhere, and they made pretty cool servers too. They were poised to ride the same dot-com wave that SUN rode. They introduced a single-CPU server called the challenge-S, which was derived from the indy workstation. It was reasonably speedy and quite affordable (for a unix server of the time). The scsi controller, however, was quite prone to failure. They developed a bad reputation. While the world was busy buying sun servers hand-over-fist, they avoided SGIs except in the technical/defense/media markets. That legacy shaped the company into what it is today: a niche player, struggling against giants like IBM and HP, in the relatively small market for high performance computers.

  30. Re:20? Try 10420,no 2560, make it 20 after all. by anon+mouse-cow-aard · · Score: 3, Interesting

    uhm... Well 2560 motherboards, 'cause their quad-cpu... Altix is the SGI C-bricks that used were built to house 4 IA64 cpu's per brick. otoh... no... really it really is 20 machines with 512 processors each, because the memory is globally shared (all processors have access to all the memory, albeit at different latency and performance: NUMA (Non Uniform Memory Access). and a single linux kernel is running on the whole thing.

  31. Re:Ok, what is the point of this? by servognome · · Score: 4, Insightful

    Really, given the fact that most popular computers have enough processing power to handle anything, and the fact that clustering technology has evolved and is usable in case they aren't...what is the point in the "super computer"?
    The super computer is a cluster (10k+ processors in 20 nodes).
    Not all applications/computations scale by just adding computers to the cluster.
    An example would be solving for z: x=84+19, y=5*3, z=x+y
    The ultimate solution z is limited by the speed x & y can be solved. You can have an individual computer solve for x and another for y in parallel. But no matter how many more computers you add, none of them can solve z until x&y are solved first, and none of them would speed up the computation of x&y.
    After a certain scale, you do not get benifits of parellel processing, so the only way to speed things up is to make each individual computer faster.

    --
    D6 63 0D 70 89 81 BB 8E 7B 7C 5F 5D 54 EA AB 73
  32. Re:windows by jd · · Score: 3, Funny

    They tried, but they ran out of blue.

    --
    It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
  33. But... by PrimeWaveZ · · Score: 2, Funny

    Does it run CherryOS?

  34. Imagine? by macz · · Score: 2, Insightful
    Imagine having your own cluster...
    I seriously doubt that all but the very edge of the bell curve could usefully use this much CPU horsepower. Even given the upper limits of Academia. While we, as a species, have been good at developing bigger, better, stronger, faster computing machines, we have not advanced very far in asking them meaningful questions.

    Inevitably someone will say "we can finally predict the weather..." and in true Futurama Farnsworth fashion I say PSHAW! We don't even know how to properly frame the QUESTION of how to predict the weather, much less get closer to an "Answer" like "The hurricane will hit EXACTLY here, at EXACTLY this time. Only the people on these specific streets are boned."

    Still, I bet I could get like 1 billion FPS on UT2004 at 3600x4800!

    Seriously though, I want to see small improvements. Better, easier to grasp programming languages. More critical thinking skills taught in schools. And a cluster like this dedicated to uber-porn. I'm talking full frame, Hi Def, ggg stuff. (did I type that last part out loud?)

    --
    ...But I digress. TREMBLE PUNY HUMANS!ONE DAY MY SPECIES WILL DESTROY YOU ALL!
    1. Re:Imagine? by arodland · · Score: 2, Funny

      > What will be in the atmosphere 24 hours from now?

      Where?

      > Everywhere.

      And what sort of things were you interested in?

      > Everything.

      Mostly a bunch of airplanes and some water.

  35. I can see a certain person now... by Trogre · · Score: 4, Funny

    ...rubbing his hands whilst sitting in a dark corner amongst an ever-dwindling pile of Microsoft-donated cash, salivating at this.

    "512 processors, 20 machines, $699 per processor. All that intellectual property, yes! No free lunch no, Linux mine, MIIIINE, BWAAAAHAHAHAHA!!!"

    *dials*

    "Hello, NASA? About that $7,157,760 you owe me...
    I'm sorry, where do you want me to jump?"

    --
    "Nine times out of ten, starting a fire is not the best way to solve the problem." - my wife
  36. What?! by commodoresloat · · Score: 3, Funny

    All those teraflops and still no aliens? How many freaking teraflops do we need? Come on, folks, I just want one goddamned spaceman!!

  37. Re:Intent of NASA... by a1cypher · · Score: 4, Funny

    Maybe they want to run PearPC at a decent speed.

  38. Linux #1 by Doc+Ruby · · Score: 4, Interesting

    The most amazing part of this development is that the fastest computer in the world runs Linux . All these TFLOPS increases are really evolutionary, incremental. That the OS is the popular, yet largely underground open source kernel is very encouraging for NASA, SGI, Linux, Linux developers and users, OSS, and nerds in general. Congratulations, team!

    --

    --
    make install -not war

  39. Processors aren't relevant anymore? by swordgeek · · Score: 4, Interesting

    Curiously enough, we were talking about the future of computing at lunch today.

    There was a time when different computers ran on different processors, and supported different OSes. Now what's happening? Itanic and Opteron running Linux seem to be the only growth players in the market; and the supercomputer world is completely dominated by throwing more processors together. Is there no room for substantial architectural changes? Have we hit the merging point of different designs?

    Just some questions. Although it's not easy, I'm less excited by a supercomputer with 10k processors than I would be by one containing as few as 64.

    --

    "People who do stupid things with hazardous materials often die." -- Jim Davidson on alt.folklore.urban
  40. Imagine a Beowolf cluster... by Chuck+Bucket · · Score: 2, Funny

    ...oh never mind.

    CBV@#$)(*?>M

  41. it's the wetware by Doc+Ruby · · Score: 4, Insightful

    Weather prediction, it turns out, is *not at all* like playing chess. Chess is a deterministic linear process operating on rigid, unchanging rules. There is always a "best move" for every board state, which a sufficiently fast and capacious database could search for. Weather is chaotic, a nonlinear process. It feeds back its state into its rules, in that some processes increase the sensitivity to change of other simultaneous processes. Chaos cannot be merely "solved", like a linear equation; it must be simulated and iterated through its successive states to identify more states.

    Of course, we're just getting started with chaos dynamics. We might find chaotic mathematical shortcuts, just like we found algebra to master counting. And studying weather simulation is a great way to do so. Lorenz first formally specified chaos math by modeling weather. While we're improving our modeling techniques to better cope with the weather on which we depend, we'll be sharpening our math tools. Weather applications are therefore some of the most productive apps for these new machines, now that they're fast enough to model real systems, giving results predicting not only weather, but also the future of mathematics.

    --

    --
    make install -not war

  42. I had a shell on the machine for day by Sabalon · · Score: 3, Funny

    It was great. I needed to build the kernel so I typed
    # make -j 10534 bzImag
    and even before I could hit the e and enter, it was done.

    I was gonna build X but on this box the possible outcomes of "build World" scared me!

  43. Units? by Guppy06 · · Score: 2, Funny

    "Its performance test (LINPACK) result of 42.7 teraflops easily outclasses the previous mark set by Japan's Earth Simulator of 35.86 teraflops"

    Yes, but what is that in bogomips?

  44. Re:Of course not. by Chrispy1000000+the+2 · · Score: 2, Funny

    Well, I think I speak for all of us when I say: Thank &ran(diety;1,100) it isn't running windows.

    --
    Sig
  45. Re:Intent of NASA... by harlows_monkeys · · Score: 3, Insightful
    With all of the new private space industry, NASA has been set free to explore the further reaches of space

    What new private space industry? Spaceship One, for example, reached space. That's a long way from being able to do anything useful in space. They were nowhere near orbital velocity, for example. We're still many years, if not decades, away from private industry being able to take over NASA's near-earth space role.

  46. Super Computers == Murder by padukes · · Score: 2

    Super Computers on Slashdot are like murders on the local news: There's another one every day - to the point where they've become just about meaningless. And come to think about it why are they even news? They seem like they're just filler when there's nothing else to report

    --

    -P
    Why have ONE conviction when you can have TWO?
  47. More on the Storage by Necroman · · Score: 2, Informative

    Check out http://www.sgi.com/products/storage/ for some more info about the storage they are using. For those that don't want to wander around the site, there is a link under the picture of the storage array that says "Watch a Video" and it gives an overview of the technology that SGI uses in their storage solution.

    They use tape storage from Storage Tek like this one
    And harddrive storage from Engenio (formally LSI Logic Storage Systems) like this.

    --
    Its not what it is, its something else.
  48. Re:Ok, what is the point of this? by HermesHuang · · Score: 5, Interesting

    The answer here is "complexity". I do some scientific computing (have done chemistry, then materials science, now doing photonic devices) and there's always more you want to be able to consider. Of course, the best I've used is an 8-processor SGI machine (although that one was a bit old - I think the 2-processor opteron system I'm using now is actually better). But especially with the materials studies, ideally we wanted to do everything with full quantum-mechanical calculations. which turns into gigantic matrices, even for a system of 100 atoms or so. And even then we put strict limits on what orbitals we consider and all that good stuff.

    Slightly more concrete example - right now with my photonics simulations (finite element) on my dual-opteron rig the max I can handle is about 180,000 elements (which means a (4*180000)x(4*180000) matrix with complex elements needs to be diagonalized, among other things), and it takes about half an hour for a standing-wave calculation. To do any time propogation, repeat same calculation in picosecond increments. And with the gridding I can do, for a 100 micron disc resonator in 2-D I have to use light at about 40 microns. To go to the 320nm wavelength these resonators are operating at, I'd need roughly 2 orders of magnitude more memory. There's also the time factor to be considered. As with any design process, one must iterate. Tweak a little here, run the program, rinse, repeat. How long are you willing to spend in this process before you feel something is "good enough"? The faster the computer spits the answer out, the more things you can try, and the more you can think things over and hopefully make it better.

    And this is a single component in what can be a fairly complex integrated-photonics chip. [And might I mention again I've been working in 2-D this entire time instead of doing a full 3-D simulation?] You give me the computational power and I'll use it. And I'm an experimentalist doing fairly basic research who just wants to check some stuff in the computer before sinking a lot of time and effort into fabricating a test device.

    On the other hand, I actually don't want to have one of the T100 supercomputers in our lab. That would mean I'd be spending all day writing code and designing complex simulations instead of in the lab getting my hands dirty.

    And as for the commonality of problems requiring such computational power, I think almost any sort of simulation can easily use it. Consider more terms (everything I've done to date is horribly linearized - let's see some more terms in the Taylor expansion) to account for nonlinear behavior, grid things up finer to get more accurate results, consider more possibilities when dealing with chaotic behavior... I would hope any good scientist would find the possibilties endless.

  49. nothing compared to 500TF per floor at the CIA by cheekyboy · · Score: 4, Funny

    1. I bet CIA has something in order of 10-100x more powerfull, I mean if you can afford to wire up 5 full office floors of computers, say 20*512 * 5 per floor * 5 , thats a hell lot more. CIA can afford to spend 200m on it, and have 10 super clusters of 1000 tf each.

    2. I bet the CIA also can change the weather, go read HARP etc... if the russians can do it in the 80s then the CIA can do anything.

    --
    Liberty freedom are no1, not dicks in suits.
  50. What's the point, I ask myself by talaphid · · Score: 2, Interesting

    As I'm RTFA...

    "For instance, on NASA's previous supercomputers, simulations showing five years worth of changes in ocean temperatures and sea levels were taking a year to model. But using a single SGI Altix system, scientists can simulate decades of ocean circulation in just days, while producing simulations in greater detail than ever before. And the time required to assess flight characteristics of an aircraft design, which involves thousands of complex calculations, dropped from years to a single day."

    Being the NASA fanboy I am, I have to wonder if this massive computational step up doesn't share a large number of similiarities between the punch card computing age versus the modern programming age. Because of a quantum leap or five in time reduction for the bottleneck in computation time, more experiments, more radical theories, more wild stuff can be done because it won't be tying up the supercomputer for the next year... just the week. For all the wild science articles that make us salivate here... is this not the harbinger of a new era?

    /fanboy
  51. Re:The worst thing about this... by general_re · · Score: 3, Insightful
    ...just about any system with the same number of... well, gosh, almost any processor except an Itanium would be even faster...

    Like what? Go out and look up SPEC results next time you're bored. I think you'll find that I2 is quite a bit more capable than you make out. IBM's dual-core POWER5 is just about the only thing out there that's even close to (a single-core) I2 in FP performance, and Opteron isn't even in the game at that level.

    Is it a commercial failure? Probably, but so was Alpha - commercial success is not an indicator of actual performance.

    --
    ABSURDITY, n.: A statement or belief manifestly inconsistent with one's own opinion.
  52. They'd run Windows XP... by gilesjuk · · Score: 2, Funny

    But it would take to long to:

    Run Windows Update for each box.
    Remove Windows Messenger.
    Cancel the window telling you to take a tour of XP.
    Cancel the window telling you to get a passport.
    Run the net connection wizard.
    Reboot after installing updates.
    etc....

    (I'm not being totally serious, I know you can deploy ghost images etc..)