Slashdot Mirror


SGI Demos 64-Proc Linux Box

foobar104 writes "Details are scarce, but SGI announced this morning that their prototype Itanium 2 system has demonstrated more than 120 GB/s to and from main memory on the STREAM TRIAD benchmark, which is the fourth best result in the world. For comparison, the Cray C90 sustains 105 GB/s, while an even larger Sun Fire 15K clocks a measly 55 GB/s. The interesting part? The system wasn't running IRIX, SGI's proprietary version of UNIX. It was running Linux. More information on STREAM TRIAD, including results from other systems, is available here. The system, incidentally, was an Origin 3800 straight out of manufacturing equipped with Itanium 2 processor modules. SGI will start selling the systems early next year."

30 of 253 comments (clear)

  1. What is this good for? by Anonymous Coward · · Score: 3, Insightful

    To me, it would seem that the primary purpose of being able to push info that fast to and from memory is useful for very few problems these days. I was under the impression that the majority of "super-computing" problems were of the sort that required lots of calculations, not lots of parsing of information in storage.

    Am I wrong about what this benchmark means? Or am I missing something basic?

    1. Re:What is this good for? by Jhan · · Score: 5, Insightful

      Typical super-computing problems are weather prediction, air flow computations and nuclear reaction modelling. Physical models in other words.

      Generally, you attack these kinds of problem by partitioning 3-d space into many small cells, and then running relatively simple calculations on every cell. The better the resolution, the better the model.

      The thing about three dimensions is, storage space increases with resolution^3... For instance, I believe the weather guys are currently pushing 1kmx1kmx100m resolutions. That means about 3,2e11 cells. If each cell has 1 kB of state, the total memory usage would be about 320 TB.

      Super computing problems eat memory like Takeru Kobayashi eats hot dogs. In many (most?) cases the calculations are simple. Hence, bandwidth is King.

      --

      I choose to remain celibate, like my father and his father before him.

    2. Re:What is this good for? by foobar104 · · Score: 4, Insightful

      Am I wrong about what this benchmark means? Or am I missing something basic?

      With no disrespect intended, I think you might be missing something basic.

      Any activity that involves moving data into and out of RAM will benefit from the ability to do it faster. That includes things such disparate things as database processing (if you're lucky, you can cache your indices in RAM), media encoding, hell, even compiling. Memory bandwidth is one of the few aspects of computer design that touches just about every application, with the exception of those that are small enough-- or sufficiently well optimized-- to fit into cache.

    3. Re:What is this good for? by ericman31 · · Score: 5, Informative

      One of the areas this is meaningful is data warehousing. There are three major competitors in the very large data warehousing environment and one wanna be competitor:

      • NCR Teradata and Worldmark MPP servers
      • IBM DB2 and IBM pSeries clusters (MPP again)
      • Sun SunFire 15K and Sybase IQ Multiplex (SMP)
      • Oracle is trying to compete in this space and not really succeeding. Their model is sort of MPP, based on Oracle Real Application Clusters
      MPP, or massively parallel processing, is the typical solution for very large (generally anything over 3 or 4 terabytes) data warehouses. Sun and Sybase are trying hard to crack the market with their SMP (symmetric multi-processing) solution, which is actually very promising. The major benefit to SMP processing is simplicity, one server to maintain, one OS, no cluster, no cluster interconnect. With Linux potentially pushing into the large SMP space we will have the potential for competition to the MPP data warehouse solutions, which are incredibly expensive to purchase and maintain.

      One of the biggest drawbacks to Linux adoption in the commercial Enterprise space is its lack of SMP scalability. If the SGI platform works out we will start seeing Linux scaling into an arena that will allow for acceptance in the Enterprise.

      --
      In my universe I'm perfectly normal, it's not my fault you don't live in my universe.
    4. Re:What is this good for? by stienman · · Score: 3, Insightful

      Primarily this is good for marketting, company image, press releases, and selling potential customers on smaller systems.

      Chances are good that they will build very few full scale machines. Those that are built go towards data-warehousing, research (atmospheric, oceanic and space science, nuclear modeling, etc) and to the government. Factoring large primes is a use, for instance, as it's a problem that can be performed in parallel.

      But they will have the ability to say that x, y, and z companies/ gov't agencies have our equipment, it can't be exported (so it must be good), and our lower end machines will suit your job until you need an upgrade - in other words we can be with you for the whole ride and promise application compatability.

      -Adam

    5. Re:What is this good for? by littleRedFriend · · Score: 3, Informative

      I work for a company that writes software for those kinds of genomic computations (yes, it runs on Linux, MPI & SMP). We recently did a large computation on the 4th largest super computer in the world. The results are freely available.

      Most of these computations are pretty intensive in CPU and memory usage. Network speed and disk speed are less important (although you need lots of storage). I would like to try one of these babies, must be fast.

      --
      IANAL, but imagine a beowulf cluster of in Soviet Russia all your belong are base to us welcoming the new SCO overlords.
    6. Re:What is this good for? by Anonymous Coward · · Score: 3, Informative

      1 km x 1 km x 100 m for Numerical Weather Prediction
      is a bit much for today's (affordable) supers.

      We use a 22 km x 22 km horizontal grid for
      predicting the weather 48 hours ahead over the
      North Atlantic + Europe (406 x 324 cells).

      We use 31 layers in the vertical (from ~30 meters
      thick in the lowest level to ~2 km for the few in
      the stratosphere.

      This is for a so-called "limited area" model. A
      global model such as the model of the European
      Centre uses about half the resolution (40 km)
      over the entire globe.

      Toon Moene.

  2. So what is faster than it in the TRIAD? by Neon+Spiral+Injector · · Score: 5, Interesting

    That was my first though. So it beats a C90, but what is faster?

    Found the answer here.

    And if you were wondering about a Beowolf cluster of these, the top ten ranking excludes "cluster results".

    1. Re:So what is faster than it in the TRIAD? by Durinia · · Score: 3, Informative

      Interesting...Looks like a T932 has got about a 3x performance on it, and the NECs (understandably, since they are the most modern) get like 5x. Still pretty impressive for a MPP machine, I would think. Were you able to find stats on MPP systems (such as the T3E or SP) anywhere?

    2. Re:So what is faster than it in the TRIAD? by brejc8 · · Score: 3, Informative

      These results are quite old. The SGI MIPS based machines seem to be much faster.
      512 processor Origin 3000 quoated as 716 GB/sec.
      I have no idea why they are using Itanics for this but its not because they are better processors.

  3. impressive w/Linux by d3xt3r · · Score: 5, Interesting
    What is most impressive about this to me is that they did it using Linux over IRIX. Why? Because this has provent to be Linux's weakest point: scalability. Most of the changes in 2.5 are concentrating on scalability, could this be reaping those benefits?

    Linux running at 120 GB/s with 64 processors is impressive for an OS that has been criticized as inefficient when running on more than 8.

    I would be very interested to know what version of the kernel they are using.

    1. Re:impressive w/Linux by tempest303 · · Score: 5, Interesting

      I'm wondering the same thing - I wouldn't be surprised if this wasn't a very customised 2.4/2.5 hybrid or some such.

      What I'm more curious about is what the licensing of all this will be like... are they just doing standard kernel patching, in which case the changes might get rolled back into the vanilla kernel? I'm a little worried that they might be doing it all via binary-only modules, which means that Linux proper gets none of the changes rolled back in... :-( I'd be somewhat surprised if SGI did this, though - they seem to have been pretty damn OSS friendly. (XFS!)

    2. Re:impressive w/Linux by Angry+White+Guy · · Score: 4, Interesting

      I think that the big question is will this get Big Iron back into the rendering farms, and what will be the effect?
      With the major animation companies going to Linux server farms to save cost and get better performance, maybe moving back away from x86 architecture to these large machines may be beneficial cost/productivity wise.

      --
      You think that I'm crazy, you should see this guy!
    3. Re:impressive w/Linux by CMonk · · Score: 5, Informative

      Given that they list "scalability" as one of the open source projects that they contribute to I would say they are playing nice with the community. (http://oss.sgi.com/projects/).

      They are working hard to get a number of their changes into the offical kernel, I imagine this is one of them .

  4. Historical comparison... by Durinia · · Score: 3, Interesting
    ...interesting that SGI chose the Cray C90 - a system released in *1991* - to compare against. It's nice to know that it's only taken them 10+ years to catch up. :)

    They also mention the SV1, which is a "low-end" Cray. I'm curious how the new X1 (nee SV2) does on the STREAM suite.

    It's good to see that their "scalable linux" work seems to be doing pretty well! I'm sure it was much easier for them to use the IA-64 port of Linux than to port IRIX...

    1. Re:Historical comparison... by ivan256 · · Score: 3, Insightful

      SGI didn't choos to comapre this to a C90, the slashdot submitter did. SGI primarily compared it to the "IBM® eServer p690 and Sun Microsystems Sun Fire"

      The part that I really find interesting is that the top three in the list all outperform this by twice as much, the #1 spot being held by a machine that can do over 500GB/sec.

      It's still over 12x faster then the quad Itaniums I used to work with, and probably much cheeper then the NEC machines and the Cray...

    2. Re:Historical comparison... by foobar104 · · Score: 3, Informative

      ...interesting that SGI chose the Cray C90 - a system released in *1991* - to compare against. It's nice to know that it's only taken them 10+ years to catch up. :)

      If you read the STREAM TRIAD web site linked above, you'll see that SGI didn't compare itself to the C90 exactly; it just ran a benchmark and published the results. Also in that approximate rank are other machines from NEC and Cray and, further down, Sun.

      But you're right. Cray was way ahead of their time when it came to things like memory bandwidth. I remember a friend (ex-Crayon) telling me once that access to main memory on the T-90 was faster than access to the on-chip cache on the Pentium III. That sounds implausible, though, so he might have been exaggerating.

      I'm curious how the new X1 (nee SV2) does on the STREAM suite.

      The last word I got is that X1 is still in the PCB design phase. It's only running as a simulator right now. So it'll be a while before you see those numbers. ;-)

      (That info is several months old, so I may be wrong.)

  5. Re:Well, that's nice, but what about... by foobar104 · · Score: 3, Insightful

    Anyone have an educated guess of what the actual score would be?

    Zero. Origin servers don't have graphics cards. Which means, unfortunately, the Slashdot community is going to have to try to wrap its collective head around a more meaningful measurement of potential performance.

  6. Released in the nick of time.... by RicochetRita · · Score: 4, Funny
    ... SGI will start selling the systems early next year.

    to meet the system requirements for Doom III.

    -R

    --
    Stuff that matters: circuitbreakers, vacuum-cleaners coffee makers, calculators generators, matching salt+pepper shakers
  7. Re:so he next question.... by Anonymous Coward · · Score: 3, Funny
    Why can't it run Windows XP?

    Well, Windows is notorious for demanding a lot from the hardware. You have to expect it to be a dog on a low-end machine like this one.

    NT once ran on MIPS machines, as I recall. I don't have my NT4 disks handy, but I think that I recall that they included binaries for Alpha and Mips. Wouldn't it be nifty to be able to boot NT on that and see it run one cpu, straight into a bluescreen? After all, a computer without MS Windows is like a person without cancer.

  8. STREAM and SGI past history by dprice · · Score: 4, Informative

    It's not surprising that the SGI machine runs STREAM well. Back in the mid-1990's, John McCalpin, who worked for SGI at that time, was a regular contributor to comp.sys.super, and he would frequently brag about the superiority of SGI running STREAM. McCalpin is one of the primary advocates for STREAM. You can optimize a computer architecture to run a particular benchmark well. The question is whether the SGI machine runs a wider variety of real-world problems well.

  9. Re:stats? by Mignon · · Score: 4, Funny
    I want to see a dmesg from this thing

    The testers tried, but it scrolled by too fast to see anything.

  10. Re:Two things by foobar104 · · Score: 4, Interesting

    The second thought is: can it be partitioned?

    Since this machine is a standard Origin 3000 with McKinley processor modules, I'm going to assume the answer will be yes. You can partition an O3000 down to a single processor brick + base IO brick, so I imagine that SGI will implement the necessary software bits to make that happen on the SN1-IA systems. I know there are both user space bits (mkpart, partmgr) and kernel space bits (the TCP-over-NUMAlink driver).

    I personally have only seen partitioning used on HA systems and lab systems. For a fully fault-tolerant N-processor system, you can buy one 2N-processor Origin and partition it down the middle. The two nodes can run in parallel, passing data back and forth over the NUMAlink via TCP/IP, until one goes down. Also, partitioning is great in a lab environment. It's nice to be able to carve up a big multiprocessor system and give each user a 4-processor (or multiple of 4) node.

    I wonder what linux apps would someone run on a system this big?

    Anything you'd run on an IRIX system of that size, I'd imagine. I believe-- not positive-- that MSC has already released Nastran for Itanium 2 Linux. (Nastran is a computer-aided engineering tool used extensively in the automotive industry, and other manufacturing industries. It's used for things like stress, heat transfer, and vibration analysis.)

    And, as long as the Fortran compilers are worth a damn, you can run just about any other scientific, analytical, or technical software, I'd imagine.

  11. Impressive memory crossbar by Animats · · Score: 5, Insightful
    First of all, the OS doesn't matter for this benchmark. This is a memory-to-memory copying test.

    That said, it's an impressive result. And it's done in an unusual way. SGI has a 1.6GB/s channel running through routers connecting the processors and memory. A computer is made up of multiple rackmount "bricks" connected by cables and routers. The "router" is a 2U rackmount device.

    Processors and memory reside in rackmount boxes with 4 CPUs and 8 GB (max) of local memory. These boxes interconnect through a single 1.6GB/s link per box, which, in a big system, goes through several layers of routers. So a memory access to another box is routed through what is essentially a fast LAN. All this is cached, of course.

    It's not clear to what extent application programs have to be aware of this. Clearly, if you lay things out in memory badly, with lots of CPUs reading and writing the same memory from all over the memory net, the system will bottleneck. (Everybody reading the same stuff is OK; it's cached. But writes have to propagate back to the home location of the data.)

    Since the whole monster crashes all at once, you don't want to build your web server farm this way. It's for applications that really need all that crunch power in one machine.

    1. Re:Impressive memory crossbar by foobar104 · · Score: 5, Informative

      It's not clear to what extent application programs have to be aware of this. Clearly, if you lay things out in memory badly, with lots of CPUs reading and writing the same memory from all over the memory net, the system will bottleneck.

      Speaking as somebody who's done his share of IRIX programming, I'd say "none at all."

      In some cases, on Origin 2000 hardware with older versions of IRIX, you could see notable performance differences if you went out of your way to place memory in banks adjacent to the running processors. But the Origin 3000 architecture, with its significant reductions in memory latency, and newer versions of IRIX, with their improved page replication algorithms, have made manual memory placement almost obsolete. Almost.

      SGI spent a lot of time and trouble trying to reduce the impact of accessing remote memory. The caching mechanisms and page replication stuff are really well thought-out.

  12. Statics, Benchmarks, and lies... by AtariDatacenter · · Score: 5, Informative

    I think it is pretty interesting that the benchmark that they used measured memory throughput, as opposed to, say, an actual workload. In other words, this is a synthetic benchmark, versus a real-world benchmark. They say, "Look! We can do memory transfers really really fast!"

    Unfortunately, memory transfers are not the world when it comes to large multiprocessor boxes. The overhead comes in when you're trying to synchronize a large number of threads/CPUs to do a large task. For example, an Oracle database.

    Sun has proven that it scales up the tree very well with large numbers of processors. But from my understanding, Linux is more efficient with a low processor count, and less and less efficient with more processors.

    I question its ability to do anything with a real workload. And I've even more suspicious because they use a benchmark I've never heard of (STREAM TRIAD) to push its superiority on a single-aspect synthetic benchmark.

    Good. The machine looks like it has a decent memory bus, and memory modules with a good configuration and speed rating. Now, what can the machine actually do well that makes it a real winner?

    1. Re:Statics, Benchmarks, and lies... by foobar104 · · Score: 5, Informative

      Good. The machine looks like it has a decent memory bus, and memory modules with a good configuration and speed rating.

      You know, before you piss in SGI's Cheerios, you might want to do a little reading. The Origin 3000 architecture, on which this prototype system was based, has no memory bus at all. It uses a fabric of switched multi-gigabyte-per-second interconnects to attach CPUs to RAM and to other CPU nodes.

      CPU benchmarks (like SPEC) are synthetic and irrelevant, because they fit in cache. Virtually no real application fits in cache, and the sort of applications you run on a machine this big deal with data sets no the order of tens or even hundreds of gigabytes. Memory-to-CPU bandwidth is probably the only real indicator of the ability of the system to handle real-world workloads.

      It's also the only thing-- other than the dimensions and the color of the plastics-- that differentiates SGI's big Itanium 2 server from everybody else's big Itanium 2 servers.

  13. Re:MIPS is to IA64 as Irix is to Linux? by foobar104 · · Score: 4, Informative

    Anybody else see that as the main reason this is running Linux instead of Irix?

    SGI started working on porting IRIX to the IA-64 architecture back in (I think it was) 1995 or 1996. Not long after, they found that it would be easier and cheaper to get Linux to scale more efficiently and to port some key libraries and services from IRIX than it would be to port all of IRIX over to the new architecture.

    It's all about time and money.

  14. there's no point in doing that by halfelven · · Score: 5, Interesting

    The whole point with the SGI supercomputers (there are Origin servers running Irix on 1024 processors) is that there's one single copy of the OS running across all those CPUs, and the entire memory is available to all CPUs on the same piece of hardware. That means, any CPU can access any piece of information at the speed of mem-IO, and you can easily create a large matrix (think many tens or hundreds of GB) to keep all your data in one piece.
    Networked clusters (Mosix, Beowulf) split the CPU bunch across the network, and the memory is split too. That means there's a huge latency when a CPU wants to access data that happens to be on a different node on the network: the network latency is many times larger than memory latency.

    There are problems that simply cannot be solved on networked clusters, precisely because of network latency. While true supercomputers (all CPUs on the same machine) do not have this limitation.
    Well, ok, so you can split the matrix across nodes in a Beowulf, but even if you have the same CPU power as the SGI supercomp, you're going to solve the problem several times slower (if not several orders of magnitude slower). Such is the importance of latency.

    This is why there's no point in clusterising this kind of computers: you lose their biggest advantage: single OS copy, all memory on the same machine.

  15. Only Thurston Howell III is eligible by deal · · Score: 3, Funny

    "Through its experience and expertise in high-performance computing, SGI will offer customers of the highest quality 64-bit operating environments."
    Well, Hmph! The rest of us low-life customers wouldn't want it anyways!