Slashdot Mirror


SGI Demos 64-Proc Linux Box

foobar104 writes "Details are scarce, but SGI announced this morning that their prototype Itanium 2 system has demonstrated more than 120 GB/s to and from main memory on the STREAM TRIAD benchmark, which is the fourth best result in the world. For comparison, the Cray C90 sustains 105 GB/s, while an even larger Sun Fire 15K clocks a measly 55 GB/s. The interesting part? The system wasn't running IRIX, SGI's proprietary version of UNIX. It was running Linux. More information on STREAM TRIAD, including results from other systems, is available here. The system, incidentally, was an Origin 3800 straight out of manufacturing equipped with Itanium 2 processor modules. SGI will start selling the systems early next year."

112 of 253 comments (clear)

  1. What is this good for? by Anonymous Coward · · Score: 3, Insightful

    To me, it would seem that the primary purpose of being able to push info that fast to and from memory is useful for very few problems these days. I was under the impression that the majority of "super-computing" problems were of the sort that required lots of calculations, not lots of parsing of information in storage.

    Am I wrong about what this benchmark means? Or am I missing something basic?

    1. Re:What is this good for? by Neon+Spiral+Injector · · Score: 2

      Usually these number crunching exercises have large datasets. Too big to fit in the CPU registers, or cache, so you need quick access to RAM too.

    2. Re:What is this good for? by Space+cowboy · · Score: 2

      Perhaps the super-computing problems are approached in the way they are, because of the limitations on bandwidth to the CPU(s).

      Most of the super-computing problems are simulations, and I would have thought that being able to simulate more of the environemnt (therefore, more data to crunch) would be an advantage.

      Simon

      --
      Physicists get Hadrons!
    3. Re:What is this good for? by Cheeko · · Score: 2

      This is very dependant on the type of application that is being done. A big use of supercomputing power these days is genome research, and from what I've seen in this field, these applications are very data intensive, moving around massive amounts of data related to the sequences being processed. I'm also rather sure that applications like nuclear explosion and earth motion modeling require manipulating very large amounts of data that have a need for lots of memory bandwidth.

    4. Re:What is this good for? by Lxy · · Score: 2

      Thin client!!!!

      That's about the only use I can see for it. I could easily replace every workstation and server in our building with one of these.

      I guess colo could be another use, but I'd have to question what you're hosting that needs 64 Itanium processors. More importantly, how well does it handle VM?

      --

      There is no reasonable defense against an idiot with an agenda
      :wq
    5. Re:What is this good for? by Falrick · · Score: 2, Informative

      Its good for, as another poster put it, simulations. Specifically simulations with lots of tightly coupled entities. If you are simulating, say 100 different entities, and the action of each of those entities has an affect on all of the other 99 entities, you gain greatly from a massively parallel shared memory environment. Sending state changes through a cluster can kill these kinds of applications.

      --
      something clever
    6. Re:What is this good for? by Jhan · · Score: 5, Insightful

      Typical super-computing problems are weather prediction, air flow computations and nuclear reaction modelling. Physical models in other words.

      Generally, you attack these kinds of problem by partitioning 3-d space into many small cells, and then running relatively simple calculations on every cell. The better the resolution, the better the model.

      The thing about three dimensions is, storage space increases with resolution^3... For instance, I believe the weather guys are currently pushing 1kmx1kmx100m resolutions. That means about 3,2e11 cells. If each cell has 1 kB of state, the total memory usage would be about 320 TB.

      Super computing problems eat memory like Takeru Kobayashi eats hot dogs. In many (most?) cases the calculations are simple. Hence, bandwidth is King.

      --

      I choose to remain celibate, like my father and his father before him.

    7. Re:What is this good for? by foobar104 · · Score: 4, Insightful

      Am I wrong about what this benchmark means? Or am I missing something basic?

      With no disrespect intended, I think you might be missing something basic.

      Any activity that involves moving data into and out of RAM will benefit from the ability to do it faster. That includes things such disparate things as database processing (if you're lucky, you can cache your indices in RAM), media encoding, hell, even compiling. Memory bandwidth is one of the few aspects of computer design that touches just about every application, with the exception of those that are small enough-- or sufficiently well optimized-- to fit into cache.

    8. Re:What is this good for? by GigsVT · · Score: 2, Funny

      That's about the only use I can see for it. I could easily replace every workstation and server in our building with one of these.

      Wow, that's going to be expensive, and how will they fit those in their cubicles?

      Imagine an openMosix cluster of these though. :)

      --
      I've had enough abrasive sigs. Kittens are cute and fuzzy.
    9. Re:What is this good for? by Usquebaugh · · Score: 2

      Yep :-)

      Throw out that old Z/390 or AS/400 and replace it with centrally managed GUI terminals. Saves the company $$$ My understanding is the only thing stopping Intel from taking over the mainframe space was a stable OS and memeory bandwidth, looks like that's solved.

      But given SGIs history this is probably destined for running simulations and large factoring jobs.

    10. Re:What is this good for? by ericman31 · · Score: 5, Informative

      One of the areas this is meaningful is data warehousing. There are three major competitors in the very large data warehousing environment and one wanna be competitor:

      • NCR Teradata and Worldmark MPP servers
      • IBM DB2 and IBM pSeries clusters (MPP again)
      • Sun SunFire 15K and Sybase IQ Multiplex (SMP)
      • Oracle is trying to compete in this space and not really succeeding. Their model is sort of MPP, based on Oracle Real Application Clusters
      MPP, or massively parallel processing, is the typical solution for very large (generally anything over 3 or 4 terabytes) data warehouses. Sun and Sybase are trying hard to crack the market with their SMP (symmetric multi-processing) solution, which is actually very promising. The major benefit to SMP processing is simplicity, one server to maintain, one OS, no cluster, no cluster interconnect. With Linux potentially pushing into the large SMP space we will have the potential for competition to the MPP data warehouse solutions, which are incredibly expensive to purchase and maintain.

      One of the biggest drawbacks to Linux adoption in the commercial Enterprise space is its lack of SMP scalability. If the SGI platform works out we will start seeing Linux scaling into an arena that will allow for acceptance in the Enterprise.

      --
      In my universe I'm perfectly normal, it's not my fault you don't live in my universe.
    11. Re:What is this good for? by vidarh · · Score: 2
      There are lots of applications that will benefit from this. But what I would like to see are faster disk storage systems, not faster memory... But then my main work over the last years have been huge mail systems (entirely disk IO bound) and extremely fault tolerant database distribution (.name TLD resolution system, also almost entirely disk IO bound).

      I'd be very happy to find a storage solution that gave us transfer rates that would get us anywhere near utilizing the full CPU capacity even with entry level servers these days for non-computing intensive processes such as mail delivery, serving DNS queries or fault tolerant message queueing... (and preferrably one that doesn't cost ten times more than any potential savings from reducing the number of servers...)

    12. Re:What is this good for? by stienman · · Score: 3, Insightful

      Primarily this is good for marketting, company image, press releases, and selling potential customers on smaller systems.

      Chances are good that they will build very few full scale machines. Those that are built go towards data-warehousing, research (atmospheric, oceanic and space science, nuclear modeling, etc) and to the government. Factoring large primes is a use, for instance, as it's a problem that can be performed in parallel.

      But they will have the ability to say that x, y, and z companies/ gov't agencies have our equipment, it can't be exported (so it must be good), and our lower end machines will suit your job until you need an upgrade - in other words we can be with you for the whole ride and promise application compatability.

      -Adam

    13. Re:What is this good for? by littleRedFriend · · Score: 3, Informative

      I work for a company that writes software for those kinds of genomic computations (yes, it runs on Linux, MPI & SMP). We recently did a large computation on the 4th largest super computer in the world. The results are freely available.

      Most of these computations are pretty intensive in CPU and memory usage. Network speed and disk speed are less important (although you need lots of storage). I would like to try one of these babies, must be fast.

      --
      IANAL, but imagine a beowulf cluster of in Soviet Russia all your belong are base to us welcoming the new SCO overlords.
    14. Re:What is this good for? by Anonymous Coward · · Score: 3, Informative

      1 km x 1 km x 100 m for Numerical Weather Prediction
      is a bit much for today's (affordable) supers.

      We use a 22 km x 22 km horizontal grid for
      predicting the weather 48 hours ahead over the
      North Atlantic + Europe (406 x 324 cells).

      We use 31 layers in the vertical (from ~30 meters
      thick in the lowest level to ~2 km for the few in
      the stratosphere.

      This is for a so-called "limited area" model. A
      global model such as the model of the European
      Centre uses about half the resolution (40 km)
      over the entire globe.

      Toon Moene.

    15. Re:What is this good for? by FyRE666 · · Score: 2

      Apparently Jonh Carmack has stated this should be able to run Doom III at a "reasonable speed"... So long as you don't go nuts with the detail level...

    16. Re:What is this good for? by Paladin128 · · Score: 2
      • Memory bandwidth is one of the few aspects of computer design that touches just about every application, with the exception of those that are small enough-- or sufficiently well optimized-- to fit into cache.
      That's assuming you have enough CPU power to process everything coming in. If you have 100 GB/s of bandwidth, and a single P4 or Athlon, most of that bandwidth will be unused.
      --
      Lex orandi, lex credendi.
  2. stats? by Lxy · · Score: 2

    Like any good press writeup, it lacks any details that are useful to techies. I want to see a dmesg from this thing, as well as pretty pictures of what's under the hood.

    --

    There is no reasonable defense against an idiot with an agenda
    :wq
    1. Re:stats? by Mignon · · Score: 4, Funny
      I want to see a dmesg from this thing

      The testers tried, but it scrolled by too fast to see anything.

    2. Re:stats? by AlgUSF · · Score: 2

      Yeah I would like to see the dmesg of it when it is recognizing the processors.... CPU #1 Genuine Intel Itanium (McKinley)............CPU #64 Genuine Intel Itanium (McKinley)..

      --


      I want my rights back. I was actually using them when our government stole them after 9/11.
    3. Re:stats? by dohcvtec · · Score: 2, Informative

      One nitpick: IIRC it would be CPU #0 - CPU #63

      --
      -- Never hit a man with glasses. Hit him with a baseball bat.
  3. So what is faster than it in the TRIAD? by Neon+Spiral+Injector · · Score: 5, Interesting

    That was my first though. So it beats a C90, but what is faster?

    Found the answer here.

    And if you were wondering about a Beowolf cluster of these, the top ten ranking excludes "cluster results".

    1. Re:So what is faster than it in the TRIAD? by Durinia · · Score: 3, Informative

      Interesting...Looks like a T932 has got about a 3x performance on it, and the NECs (understandably, since they are the most modern) get like 5x. Still pretty impressive for a MPP machine, I would think. Were you able to find stats on MPP systems (such as the T3E or SP) anywhere?

    2. Re:So what is faster than it in the TRIAD? by Neon+Spiral+Injector · · Score: 2

      This page has results for hte T3E, but no STREAM TRIAD for the SP.

    3. Re:So what is faster than it in the TRIAD? by brejc8 · · Score: 3, Informative

      These results are quite old. The SGI MIPS based machines seem to be much faster.
      512 processor Origin 3000 quoated as 716 GB/sec.
      I have no idea why they are using Itanics for this but its not because they are better processors.

    4. Re:So what is faster than it in the TRIAD? by Neon+Spiral+Injector · · Score: 2

      It is strange about the Itaniums. Origianlly SGI was looking towards Windows NT to replace IRIX, and thus they started working with the Intel platform. But now they have turned that interest to Linux, which runs just fine on the MIPS.

    5. Re:So what is faster than it in the TRIAD? by Durinia · · Score: 2, Informative
      512 processor Origin 3000 quoated as 716 GB/sec.

      That's a peak speed, not a STREAM speed. Some of these machines (like the NEC SX-6) have peak speeds that are *much* higher. STREAM is an attempt at showing how a system performs on a somewhat more realistic workload.

    6. Re:So what is faster than it in the TRIAD? by jedidiah · · Score: 2

      If they use someone else's parts for a portion of the solution, that's one less chunk of the R&D that they have to bankroll. HP is dropping it's own CPU line over such concerns. Besides, on highend RISC based machines it is the memory busses that are most impressive (not the CPUs). A Sun or SGI bus is what Intel CPU's need to look really respectable.

      --
      A Pirate and a Puritan look the same on a balance sheet.
  4. impressive w/Linux by d3xt3r · · Score: 5, Interesting
    What is most impressive about this to me is that they did it using Linux over IRIX. Why? Because this has provent to be Linux's weakest point: scalability. Most of the changes in 2.5 are concentrating on scalability, could this be reaping those benefits?

    Linux running at 120 GB/s with 64 processors is impressive for an OS that has been criticized as inefficient when running on more than 8.

    I would be very interested to know what version of the kernel they are using.

    1. Re:impressive w/Linux by tempest303 · · Score: 5, Interesting

      I'm wondering the same thing - I wouldn't be surprised if this wasn't a very customised 2.4/2.5 hybrid or some such.

      What I'm more curious about is what the licensing of all this will be like... are they just doing standard kernel patching, in which case the changes might get rolled back into the vanilla kernel? I'm a little worried that they might be doing it all via binary-only modules, which means that Linux proper gets none of the changes rolled back in... :-( I'd be somewhat surprised if SGI did this, though - they seem to have been pretty damn OSS friendly. (XFS!)

    2. Re:impressive w/Linux by foobar104 · · Score: 2

      I would be very interested to know what version of the kernel they are using.

      I tried really hard to find that info this morning before submitting, but to no avail. But the test was demonstrated at the Intel Developer's Conference, according to the press release, so maybe we could find somebody who knows somebody?

    3. Re:impressive w/Linux by Angry+White+Guy · · Score: 4, Interesting

      I think that the big question is will this get Big Iron back into the rendering farms, and what will be the effect?
      With the major animation companies going to Linux server farms to save cost and get better performance, maybe moving back away from x86 architecture to these large machines may be beneficial cost/productivity wise.

      --
      You think that I'm crazy, you should see this guy!
    4. Re:impressive w/Linux by CMonk · · Score: 5, Informative

      Given that they list "scalability" as one of the open source projects that they contribute to I would say they are playing nice with the community. (http://oss.sgi.com/projects/).

      They are working hard to get a number of their changes into the offical kernel, I imagine this is one of them .

    5. Re:impressive w/Linux by joib · · Score: 2

      As everyone else here, I don't know either. But I'd say it's quite a different kernel than the stock 2.4/2.5 kernel. I'd gues something like

      1) A K42 -like exokernel with some parts of the linux kernel bolted on.

      2) Something like Larry McVoys idea of OsLets, i.e. many kernels running on the system collaborating to provide a single system image to the user.

      3) The traditional way, i.e. implementing super-fine-grained locking in the linux kernel. This would of course make linux hard to maintain and slow on "normal" hardware, just like say, solaris.

    6. Re:impressive w/Linux by Bingo+Foo · · Score: 2
      Nope. Rendering motion picture frames is "embarrassingly parallel" as my boss likes to say. For a feature length movie, you have circa 120000 frames that each can be rendered without any communication through memory to other frames being rendered.

      You would be foolish to pay for interprocessor memory bandwidth when clusters are just as fast for that task.

      --
      taken! (by Davidleeroth) Thanks Bingo Foo!
    7. Re:impressive w/Linux by AJWM · · Score: 2

      Heck, not only do the frames not need to communicate with any other frame to be rendered, the same is also true for most of the pixels. (Absolutely so for classic ray tracing, less so for other rendering techniques.)

      On the other hand, however, many of the modelling techniques used to generate/animate the scenes to be rendered are memory bandwidth intensive as they basically amount to physics simulations in themselves. (Think particle systems, water effects (fluid dynamics), motion of things like hair and fabric, etc.)

      --
      -- Alastair
  5. Historical comparison... by Durinia · · Score: 3, Interesting
    ...interesting that SGI chose the Cray C90 - a system released in *1991* - to compare against. It's nice to know that it's only taken them 10+ years to catch up. :)

    They also mention the SV1, which is a "low-end" Cray. I'm curious how the new X1 (nee SV2) does on the STREAM suite.

    It's good to see that their "scalable linux" work seems to be doing pretty well! I'm sure it was much easier for them to use the IA-64 port of Linux than to port IRIX...

    1. Re:Historical comparison... by ivan256 · · Score: 3, Insightful

      SGI didn't choos to comapre this to a C90, the slashdot submitter did. SGI primarily compared it to the "IBM® eServer p690 and Sun Microsystems Sun Fire"

      The part that I really find interesting is that the top three in the list all outperform this by twice as much, the #1 spot being held by a machine that can do over 500GB/sec.

      It's still over 12x faster then the quad Itaniums I used to work with, and probably much cheeper then the NEC machines and the Cray...

    2. Re:Historical comparison... by foobar104 · · Score: 3, Informative

      ...interesting that SGI chose the Cray C90 - a system released in *1991* - to compare against. It's nice to know that it's only taken them 10+ years to catch up. :)

      If you read the STREAM TRIAD web site linked above, you'll see that SGI didn't compare itself to the C90 exactly; it just ran a benchmark and published the results. Also in that approximate rank are other machines from NEC and Cray and, further down, Sun.

      But you're right. Cray was way ahead of their time when it came to things like memory bandwidth. I remember a friend (ex-Crayon) telling me once that access to main memory on the T-90 was faster than access to the on-chip cache on the Pentium III. That sounds implausible, though, so he might have been exaggerating.

      I'm curious how the new X1 (nee SV2) does on the STREAM suite.

      The last word I got is that X1 is still in the PCB design phase. It's only running as a simulator right now. So it'll be a while before you see those numbers. ;-)

      (That info is several months old, so I may be wrong.)

    3. Re:Historical comparison... by foobar104 · · Score: 2

      Add to this of course that Itanium2 is hardly a vector processor, so what they're doing is comparing apples to oranges.

      Ordinarily I'm all for spoiling benchmarks by pointing out ways in which they're not applicable, but in this case, you're wrong. This test measures bandwidth into and out of main memory. That's it. It makes no difference whether the processors have vector registers and instructions or not. Noting matters except the factors that contribute to moving data between main memory and the CPUs.

    4. Re:Historical comparison... by Boone^ · · Score: 2

      your info is several months old. :)

  6. Whoa. Now let's parallellize! by Jeppe+Salvesen · · Score: 2

    Hmm. I wonder if I can parallellize the app I work on enough to use all those 64 processors? I know my bosses would wet themselves if I did. Of course, I am mainly disk bound. Anyone got a disk system to match?

    --

    Stop the brainwash

    1. Re:Whoa. Now let's parallellize! by haplo21112 · · Score: 2

      Symmetrix! Give EMC a Call

      --
      Power Corrupts,Absolute Power Corrupts Absolutely, leaving one person(group)in charge is absolutely corrupt.
    2. Re:Whoa. Now let's parallellize! by ericman31 · · Score: 2

      I have two words too. Hitachi Lightning! Incredible bandwidth to the disks, very redundant, and better priced than Symmetrix.

      --
      In my universe I'm perfectly normal, it's not my fault you don't live in my universe.
  7. Stock Kernel? by Hornsby · · Score: 2

    Does the current 2.4.x series kernel scale to 64 procs effectively, or are they using some "enterprise patch" to fine tune for this particular hardware? I was under the impression that since most kernel developers don't have access to this kind of ultra-high end hardware that Linux isn't really optimized for it. Correct me if I'm wrong.

    --
    A musician without the RIAA, is like a fish without a bicycle.
    1. Re:Stock Kernel? by Jobe_br · · Score: 2, Informative

      I, too, was wondering if SGI has produced a patch for this or if its running a linus kernel. Chances are, though, it isn't 2.4.x which is in maintenance mode, but rather the 2.5.x series, which is concentrating on enhancing scaleability. Surprising, however, that the 2.5.x line would have gotten such impressive results so early. 2.5.x has only been in the works for a short time now, right?!?

    2. Re:Stock Kernel? by GigsVT · · Score: 2, Informative

      SGI is actually the driving force behind a lot of work on linux scalability. SGI submits patches to the kernel, everyone benefits, etc.

      Linux isn't really optimized for a lot of processors, but companies like SGI are working to change that, and contributing a lot to the community in the process.

      --
      I've had enough abrasive sigs. Kittens are cute and fuzzy.
  8. SGI Origin 3800 by brejc8 · · Score: 2

    SGI make loads of 64 processor machines. And I believe Linux runs fine on multprocessor MIPS 14000s.

    1. Re:SGI Origin 3800 by brejc8 · · Score: 2

      I have seen linux on an O2 R5000.

    2. Re:SGI Origin 3800 by brejc8 · · Score: 2

      It was on a network

  9. so he next question.... by Lumpy · · Score: 2, Funny

    Why can't it run Windows XP?

    Ow!... ow, ow, ow, OW! stop throwing rock at me!

    Ok so it was a bad joke....

    --
    Do not look at laser with remaining good eye.
    1. Re:so he next question.... by Anonymous Coward · · Score: 3, Funny
      Why can't it run Windows XP?

      Well, Windows is notorious for demanding a lot from the hardware. You have to expect it to be a dog on a low-end machine like this one.

      NT once ran on MIPS machines, as I recall. I don't have my NT4 disks handy, but I think that I recall that they included binaries for Alpha and Mips. Wouldn't it be nifty to be able to boot NT on that and see it run one cpu, straight into a bluescreen? After all, a computer without MS Windows is like a person without cancer.

    2. Re:so he next question.... by Fizzlewhiff · · Score: 2

      I was going to say, "Wow, finally a machine that can handle the resource requirements of GNOME." but I didn't have the gnads.

      --

      'Same speed C but faster'
    3. Re:so he next question.... by cant_get_a_good_nick · · Score: 2

      at one time, NT was on x86, MIPS, and PowerPC. I remember all the "It runs NT" ads for MIPS based comps in teh Ziff-davis rags. I think for NT 3.51 only, then all but Alpha was dropped for NT 4, and then not even alpha was supported past NT4.

      I may be wrong...

  10. Make the demo Open Source! by Lieutenant_Dan · · Score: 2, Insightful

    If we could work together (plus Mr Perens who is currently looking for a good cause to lead) we could take the demo to greater heights.

    What is to say that the demo's code isn't buggy and shoddy, holding the power Itanium processors back?

    If we realize the vast potential that the Open Source developer community provides then we can tackle such complex tasks as this Itanium performance measurement.

    --
    Wearing pants should always be optional.
  11. Two things by _damnit_ · · Score: 2, Interesting

    This sounds very cool, but I would really like more info than this. Plus, it isn't going to be released until next year. Within that time frame there will be the usual delays and then final release to a couple customers. Don't get me wrong, I think this is cool. Especially the linux part. This could go a long way to helping Linux scale better on massive machines.
    The second thought is: can it be partitioned? This is a rather big machine and goes against the trend I have witnessed to use many smaller machines to accomplish your goal. I'll have to ask some of the guys at Oracle if they've looked at Linux installs of this size, but as far as I know they only make x86 ports right now. So, I wonder what linux apps would someone run on a system this big? (I know. Insert obligatory Quake, Beowolf and porn server reference here.)

    Disclaimer: I work for an SGI competitor. But I have personally installed Linux on every piece of harware I can get my hands on. Just to play usually, but still. They just pay my mortgage.

    --


    _damnit_

    It's my job to freeze you. -- Logan's Run
    1. Re:Two things by foobar104 · · Score: 4, Interesting

      The second thought is: can it be partitioned?

      Since this machine is a standard Origin 3000 with McKinley processor modules, I'm going to assume the answer will be yes. You can partition an O3000 down to a single processor brick + base IO brick, so I imagine that SGI will implement the necessary software bits to make that happen on the SN1-IA systems. I know there are both user space bits (mkpart, partmgr) and kernel space bits (the TCP-over-NUMAlink driver).

      I personally have only seen partitioning used on HA systems and lab systems. For a fully fault-tolerant N-processor system, you can buy one 2N-processor Origin and partition it down the middle. The two nodes can run in parallel, passing data back and forth over the NUMAlink via TCP/IP, until one goes down. Also, partitioning is great in a lab environment. It's nice to be able to carve up a big multiprocessor system and give each user a 4-processor (or multiple of 4) node.

      I wonder what linux apps would someone run on a system this big?

      Anything you'd run on an IRIX system of that size, I'd imagine. I believe-- not positive-- that MSC has already released Nastran for Itanium 2 Linux. (Nastran is a computer-aided engineering tool used extensively in the automotive industry, and other manufacturing industries. It's used for things like stress, heat transfer, and vibration analysis.)

      And, as long as the Fortran compilers are worth a damn, you can run just about any other scientific, analytical, or technical software, I'd imagine.

    2. Re:Two things by fgodfrey · · Score: 2
      Err, as I point out in a post below, this is not an Origin 3000 series machine, but you're right, it will be partitionable.


      Also, a hint on partitioning - ditch mkpart. Use partmgr. mkpart was one of the biggest pains in my rear when I worked on the O3000 partitioning code.... Also, nice to see someone running code I've worked on :)

      --
      Go Badgers! -- #include "std/disclaimer.h"
  12. What would you do with it? by Anonvmous+Coward · · Score: 2

    I saw a few comments along the lines of "wowee, powerful!". I'm just curious what somebody'd want with a machine that powerful.

    Me, personally, I do lotsa 3D stuff and would love to see what it'd take to bring that machine to it's knees. However, I get the impression I'm but of a few 3D dudes here. So what would you non-3D dudes wanna do with it?

    1. Re:What would you do with it? by tvalley000 · · Score: 2, Informative

      At a company I worked for in 1997, we used an SGI box of comperable power (well, not _that_ much power) to do real-time rendering of geological resevoirs of data. Typical data points were about 40MB of data, directly measured from the field of study. The purpose was a "fly through" for geologists to tell where oil could be found.

      Everyone on the team used SGIs (I used an Indigo 2, arguably the slowest box in the office) running IRIX. The Origin system sat two floors below us, with the 3D programmer only having the keyboard, mouse and monitor in his office. It made it difficult when we wanted to run a game of Quake, as everyone could easily sneak up on him.

    2. Re:What would you do with it? by joib · · Score: 2

      I've been doing ab initio calculations, i.e. calculating properties of some atomic system starting from quantum mechanics. Last month I used about 3000 CPU-hours of IBM POWER4 1.1Ghz juice. And my calculations weren't extremely complicated either..

    3. Re:What would you do with it? by geekoid · · Score: 2

      Model weather with smaller cells.

      --
      The Kruger Dunning explains most post on /. http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect
  13. Re:Well, that's nice, but what about... by foobar104 · · Score: 3, Insightful

    Anyone have an educated guess of what the actual score would be?

    Zero. Origin servers don't have graphics cards. Which means, unfortunately, the Slashdot community is going to have to try to wrap its collective head around a more meaningful measurement of potential performance.

  14. heh, typo? or what he meant... by Raleel · · Score: 2

    from Maddog:

    "For those applications that need to scale, SGI has just proven that Linux need not be synonymous with clutter."

    cluster? or clutter? a good cluster is not cluttered :)

    --
    -- Who is the bigger fool? The fool or the fool who follows him? --
  15. Released in the nick of time.... by RicochetRita · · Score: 4, Funny
    ... SGI will start selling the systems early next year.

    to meet the system requirements for Doom III.

    -R

    --
    Stuff that matters: circuitbreakers, vacuum-cleaners coffee makers, calculators generators, matching salt+pepper shakers
  16. great.... by _ph1ux_ · · Score: 2

    ....but seriously what are the applications for boxes like these. I mean - other than uses for lawrence livermore labs etc... big ass iron like this seems to only really be useful for 1. nuclear modelling 2. benchmark testing press releases.

    I know that someone somewhere is going to use a box like this - but tell me for what real world application will you use it. (serious question - curious. I want to know the reall apps these are used for)

    1. Re:great.... by ericman31 · · Score: 2

      In the health insurance industry, which I happen to work in, large SMP or MPP machines are used for data warehousing and fraud and abuse detection. Machines ranging from 16 to 64 CPU's (generally UltraSPARC or IBM Power). When you are dealing with claims records for 5 or 10 million beneficiaries over a 5 or 10 year time span you need a lot of processing power and disk space. The data warehouses are used for trend analysis, fraud investigation and the like. Anyone with a background in statistics knows just how much number crunching we are talking about.

      --
      In my universe I'm perfectly normal, it's not my fault you don't live in my universe.
    2. Re:great.... by sql*kitten · · Score: 2

      I know that someone somewhere is going to use a box like this - but tell me for what real world application will you use it. (serious question - curious. I want to know the reall apps these are used for)

      There are so many real world applications that demand a lot of CPU that it's hard to know where to begin answering your question. For a start, there's the engineering industry. Simulations are a lot cheaper than fabricating mockups and a lot easier to analyze that windtunnel tests - so anyone designing any form of machinery or structure benefits from the raw CPU to run simulations that are closer approximations to reality. Jet engines, cargo ships, skyscrapers, prosthetic joints, microprocessors... pretty much any industrial product.

      Next there's the scientific community. There are a who class of problems for which there are no "pure" theoretical solutions, the only way to solve them is to iterate over the data set with an algorithm until it stabilises. The search for new anti-cancer drugs is largely a matter of simulating interations between protein molecules, which requires an enormous amount of processing power to get results in any useful time. Physics research is similar.

      Next, there's commerce. Success in business is about getting to market with what the market wants to buy at a price the market is willing to pay. If you can spot trends in billions of transactions (say, you're a mobile phone operator, a credit card provider or an airline) before your competition, you have an edge. When you're analyzing data in 14 or more dimensions, the more memory and CPU you can throw at the problem the better.

      That's just off the top of my head. We are a long, long way from the day we can say that we have "enough" processor power.

  17. STREAM and SGI past history by dprice · · Score: 4, Informative

    It's not surprising that the SGI machine runs STREAM well. Back in the mid-1990's, John McCalpin, who worked for SGI at that time, was a regular contributor to comp.sys.super, and he would frequently brag about the superiority of SGI running STREAM. McCalpin is one of the primary advocates for STREAM. You can optimize a computer architecture to run a particular benchmark well. The question is whether the SGI machine runs a wider variety of real-world problems well.

  18. One word by joib · · Score: 2

    EXPENSIVE

  19. Re:Well, that's nice, but what about... by afidel · · Score: 2

    Which means, unfortunately, the Slashdot community is going to have to try to wrap its collective head around a more meaningful measurement of potential performance.

    So how man LOC's/sec is that?

    --
    There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
  20. Well... by jellisky · · Score: 2

    ... weather modeling, for one. Here in the US, NCEP (National Centers for Environmental Prediction) runs all the forecast weather models on an IBM-SP (used to run on a Cray C90, I think). In Europe, the ECMWF model is run on a Fujitsu supercomputer, I think.

    Models for plasma dynamics and astrophysics are also run on these heavy-duty machines. I'm sure others have had some experience running other things, but I know that the NCEP IBM-SP gets a workout at least 2 times a day running at least four different weather models that have average runtimes around an hour each.

    -Jellisky

  21. Re:Well, that's nice, but what about... by Graspee_Leemoor · · Score: 2

    "Which means, unfortunately, the Slashdot community is going to have to try to wrap its collective head around a more meaningful measurement of potential performance."

    Hmm. (rubs chin). Well, there's that ASCII graphics version of Quake for the console- that should work!

    graspee

  22. big national id databases by QuantumRiff · · Score: 2

    This sure would run a select statement on a database of all of our info pretty damn fast. but, who would believe we'd ever adopt any kind of national id, you know, like drivers licenses, social security cards, membership cards at grocery stores, etc.

    --

    What are we going to do tonight Brain?
  23. I have a dream... by mike260 · · Score: 2

    ...and it's called 64 CPUs.
    Perhaps they should update the song

  24. Impressive memory crossbar by Animats · · Score: 5, Insightful
    First of all, the OS doesn't matter for this benchmark. This is a memory-to-memory copying test.

    That said, it's an impressive result. And it's done in an unusual way. SGI has a 1.6GB/s channel running through routers connecting the processors and memory. A computer is made up of multiple rackmount "bricks" connected by cables and routers. The "router" is a 2U rackmount device.

    Processors and memory reside in rackmount boxes with 4 CPUs and 8 GB (max) of local memory. These boxes interconnect through a single 1.6GB/s link per box, which, in a big system, goes through several layers of routers. So a memory access to another box is routed through what is essentially a fast LAN. All this is cached, of course.

    It's not clear to what extent application programs have to be aware of this. Clearly, if you lay things out in memory badly, with lots of CPUs reading and writing the same memory from all over the memory net, the system will bottleneck. (Everybody reading the same stuff is OK; it's cached. But writes have to propagate back to the home location of the data.)

    Since the whole monster crashes all at once, you don't want to build your web server farm this way. It's for applications that really need all that crunch power in one machine.

    1. Re:Impressive memory crossbar by foobar104 · · Score: 5, Informative

      It's not clear to what extent application programs have to be aware of this. Clearly, if you lay things out in memory badly, with lots of CPUs reading and writing the same memory from all over the memory net, the system will bottleneck.

      Speaking as somebody who's done his share of IRIX programming, I'd say "none at all."

      In some cases, on Origin 2000 hardware with older versions of IRIX, you could see notable performance differences if you went out of your way to place memory in banks adjacent to the running processors. But the Origin 3000 architecture, with its significant reductions in memory latency, and newer versions of IRIX, with their improved page replication algorithms, have made manual memory placement almost obsolete. Almost.

      SGI spent a lot of time and trouble trying to reduce the impact of accessing remote memory. The caching mechanisms and page replication stuff are really well thought-out.

    2. Re:Impressive memory crossbar by Sabalon · · Score: 2

      Packet sniffing?

      arp who-has cpu53 tell cpu4
      arp who-has ram1G-2G tell cpu3

    3. Re:Impressive memory crossbar by Florian+Weimer · · Score: 2

      First of all, the OS doesn't matter for this benchmark. This is a memory-to-memory copying test.

      Even the relatively simple uniprocessor x86 architecture offers OS implementors numerous ways to kill performance (shameless plug: a benchmark example). I would be suprised if SGI achieved this result without some tweaking.

    4. Re:Impressive memory crossbar by foobar104 · · Score: 2

      The docs only talked about 'Copy on write' mechanisms, but does it also do 'Copy on read'?

      I believe it does. They call it page replication. When a CPU on node 1 fetches data from memory on node 2, the system replicates the data pages from node 2 onto node 1, if free memory is available for them. It's all automatic. In this way, main memory is almost used like a level 4 cache.

      Do all NUMA machines do that?

      By now, they might, but when the O2000 was first released, SGI made a very big deal about this feature.

  25. Statics, Benchmarks, and lies... by AtariDatacenter · · Score: 5, Informative

    I think it is pretty interesting that the benchmark that they used measured memory throughput, as opposed to, say, an actual workload. In other words, this is a synthetic benchmark, versus a real-world benchmark. They say, "Look! We can do memory transfers really really fast!"

    Unfortunately, memory transfers are not the world when it comes to large multiprocessor boxes. The overhead comes in when you're trying to synchronize a large number of threads/CPUs to do a large task. For example, an Oracle database.

    Sun has proven that it scales up the tree very well with large numbers of processors. But from my understanding, Linux is more efficient with a low processor count, and less and less efficient with more processors.

    I question its ability to do anything with a real workload. And I've even more suspicious because they use a benchmark I've never heard of (STREAM TRIAD) to push its superiority on a single-aspect synthetic benchmark.

    Good. The machine looks like it has a decent memory bus, and memory modules with a good configuration and speed rating. Now, what can the machine actually do well that makes it a real winner?

    1. Re:Statics, Benchmarks, and lies... by foobar104 · · Score: 5, Informative

      Good. The machine looks like it has a decent memory bus, and memory modules with a good configuration and speed rating.

      You know, before you piss in SGI's Cheerios, you might want to do a little reading. The Origin 3000 architecture, on which this prototype system was based, has no memory bus at all. It uses a fabric of switched multi-gigabyte-per-second interconnects to attach CPUs to RAM and to other CPU nodes.

      CPU benchmarks (like SPEC) are synthetic and irrelevant, because they fit in cache. Virtually no real application fits in cache, and the sort of applications you run on a machine this big deal with data sets no the order of tens or even hundreds of gigabytes. Memory-to-CPU bandwidth is probably the only real indicator of the ability of the system to handle real-world workloads.

      It's also the only thing-- other than the dimensions and the color of the plastics-- that differentiates SGI's big Itanium 2 server from everybody else's big Itanium 2 servers.

    2. Re:Statics, Benchmarks, and lies... by AtariDatacenter · · Score: 2

      CPU benchmarks (like SPEC) are synthetic and irrelevant, because they fit in cache. Virtually no real application fits in cache, and the sort of applications you run on a machine this big deal with data sets no the order of tens or even hundreds of gigabytes. Memory-to-CPU bandwidth is probably the only real indicator of the ability of the system to handle real-world workloads.

      You say that synthetic benchmarks are irrelevant. Then you go on to say that this particular synthetic benchmark is highly relevant. It can't be both. I'd like to see this run a TPC variant, which is closer to real-world than it is synthetic.

      The Origin 3000 architecture, on which this prototype system was based, has no memory bus at all. It uses a fabric of switched multi-gigabyte-per-second interconnects to attach CPUs to RAM and to other CPU nodes.

      What, do I have to explicitly call out the components and subcomponents? It is a memory bus, for the purpose of this discussion.

    3. Re:Statics, Benchmarks, and lies... by AtariDatacenter · · Score: 2

      Obviously you missed the boat here cheif. The system SGI is selling is for 3D Rendering.. not to run amazon.com

      I don't think I've missed the boat. Okay. Let's take rendering. On a pure economic level, they're going to be hard pressed to sell this configuration vs 64 single processor (perhaps even blade) servers.

      On a technical level, let's see how well performance ramps up when you go from 32 to 33 processors. (Hint: you won't be getting a full CPU's worth of extra performance.) Actually, it can get even worse with lock contention and kernel issues so where you can LOSE performance by adding a CPU.

      The point I was trying to make is they're touting superiority based on a single benchmark which measures memory bandwidth. Great. The company who produced the box picked a single benchmark which puts the best shine on the hardware/os combination.

      Now, what does the box really crank?

    4. Re:Statics, Benchmarks, and lies... by foobar104 · · Score: 2

      You say that synthetic benchmarks are irrelevant. Then you go on to say that this particular synthetic benchmark is highly relevant.

      No, I don't. I really can't emphasize this enough: read. I said, "SPEC is synthetic and irrelvant." Big difference.

      I'd like to see this run a TPC variant, which is closer to real-world than it is synthetic.

      The TPC benchmarks are measurements of database performance. Since SGI was trying to demonstrate the features and capabilities of their hardware, it would have been completely inappropriate for them to use a database benchmark. STREAM TRIAD is great because it measures only one thing: the rate at which data can be moved from memory to the CPU or vice versa. The TPCs measure aggregate systems, including hardware, storage, OS, database software, and so on. They may be relevant if you're looking for a fast database server system, but they're hardly useful for evaluating one hardware architecture over another.

      What, do I have to explicitly call out the components and subcomponents? It is a memory bus, for the purpose of this discussion.

      The whole point of this discussion is that the SGI system can outperform virtually everything else on STREAM TRIAD because it has no memory bus. Memory busses are bottlenecks, and pumping a lot of data through them is very hard. The SGI system eliminates the bottleneck and thus demonstrates amazing bandwidth. When you miss the whole point of the discussion, I'm going to call you on it.

    5. Re:Statics, Benchmarks, and lies... by AtariDatacenter · · Score: 2

      No, I don't. I really can't emphasize this enough: read. I said, "SPEC is synthetic and irrelvant." Big difference.

      No. You said... "CPU benchmarks (like SPEC) are synthetic and irrelevant, because they fit in cache." You also said "Memory-to-CPU bandwidth is probably the only real indicator of the ability of the system to handle real-world workloads." I'd call that even more synthetic and irrelevant than SPEC.

      The TPCs measure aggregate systems, including hardware, storage, OS, database software, and so on.

      No argument there. But I was saying that it was MORE relevant than SPEC. And extremely more relevant than that STREAM TRIAD test they're pushing.

      The whole point of this discussion is that the SGI system can outperform virtually everything else on STREAM TRIAD because it has no memory bus.

      Really? I don't recall reading that in the story introduction or SGI's Press Release. Only the link to the STREAM TRIAD itself pointed out that it was talking about memory bandwidth. IIn fact, that is what my original message was trying to point out.

      So they've got a machine that gets great ratings on this synthetic benchmark? Who cares. It doesn't mean much if you've bolted a kernel on top of it which isn't mature in a large CPU environment. (And other hardware issues, as you mentioned which the TPCs would bring into play.)

    6. Re:Statics, Benchmarks, and lies... by Jah-Wren+Ryel · · Score: 2

      Synchronization is all about memory latency. For the most part, STREAM is all about memory bandwidth. Althought each has an effect on the other in any given architecture, it usually isn't too strong of an effect.

      --
      When information is power, privacy is freedom.
  26. Re:Well, that's nice, but what about... by geekoid · · Score: 2

    "Zero. Origin servers don't have graphics cards. Which means, unfortunately, the Slashdot community is going to have to try to wrap its collective head around a more meaningful measurement of potential performance."

    NOOOOOOOOOoooooooooo... ;)

    --
    The Kruger Dunning explains most post on /. http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect
  27. it's the other way 'round by halfelven · · Score: 2, Informative

    Actually, it's precisely because of lack of superfast mem-IO machines that many people tried to work around the problem and create algorithms that are CPU-bound.
    In fact, most of the computationally-intensive problems require LOTS of mem-IO.

    And there's one more thing: there's a huge difference between the 64-CPU SGI machine, and a Mosix cluster of 64 1-CPU nodes: the SGI has one single memory space contiguous on the same machine. That means you can actually use a very large matrix to process your data, instead of shoving bits of it over the network back and forth.
    There are entire classes of problems that will be solved orders of magnitude faster on the SGI server than on a network-distributed Mosix cluster (or any other kind of cluster, Beowulf, etc.). That's the advantage of true SMP systems (all CPUs on the same hardware) as opposed to networked clusters.

  28. SGI and Linux by fm6 · · Score: 2
    What is most impressive about this to me is that they did it using Linux over IRIX. Why? Because this has provent to be Linux's weakest point: scalability.
    Maybe that was true Three years ago when SGI announced its Itanium/Linux strategy. But I imagine they've put a little effort into it since then.

    This new system is news, but it's hardly groundbreaking news. Back in '99, SGI spun off MIPS and announced they would do commodity systems -- including supercomputers with commodity processors. At that they had a choice: port IRIX to the Itanium, or teach Linux to scale so they could use it on their supercomputers. It's been no secret that they chose the latter. Or why: it was less expensive, and catered to an established user community.

    Note that Itanium/Linux systems are not meant to replace MIPS/Irix systems. Unless they've changed their strategy since I worked there, SGI plans to keep developing Irix systems for another 10 years, at least. Of course, that depends on maintaining loyalty to Irix solutions, and the buzz is that they're having trouble with that.

  29. Re:MIPS is to IA64 as Irix is to Linux? by foobar104 · · Score: 4, Informative

    Anybody else see that as the main reason this is running Linux instead of Irix?

    SGI started working on porting IRIX to the IA-64 architecture back in (I think it was) 1995 or 1996. Not long after, they found that it would be easier and cheaper to get Linux to scale more efficiently and to port some key libraries and services from IRIX than it would be to port all of IRIX over to the new architecture.

    It's all about time and money.

  30. there's no point in doing that by halfelven · · Score: 5, Interesting

    The whole point with the SGI supercomputers (there are Origin servers running Irix on 1024 processors) is that there's one single copy of the OS running across all those CPUs, and the entire memory is available to all CPUs on the same piece of hardware. That means, any CPU can access any piece of information at the speed of mem-IO, and you can easily create a large matrix (think many tens or hundreds of GB) to keep all your data in one piece.
    Networked clusters (Mosix, Beowulf) split the CPU bunch across the network, and the memory is split too. That means there's a huge latency when a CPU wants to access data that happens to be on a different node on the network: the network latency is many times larger than memory latency.

    There are problems that simply cannot be solved on networked clusters, precisely because of network latency. While true supercomputers (all CPUs on the same machine) do not have this limitation.
    Well, ok, so you can split the matrix across nodes in a Beowulf, but even if you have the same CPU power as the SGI supercomp, you're going to solve the problem several times slower (if not several orders of magnitude slower). Such is the importance of latency.

    This is why there's no point in clusterising this kind of computers: you lose their biggest advantage: single OS copy, all memory on the same machine.

  31. C90 is 12 years old! by Boone^ · · Score: 2

    The Cray C90 came out like in 1990 or 1991, and this new fangled SGI box just barely beats it? wow!

  32. Re:Imagine. . . by nizo · · Score: 2

    Actually, I would rather imagine a cluster of people with baseball bats chasing down people who post "Imagine a Beowulf cluster of these" to slashdot and beating some sense into them. ;-)

  33. Only Thurston Howell III is eligible by deal · · Score: 3, Funny

    "Through its experience and expertise in high-performance computing, SGI will offer customers of the highest quality 64-bit operating environments."
    Well, Hmph! The rest of us low-life customers wouldn't want it anyways!

  34. Windows? No way! by halfelven · · Score: 2, Interesting

    SGI never thought to replace Irix with Windows! That's ridiculous.
    Irix can scale up to 1024 CPUs and beyond. Solaris can scale up to 100. Here's Linux, now it's scaling close to 100. How much to you think Windows can scale? 10 CPUs? 20? :-)
    SGI's thing was always that it had machines running one single copy of the OS across hundreds (or thousands) of CPUs on the same machine (not in a cluster). You simply cannot do that with Windows, period.
    They had some graphics workstations running Windows, but that was on the lowest end of things, and now those systems are not available anymore.

    1. Re:Windows? No way! by foobar104 · · Score: 2

      Back in the early days of Windows NT, it was not know what its capabilities would be. SGI nearly bought the farm by betting NT to replace IRIX.

      Then SGI realized NT wasn't going to be for big machines, and let that bad dream fade away.


      Dude, that's simply not true. SGI never even built a prototype Windows system that ran any version of NT before 4.0. By the time their Windows NT workstations made it out the door, Windows 2000 was very nearly a reality. So close that rather than adding support for certain features, SGI just punted. For example, dual monitor support was never offered on the NT systems until Windows 2000 came out, because NT 4.0 didn't support running two graphics pipes with different drivers.

      SGI never, ever, spent any time or effort on Windows NT for anything other than workstations.

      But likewise, finding x86 hardware with more processors is probably the largest reason that x86 Linux, Windows, whatever isn't running on bigger machines.

      So, let me get this straight. They don't make big x86 boxes, and that's "probably the largest reason" why there are no OSs for big x86 boxes? Brilliant!

    2. Re:Windows? No way! by foobar104 · · Score: 2

      During development of Windows NT 3.1, the first version (god bless starting counting at 3) MS made a strong pitch to SGI to get behind it and work with them to replace IRIX. SGI turned it down, and later signed up for limited workstation production.

      If true, this is the first I've heard of this. Can you back this up with some sort of evidence?

    3. Re:Windows? No way! by fgodfrey · · Score: 2
      So it was more like "we got a moron for a CEO who was totally in love with Bill Gates". He got SGI to commit to running Windows, even on big systems (in fact, the internal hardware manuals for a large system I worked with there actually mentioned booting "an OS such as Irix or Windows", even though that plan had long since been dropped). This caused SGI to lose a large number of top-notch developers who didn't really want anything to do with Windows. The same moron CEO, "Rocket Rick" Belluzo (who recently got the axe at Microsoft), and his yes-men made a number of other dumb descisions that nearly killed the company. Finally, he quit (before he could be tarred, featered, and run out of town on a rail) in August of '99 and was replaced by competant management (the present CEO, Bob Bishop, who brought in a bunch of other people who are a lot better than Rocket Rick's crowd).


      In any case, the large systems capable of running Windows didn't appear until after he left and by the time they did, the descision had been made to unload Windows and go with a useful OS (Linux). This is what triggered SGI to get involved heavily in Linux development. Irix was another OS that was considered for the Itanium platform, but there were a variety of reasons why that wasn't picked.


      So, that's the short story on why SGI is presently making this system with Linux and why some people have mentioned Windows on large SGI's.

      --
      Go Badgers! -- #include "std/disclaimer.h"
    4. Re:Windows? No way! by foobar104 · · Score: 2

      I won't argue with you about it, but I'm not going to email you either. I'll just take your word for it and admit that you learn something new every day.

  35. Comment removed by account_deleted · · Score: 2

    Comment removed based on user account deletion

  36. Re:Well, that's nice, but what about... by GutBomb · · Score: 2

    i think the point of the question was to see how a software opengl implementation would perform on a 64proc machine.

  37. Comment removed by account_deleted · · Score: 2

    Comment removed based on user account deletion

  38. Re:Imagine. . . by jedidiah · · Score: 2

    Make bigger machines and people will just make bigger problems. Don't laugh or groan, someone WILL come up with some application that could exploit a couple hundred of these monsters.

    --
    A Pirate and a Puritan look the same on a balance sheet.
  39. Itanium on the rebound? by bryanbrunton · · Score: 2


    Intel must be pleased. If SGI could manage to sell one of these that would double the number of Itaniums that Intel has managed to flog.

  40. Comment removed by account_deleted · · Score: 2

    Comment removed based on user account deletion

  41. Not an Origin 3800 by fgodfrey · · Score: 2
    The poster on this is wrong. An Origin 3800 has MIPS processors and runs Irix (although there was a "toy" Linux port to Origin 2000 machines that would be fairly easy to adapt to the 3000 series). This is "the upcoming Itanium 2 system from SGI" that the press release mentions (what the marketing department at SGI will ultimately come up with for a name, I have no clue). While they are similar systems (both use ccNUMA and similar in other ways that I can't go into here), they use different memory control ASICs.


    In any case, the poster made it sound like you can just plug Itanium 2's into an Origin 3000 and *bang* you've got a Linux system which is not correct.

    --
    Go Badgers! -- #include "std/disclaimer.h"
    1. Re:Not an Origin 3800 by foobar104 · · Score: 2

      Based on your resume, you obviously have inside information that I'm not privy to. But I think you're oversimplifying the story a bit.

      The Origin 3000 series (SN-1) was designed very nearly from the beginning to accommodate either MIPS CPUs from the R10000 family or IA-64 CPUs. In 1996-1997, when I worked most closely with the SN-MIPS and SN-IA groups, there was doubt about whether the IA-64 processor would be Merced or McKinley, but there was no question about support for one or both of them.

      Look inside a C-brick some time. (If you don't have one handy-- heh-- there's an illustration here.) See all that empty space at the front? The original design called for the use of either MIPS PIMMs or IA-64 PIMMs. The IA-64 PIMMs would include all the necessary hardware to make the Intel chips talk to the Bedrock memory controller. The MIPS PIMMs are pretty small, about four inches square or so. But the IA-64 PIMMs were projected to be real monsters with giant heat sinks on them. Thus all that empty space in the C-brick.

      For quite a while, of course, SGI has been working on SN-2, or whatever they're calling the successor to SN-MIPS these days. I'm not associated with that group any more, so I'm not in the loop on the new design. I've heard rumors on the order of 128 MIPS processors in a rack, quadrupling the processor density of the 3000 series systems, but I don't have any real information there. It's certainly possible that SGI is preparing to roll out their IA-64 systems in the spring in the new architecture, but that would surprise me. Of course, like I said before, you seem to know more than I do on this one.

    2. Re:Not an Origin 3800 by fgodfrey · · Score: 2
      Well, as you well know since you worked for SGI, SGI's plans change rapidly :) By the time I got to the project in '99, there were no plans for the SN1 architecture to support McKinley - that's SN2. Alas, SN1-IA64 is dead. It was killed in spring of '01 (which is a part of why I am no longer working for SGI...).


      Anyhow, I have definetly looked inside more C-bricks than I can count when I worked on the SN1 (both MIPS and IA64) partitioning software.

      --
      Go Badgers! -- #include "std/disclaimer.h"
    3. Re:Not an Origin 3800 by foobar104 · · Score: 2

      That sounds fairly typical. Thanks for correcting me.

  42. Huh? by rakslice · · Score: 2

    Doesn't SGI own Cray? (At least until recently?)