Slashdot Mirror


SGI Introduces World's Densest Server

Twirlip of the Mists writes "Today SGI announced the Origin 3900 server, the world's densest computer. How dense? How about 16 MIPS R14000A processors and 32 GB of RAM in a 4-rack-unit 'superbrick,' for a grand total of 128 processors and 256 GB of RAM in a single rack. That makes the new machine the densest single-system-image computer in the world; it's even denser than most blade systems. Just for fun, the server also includes a whole bunch of 64-bit, 133 MHz PCI-X slots (from 11 up to hundreds and hundreds, depending on configuration). There's coverage of the announcement on ZDNet, CNET, and InfoWorld, as well as on SGI's own site."

39 of 338 comments (clear)

  1. System Requirements by Soporific · · Score: 5, Funny

    Isn't that the system requirement for the up and coming Doom III?

    ~S

    1. Re:System Requirements by akula1 · · Score: 3, Funny

      It'll probably be about right by the time Duke Nukem Forever is released.

  2. Densest server? by digitalsushi · · Score: 5, Funny

    Now where do we find the world's densest admin to run it?

    --
    slashdot: where everyone yells sarcastic metaphors to themselves to understand the issue
  3. Re:Is it such a good new? by pyr0 · · Score: 5, Funny

    "...and more on lessening heat dissipation..."

    Correct me if I'm wrong, but wouldn't you want to *increase* heat dissipation?

  4. Re:Just imagine a beowulf .... by jo42 · · Score: 5, Funny
    Query: What do you call a cluster of slashdot Linux geeks?

    Response: The boys that cried "Beowulf!".

  5. World's Densest Server by bstadil · · Score: 4, Funny
    Not true.

    This record goes to Emmanuel at the little bistro on Rue de Bach just off Blvd. St. Michel in Paris.

    --
    Help fight continental drift.
  6. Re:Heating? by Twirlip+of+the+Mists · · Score: 5, Informative

    I meant to mention this in my submission, but it slipped my mind. The R14000A only consumes 17 watts of power. Four of them, plus the Bedrock memory controller chip, plus up to 8 GB of RAM, fit on a board inside a 1 RU clearance. Four of them, plus some nifty backplane hardware, fit into a "superbrick," meaning sixteen processors in 4 RU.

    As far as heat loading goes, the "superbrick" is basically one big wind tunnel, with giant fans on the front and ventilation out the back. It pumps a lot of heat into the room, but the temperature in and around the CPUs is really pretty low. I think it peaks around 35 C.

    --

    I write in my journal
  7. Blade/Origin Comparison by zmalone · · Score: 5, Insightful

    Commenting on how the new Origin systems are denser then any other single image system, and then comparing them to the current blade fad to make your point is a bit silly. Blades are seperate machines (unless they are Sun, in which case they are the current desktop line), this system is a single machine. I'm not entirely certain about this density claim either, doesn't Sun fit 128 processors in a rack with the Fire 15ks?

    1. Re:Blade/Origin Comparison by Twirlip+of+the+Mists · · Score: 4, Informative

      Sun fits 106 processors into a rack. They were previously the record holder. The Origin 3900 is considerably denser than the Sun Fire 15K, both in terms of processor count and PCI-X slot count-- though not at the same time, of course.

      I compared the density of SGI's system to blade systems because those are widely considered to be the densest computers in the world, with something like 90 or 100 individual one-processor computers per rack. This system is not only dense in terms of pure processor count that most-- not all, but most-- blade servers, but it's also got all the advantages of a single system image for HPC applications.

      --

      I write in my journal
    2. Re:Blade/Origin Comparison by Twirlip+of+the+Mists · · Score: 5, Interesting

      Close, but no kewpie doll. A superbrick hold 16 processors (not 64; I think that was a typo on your part), and connects externally via NUMAlink to other superbricks. But, if I remember my numbers right, the maximum memory latency across the longest multi-router NUMAlink hop in a 128-processor Origin 3000-series system is less than the normal processor-to-processor latency in the Sun Fire 15K. NUMAlink is incredibly fast. The ratio of local memory latency to remote memory latency is something 1:1.5, as opposed to about 1:10 in IBM's and Sun's big systems.

      --

      I write in my journal
  8. Re:SGI's Gettin' Some by Twirlip+of+the+Mists · · Score: 3, Insightful

    What we need is faster, cheaper hardware that makes sense!

    The 128-processor Origin 3900 lists for $2.9 million. There's nothing "cheaper" about this. Faster, yeah; this is one of-- not "the," but one of-- the fastest computers in the world. And it's the densest. But it's nowhere near cheap.

    --

    I write in my journal
  9. Pointless in most datacenters by cjsnell · · Score: 3, Interesting


    These servers are pointless in most datacenters. In order to fill one rack with this much horsepower, you would need at least two empty racks next to it to compensate for the power draw and (much) increased cooling needs. I would argue that the target market for this equipment is government labs, research institutes and universities--not usually starved for floor space.

    1. Re:Pointless in most datacenters by Jobe_br · · Score: 4, Interesting

      Just an FYI - the CNet article (linked above) talks about its possible use on oil rigs - that type of mapping usually takes some horsepower and as usual, anything that is sea-based will be somewhat cramped for space!

    2. Re:Pointless in most datacenters by Twirlip+of+the+Mists · · Score: 4, Insightful

      Right but wrong. The target market for this system is definitely government and university HPC labs, but those labs are definitely short of floor space. Putting more MIPS per floor tile is an important advancement.

      --

      I write in my journal
    3. Re:Pointless in most datacenters by aaarrrgggh · · Score: 4, Informative

      Actually, the spec sheet indicates that it is 8.9kW per rack (2.2kW for Drive arrays). That is on the high side, but liveable. (6kW is the max for "standard" cooling-- you can accommodate up to 10kW with a high delta-T cooling system. Water cooling comes into play after that.)

      The value of shrinking it down is (as you allude to) not a real-estate issue, but more about the computing efficiencies of a denser package.

      The HP blades (6U) are about 35kW nameplate per rack, with a real load of about 10-11kW. The energy savings of SGI might actually give it some value in comparison!

  10. Favorite Quote by ProtoStar · · Score: 5, Funny
    From the ZD link:

    Procter & Gamble, for example, uses an SGI system to study the aerodynamics of Pringle's potato chips

  11. Re:no different... by Twirlip+of+the+Mists · · Score: 5, Insightful

    No, it's not. This is a single-system-image server. The 128-processor rack boots a single kernel. (In fact, you can connect four 128-p racks together to make a 512-p system, and larger systems than that are supported under special contract to SGI. I believe NASA Ames has a 1,024-p.)

    The four-processor, 1-unit server you talked about stops there: at four processors. You can't compare that to a system that scales to be 256 times that size.

    --

    I write in my journal
  12. Density by flops? by LoudMusic · · Score: 5, Insightful

    How about we calculate density by flops or something else useful. I mean, how difficult would it be to cram a butt load of Pentiums in a rack? Yeah well how much calculation can they do?

    Lets cruise on over to the Top 500 and use their handy dandy html list to view 'most powerful chip'. This unfortunately requires a little calc work because they failed to include this number in their table.

    #1 NEC Earth-Simulator 35,860.00 GFlops using 5,120 Processors -- WOW!

    But that's only 7 GFlops per processor ... that thing is mamoth with 5,120 processors.

    Now lets look at a little different design ...

    #14 Hitachi SR8000-F1/168 1,653.00 GFlops using 168 Processors -- Hot DAMN!!

    This is more like it. They're pulling 9.84 GFlops per processor. With their architecture they could pull off the Earth-Simulator's GFlop rate with 3,645 processors - That's 28% less computer doing the same amount of work. Which means if the Earth-Simulator had been constructed with Hitachi's hardware, they could have been pulling 50,380 GFlops in the same cubic footage.

    Now this is all rambling that assumes that the processors are similar in size. Which probably isn't true. But they are also getting more power out of less hardware, and it is rare that THAT isn't a bonus. ... ramble ramble ...

    --
    No sig for you. YOU GET NO SIG!
  13. Re:SGI's Gettin' Some by Sean+Johnson · · Score: 3, Funny

    Just wait for the technology to trickle down. You'll be able to get womething on par with this for $3000 in about, oh say......30 yrs.

    --
    >>>>>> Chewie, take the professor in the back and plug him into the hyperdrive.
  14. Re:Z.... by Twirlip+of+the+Mists · · Score: 5, Interesting

    Obviously, that should be 64 gigabytes of RAM, not 64 megs.

    Interesting thing about this system will be, rather than the maximum RAM capacity, the minimum RAM required. The original Origin 3000 required some minimal amount of RAM-- 256 or 512 MB or something-- for every four processors. I'm not sure if this new model has the same requirement, but I'd imagine that it does. (It's an architectural thing. Every node board has to have some RAM on it, because that node board may be nominated at boot time to act as the boot master, among other reasons.)

    If that's true, then a 128-processor system would require a minimum of either 32 or 64 GB of RAM, depending on whether you can put 256 MB on a node board.

    --

    I write in my journal
  15. Re:Superbrick's layout? by Twirlip+of+the+Mists · · Score: 5, Informative

    (I'm answering these questions off-the-cuff, so if I mistype any details, sorry.)

    If you know what a first-generation C-brick looks like, imagine squeezing that board into a one-rack-unit form factor and stacking four of them together.

    Each superbrick includes four boards, spaced one unit apart, with four R14Ks, the Bedrock, and some RAM. The boards are connected with an internal eight-port crossbar router, making the superbrick a self-contained 16-processor unit. Externally, the superbrick connects to the base I/O brick via XIO+; the base I/O brick contains stuff like the system disk and the first 11 PCI-X slots.

    I'm not positive how the superbricks are configured. Theoretically, you can partially populate them in one-node increments (meaning 4 CPUs and some RAM), but SGI may or may not sell them that way for manufacturing and QA reasons.

    I believe the CPUs come with 8 MB of s-cache each.

    The CPU-to-CPU and CPU-to-RAM bandwidths vary depending on the topology you're crossing, but I believe the minimum is 1.6 GB/s unidirectional, or 3.2 GB/s bidirectional. Intra-node bandwidths are somewhat higher, I believe.

    No, the CPUs are regular single-core MIPS R14000As. They're tiny chips that don't consume much power, so you can really squeeze 'em in there.

    Keep an eye on techpubs.sgi.com, because SGI will be releasing the developer and owner docs for the new system there shortly. (By "shortly" I mean as soon as a few hours or as long as a few weeks, depending on when the docs get released.) You'll find all the technical data you want when those docs go up.

    --

    I write in my journal
  16. not as dense as mine ! by jacquesm · · Score: 4, Interesting

    www.clustercompute.com

    well, on a per mips basis maybe, but then again I could use faster cpu's today.

  17. Re:SGI's Gettin' Some by Twirlip+of+the+Mists · · Score: 5, Interesting

    There are 128 cpu intel/amd solutions that fit in a single rack. I know of at least 3 companies that produce them and they are cheap.

    There are a few blade systems that can squeeze 128 or more processors into a rack, but those are blade systems, not single-system-image compute servers. You can't use a blade server to do the job of an Origin 3900. (Of course, the converse is also true; you wouldn't buy an Origin 3900 to do something you could do with a blade server instead.)

    SGI tends to produce exactly what the customer wants. It's just that their customer is more often than not the federal government, or a very large corporation. It's not well-known-- in fact, for a time it was classified-- but SGI designed, manufactured, and sold an entire line of what were basically DSP coprocessor units specifically for Lockheed's satellite division. Called the "tensor processing unit," each one was basically an expansion module for the Origin 2000. SGI built it just like a commercial product, complete with documentation and everything, and manufactured them in large quantities. It's just that you couldn't buy them unless you were Lockheed.

    It's only when SGI tries to branch out that they do poorly. I don't know WTF they were thinking when they decided to try selling inexpensive (relative to other SGI products) workstations running NT or Linux. That was just insane. But as SGI strips more and more of that BS away, they get closer and closer to being a sound company again.

    --

    I write in my journal
  18. Delaying the inevitable? by afidel · · Score: 3, Insightful

    Is this just delaying the death of SGI or signaling a new focus and niche for the company? I loved the Indy stations back in college and the O2's were amazing in their time, but most of the work those systems could do can now be done on comodity hardware, so SGI had to find a new reason to exist. Whether this system is enough to keep the grim reaper away is left to be seen.

    --
    There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
    1. Re:Delaying the inevitable? by Twirlip+of+the+Mists · · Score: 5, Insightful

      I can't believe this got moderated as "insightful." Crap like Indys and O2s is what put SGI in a bad place to begin with. SGI always had fantastic graphics technology and a kick-ass operating system. When they tried to sell low-end workstations-- Indys and O2s running IRIX, and all the stupid stuff with Intel machines running NT and Linux-- their net revenues went into the toilet. SGI's biggest sources of revenue have always been scientific and technical computing customers, the government, and the petrochemical/geological industries. It's when SGI de-focuses to talk about stuff like PCs with fancy cases or video servers or data mining software that they start to lose their way.

      This isn't SGI finding a new reason to exist. This is SGI going back to what has always been one of its best reasons to exist. Over time, SGI's technical lead in graphics has diminished, fueled primarily by (believe it or not) home computer games. But even now, nobody can touch SGI for high-performance scalable servers like the 3900.

      --

      I write in my journal
    2. Re:Delaying the inevitable? by sql*kitten · · Score: 3, Informative

      I can't believe this got moderated as "insightful." Crap like Indys and O2s is what put SGI in a bad place to begin with. SGI always had fantastic graphics technology and a kick-ass operating system. When they tried to sell low-end workstations-- Indys and O2s running IRIX, and all the stupid stuff with Intel machines running NT and Linux-- their net revenues went into the toilet.

      Not quite true. After all, in 1994 an Indy had better price/performance than a comparable Pentium system... and a Pentium couldn't touch a fully loaded Indy. With better marketing, SGI could have dominated the high end 2D and low end 3D space, driving out Apple and Intergraph, and continued to hold high-end 3D. I agree that NT was a colossal mistake for them, and they aren't recovered from that mistake next.

      It's when SGI de-focuses to talk about stuff like PCs with fancy cases or video servers or data mining software that they start to lose their way.

      SGI servers are fantastic for large databases, the features that make them great for rendering and number crunching (high memory bandwidth, very fast disk I/O, single system image) can easily be applied to databases. The Origins should be wiping the floor with Sun's Fire range. It's a marketing failure, not a technology failure.

      This isn't SGI finding a new reason to exist. This is SGI going back to what has always been one of its best reasons to exist. Over time, SGI's technical lead in graphics has diminished, fueled primarily by (believe it or not) home computer games. But even now, nobody can touch SGI for high-performance scalable servers like the 3900.

      It has diminished true, but it still exists. There isn't a PC that can touch the Fuel workstation, for example.

  19. Re:Heating? by Twirlip+of+the+Mists · · Score: 5, Interesting

    I'd worry about the bus chipset heating up more than the processors.

    It does. The Bedrock chip is both considerably larger and considerably hotter than the R14000A is. (Bedrock is the memory controller, node crossbar, and "bus" arbitrator.)

    As to your other comment, SGI got a lot for their money when they bought Cray back in the mid 90's. They took a lot of good Cray technology-- like crossbar-based NUMA system design principles-- and incorporated them into their large server systems. I believe SGI was the first company-- other than Cray itself-- to break the one-hundred CPU barrier on a single system image. (The T3 series was a monster, but I don't recall exactly how many CPUs you could cram into one.)

    I think it was Seymour himself who once said, "A supercomputer is a device for turning compute-bound problems into I/O bound problems."

    --

    I write in my journal
  20. Cooling Fans = Wind Tunnel by Brigadier · · Score: 5, Funny


    Anyone see the large image of this thing. It has like 10 6" Wide cooling fans. Walking by this thing will be like walking by a turbine jet engine. I cant' wait for the readers digest " Sucked in to the Origin 3000 how I survived"

    http://www.sgi.com/cgi-bin/download.cgi?/newsroo m/ press_releases/2002/november/images/origin3900_1_j pg.zip

  21. Re:Is it such a good new? by Twirlip+of+the+Mists · · Score: 5, Funny

    I remember eading an article on Slashdot some time ago on how processors were becoming so hot that at the current trend, they would be hotter than nuclear reactors by 2025.

    When I got up this morning, it was 59 F outside. Now, just after lunch, it's over 65 F. If this trend continues, it will be hot enough to melt lead outside by next spring!

    Beware statistical projections.

    --

    I write in my journal
  22. From the article by greenhide · · Score: 5, Funny

    A beefed-up system with 128 processors and 64MB of memory sells for $2.9 million.

    Imagine how much the version with 128 MB must cost!

    --
    Karma: Chevy Kavalierma.
  23. Woah by teslatug · · Score: 5, Funny

    Nice Rack!

  24. Flops are not everything by halfelven · · Score: 5, Interesting

    Sure, if you buy a ton of second-hand peecees and glue them together in a Beowulf, you have lots and lots of flops (= CPU power).

    But the flops are not everything. The problem with clusters is the network latency when the nodes talk to each other. That latency is small for your average network application, but immense for a supercomputer trying to make all its CPUs talk together. This is why there are entire classes of problems that cannot be solved properly on clusters (non-parallelizable problems).

    As opposed to that, an SGI supercomputer has the inter-CPU latency orders of magnitude lower. Same GFlops per total (same CPU power), but certain problems are solved orders of magnitude faster.

    That's the power of latency. ;-)

  25. Dedicated Application Computing by yoink! · · Score: 4, Informative

    Check out Nvidia's data centers. Beware... windows media format warning.

    Notice how many times the word linux is used...

  26. Re:SGI's Gettin' Some by bmajik · · Score: 3, Interesting

    They're not out of the woods by any means.

    History speaks pretty clearly about what happens to companies that marginalize their business into making 1-offs for infinite-budget DoD contracts and agencies. Eventually, projects get cancelled, line items in budgets get axed, and whole departments are re-orged into something different.

    Cray, anyone ? Cray-Research basically went under when the Cray-3 contract was axed. They were counting on that single-machine to keep the afloat. They futzed around with GaAs custom process and never got it qutie working right, and then the cold war ended and with it the justification for subsidizing a maker of 1-off supercomputers.

    (Incidentally, the purchase of Cray is what really broke SGI's back. 50% more employees, 2% more market cap, and the O2k/O3k technology came from stanford, not Cray) SGI bought itself into the supercomputing space with the cray acquisition, but their sales reps didn't know what to sell... T3, vector, or Origin. It bled the company pretty badly.

    Nobody argues that right now, there are some things for which there simply isn't any other rational choice besides SGI. In the early 90s, that was "anything with video, at all". Look how that market has all but vanished for them.
    The problem is the number of markets for which SGI is the only choice is shrinking and will continue to shrink. Only the institutions that need to be 1-3 years ahead of the curve will pay the huge markup for it. The big advantage of the O3k system is, as you ponit out, the single-system image. But this is only really advantageous for lazy programmers, and when you're talking 3m for a machine to do scientific or simulatino work, i suspect a lot of the code running on these is very custom, and NOT done by lazy programmers. So the brilliant thinking SGI has put into the hardware can sometimes be beaten by domain-specific software. Eg, lets say that MOSIX and 10Gig ethernet advances to the point that you can build a 1024p 512 node cluster, where the backbone (10Gb ethernet) is constructed in the same hypercube fabric as the numalink cables, and MOSIX can with software emulate the memory/process/thread migration that O3k is doing now....

    then will 2.9m for a machine still seem justified ?... a 512 node wintel cluster is cheaper than 2.9m if the node cost is under about 5500. How many x86 boxes do you know of that cost 5500.. even with 2 procs, a few gb of ram, and 4 or 5 10GB ethernet controllers (so that each node is n-way connected in the same hypercube fabric that O2k and o3k provide)

    --
    My opinions are my own, and do not necessarily represent those of my employer.
  27. Here's what you're missing by halfelven · · Score: 4, Interesting
    ...the single-system image... is only really advantageous for lazy programmers...

    There are entire classes of problems which cannot be solved fast enough on clusters, but only on single-image systems. Anything that cannot be made into a parallel algorithm falls into that category.

    ...let's say that Mosix and 10Gig ethernet advances to the point...

    With networked clusters you're always going to have latencies, orders of magnitude higher than with single-image supercomputers.
    Sure, perhaps in 10 or 15 years, we're going to have network latencies as small as those of a PCI bus, but i'm not really talking about future that far. Until then, clusters will be slow for certain problems. Deal with it. :-P
    1. Re:Here's what you're missing by Fulcrum+of+Evil · · Score: 3, Interesting

      With networked clusters you're always going to have latencies, orders of magnitude higher than with single-image supercomputers.

      While your point abour ethernet latency is valid, you should be aware that, for somewhat more money, you can get 2gb throughput and about 7us latency. More info at myri.com.

      The gap between supercomputer and desktop is getting narrower each year. Eventually you will buy your computer by the pound.

      --
      "We returned the General to El Salvador, or maybe Guatemala, it's difficult to tell from 10,000 feet"
  28. Re:SGI's Gettin' Some by Twirlip+of+the+Mists · · Score: 5, Insightful

    Cray-Research basically went under when the Cray-3 contract was axed.

    Cray has already taken more than $25 million in orders for the X1, a computer that hasn't even been built yet. Cray has had a rough time, but they're doing just fine.

    lets say that MOSIX and 10Gig ethernet advances

    What if it does? Bandwidth between nodes isn't as big a problem as latency in that case. No matter how fast-- in terms of bits per second-- your network transport is, you're always going to have latencies that are a million times higher than node-to-node latencies inside a NUMA system like the Origin. Seriously, a million times; we're talking milliseconds versus nanoseconds here. Your dismissal of single-system-image designs in favor of cluster designs shows a distinct lack of vision on your part, I'm afraid.

    then will 2.9m for a machine still seem justified ?

    If you set up the hypothetical situation such that the less-expensive system does everything that the more-expensive system can do, then no, of course the more-expensive system isn't justifable. But that's not reality. SGI can deliver 1,024-processor systems right now. You can call them up and place and order for a 512-processor system right out of their main price list. (Bigger systems are special deals, but the 512-processor configuration has its own part number, just like a workstation or a monitor.)

    Two or three years from now, when everything you just described is possible, let's see what SGI has in its price book and revisit the question. I imagine the answer then will be the same as the answer now, just with the facts ratched up a few notches. "Yeah," you'll say, "SGI can deliver 8 kiloprocessors for $3 million, but is it justified? A 2 kilonode wintel cluster is cheaper...."

    --

    I write in my journal
  29. Just how dense is it? by Admiral+Burrito · · Score: 5, Funny

    Client:
    GET / HTTP/1.1
    Host: densestserver.sgi.com

    Server:
    Um... What's that?

    Client:
    Do you not understand HTTP 1.1?

    Server:
    Of course I do...?

    Client:
    Well then,
    GET / HTTP/1.1
    Host: densestserver.sgi.com

    Server:
    Okay... Would you like that biggie-sized?

    Client:
    wtf?

    Server:
    Oh, you want a web page. Okay, I get it now.

    Client:
    Great. Now send it, please.

    Server:
    Send what?

    Client:
    *sigh* Nevermind.

    User:
    Huh? What does "500 Server Error: Server too dense" mean?

  30. Densest server has 336 processors per rack... by raynet · · Score: 3, Interesting

    RLX Technologies has a server based on Transmeta Crusoe chip and it can hold 24 CPUs in 3U space, giving 336 processors per rack (and 336GB of RAM and 27TB of HDD :)

    See promo here..

    --
    - Raynet --> .