SGI Introduces World's Densest Server
Twirlip of the Mists writes "Today SGI announced the Origin 3900 server, the world's densest computer. How dense? How about 16 MIPS R14000A processors and 32 GB of RAM in a 4-rack-unit 'superbrick,' for a grand total of 128 processors and 256 GB of RAM in a single rack. That makes the new machine the densest single-system-image computer in the world; it's even denser than most blade systems. Just for fun, the server also includes a whole bunch of 64-bit, 133 MHz PCI-X slots (from 11 up to hundreds and hundreds, depending on configuration). There's coverage of the announcement on ZDNet, CNET, and InfoWorld, as well as on SGI's own site."
Isn't that the system requirement for the up and coming Doom III?
~S
Now where do we find the world's densest admin to run it?
slashdot: where everyone yells sarcastic metaphors to themselves to understand the issue
"...and more on lessening heat dissipation..."
Correct me if I'm wrong, but wouldn't you want to *increase* heat dissipation?
Project Steve
Response: The boys that cried "Beowulf!".
This record goes to Emmanuel at the little bistro on Rue de Bach just off Blvd. St. Michel in Paris.
Help fight continental drift.
I meant to mention this in my submission, but it slipped my mind. The R14000A only consumes 17 watts of power. Four of them, plus the Bedrock memory controller chip, plus up to 8 GB of RAM, fit on a board inside a 1 RU clearance. Four of them, plus some nifty backplane hardware, fit into a "superbrick," meaning sixteen processors in 4 RU.
As far as heat loading goes, the "superbrick" is basically one big wind tunnel, with giant fans on the front and ventilation out the back. It pumps a lot of heat into the room, but the temperature in and around the CPUs is really pretty low. I think it peaks around 35 C.
I write in my journal
Commenting on how the new Origin systems are denser then any other single image system, and then comparing them to the current blade fad to make your point is a bit silly. Blades are seperate machines (unless they are Sun, in which case they are the current desktop line), this system is a single machine. I'm not entirely certain about this density claim either, doesn't Sun fit 128 processors in a rack with the Fire 15ks?
Procter & Gamble, for example, uses an SGI system to study the aerodynamics of Pringle's potato chips
No, it's not. This is a single-system-image server. The 128-processor rack boots a single kernel. (In fact, you can connect four 128-p racks together to make a 512-p system, and larger systems than that are supported under special contract to SGI. I believe NASA Ames has a 1,024-p.)
The four-processor, 1-unit server you talked about stops there: at four processors. You can't compare that to a system that scales to be 256 times that size.
I write in my journal
How about we calculate density by flops or something else useful. I mean, how difficult would it be to cram a butt load of Pentiums in a rack? Yeah well how much calculation can they do?
... that thing is mamoth with 5,120 processors.
...
... ramble ramble ...
Lets cruise on over to the Top 500 and use their handy dandy html list to view 'most powerful chip'. This unfortunately requires a little calc work because they failed to include this number in their table.
#1 NEC Earth-Simulator 35,860.00 GFlops using 5,120 Processors -- WOW!
But that's only 7 GFlops per processor
Now lets look at a little different design
#14 Hitachi SR8000-F1/168 1,653.00 GFlops using 168 Processors -- Hot DAMN!!
This is more like it. They're pulling 9.84 GFlops per processor. With their architecture they could pull off the Earth-Simulator's GFlop rate with 3,645 processors - That's 28% less computer doing the same amount of work. Which means if the Earth-Simulator had been constructed with Hitachi's hardware, they could have been pulling 50,380 GFlops in the same cubic footage.
Now this is all rambling that assumes that the processors are similar in size. Which probably isn't true. But they are also getting more power out of less hardware, and it is rare that THAT isn't a bonus.
No sig for you. YOU GET NO SIG!
Obviously, that should be 64 gigabytes of RAM, not 64 megs.
Interesting thing about this system will be, rather than the maximum RAM capacity, the minimum RAM required. The original Origin 3000 required some minimal amount of RAM-- 256 or 512 MB or something-- for every four processors. I'm not sure if this new model has the same requirement, but I'd imagine that it does. (It's an architectural thing. Every node board has to have some RAM on it, because that node board may be nominated at boot time to act as the boot master, among other reasons.)
If that's true, then a 128-processor system would require a minimum of either 32 or 64 GB of RAM, depending on whether you can put 256 MB on a node board.
I write in my journal
Just an FYI - the CNet article (linked above) talks about its possible use on oil rigs - that type of mapping usually takes some horsepower and as usual, anything that is sea-based will be somewhat cramped for space!
(I'm answering these questions off-the-cuff, so if I mistype any details, sorry.)
If you know what a first-generation C-brick looks like, imagine squeezing that board into a one-rack-unit form factor and stacking four of them together.
Each superbrick includes four boards, spaced one unit apart, with four R14Ks, the Bedrock, and some RAM. The boards are connected with an internal eight-port crossbar router, making the superbrick a self-contained 16-processor unit. Externally, the superbrick connects to the base I/O brick via XIO+; the base I/O brick contains stuff like the system disk and the first 11 PCI-X slots.
I'm not positive how the superbricks are configured. Theoretically, you can partially populate them in one-node increments (meaning 4 CPUs and some RAM), but SGI may or may not sell them that way for manufacturing and QA reasons.
I believe the CPUs come with 8 MB of s-cache each.
The CPU-to-CPU and CPU-to-RAM bandwidths vary depending on the topology you're crossing, but I believe the minimum is 1.6 GB/s unidirectional, or 3.2 GB/s bidirectional. Intra-node bandwidths are somewhat higher, I believe.
No, the CPUs are regular single-core MIPS R14000As. They're tiny chips that don't consume much power, so you can really squeeze 'em in there.
Keep an eye on techpubs.sgi.com, because SGI will be releasing the developer and owner docs for the new system there shortly. (By "shortly" I mean as soon as a few hours or as long as a few weeks, depending on when the docs get released.) You'll find all the technical data you want when those docs go up.
I write in my journal
Right but wrong. The target market for this system is definitely government and university HPC labs, but those labs are definitely short of floor space. Putting more MIPS per floor tile is an important advancement.
I write in my journal
www.clustercompute.com
well, on a per mips basis maybe, but then again I could use faster cpu's today.
MP3 Search Engine
There are 128 cpu intel/amd solutions that fit in a single rack. I know of at least 3 companies that produce them and they are cheap.
There are a few blade systems that can squeeze 128 or more processors into a rack, but those are blade systems, not single-system-image compute servers. You can't use a blade server to do the job of an Origin 3900. (Of course, the converse is also true; you wouldn't buy an Origin 3900 to do something you could do with a blade server instead.)
SGI tends to produce exactly what the customer wants. It's just that their customer is more often than not the federal government, or a very large corporation. It's not well-known-- in fact, for a time it was classified-- but SGI designed, manufactured, and sold an entire line of what were basically DSP coprocessor units specifically for Lockheed's satellite division. Called the "tensor processing unit," each one was basically an expansion module for the Origin 2000. SGI built it just like a commercial product, complete with documentation and everything, and manufactured them in large quantities. It's just that you couldn't buy them unless you were Lockheed.
It's only when SGI tries to branch out that they do poorly. I don't know WTF they were thinking when they decided to try selling inexpensive (relative to other SGI products) workstations running NT or Linux. That was just insane. But as SGI strips more and more of that BS away, they get closer and closer to being a sound company again.
I write in my journal
I'd worry about the bus chipset heating up more than the processors.
It does. The Bedrock chip is both considerably larger and considerably hotter than the R14000A is. (Bedrock is the memory controller, node crossbar, and "bus" arbitrator.)
As to your other comment, SGI got a lot for their money when they bought Cray back in the mid 90's. They took a lot of good Cray technology-- like crossbar-based NUMA system design principles-- and incorporated them into their large server systems. I believe SGI was the first company-- other than Cray itself-- to break the one-hundred CPU barrier on a single system image. (The T3 series was a monster, but I don't recall exactly how many CPUs you could cram into one.)
I think it was Seymour himself who once said, "A supercomputer is a device for turning compute-bound problems into I/O bound problems."
I write in my journal
Anyone see the large image of this thing. It has like 10 6" Wide cooling fans. Walking by this thing will be like walking by a turbine jet engine. I cant' wait for the readers digest " Sucked in to the Origin 3000 how I survived"
http://www.sgi.com/cgi-bin/download.cgi?/newsro
I remember eading an article on Slashdot some time ago on how processors were becoming so hot that at the current trend, they would be hotter than nuclear reactors by 2025.
When I got up this morning, it was 59 F outside. Now, just after lunch, it's over 65 F. If this trend continues, it will be hot enough to melt lead outside by next spring!
Beware statistical projections.
I write in my journal
A beefed-up system with 128 processors and 64MB of memory sells for $2.9 million.
Imagine how much the version with 128 MB must cost!
Karma: Chevy Kavalierma.
Nice Rack!
I can't believe this got moderated as "insightful." Crap like Indys and O2s is what put SGI in a bad place to begin with. SGI always had fantastic graphics technology and a kick-ass operating system. When they tried to sell low-end workstations-- Indys and O2s running IRIX, and all the stupid stuff with Intel machines running NT and Linux-- their net revenues went into the toilet. SGI's biggest sources of revenue have always been scientific and technical computing customers, the government, and the petrochemical/geological industries. It's when SGI de-focuses to talk about stuff like PCs with fancy cases or video servers or data mining software that they start to lose their way.
This isn't SGI finding a new reason to exist. This is SGI going back to what has always been one of its best reasons to exist. Over time, SGI's technical lead in graphics has diminished, fueled primarily by (believe it or not) home computer games. But even now, nobody can touch SGI for high-performance scalable servers like the 3900.
I write in my journal
Sure, if you buy a ton of second-hand peecees and glue them together in a Beowulf, you have lots and lots of flops (= CPU power).
;-)
But the flops are not everything. The problem with clusters is the network latency when the nodes talk to each other. That latency is small for your average network application, but immense for a supercomputer trying to make all its CPUs talk together. This is why there are entire classes of problems that cannot be solved properly on clusters (non-parallelizable problems).
As opposed to that, an SGI supercomputer has the inter-CPU latency orders of magnitude lower. Same GFlops per total (same CPU power), but certain problems are solved orders of magnitude faster.
That's the power of latency.
Actually, the spec sheet indicates that it is 8.9kW per rack (2.2kW for Drive arrays). That is on the high side, but liveable. (6kW is the max for "standard" cooling-- you can accommodate up to 10kW with a high delta-T cooling system. Water cooling comes into play after that.)
The value of shrinking it down is (as you allude to) not a real-estate issue, but more about the computing efficiencies of a denser package.
The HP blades (6U) are about 35kW nameplate per rack, with a real load of about 10-11kW. The energy savings of SGI might actually give it some value in comparison!
Check out Nvidia's data centers. Beware... windows media format warning.
Notice how many times the word linux is used...
There are entire classes of problems which cannot be solved fast enough on clusters, but only on single-image systems. Anything that cannot be made into a parallel algorithm falls into that category.
With networked clusters you're always going to have latencies, orders of magnitude higher than with single-image supercomputers.
Sure, perhaps in 10 or 15 years, we're going to have network latencies as small as those of a PCI bus, but i'm not really talking about future that far. Until then, clusters will be slow for certain problems. Deal with it.
Cray-Research basically went under when the Cray-3 contract was axed.
Cray has already taken more than $25 million in orders for the X1, a computer that hasn't even been built yet. Cray has had a rough time, but they're doing just fine.
lets say that MOSIX and 10Gig ethernet advances
What if it does? Bandwidth between nodes isn't as big a problem as latency in that case. No matter how fast-- in terms of bits per second-- your network transport is, you're always going to have latencies that are a million times higher than node-to-node latencies inside a NUMA system like the Origin. Seriously, a million times; we're talking milliseconds versus nanoseconds here. Your dismissal of single-system-image designs in favor of cluster designs shows a distinct lack of vision on your part, I'm afraid.
then will 2.9m for a machine still seem justified ?
If you set up the hypothetical situation such that the less-expensive system does everything that the more-expensive system can do, then no, of course the more-expensive system isn't justifable. But that's not reality. SGI can deliver 1,024-processor systems right now. You can call them up and place and order for a 512-processor system right out of their main price list. (Bigger systems are special deals, but the 512-processor configuration has its own part number, just like a workstation or a monitor.)
Two or three years from now, when everything you just described is possible, let's see what SGI has in its price book and revisit the question. I imagine the answer then will be the same as the answer now, just with the facts ratched up a few notches. "Yeah," you'll say, "SGI can deliver 8 kiloprocessors for $3 million, but is it justified? A 2 kilonode wintel cluster is cheaper...."
I write in my journal
Client:
GET / HTTP/1.1
Host: densestserver.sgi.com
Server:
Um... What's that?
Client:
Do you not understand HTTP 1.1?
Server:
Of course I do...?
Client:
Well then,
GET / HTTP/1.1
Host: densestserver.sgi.com
Server:
Okay... Would you like that biggie-sized?
Client:
wtf?
Server:
Oh, you want a web page. Okay, I get it now.
Client:
Great. Now send it, please.
Server:
Send what?
Client:
*sigh* Nevermind.
User:
Huh? What does "500 Server Error: Server too dense" mean?