SGI Introduces World's Densest Server
Twirlip of the Mists writes "Today SGI announced the Origin 3900 server, the world's densest computer. How dense? How about 16 MIPS R14000A processors and 32 GB of RAM in a 4-rack-unit 'superbrick,' for a grand total of 128 processors and 256 GB of RAM in a single rack. That makes the new machine the densest single-system-image computer in the world; it's even denser than most blade systems. Just for fun, the server also includes a whole bunch of 64-bit, 133 MHz PCI-X slots (from 11 up to hundreds and hundreds, depending on configuration). There's coverage of the announcement on ZDNet, CNET, and InfoWorld, as well as on SGI's own site."
Isn't that the system requirement for the up and coming Doom III?
~S
Now where do we find the world's densest admin to run it?
slashdot: where everyone yells sarcastic metaphors to themselves to understand the issue
"...and more on lessening heat dissipation..."
Correct me if I'm wrong, but wouldn't you want to *increase* heat dissipation?
Project Steve
Response: The boys that cried "Beowulf!".
This record goes to Emmanuel at the little bistro on Rue de Bach just off Blvd. St. Michel in Paris.
Help fight continental drift.
I meant to mention this in my submission, but it slipped my mind. The R14000A only consumes 17 watts of power. Four of them, plus the Bedrock memory controller chip, plus up to 8 GB of RAM, fit on a board inside a 1 RU clearance. Four of them, plus some nifty backplane hardware, fit into a "superbrick," meaning sixteen processors in 4 RU.
As far as heat loading goes, the "superbrick" is basically one big wind tunnel, with giant fans on the front and ventilation out the back. It pumps a lot of heat into the room, but the temperature in and around the CPUs is really pretty low. I think it peaks around 35 C.
I write in my journal
Commenting on how the new Origin systems are denser then any other single image system, and then comparing them to the current blade fad to make your point is a bit silly. Blades are seperate machines (unless they are Sun, in which case they are the current desktop line), this system is a single machine. I'm not entirely certain about this density claim either, doesn't Sun fit 128 processors in a rack with the Fire 15ks?
What we need is faster, cheaper hardware that makes sense!
The 128-processor Origin 3900 lists for $2.9 million. There's nothing "cheaper" about this. Faster, yeah; this is one of-- not "the," but one of-- the fastest computers in the world. And it's the densest. But it's nowhere near cheap.
I write in my journal
These servers are pointless in most datacenters. In order to fill one rack with this much horsepower, you would need at least two empty racks next to it to compensate for the power draw and (much) increased cooling needs. I would argue that the target market for this equipment is government labs, research institutes and universities--not usually starved for floor space.
Procter & Gamble, for example, uses an SGI system to study the aerodynamics of Pringle's potato chips
No, it's not. This is a single-system-image server. The 128-processor rack boots a single kernel. (In fact, you can connect four 128-p racks together to make a 512-p system, and larger systems than that are supported under special contract to SGI. I believe NASA Ames has a 1,024-p.)
The four-processor, 1-unit server you talked about stops there: at four processors. You can't compare that to a system that scales to be 256 times that size.
I write in my journal
How about we calculate density by flops or something else useful. I mean, how difficult would it be to cram a butt load of Pentiums in a rack? Yeah well how much calculation can they do?
... that thing is mamoth with 5,120 processors.
...
... ramble ramble ...
Lets cruise on over to the Top 500 and use their handy dandy html list to view 'most powerful chip'. This unfortunately requires a little calc work because they failed to include this number in their table.
#1 NEC Earth-Simulator 35,860.00 GFlops using 5,120 Processors -- WOW!
But that's only 7 GFlops per processor
Now lets look at a little different design
#14 Hitachi SR8000-F1/168 1,653.00 GFlops using 168 Processors -- Hot DAMN!!
This is more like it. They're pulling 9.84 GFlops per processor. With their architecture they could pull off the Earth-Simulator's GFlop rate with 3,645 processors - That's 28% less computer doing the same amount of work. Which means if the Earth-Simulator had been constructed with Hitachi's hardware, they could have been pulling 50,380 GFlops in the same cubic footage.
Now this is all rambling that assumes that the processors are similar in size. Which probably isn't true. But they are also getting more power out of less hardware, and it is rare that THAT isn't a bonus.
No sig for you. YOU GET NO SIG!
Just wait for the technology to trickle down. You'll be able to get womething on par with this for $3000 in about, oh say......30 yrs.
>>>>>> Chewie, take the professor in the back and plug him into the hyperdrive.
Obviously, that should be 64 gigabytes of RAM, not 64 megs.
Interesting thing about this system will be, rather than the maximum RAM capacity, the minimum RAM required. The original Origin 3000 required some minimal amount of RAM-- 256 or 512 MB or something-- for every four processors. I'm not sure if this new model has the same requirement, but I'd imagine that it does. (It's an architectural thing. Every node board has to have some RAM on it, because that node board may be nominated at boot time to act as the boot master, among other reasons.)
If that's true, then a 128-processor system would require a minimum of either 32 or 64 GB of RAM, depending on whether you can put 256 MB on a node board.
I write in my journal
(I'm answering these questions off-the-cuff, so if I mistype any details, sorry.)
If you know what a first-generation C-brick looks like, imagine squeezing that board into a one-rack-unit form factor and stacking four of them together.
Each superbrick includes four boards, spaced one unit apart, with four R14Ks, the Bedrock, and some RAM. The boards are connected with an internal eight-port crossbar router, making the superbrick a self-contained 16-processor unit. Externally, the superbrick connects to the base I/O brick via XIO+; the base I/O brick contains stuff like the system disk and the first 11 PCI-X slots.
I'm not positive how the superbricks are configured. Theoretically, you can partially populate them in one-node increments (meaning 4 CPUs and some RAM), but SGI may or may not sell them that way for manufacturing and QA reasons.
I believe the CPUs come with 8 MB of s-cache each.
The CPU-to-CPU and CPU-to-RAM bandwidths vary depending on the topology you're crossing, but I believe the minimum is 1.6 GB/s unidirectional, or 3.2 GB/s bidirectional. Intra-node bandwidths are somewhat higher, I believe.
No, the CPUs are regular single-core MIPS R14000As. They're tiny chips that don't consume much power, so you can really squeeze 'em in there.
Keep an eye on techpubs.sgi.com, because SGI will be releasing the developer and owner docs for the new system there shortly. (By "shortly" I mean as soon as a few hours or as long as a few weeks, depending on when the docs get released.) You'll find all the technical data you want when those docs go up.
I write in my journal
www.clustercompute.com
well, on a per mips basis maybe, but then again I could use faster cpu's today.
MP3 Search Engine
There are 128 cpu intel/amd solutions that fit in a single rack. I know of at least 3 companies that produce them and they are cheap.
There are a few blade systems that can squeeze 128 or more processors into a rack, but those are blade systems, not single-system-image compute servers. You can't use a blade server to do the job of an Origin 3900. (Of course, the converse is also true; you wouldn't buy an Origin 3900 to do something you could do with a blade server instead.)
SGI tends to produce exactly what the customer wants. It's just that their customer is more often than not the federal government, or a very large corporation. It's not well-known-- in fact, for a time it was classified-- but SGI designed, manufactured, and sold an entire line of what were basically DSP coprocessor units specifically for Lockheed's satellite division. Called the "tensor processing unit," each one was basically an expansion module for the Origin 2000. SGI built it just like a commercial product, complete with documentation and everything, and manufactured them in large quantities. It's just that you couldn't buy them unless you were Lockheed.
It's only when SGI tries to branch out that they do poorly. I don't know WTF they were thinking when they decided to try selling inexpensive (relative to other SGI products) workstations running NT or Linux. That was just insane. But as SGI strips more and more of that BS away, they get closer and closer to being a sound company again.
I write in my journal
Is this just delaying the death of SGI or signaling a new focus and niche for the company? I loved the Indy stations back in college and the O2's were amazing in their time, but most of the work those systems could do can now be done on comodity hardware, so SGI had to find a new reason to exist. Whether this system is enough to keep the grim reaper away is left to be seen.
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
I'd worry about the bus chipset heating up more than the processors.
It does. The Bedrock chip is both considerably larger and considerably hotter than the R14000A is. (Bedrock is the memory controller, node crossbar, and "bus" arbitrator.)
As to your other comment, SGI got a lot for their money when they bought Cray back in the mid 90's. They took a lot of good Cray technology-- like crossbar-based NUMA system design principles-- and incorporated them into their large server systems. I believe SGI was the first company-- other than Cray itself-- to break the one-hundred CPU barrier on a single system image. (The T3 series was a monster, but I don't recall exactly how many CPUs you could cram into one.)
I think it was Seymour himself who once said, "A supercomputer is a device for turning compute-bound problems into I/O bound problems."
I write in my journal
Anyone see the large image of this thing. It has like 10 6" Wide cooling fans. Walking by this thing will be like walking by a turbine jet engine. I cant' wait for the readers digest " Sucked in to the Origin 3000 how I survived"
http://www.sgi.com/cgi-bin/download.cgi?/newsro
I remember eading an article on Slashdot some time ago on how processors were becoming so hot that at the current trend, they would be hotter than nuclear reactors by 2025.
When I got up this morning, it was 59 F outside. Now, just after lunch, it's over 65 F. If this trend continues, it will be hot enough to melt lead outside by next spring!
Beware statistical projections.
I write in my journal
A beefed-up system with 128 processors and 64MB of memory sells for $2.9 million.
Imagine how much the version with 128 MB must cost!
Karma: Chevy Kavalierma.
Nice Rack!
Sure, if you buy a ton of second-hand peecees and glue them together in a Beowulf, you have lots and lots of flops (= CPU power).
;-)
But the flops are not everything. The problem with clusters is the network latency when the nodes talk to each other. That latency is small for your average network application, but immense for a supercomputer trying to make all its CPUs talk together. This is why there are entire classes of problems that cannot be solved properly on clusters (non-parallelizable problems).
As opposed to that, an SGI supercomputer has the inter-CPU latency orders of magnitude lower. Same GFlops per total (same CPU power), but certain problems are solved orders of magnitude faster.
That's the power of latency.
Check out Nvidia's data centers. Beware... windows media format warning.
Notice how many times the word linux is used...
They're not out of the woods by any means.
History speaks pretty clearly about what happens to companies that marginalize their business into making 1-offs for infinite-budget DoD contracts and agencies. Eventually, projects get cancelled, line items in budgets get axed, and whole departments are re-orged into something different.
Cray, anyone ? Cray-Research basically went under when the Cray-3 contract was axed. They were counting on that single-machine to keep the afloat. They futzed around with GaAs custom process and never got it qutie working right, and then the cold war ended and with it the justification for subsidizing a maker of 1-off supercomputers.
(Incidentally, the purchase of Cray is what really broke SGI's back. 50% more employees, 2% more market cap, and the O2k/O3k technology came from stanford, not Cray) SGI bought itself into the supercomputing space with the cray acquisition, but their sales reps didn't know what to sell... T3, vector, or Origin. It bled the company pretty badly.
Nobody argues that right now, there are some things for which there simply isn't any other rational choice besides SGI. In the early 90s, that was "anything with video, at all". Look how that market has all but vanished for them.
The problem is the number of markets for which SGI is the only choice is shrinking and will continue to shrink. Only the institutions that need to be 1-3 years ahead of the curve will pay the huge markup for it. The big advantage of the O3k system is, as you ponit out, the single-system image. But this is only really advantageous for lazy programmers, and when you're talking 3m for a machine to do scientific or simulatino work, i suspect a lot of the code running on these is very custom, and NOT done by lazy programmers. So the brilliant thinking SGI has put into the hardware can sometimes be beaten by domain-specific software. Eg, lets say that MOSIX and 10Gig ethernet advances to the point that you can build a 1024p 512 node cluster, where the backbone (10Gb ethernet) is constructed in the same hypercube fabric as the numalink cables, and MOSIX can with software emulate the memory/process/thread migration that O3k is doing now....
then will 2.9m for a machine still seem justified ?... a 512 node wintel cluster is cheaper than 2.9m if the node cost is under about 5500. How many x86 boxes do you know of that cost 5500.. even with 2 procs, a few gb of ram, and 4 or 5 10GB ethernet controllers (so that each node is n-way connected in the same hypercube fabric that O2k and o3k provide)
My opinions are my own, and do not necessarily represent those of my employer.
There are entire classes of problems which cannot be solved fast enough on clusters, but only on single-image systems. Anything that cannot be made into a parallel algorithm falls into that category.
With networked clusters you're always going to have latencies, orders of magnitude higher than with single-image supercomputers.
Sure, perhaps in 10 or 15 years, we're going to have network latencies as small as those of a PCI bus, but i'm not really talking about future that far. Until then, clusters will be slow for certain problems. Deal with it.
Cray-Research basically went under when the Cray-3 contract was axed.
Cray has already taken more than $25 million in orders for the X1, a computer that hasn't even been built yet. Cray has had a rough time, but they're doing just fine.
lets say that MOSIX and 10Gig ethernet advances
What if it does? Bandwidth between nodes isn't as big a problem as latency in that case. No matter how fast-- in terms of bits per second-- your network transport is, you're always going to have latencies that are a million times higher than node-to-node latencies inside a NUMA system like the Origin. Seriously, a million times; we're talking milliseconds versus nanoseconds here. Your dismissal of single-system-image designs in favor of cluster designs shows a distinct lack of vision on your part, I'm afraid.
then will 2.9m for a machine still seem justified ?
If you set up the hypothetical situation such that the less-expensive system does everything that the more-expensive system can do, then no, of course the more-expensive system isn't justifable. But that's not reality. SGI can deliver 1,024-processor systems right now. You can call them up and place and order for a 512-processor system right out of their main price list. (Bigger systems are special deals, but the 512-processor configuration has its own part number, just like a workstation or a monitor.)
Two or three years from now, when everything you just described is possible, let's see what SGI has in its price book and revisit the question. I imagine the answer then will be the same as the answer now, just with the facts ratched up a few notches. "Yeah," you'll say, "SGI can deliver 8 kiloprocessors for $3 million, but is it justified? A 2 kilonode wintel cluster is cheaper...."
I write in my journal
Client:
GET / HTTP/1.1
Host: densestserver.sgi.com
Server:
Um... What's that?
Client:
Do you not understand HTTP 1.1?
Server:
Of course I do...?
Client:
Well then,
GET / HTTP/1.1
Host: densestserver.sgi.com
Server:
Okay... Would you like that biggie-sized?
Client:
wtf?
Server:
Oh, you want a web page. Okay, I get it now.
Client:
Great. Now send it, please.
Server:
Send what?
Client:
*sigh* Nevermind.
User:
Huh? What does "500 Server Error: Server too dense" mean?
RLX Technologies has a server based on Transmeta Crusoe chip and it can hold 24 CPUs in 3U space, giving 336 processors per rack (and 336GB of RAM and 27TB of HDD :)
See promo here..
- Raynet --> .