Specs On New SGI Onyx And Origin
An anonymous reader wrote in to tell us that
SGI has announced their latest and greatest MIPS-based computers, the Onyx and Origin 3000 line. Up to 1 TB RAM and 512 processors, all on a single system (not a cluster).
Beyond Boxes has a nice summary, too. This is definitely a great system for anyone who wants to
have their computer be the size of several refrigerators ;)
Yeah, I know. After I posted it, I realised that noticably wasn't the right word. Perhaps measurably would have been better. The point I was trying to make (and I guess I didn't succeed very well) is not that you could use a cluster of off-the-shelf machines instead of an O3K, but that the O3K (and other large machines) had some cluster-like properties housed in a single case. Bandwidth and latencies may be orders of magnitude better, but architecturally, they're similar (although not identical).
"The invisible and the non-existent look very much alike." -- Delos B. McKown
Ermmm... no. Data General and Sequent have both been shipping NUMA boxen for many years now.
"The invisible and the non-existent look very much alike." -- Delos B. McKown
I, personally, wouldn't have a use for one (other than bragging rights :-), but it's not actually that big. At 48bpp (16 each for RGB), you could get 3900x1792, an aspect ratio and resolution that may well be suitable for motion pictures using digital projection. Alternatively, you could have a triple-headed 1600x1200 display. I've worked at companies where three 1280x1024 displays per machine were commonplace, so it's not that unreasonable.
"The invisible and the non-existent look very much alike." -- Delos B. McKown
The nicest thing about the SGI machines is that they have low-latency interconnect. Complete cache coherency is on the order of nanoseconds - not your microsecond latency on SCI or Myrinet, or your millisecond latencies on Ethernet (and those latter latencies are for data transfer only). A lot of supercomputing tasks can be done by a cluster of Linux machines these days; but for exactly the class of applications you're talking about (lots of communication/contention) this is the machine you'd want to run it on. The other class of applications (of course) is detailed simulations with a fine grid size - where else can you get 1TB of shared memory? ;)
As far as the kernel goes, it's been scaled from 1..512 processors. There is almost no kernel overhead in computational code to begin with anyway (sure, that simulation may run for 100 hours, but it makes about 1000 system calls), but Irix does a pretty decent job of staying out of the way (aside from periodic stupidness of the scheduler anyway).
No offense, but comparing Linux/BSD/whatever kernel overhead to commercial high-end UNIX overhead is like comparing apples to oranges. Sure, Linux may scale to 8 processors ok, but that's way different than scaling to 512 (which is very difficult to do).
Yep, mea culpa. I was dividing 320MB by 48, not by 6 (or 8, if you assume 32-bit word aligned accesses).
"The invisible and the non-existent look very much alike." -- Delos B. McKown
So the rest of the industry is playing "catchup" to SGI ?! I don't really think there's a huge market for large-scale multiprocessor machines when equivalents can be built up easily from cheap hardware and fast network infrastructure.
Actually, they can't be.
This is not a cluster - it's a multiprocessing supercomputer designed as a single unit. The internal busses have far, far greater bandwidth than even the expensive networks in a high-end cluster.
It does have competition - the Sun Starfire. But that's about it.
Clusters are definitely useful, and give you by far the best bang-for-the-buck on problems with relatively light communications load, but problems with a heavy communications load are best run on machines with high communications bandwidth, like this one.
1. The CDROM is on an internal FireWire bus.
2. The system disk is Fibre Channel.
3. SGI hasn't made a big deal about it yet, but the system will accept either MIPS or Intel processors in the same CPU modules. The MIPS processors come on one kind of daughtercard, and the Itaniums (Itania?) on another. You can't mix-and-match MIPS and IA-64 CPUs in the same machine, but you can mix-and-match in the same cluster.
4. The IA-64 based versions of the 3000 series will include the Linux kernel along an some IRIX compatibility layer.
Amusing bits from the page:
Debra Goldfarb, group vice president at analyst firm IDC, agrees: "Modular computing empowers end users to build the kind of environment that they need not only today but over time. SGI, with this product, is really ahead of the curve in the market. We are seeing the [rest of the] industry absolutely trying to catch up" with SGI.
So the rest of the industry is playing "catchup" to SGI ?! I don't really think there's a huge market for large-scale multiprocessor machines when equivalents can be built up easily from cheap hardware and fast network infrastructure. The last time I saw an SGI was the NASA AMES crew using one for their amazing Viz tool, and even they were making mutterings about porting it to NT and Linux for ease of maintenance and actual use.
In addition, SGI Origin 3000 servers and SGI Onyx 3000 visualization systems reflect a return to SGI's core competencies.
At least that's true. The NT machines were a joke. Anyone tried SGI Linux yet?
Insanity is the last line of defence for the master diplomat. But you have to lay the groundwork early.
Aah, yes. The $64000 question. The answer to this is NUMA and hypercube structured interconnect. Check out the specs. Its not an SMP. It is shared memory like an SMP. Looks and acts like an SMP at all processor counts.
The whole system has one contiguous view of memory. The NU means "non-uniform" as in the memory access time is non-linear. If a process's memory is located on the processor module it's running on, the memory access is fastest. If it has to jump one module away, the memory access time increases by 100 nanoseconds (roundtrip). Architecturally, it's completely different than anything Sun has to offer. Sun has been promising a NUMA mahcine for years and still hasn't delivered. The closest company to SGI is Compaq(DEC), and there top of the line offering can almost compete with Origin _2000_. All other companies high-end servers use symmetric multiprocessing, which becomes limited as more and more processors try to access the shared memory bus, ultimately bringing in negative returns as you add more processors. This NUMA architecture incurs very little (if any ) penalty by ading more processors, as long as the hardware and OS do a good job of placing processes and memory (keeping them physically near). Also, the machine is _not_ limited by 512 processors. To give an example of the power of this box, a company has certain calculations that they run day to day. On their top-of-the-line Sun hardware, it takes about seven hours. On O3k, it takes seven seconds! What does being a modular system have to do with being a cluster? By being "modular" it simply means that you can plug in more of whatever you want, whenever you want. I believe you can even mix faster cpu modules with existing ones as they become available, protecting your investment. This is not a cluster.
Intel transfer the difficult from Hadware to software, for get more power, programmer need more technology. -- chinaitn
Don't get me wrong, I love SGI's machines and use one daily. Even passed up on a faster PC (running Windows) because I like it so much. But there is no way I could cost justify getting a new one. They simply do not provide enough performance to justify the cost anymore. All the demos of their stuff we've seen doesn't indicate that their new machines are a huge leap in performance. (meaningfully faster to be sure but not nearly enough to justify the cost of a new one) Fortunately for SGI they make a ton of money on each Onyx & Origin they sell but if they aren't careful this could easily evaporate out from under them. They make very cool systems but it is not a well run business IMO. I'll be somewhat suprised if SGI doesn't get bought out by someone in the next year or two.
"This is definitely a great system for anyone who wants to have their computer be the size of several refrigerators."
I foresee a day when computers may be as small as one refridgerator. Probably there will be a world market for no more than 10 of these.
--
Give us our karma back! Punish Karma Whores through meta-mod!
Linux MAPI Server!
http://www.openone.com/software/MailOne/
(Exchange Migration HOWTO coming soon)
And wants several refrigerators to cool the system, too. Can Linux even handle that many processors, let alone make good use of them? UNIX is simply amazing...
I think these machines are simply awesome, but you have to wonder how many of these really gets sold? Yesterday Ascii white was announced to be sold to the public and now we see this bad boy. Does anyone have a link or figures on how many of these sell? How long does a company keep a supercomputer after buying one? The specs are impressive and so is the price tag, but do many companies, or countries buy these?
System Bandwidth
3200: 11.2 Gigabytes/sec
3400: 44.8 GB/sec
3800: 716 GB/sec
...methinks they skipped a decimal point here.
(if not, please explain!)
---
pb Reply or e-mail; don't vaguely moderate.
pb Reply or e-mail; don't vaguely moderate.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
Well.... before we can support 512 processor MIPS boxen we need to support single processor and dual processor.... IMHO low end multiprocessor SGI box support is where Linux needs to go on the SGI architecture
I don't know a whole lot about SMP.
That said, what's to stop each running thread from using one or four or whatever processors. I mean, unless the software is specifically to use 512 processors, wouldn't it kind of work as a really great multitasker?
Like I said, I don't know much about SMP.
One process with 512 _threads_ will do _really_ well. No one else ships a single memory image computer anywhere near this big. They can't. SGI took the R&D hit early with the O2k, adopted from the Stanford DASH machine. IRIX took the complexity hit early with 6.4 to improve concurrency/lock granularity.
Now they ship this monster. For large problems, No OS can touch IRIX, and no hardware can touch this. For people wanting to make the "clusters are better argument", well, if you happen to have the small variety of problem thats "clusterizable", this thing will run those too, and quite well. Furthermore, you can always cluster a bunch of these guys together for _thousands_ of cpus and _terabytes_ of ram... and it will all be using interconnects a shit of a lot faster than what you can get elsewhere.
Finally, if you pay attention, you'll see that the whole thing is totally modular. It doesn't have to run MIPS cpus. You can yank the C-bricks and throw in an IA-64 c-brick (sometime in the future). It's _NOT_ a MIPS-based architecture. It's a modular supercomputing platform.
SGI has done their homework adopting the lessons learned in DASH (and later FLASH). As a result they've got the most scalable real-world-useful computer there is.
My opinions are my own, and do not necessarily represent those of my employer.
To fix a few misconceptions: 1) The bricks are (mostly) 3U [5.25"], or 4U [7"] high, and the same bricks are used to construct a wild range of systems, with huge variations in CPU-I/O-storage ratios. 2) In some cases, the bricks will be sold separately and embedded into airplanes, vans, etc, by defense contractors. I'm told the submarine folks really love the idea. 3) In a half-rack (SGI Origin 3200), you can have 2-8 CPUs [1-2 C-bricks], a required I/O brick [I-brick], and either another I/O brick (I, P, or X) or a disk brick (D-brick). 4) People always announce a wide range of systems: realistically, most of these machines will be 1-2 rack systems, just like they are for everybody else. People who buy lots of computers use racks anyway - the last thing in the world they want to do is waste precious floorspace. 5) IRIX already scales to 512P fairly well, and NASA AMES runs individual shared-memory jobs on their older Origin2000. It already saved you a lot of tax money. 6) SGI is not shipping Linux on the MIPS-based machines. This is a "Caterpillar" announcement, with a lot of shoes left to drop, like IA-64-Linux versions coming later. A major point of the brick thing is that you can change bricks while re-using most of what you already had; you can for example, introduce a PCI-X, or later, Infiniband brick without obsoleting older I/O bricks. Also, you can build C-bricks with Intel IA-64s, and those will run Linux, not IRIX. All of the rest of the hardware infrastructure & bricks are the same. 7) SGI is working hard with the Linux community on scalability, i.e., to let it handle more CPUs well without damaging the basic Linux. Personally, I doubt that it will make sense to try to scale Linux to where Irix is, but it will certainly scale big enough to be interesting [say 32-64P in single system image]. Using partitioned hardware, one can get NUMAlink speeds between partitions, and that satisfies many customers. 8)The customer should be able to pick the size of machine, and then cluster that size together. For some customers, 1P + 64MB is just dandy, and they buy clusters of IA-32 boxes. I know customers where the right size happens to be 32P, 16GB of memory, 2 disks, and 3 Ethernets [one full rack], and then they cluster a lot of those. I know customers that cluster 128Ps, and there's one who would cluster 512Ps if they had the money. If the NASA Ames folks had the money, what they really want is a single machine with Petabytes of memory and Petaflops. I was sorry to tell them, Not Likely Soon. 9) Don't get too crazy with the fact these systems can go really big. I've lost track, but I think there are 30,000 of the Origin2000s & 200s out there, and most systems are small to medium. Of course, the big systems account for many CPUs. 10) The NUMAflex brick approach has many subtle benefits, but is hard work. In some thread, people mentioned backplanes ... but there aren't backplanes in the normal sense. Each C-brick has 4 MIPS CPUs, memory, and an ASIC Crossbar, with 2 ports out the back for cables that run (peak) rates of 3.2 GB/sec (2 * 1.6GB) and 2.4 GB/sec for I/O to separate I/O bricks. Each brick has internal circuit boards, but there is nothing that looks like a normal CPU backplane. To do this, you have to be able to run 3meter/5meter cables at these rates, and do tricky circuit engineering. Later versions will independently improve the interconnects as well, not just upgrade the bricks.
But does it really do the same graphics processing? Can a Voodoo5 or GeForce2 handle 48-bit colour for example (as used by the motion picture industry)? How about a 320MB framebuffer with 256MB texture RAM?
"The invisible and the non-existent look very much alike." -- Delos B. McKown
We got one of the earlier Onyx machines (creatively named onyx.astro.wisc.edu) back in 1993. It was pretty novel with its dual processors and fast OpenGL hardware. When some SGI programmers ported Doom (and later Quake) to the MIPS chip, some of us grad students used to play on the dept SGI boxes, including that dorm-fridge-sized machine. But for all its lofty framerate scores, our Onyx had no sound, so the poor sucker sitting at that terminal often got fragged with no warning.
But alas, the proprietary $15,000 memory module fried itself after the warranty expired and the machine was sold (for parts, I guess). No heated footstools in our computer room any more...
With boxen this size, the boundary between a single machine and a cluster tends to get a little blurred anyway. Even SGI are stressing the fact that it's a modular system. Basically, each module has it's own CPUs and memory, and has connectivity to the other modules in the system. What's the difference between that and a conventional cluster? Mostly the phenomenal inter-module bandwidth, but that's just a matter of numbers. Architecturally, is there much difference? OK, so you have a single OS image running across all CPUs, but is that even true any more? Certainly other large systems (e.g., from Sun or Data General) let you run multiple versions of the OS concurrently on a single box as you see fit.
"The invisible and the non-existent look very much alike." -- Delos B. McKown
http://reality.sgi.com/sgiquake
Keep in mind that until a month ago, SGI's top-of-the-line graphics board sets (MXE and IR2) were the same designs that originally appeared as Maximum IMPACT on Indigo2 and InfiniteReality on Onyx R10000. About five years ago, give or take.
During that time, entire graphics hardware companies have come and gone. The really good ones have caught up to, and occasionally surpassed, what SGI was doing in 1994. Impressive. Most impressive. ;-)
Now SGI has released Vpro, which despite having one name is actually two totally different workstation graphics designs. The Vpro you can get in the IA-32 workstations is basically high-bin commodity graphics hardware from a company that shall remain nameless.
But the Vpro that comes in the Octane2 looks outstanding. I haven't had a chance to use it yet, so I won't endorse, but the design specs for the Buzz chip make it look like InfiniteReality performance on the desktop. Way better than anything in the commodity market right now, and way more expensive, too. It's one of those things: if you have to ask how much it costs, you can't afford it.
And so we are all a part of the great Circle of Life.
Something this size does well for meterological simulations, atomic weapons research, something that entails MASSIVE numbers of computations. I wouldn't be surprised if you see these in placed like NCAR (National Center for Atmospheric Research) NCSA, the Government labs like Lost Alamos and Larry Livermore. I still remember seeing NCSA's purple monster 1024 node cluster of Origin 2000's (using an experimental node bridge)
;)
As for how this is different than a Beowulf cluster, look at the bandwidth! Even with switched 10/100 Ethernet as your Beowulf 'backplane' most switches have just enough backplane bandwidth to handle every 100 Mb connection, some have a little less. sgi has always had amazing bandwidth numbers, this is just taken to the N'th degree.
AND this is one machine, one OS, unlike a cluster of many independant machines, much easier to administer.
These are simply awesome machines, now maybe sgi can sell a bah-zillion of them and I can get my Indy sold
g:wq
What if it is just turtles all the way down?
It's tricky enough to design file-systems that are properly distributed. I did some design for a school thesis for a serverless distributed file-system with useful fault tolerance features. Thats pretty tricky in and of itself, even to support UNIX file-semantics. Building on something like that to build a strong and safe RDBMS would be quite a feat.
People _really_ like the single-machine programming paradigm. The OS at every level needs to emulate that behavior as much as possible, regardless of the reality of the situation. Hence, the need for a good file-system. (see Berkely xFS for the right approach, or Centravision for a shipping product looks interesting). RDBMS are already choked by locking algorithms and contention on SINGLE CPU machines. It should be no surprise that a fast RDBMS that is fully distributed and scalable isn't widely available. To do it right you've got to have transparent internal replication of basially everything. Not just data and meta-data, but even logic. Coming up with a serverless (and thus usefully scalable) scheme that gives strong enough guarantees for RDBMS applications yet still survives and survives corectly and quickly and doesn't bog down the system with locking will be quite a feat for whoever manages to do it.
My opinions are my own, and do not necessarily represent those of my employer.
Crap name, but I really think this 'brick' implementation is a great idea, and although I don't doubt the backplane/bus adds a certain amount of overhead to the cost, it'd be nice to see this sort of thing on Workstation and Desktop systems. And yes, I know similar things have been tired before (Acorn?)
PC getting slow or out of date? Add a new processor brick, that gets detected and used with just a reboot. Keep the old brick if you want. Graphics too slow? Just bought a second 19" monitor? Add a new graphics brick.
Im not suggesting this is a cheap or easy solution (yet) but its a much nicer one that PCI slots, and a tidier one than USB...
Pax,
White Rabbit +++ Divide by Cucumber Error ++
free experimental electronic music netlabel at www.viablehybrid.com
It isn't just the graphics, the the architecture of the entire box. You cannot compete with the pipelines on an SGI box with a x86. Its just pointless.
Ever wonder why Pixar has so many SGIs? It isn't because they have the extra money to burn. Its because SGI _IS_ the best at graphics. Until you use one for visualization (my department does a LOT of vis work - combat simulation), you have no idea the power of these things.
-- toolie
People are asking things like "why would I use this" and "who wants these?" Let me tell you, in the era of bloatware like Oracle and any of the content management systems out there (possible exception of Zope), the incredible scalability of these systems will be a huge selling point. Oracle, for example, is very careful to build and market their software to be monolithic so that you have to buy big hardware to run it, and then they charge you based on the size of the hardware you're running. Thus, they drive the purchase of huge systems like this, and then charge you up the ass for their "Enterprise class" database.
Believe it or not, this is actually the kind of business model that the Fortune 500 are not only happy with, but demand.
Personally, I'd be happy with a database that could run on a loose, fault-tollerant network of a dozen or so small (e.g. 2-processor Intel or Alpha) systems.
Then again, I'd really like to play with some of SGI's big iron....
NOW it's all so clear to me as to why iD would want to sell off it's SGI PowerHaus(tm).
The new models are on the way!
Rami James
Guy with Duh.
--
rJames.org - illustration
Geeky girls are often impressed by the size and power of your computer equipment. However, size is not the most important thing, it's how you operate it.
Mas vale cholo, que mal acompañado.
Anonymous is uninformed. Digital Domain used a Linux render-farm for Titanic, but as usual at DD, the bulk of the 3D interactive work was done on SGIs (and some Macs, and PCs with NT). This is very typical: renderfarms are whatever the company can get for the lowest cost/rendermark (or equivalent), and they don't use any graphics hardware, just the CPUs. For example, Sun gave Pixar a great deal on a renderfarm ... and they still buy OCTANE2s for their interactive work.
It is trivial to check:
http://www.d2.com/text/faq/main.html
and see what tools they use.
In the last 10 years, consider all of the films that won Academy awards for Computer-Generated special effects, and add in all of those nominated. Of these films, can you name the films that did *not* use SGI?
Finally, to avoid this being an SGI versus LInux, do recall that SGI is seriously investing in LInux work and contributing to the community in this turf, so it's not like we dislike it, just the facts.
SGI closed its doors for the last time today despite announcing record profits.
"We just ran out of names beginning with O" said Spokesman Otto Olson, head of names. The Ohshit and the Omygod were really scraping the bottom of the barrel.
Oliver Ottowan added "We really should have used a more common letterlike T or S."
NVidia made SGI's "VPro" graphics chipsets...
You're half right. There are three-and-a-half flavors of Vpro right now. There's V3/VR3, which is an nVidia board with 32 or 64 MB of DDR RAM.
Then there's V6/V8, also known as Odyssey. These are available only in Octane2. They're an all-SGI design with the Buzz chip-- "OpenGL on a Chip!"-- at the heart.
There's talk of a V12, which I think is supposed to be a two-Buzz version of V8. That, if it happens, will be exactly twice the geometry performance of V8.
Odyssey-- V6, V8, V12-- look on paper like they're light-years ahead of the nVidia stuff you find in the 230/330/530 systems. I say "look on paper" because I haven't used one myself. Disclaim, disclaim.
Finally-- my very own regeneration alcove :)
-j
-sigs of the world unite
Sounds like someone is playing a numbers game to me. The SMP effect is going to chew their lunch on performance. Say you have an application that runs on ten processors. Now, can you imagine the new performance level if you change that to 100 processors? It won't be even close to a 1000% increase. You'll have contention between the processes, and contention (especially) within the kernel. (Hell, I get it with just 12 processors, depending on the application that is running.)
If they know something I don't here, I'd love to see it.
As great as it is to see SGI's moves to utilize Linux, computers like these demonstrate that Irix still has a place in the larger picture. Irix is really a pretty neat operating system, and frankly, it can scale in ways that Linux just isn't ready to yet. As long as SGI is still making systems like these on the high end, I don't see Irix being displaced anytime soon.
Of course, Irix also has a lot of graphics production tools that you don't find on any OS, Linux included. That's something else that'll keep Irix around, at least until equivalents exist. Ideally, we'd see SGI continue to take steps toward open source/Free software, with Irix components.
Anyway, looks like a pretty cool new system from the people who brought us the original colored computer. Can't wait to get my hands on one of these.
yours,
john