Sun Releases Starcat

← Back to Stories (view on slashdot.org)

Posted by ryuzaki0 on Tuesday September 25, 2001 @06:20AM from the latest-and-greatest dept.

SilentChris writes: "Sun has released the Starcat server, a beast with up to 106 processors running Unix. Anyone have an extra couple [million] bucks lying around?" They're not cheap.

9 of 305 comments (clear)

Min score:

Reason:

Sort:

For those beowolf comments by segfaultcoredump · 2001-09-25 06:25 · Score: 5, Informative

Lets remember, that this system is not intended to replace a beowolf cluster of cheap pc's. It is intended to do something that most beowolf clusters can never do: present a single OS image with half a terabyte of memory that any cpu can access at very high speed.

This is a system that is very good at things like fluid dynamics and massive database operations. It is not a good idea if all you want to do is get to the top of the list for the SETI@Home project
1. Re:For those beowolf comments by Webmonger · 2001-09-25 07:34 · Score: 4, Informative
  
  In terms of memory bandwidth and latency, they are very different.
  
  The fastest networking technologies do not approach the speed and responsiveness of a memory bus. Yet a cluster design uses networking in place of a memory bus some of the time.
  
  If there's not a lot of data, it doesn't matter much. If there's tons and tons of data, a cluster design is inefficient.
2. Re:For those beowolf comments by tolldog · 2001-09-25 08:47 · Score: 3, Informative
  
  Agreed.
  This would make a killer render system, assuming the renderer can handle that many threads.
  
  This is why beowulf rendering is bad. Network performance for shared memory sucks.
  
  With renders hitting the 2GB + mark for memory useage, do you really want a network passing that data arround.
  
  What could happen with systems like this is that the render time vs. load time would get extremely lopsided. 30 minute loads and under a minute a frame. It would force a rethink of how the render jobs get distributed and ran.Best case would be a few of these, for each different departments render needs. But then we are talking 20+ million for rendering. That buys a lot of intel boxes.
  If I was given one, I would try to use it. But I don't think I could ever seriously suggest buying one. But that is me and my particular application.
  
  --
  -I just work here... how am I supposed to know?
Re:106? by segfaultcoredump · 2001-09-25 06:34 · Score: 5, Informative

The system grows to 106 in the following way:

There are 18 "cpu/memory" boards that hold 4 cpu's each. This brings the system up to a total of 72 cpu's and 576GB of ram.

Now, if you want an server that just does number crunching and dont care about I/O, you can then add 'MaxCPU" modules. Each module holds two additional cpu's (no memory) and occupies the hPCI module slot (a hot swap PCI case that can hold what looks like two to four pci cards). You can use up to 17 of the hPCI module slots to hold MaxCPU modules. (there are 18 pci channels on the system, and at least one must be used for accessing the boot disk).

So there ya have it, 106 cpu's and half a terabyte of ram. I think that in most cases, folks will opt to not use the MaxCPU modules and just stick to the 72 cpu limit.
Re:Clarification please by Cheetahfeathers · 2001-09-25 06:39 · Score: 3, Informative

These can be either. It depends on how you configure it. And the fun this is, you can reconfigure it on the fly. You want a cluster in a box? You got it. You want 2 seperate instances of Solaris running, each using 1/3rd the resources of this box, while you pull out the hardware on the rest of the box for maintenance? You got it. This thing is _configurable_. You can hot swap everything except the backplane, pretty much. It's _sweet_.
Re:partitions by Doctor_D · 2001-09-25 07:13 · Score: 5, Informative

According to the specs each processor board holds 4 processors and 32 gigs of memory.

Now, if the starcat treats domains (partitions) the same as the E10k (I haven't been to training yet on it), then each domain at minimum will consist of 4 processors and 32 gigs of ram, ie 1 processor board. Basicaly these doamins are treated as seperate boxes as far as Solaris is concerned. You configure a domain to say contain 2 system boards, and then when you load Solaris, it then sees 8 processors and 64 gigs of memory. This way you can allocate resources as the need fits. But this means it doesn't look like the virtual processor that mainframes present.

The starcat may deal with processors above 72 in a different way, but I honestly don't know at this time how it deals with them.

Hope this helps answer your question.

--
"If you insist on using Windoze you're on your own."
Re:106 procs, so what by segfaultcoredump · 2001-09-25 08:46 · Score: 4, Informative

The SGI origin has a ccNUMA architecture, which makes it great for some tasks, ok for others, and awful for yet others. (the trick is to make sure that your particular app falls under the 'great' category)

The sun system is an smp based system, everything connects to a common backplane and each board has equal access to all of the other boards. With the sgi, the speed of accessing memory on the local board or boards in the same cabinet is much faster than hits to memory in remote cabinets.

From what I can tell, Sun is planing on producing a special system board that goes into one of those 18 slots. Thus, with 19 StarCats you can create one big system with 1836 cpu's and 9.7TB of ram. (think of a system in the middle that acts as the center of a star) it will most likely be based on a COMA architecture rather than a ccNUMA. Like the SGI, memory access will depend on the distance between the requesting cpu and the storage location. The difference is that under COMA, if a cpu requests a particular bit of memory a lot, that page is either migrated or copied to a memory bank on that cpu's memory board (so if 5 cpu's all need read only access to the same bit of memory, then they can each have their own copy in a local memory bank. write updates are what make the system a pain in the ass to manage ).
Re:SGI Origin 3000 by fgodfrey · 2001-09-25 10:20 · Score: 3, Informative

What are *you* smoking? The Origin 3800 is certainly *not* using the "same back plane a the sun was 2 gens back". In fact, if you were anything other than a hopeless troll, you'd realize that the Origin 3800 doesn't *use* a backplane at all. You get, on a 512p system, 128 I/O channels, each of them supporting up to 12 PCI slots or 4 XIO slots. I can't remember off the top of my head what the bandwidth per channel is but it's on the order of a gigabyte/second (I wanna say it's 1.6 GB/sec, but I might be wrong).

--
Go Badgers! -- #include "std/disclaimer.h"
Where 106 probably comes from by MadDog+Bob-2 · 2001-09-25 13:04 · Score: 3, Informative

I mean to say what's the difference between 106 (what an odd number) ...

Given the way other Sun boxen like the E3500 work, I expect that's the 15K has 18 boards, each of which takes 3 modules, either 2xCPU of 8GB RAM.

That means that 72 CPU / 288 GB memory is 18 boards, each with 2 2xCPU modules and one 18 GB memory module, and the box is full.

Since you always need some memory, the most CPUs you can get is 17 boards w/ 6 each and one with 4. Of course, that leaves you with 8GB of memory for your 106 CPUs.

The other end is (17 x 3 + 1 x 2) 8 GB memory modules for 424 GB on a pair of CPUs

But that's just a guess...