Sun Releases Starcat
SilentChris writes: "Sun has released the Starcat server, a beast with up to 106 processors running Unix. Anyone have an extra couple [million] bucks lying around?" They're not cheap.
← Back to Stories (view on slashdot.org)
As someone who does nothing with these types of systems, nor follows them, I think it's great that you can have different processor speeds using "partitions."
I wonder if memory is treated the same way... i.e., separated by "partitions," or if you also have a choice to use it as one, large unified memory resource... or, I wonder if memory can be dynamically partitioned... hmm.
Actually, now that I'm thinking about it... are all of the processor partitions considered peers? I mean, are the partitions all treated as if they were a single processor... then treated equally?
to get on the National US ID Card database bandwagon with Oracle... It'll only need to store about 300 million records with DNA, fingerprint, picture for facial recognition software, key escrow, etc...
try { do() || do_not(); } catch (JediException err) { yoda(err); }
"Huh? I understand that the nation's air traffic controllers may need updated equipment in light of the existing crisis, but how hard can scheduling be? I could see a use for a massively parallel monster like this in, say, flow-through or structural analysis or something, but scheduling? "
What your missing is that this isn't a matter of airtraffic control. This is a matter of determining which planes and crews to fly to which locations at what times to maximize revenue. This is a classic, big, nasty travelling salesman problem. The bigger of a beast of a machine you get, the closer you get to an optimized solution. I.E. Most passengers willing to pay this most money with the least use of resources. It's a huge problem that needs massive computational power.
Actually, the cheap, unreliable Intel boxes will have far lower TCO if you set it up right.
When one of them fails, we throw it away and replace it with another $1000 server. We put a CD in the drive, and leave. When we come back, its rejoined the cluster.
Far cheaper than the time spent waiting for the Sun tech who comes, and then bends the fucking pins on a CPU trying to install it.
Take a look at the SGI O3800 if you want your socks blown off for real. And then consider that the O3800 has been available for a while now.
Sun is still playing catchup. It doesn't really matter though because even in the high-end unix mainframe market, marketing means far more than technical ability.
Maintenance is also a lot different for a single system image versus a beowulf cluster, with tradeoffs either way. To manage that big a beowulf cluster means managing a minimum of 27 different system images, which means 27 times some types of maintenance activity (I'm assuming 4 proc boxes max, so a more typical config would be 53 or 106 system images). On the other hand, if one of those fails, needs to be upgraded, or whatever, the impact to the cluster is probably pretty minimal. Unless starcat does something really different, a CPU failure will cause that single image to fail (which is of course why you'd split the box into two and use an HA cluster of some sort :-).
7 November 2006: The day Americans realized corruption and incompetence weren't addressing 11 September 2001
Just wait for six months. This is the first beast in a series of pseudo-clustered Sunfires. This is roughly a stack of 6800's, and there's going to be a MUCH larger machine released very soon.
"People who do stupid things with hazardous materials often die." -- Jim Davidson on alt.folklore.urban
Well, this Sun, and all of SGI's line are not homogenous memory architecures. They're offshoots of NUMA. Most of these machines are 4 way boards with their own local memory.
They can access non-local memory via a high-speed crossbar network (SGI's is called Craylink, dunno about Sun's).
The architectures are pretty similar because when SGI bought Cray, they spun off the lower end stuff to Sun. Ironic that Sun got the reputation for high end servers with the low end of what SGI uses now. An Origin3800 has been available for some time, and blows the doors off this new Sun machine.
Guess it just goes to show that technology is completely irrelevant for market success.
This is in fact the fastest Java server in the world. Check out:
f lash.20010925.3.html
http://www.sun.com/smi/Press/sunflash/2001-09/sun
-Steve
Yes, there's a buy online button. But that's used to get info so one of their sales droids can contact you. It's not like you can slap it on your Visa card. :)
(Disclaimer, I work a lot with E10Ks, so this post is written mostly from my experience with those.)
The 15K is basically just an improvement on the E10K architecture, from what I've seen and heard from Sun's SSEs. The E10K started out life as the Cray SuperServer, and was sold to Sun for a song. It's not architecturally perfect. The E10K is set up to allow individual system boards to be part of domains (aka partitions), which can make for some great scalability in the domains. I've seen tiny little one-system-board domains, and domains with 13 fully populated system boards in them.
One of the major advantages to this platform is the fact that you can hot-swap everything except the centerplane. (Of course, I've never seen a centerplane fail.) The E10K also has Dynamic Reconfiguration, where you can remove system boards from a running domain, but unless your platform is set up in a certain, specific way, this doesn't work as well as advertised. I've personally never used it. The best thing about the E10K is the use of the System Service Processor, which handles all the administrative tasks for the entire cabinet. I've heard that the SSP is now integrated into the 15K, thus eliminating the need for a separate system to perform these tasks and monitoring.
The only thing I've ever seen this class of system used for is data warehousing. No modeling, no graphics rendering, just Oracle databases. Just because it has a large number of processors, doesn't mean they're going to be suitable for every task imaginable. (I used to have a 180MHz Indy R5000, that got 68kkeys/sec in d.net. My 166MMX got something like 350kkeys/sec.) These are workhorse processors, not sports-car style processors.
Though I wonder if Sun's gotten around to fixing that nasty ecache parity error problem with their processors... Having a domain randomly crash because the parity bit on a processor got flipped is no fun when you're dealing with a large production database. I have a feeling that problem will continue to plague them in the 15K.
A good traveller has no fixed plans and is not intent on arriving.