Slashdot Mirror


SGI to Scale Linux Across 1024 CPUs

im333mfg writes "ComputerWorld has an article up about an upcoming SGI Machine, being built for the National Center for Supercomputing Applications, "that will run a single Linux operating system image across 1,024 Intel Corp. Itanium 2 processors and 3TB of shared memory.""

2 of 360 comments (clear)

  1. Sun does more than that by puppetluva · · Score: 4, Insightful

    Sun hardware has additional, wonderful resiliency features like - allowing cpu's to "fail-over" to other cpus in case of failure. The same holds true for memory, network interfaces, etc. Solaris is aware of these hardware features and can "map out" the bad memory and cpus on the fly (or allow swap-in replacements). The engineers can then replace the broken cpus/memory/interfaces WITHOUT BRINGING THE MACHINE DOWN. This lends itself to an environment than can enjoy nearly 100% uptime. Finally, since Sun has been doing the "lots of cpus" thing for many years, their process management and scalability tends to be much better.

    I don't work for Sun, I'm just an SA that deals with both Solaris and Linux boxes. You don't pick sun for just "lots of cpus", you pick it for a very scalable OS and amazing hardware that allows for a very, very solid datacenter. If downtime costs a lot (ie. you lose a lot of money for being down), you should have Sun and/or IBM zseries hardware. Unfortunately those features cost a lot and most times you can use Linux clustering instead for a fraction of the cost and a high percentage of the availability.

  2. Another thing Sun does well.... by passthecrackpipe · · Score: 4, Insightful
    Cache reduction - ehh cash reduction. One of the prime reasons Sun is losing serious levels of installed base to Linux is not because linux is better, it is because Sun is bloody expensive - outrageously so. And while most customers had to endure the annual fleecing with gritted teeth - due to lack of alternatives - Sun is now being pummeled out of datacenter after datacenter.

    I have replaced Sun Hardware/Software combo's in the core datacenter for many of our customers, and I can tell you that yes - Sun brings some amazing features to the table - most of which are there to serve old technology. Linux on simple CPU's delivers such an amazing price performance (depending on the job, we see an average of 3x to 4x performance increase for 25% of the cost. That means that if I were to spend the same, lifecycle-wise, on a Linux cluster as I would on a big Sun box like the 10k or 15k, I'd end up with 12x to 16x the performance of the Sun solution.

    The same functionality in terms of cpu and ram (and other hardware) failure is available on the Linux cluster, albeit in less graceful form - the magic spell to invoke goes like this:
    shutdown -h now
    if I have 300 machines crunching my data, I can afford to lose a couple, and can afford to have a few hot-standby's.

    Of course, the massively parrallel architecture does not work for all applications, and in those cases you would look to use either OpenMOSIX or of course the (relatively expensive) SGI box mentioned in this article.
    --
    People who think they know everything are a great annoyance to those of us who do.