Cringely Wants A Supercomputer in Every Garage
Nate LaCourse writes: "Real good one from Cringely this month. It's on building his own supercomputer, but with some twists." You'll probably also want to check out the KLAT2 homepage to learn more about their Flat Neighborhood Network. And since KLAT2 has been around for nearly a year (check out the poster on this page!), perhaps a 3rd generation is in the works?
Cringely is completely missing the point. KLAT2 uses multiple routes and switches, not channel bonding. And what the project contributes is not the basic idea of using multiple network interfaces (which is decades old), but a specific approach: using genetic algorithms to optimize the network topology. More traditionally, such clusters have used manually designed topologies with known performance bounds.
Speaking as someone who, yes, has actually worked with the big iron...
Why bother. Remember, Moore's Law is still in effect. Recently, we've hit the point in the curve where supercomputers are no longer needed, nor cost-effective. That is, the time it takes for the industry to deliver a far superior product has eclipsed the average lifespan of your typical supercomputer.
We're living in an age where a single graphing calculator you can buy at Walgreens has more horsepower under the hood than what got us to the moon 30 years ago. Your $2700 PC will be worth $150 within 3 years.
Having a supercomputer in every garage makes about as much sense as taking a rocket fuel-powered dragster to the supermarket for a gallon of milk.
Cheers,
Bowie J. Poag
Clusters are great for embarassingly parallel applications (ie ones that have threads which don't communicate with each other much. This includes things like SETI@home and batch rendering of images. What they don't compare on is applications that communicate a lot like nuclear physics simulations. This is not to say that that will never change in the future, but for the time being it's still true.
Last, and certainly not least, real supercomputers have memory bandwidth that can match the speed of the processor. A Cray or an SGI Origin has an absolutely massive amount of bandwith from the processor to local memory compared to a PC. That allwos a traditional supercomputer to actually *achieve* the fantastic peak performance numbers. On many applications, the working sets are huge and don't fit in cache so you end up relying on memory being fast. On a PC, it's not and I've heard from sources I consider reliable (though I have no actual numbers to back this up so it may be rumor only) that one large cluster site sees around 10% or less of peak on a cluster for a nuclear physics simulation, whereas, on a vector Cray, you can hit ~80% of peak. This means that the cluster has to be 8 times more powerful and when you start multiplying the costs by 8, they start looking like the same price as a real supercomputer.
So my point is that building a real supercomputer does not mean grabbing a bunch of off-the-shelf components, slapping them together with a decent network and running Beowulf (or a similar product).
Go Badgers! -- #include "std/disclaimer.h"