Cringely Wants A Supercomputer in Every Garage

← Back to Stories (view on slashdot.org)

Cringely Wants A Supercomputer in Every Garage

Posted by timothy on Wednesday December 26, 2001 @04:05PM from the i'd-call-mine-claire dept.

Nate LaCourse writes: "Real good one from Cringely this month. It's on building his own supercomputer, but with some twists." You'll probably also want to check out the KLAT2 homepage to learn more about their Flat Neighborhood Network. And since KLAT2 has been around for nearly a year (check out the poster on this page!), perhaps a 3rd generation is in the works?

5 of 277 comments (clear)

Min score:

Reason:

Sort:

the ignorant are easily amused by markj02 · 2001-12-26 17:37 · Score: 5, Insightful

Cringely is completely missing the point. KLAT2 uses multiple routes and switches, not channel bonding. And what the project contributes is not the basic idea of using multiple network interfaces (which is decades old), but a specific approach: using genetic algorithms to optimize the network topology. More traditionally, such clusters have used manually designed topologies with known performance bounds.
1. Re:the ignorant are easily amused by funnyguy · 2001-12-26 19:32 · Score: 4, Insightful
  
  the FNN which was created for KLAT2, is not a speed increase of ethernet by using multiple network cards. It basically allows full speed (100mb full-duplex) without a 64+ port, full wire speed switch. If such a thing even existed. Cringley's network is just 4 channel bonded network layers. Channel bonding actually has slightly more overhead than FNN. With KLAT2's FNN, each machine is on 4 seperate networks. No matter what other machine a single machine needs to communicate with, they each share one common network. Each network is held together with one switch, so there is always a full speed route to every other computer in the cluster. The OS handles this directly by using /etc/ethers to hard code the hardware addresses of every computer. different networks are different subnets, and the network routes are layed out accordingly.... blah blah... I could go on and on, but aggregate.org has more info.
  
  As for the algorithm everyone is talking about. there are some versions which can return a pattern in a second or two on a slow celeron. then there are some version which are designed optimized for certain datasets which take time to run. but generally, you don't need a supercomputer to design a fnn. even with 64+ nodes.
2. Re:the ignorant are easily amused by Zeinfeld · 2001-12-27 00:45 · Score: 4, Insightful
  
  Quite, the problem with measuring super-computer performance is that every single machine in the class is highly optimised to a particular niche. That is the main rason they are so expensive compared to the components - large machines sell in the tens rather than the tens of thousands.
  Anyone can build a machine with a really high processing performance. Just by a few thousand X boxes and plug them into the same ethernet cable. The real issue is how much communications bandwidth you have between the CPUs. Some problems require almost none - the 'trivial parallelism' problems like DEScrack and the mandelbrot set. In the 1980s we had a machine that had 1000 20MHz processors that could bang out mandelbrot sets like anything (using the goofy algorithm, not the modern optimizations). But is wasn't much use for anything else.
  The problem with competitions for supercomputers is that they rarely measure the communication bandwidth because (a) its hard to do and (b) the effect on performance is highly algorithm dependent.
  As for the KLAT's ingenious topology, I once did some research in the area myself when it was the fashion. I tried using minimum diameter graphs which should in theory have been better than a plain taurus. However as with Bill Dally at Cal Tech I concluded that the additional cost of exotic topology (more than double the price) was not really justified by the performance advantage (about 10-30% on a good day).
  Certainly the many companies that set up to build transputer based processing clusters with high performance switches inside did not seem to go anywhere much.
  Using a high performance router at the core of a processing cluster might be interesting. They are pretty cheap these days and are headed cheaper.
  
  --
  Looking for an Information Security student project suggestion?
  Try http://dotcrimeManifesto.com/
Supercomputing? Why bother. by Bowie+J.+Poag · 2001-12-26 17:46 · Score: 5, Insightful

Speaking as someone who, yes, has actually worked with the big iron...

Why bother. Remember, Moore's Law is still in effect. Recently, we've hit the point in the curve where supercomputers are no longer needed, nor cost-effective. That is, the time it takes for the industry to deliver a far superior product has eclipsed the average lifespan of your typical supercomputer.

We're living in an age where a single graphing calculator you can buy at Walgreens has more horsepower under the hood than what got us to the moon 30 years ago. Your $2700 PC will be worth $150 within 3 years.

Having a supercomputer in every garage makes about as much sense as taking a rocket fuel-powered dragster to the supermarket for a gallon of milk.

Cheers,

--
Bowie J. Poag
A real supercomputer? Not exactly by fgodfrey · 2001-12-26 18:49 · Score: 5, Insightful

The article would have people believe that all a supercomputer is is a collection of not-quite-modern processors, memory, and an interconnect of some sort. This is simply not the case. If it were, why do many (granted a smaller number than before) people still buy real big iron? The answer is that Cringely's (sp?) collection of processors is not a real supercomputer for the kinds of applications that are associated with traditional machines. Traditional vector supercomputers still have processors that are faster than Pentium 4 class systems. Traditional massively parallel supercomputers (which are the most similar to a cluster) have a number of features not found in your average garage built cluster like a truely low-latency interconnect, gang scheduling of entire jobs, single system image for users/administrators/processes.
Clusters are great for embarassingly parallel applications (ie ones that have threads which don't communicate with each other much. This includes things like SETI@home and batch rendering of images. What they don't compare on is applications that communicate a lot like nuclear physics simulations. This is not to say that that will never change in the future, but for the time being it's still true.

Last, and certainly not least, real supercomputers have memory bandwidth that can match the speed of the processor. A Cray or an SGI Origin has an absolutely massive amount of bandwith from the processor to local memory compared to a PC. That allwos a traditional supercomputer to actually *achieve* the fantastic peak performance numbers. On many applications, the working sets are huge and don't fit in cache so you end up relying on memory being fast. On a PC, it's not and I've heard from sources I consider reliable (though I have no actual numbers to back this up so it may be rumor only) that one large cluster site sees around 10% or less of peak on a cluster for a nuclear physics simulation, whereas, on a vector Cray, you can hit ~80% of peak. This means that the cluster has to be 8 times more powerful and when you start multiplying the costs by 8, they start looking like the same price as a real supercomputer.

So my point is that building a real supercomputer does not mean grabbing a bunch of off-the-shelf components, slapping them together with a decent network and running Beowulf (or a similar product).

--
Go Badgers! -- #include "std/disclaimer.h"