Slashdot Mirror


Compaq To Build DEC Beowulf Supercomputer

Tower writes: "Compaq Computer (Digital) and the Pittsburgh Supercomputing Center have won a $36 million contract to build a 2,728-processor supercomputer using 1.1 GHz EV68 processors in a 682 node Beowulf setup. Check it out here." This is a different machine than this one: That one was supposed to be used to calculate nuclear explosions, this one will be used by the National Science Foundation to work on biophysics, global climate change, astrophysics and materials science, according to the article.

13 of 99 comments (clear)

  1. Re:But why? by Tracy+Reed · · Score: 3
    Home computers (Linux systems) CAN share disks like this if you want to invest in fibrechannel (which may not be as expensive as you think). Check out:

    http://www.globalfilesystem.org

    Very cool technology. I have been following this for quite a while and it shows tremendous promise for solving all kinds of disk scalability problems.

  2. Re:But why? by Shimbo · · Score: 3
    If you want massively parallel systems then I would honestly think that something like processtree would be a good solution since you can rent a phenomenal block of cpu time.

    A lot of these problems, like climate modelling can be worked on by partioning the problem into cells. You just need to fix up at the edges, on each iteration though. Independent systems but joined together, particularly with a low latency interconnect fit this sort of problem space well.

    Obviously, there are some problems, where the dependencies between the data sets are nil, where commodity Intel/Athlon/Alpha Linux boxes are ideal. Still more where the are cost-efficient ;)

    Supercomputing facilities are best equipped with a mixture of these. For some jobs a steamroller is better than a Porsche. When you've got a specific requirement, and lots of money is involved, off the shelf components are not always the best bet.

    they surely lack the memory bandwidth that makes traditional mainframes and supercomputers so powerful.

    Yes, but these aren't Beowulf clusters. Quadrix hardware is not some cheap and cheerful solution like switched Gigabit Ethernet ;)

  3. Re:Not Beowulf/Linux by Black+Parrot · · Score: 3

    > I dunno -- seems to me like the author is saying that it really is a Beowulf cluster.

    I took it to be a "Beowulf clone" or a "Beowulf-style cluster". AFAIK (please correct me!), "Beowulf" refers specifically to a GPL'd Linux kernel hack, and thus any "Beowulf cluster" would be a Linux cluster. But I would assume it would be more or less straightforward to implement on Unices, at least for parties who have the source code, in which case I would call it a "Beowulf type cluster", or give it a new name altogether. But perhaps the term has been generalized; I think it has already generalized once from refering to "the" Beowulf cluster (the original one), to refering to all clusters built with the same kernel patch.

    OTOH, there was a [epithet of your choice for a moron here] on the Beowulf mailing list for a while, who was adamant that his NT cluster was a "Beowulf" system. I never figured out why he even subscribed, since any exchange of information there would be completely irrelevant to his situation. Shows the importance of bragging rights in the IT world, I suppose.

    --

    --
    Sheesh, evil *and* a jerk. -- Jade
  4. How many power does such a thing use? by Idaho · · Score: 3

    I always wonder how much power goes into these kind of beasts.

    Let's try to estimate it: 682 systems each containing 4 processors. I guess that they will need a 300 W power supply. So that makes about 204 KW just for the computers (when working at full speed only, OK)!

    At 110 V this thing would eat 1860 Ampere, not something you'd like to try at home or something (imagine the electricity bill :-)

    --
    Every expression is true, for a given value of 'true'
  5. Test Drive a Beowulf by chamber · · Score: 5
    As a matter of fact, we've just set up a new Beowulf cluster that people can play with. It's got 9 DS10L's (1U rack-mountable Alpha systems), each with a 466 MHz EV6 Alpha, 256 MB of RAM, and two 10,000 RPM UltraSCSI drives. If you're interested, stop by http://www.testdrive.compaq.com/, where you can get all the dirt and get a free shell account on it and our other Test Drive machines.

    Yes, I work for Compaq. No, I don't speak for them.

  6. Re:But why? by dehuit · · Score: 3
    I have to wonder what the point is in massive beowulf clusters like these. Sure they are fast and give you more Mips than flanders next door, but they surely lack the memory bandwidth that makes traditional mainframes and supercomputers so powerful.

    If you want massively parallel systems then I would honestly think that something like processtree would be a good solution since you can rent a phenomenal block of cpu time.

    Well, obviously these machines are something inbetween the extremes you mention, and there are applications for which this is sort of a sweet-spot.

    I have used an application for which this type of machines are excellent: molecular dynamics simulations.

    The usual strategy for this type of software is to partion your system by giving every proc a share of the atoms. Then you start calculating forces and motions etc for each part for a short time period, and then compare them. Many forces extend to neighbouring parts, and atoms can move to other parts, so quite a lot of communications between the nodes is necessary. After exchanging this info, each node can compute the next timestep. This works quite well if most interactions between atoms are relatively short.

    This type of app is excellently suited for a large cluster. It is naturally suited for message-passing, so programming it using MPI is easy. If you partion the system well, the memory use of one node is quite small, and fits for a large part in cache. IO between nodes has to happen quite often, so latency is a problem. So processtree is obviously no option.

    These simulations scale quite well to larger molecular systems. Unfortunately, many researchers don't want more atoms in their systems, they want the simulation of their small system done faster. Unfortunately, this scaling is bad; if you end up with only a few atoms per node, the communication overhead boggs it down.

    FYI, here are some old benchmarks of the software i used (gromacs). Although this software is considered to scale excellent, a 64 node machine is only 32 times as fast as a single-node machine...

    Sorry if all this is incrompehensible, i guess i want to say too much too fast...

  7. Not Beowulf/Linux by Wookie+Athos · · Score: 4

    If you read the C|Net page carefully you will see it says the machines are to be 4-CPU Compaq boxes running Tru64 Unix.

    The writer did mention Beowulf, but only to say that it was similar.

    __
    Conclusions are easy to jump to. Just be prepared to jump again...

  8. Re:Will this really be supercomputer? by dehuit · · Score: 3
    At least, they seem to be custom-built by a company the sepcializes on such things.

    Which has an excellent product page here. 2.35 usec latency for a short message. 340 MB/s peak, 210 MB/s sustained throughput. Fault tolerant redundant links. Tru64, Solaris and Linux support. I know nothing about this, but it sounds impressive to me.

  9. But why? by grahamsz · · Score: 3

    I have to wonder what the point is in massive beowulf clusters like these. Sure they are fast and give you more Mips than flanders next door, but they surely lack the memory bandwidth that makes traditional mainframes and supercomputers so powerful.

    If you want massively parallel systems then I would honestly think that something like processtree would be a good solution since you can rent a phenomenal block of cpu time.

    Each of these 682 nodes will be running Compaq's Tru64 Unix, which is capable of sharing a single file system

    Wow if only home computers could share disks like that!!! This actually makes me think that the nodes are operating as independant computers rather than part of a whole... but hey i'm probably wrong :)

  10. This is not a beowulf cluster by amck · · Score: 3

    The article was vague with the 'souped-up beowulf'. These AlphaServer SC machines are not just connected by fast ethernet, they share a Quadrics switch that provides ~200 MB/s bandwidth with 5us latency per node.

    Alastair McKinstry
    AlphaServer SC Engineering, Compaq.

    --
    Anyone who believes exponential growth can go on forever in a finite world is either a madman or an economist
  11. Re:Programming this Beast by -brazil- · · Score: 3

    Well, those machines are most commonly employed to solve numerical problems (as in: huge systems of differential equations). For that kind of work, High Performance Fortran can be used. HPF basically consists of extensions to Fortran that allow you to explicitly divide data (i.e. parts of matrices) between nodes and still use standard operations on it. The compiler takes care of the inter-node communication, and if you divided the data wisely, there hopefully won't be too much of it.

    --

    The illegal we do immediately. The unconstitutional takes a little longer.
    --Henry Kissinger

  12. More on PSC by gotih · · Score: 3

    The PSC has a release here

    I was involved with the pittsburgh supercomputing center in high school. We were given a grant for processing time, something like $40,000, to compute the heat loss of my community due to improper insulation. Admittedly, I was on the fray of the group but I know they have been using massively parallel systems for a while. They also had an Internet connection which is where I first used Lynx.

    At that time they had a T3D and a "DEC supercluster" which was IIRC 256 Digital Alpha computers. They had some other supercomputers but I can't remember what they were. The supercluster was later upgraded to 512 processors. It seems that this is the same thing, updated and built by Compaq (who bought Digital).

    --

    fear is the mind killer
  13. Programming this Beast by acacia · · Score: 3

    I keep hearing about these projects, and the means by which the nodes of these machines are connected, but what I really want to know is how these clusters are programmed. More to the point, how is it data and process parallelism implemented (or not) when you are talking about a high complexity environment and a fairly low level of abstraction.

    I write software for MPP & large scale SMP machines, but I use tools like Ab Initio or Torrent Orchestrate to abstract away much of the complexity for traffic control, checkpointing, hash partitioning data, etc... in my cursory examination of PVM and the MPI implementation, it seems pretty primitive, and the code must be a nightmare to implement properly, much less maintain.

    Is anyone working on a GNU componentized approach similar to the commercial packages I mentioned earlier to take care of this? Is anyone interested in doing this? This could be a pretty cool project.

    The other reservation I have when I look at the whole beowulf architecture is the node latency issue. Unless you have highly partitioned code, with independent processes, these machines are gigantic toasters, spending most of their lives waiting for IO. A well designed, partitioned app should be CPU bound. Most of the business apps I develop don't exhibit these (well partitioned) characteristics all the way through the process. It makes me wonder how effective these machines really are.

    --
    ~Religion is O.K., as long as it gets you laid.