IBM Creates New Fastest Beowulf Cluster

Re:Cost/performance by Indomitus · 2000-03-22 01:22 · Score: 2

I have a friend at SGI that said SGI's bid for this was almost at cost and it was like $1 million. IBM is losing a lot of money on this deal but they get the publicity and the fact that all the geeks at UNM will see big blue's logo all over it.

Also, the cost of the boxes is almost unimportant with something like this, you have to take into account the actual construction of the network (usually the most time-consuming part of any super computer) and the main cost is the support contract. You have to have people available to fix this guy on a moment's notice whenever it breaks.

SMP on the nodes? by Nicolas+MONNET · 2000-03-21 21:56 · Score: 2

I guess that would depend largely on the software run, however I believe that SMP will be useful on the nodes; simple SMP systems are dirt cheap nowadays, and they cost actually less than two UP boxes (1 powser supply, 1 MB, 1 bus, 1 Network interface, etc ...)

Re:What kind of network speed do you need? by otis+wildflower · 2000-03-21 22:45 · Score: 2

How about something like Myrinet? Iffen you gots the $$$ ;)

Your Working Boy,

Cost/performance by larien · 2000-03-21 21:19 · Score: 2

Only 24th in Supercomputer rankings, but what do those 64 Netfinities cost? I'm pretty sure it would come in at under the cost of most of the top 500; I'm estimating 10,000 UKP per box, which is under a quarter of a million. Not bad. However, it must be a bitch to manage 64 seperate nodes to make a single 'unit'

Also, it mentions the limitations of networking; can't you link together 3com (now defunct, I know) switches in a stack to make larger switches? If not, I'm pretty sure that we'll have larger switches in the next few years.
--

Re:Cost/performance by Sensor · 2000-03-21 21:26 · Score: 2

Couldn't see a spec in the artical, but we have a couple here (dual Xeon 500, 1Gb ram, RAID) and their list price is around 16k.

and just to be picky 64x10k=640k or over half a million!

Tom

Re:(OT)How does a first post get marked as redunda by Skinka · 2000-03-21 21:27 · Score: 2

Because the joke is so damn obvious. Christ sakes my mother could have thought of that one.

That's it, the moderators are on drugs by orcrist · 2000-03-21 21:59 · Score: 2

What possessed someone to mark the above post as 'flamebait'?

I literally read it 3 times through to find the 'flamebait' there: nada. The only moderation down, which could have had a sliver of merit would have been 'overrated' but this is ridiculous. Hopefully this gets caught in meta-moderation...

Chris

--
San Francisco values: compassion, tolerance, respect, intelligence

Re:That's it, the moderators are on drugs by Tower · 2000-03-21 22:32 · Score: 2

The only thing I could think of is that the moderator was an SMP designer and took offense to the "SMP is going to run out of steam" comment. Otherwise, I can't see any there either...

--
"It's tough to be bilingual when you get hit in the head."

Re:not Beowulf? by rappleye · 2000-03-22 01:57 · Score: 2

You're making a category mistake. MPI and PVM are message-passing libraries. LINDA is a programming language that uses a tuple space stored in distributed shared memory (see here for more info. HACMP is a completly different beast, see IBM's homepage.

Beowulf != any of these. Beowulf is the idea that one can take commodity, off the shelf (COTS) components and build a powerful machine at a price far less then a comparable commercial offering.

Codes run on Beowulf, and really any parallel machine, typically use MPI, PVM, or custom message passing libraries. The beowulf idea includes the use of MPI & PVM, among other freely available software packages. Codes that run on shared memory machines typicall uses the shared memory device of MPI, shared memory, or pthreads.

For CPU intensive tasks the Beowulf idea is great. Codes that perform lots of disk I/O suffer, as adding higher performance (i.e. SCSI) disks increases system cost greatly. Communication intensive tasks perform the worst on beowulf style clusters compared to commercial computers, as the interconnect on beowulf-style clusters can't compare. For a relatively large increase in cost, one can use Myrinet. With Myrinet bandwidth and latency begin to approach that of the switch found on the IBM SP series of machines.

With high bandwidth, low latency interconnect technologies that scale well (e.g. Myrinet), one can build a cluster that outperforms a comparable commercial offering at, say one quarter to one eigth the price. The difference at that point is software. There's really not alot out there to configure and administer beowulf-style clusters, and commercial implementations of some packages beat the pants off of their freely available counterparts (compilers, for example). Until the software situation changes there is still reason to buy your big iron from IBM, SGI, and Sun.

--Jason

Re:not Beowulf? by dublin · 2000-03-23 05:22 · Score: 2

Actually, I'm not making a category mistake (I even noted in my original post that the things I was using as examples were not necessarily interchangeable), but you've made my point for me.

As you note, the real power of distributed/parallel computing comes from the message passing libraries, most commonly MPI or PVM. Beowulf per se is almost nothing more than a label for the generic concept of distributed computing on Linux. The same thing can be done with any other reasonably modern networked computer you have lying around, even those running Windows - you can even mix OSes in a cluster, although this introduces new and interesting problems. (There are a serious lot of underutilzed cycles sitting out there on the corporate world's desktops if they're not running OpenGL screensavers...)

BTW: If the phenomenal success of Sun's E10000 Starfire has taught us anything at all, it's that where I/O is important, a big honkin' SMP box kicks cluster butt! Seriously, the interconnect technology between boxes just *can't* be fast enough to compete effectively with a huge multi-level crossbar packet switch like the ones in the E10K. Sun and the other SMP vendors can win here because they own the domain in which the simpler problem resides...

Don't assume by this that I'm against Beowulf clusters at all - they are a great and amazing thing, but there's more than one way to skin a cat, and Beowulf isn't the only path to Linux distributed computing.

--
"The future's good and the present is nothing to sneeze at." - Roblimo's last ./ post

Re:not Beowulf? by dublin · 2000-03-22 00:28 · Score: 2

The article doesn't say, but despite what you'd think by reading the rantings of the ill-informed 3l337 d00dZ on slashdot, Beowulf isn't even a very good clustering technology for most problems.

There are far more serious, industrial-strength solutions out there, things like MPI, PVM, LINDA,and IBM's own HACMP. (Note these cover a lot of ground and are not necessarily even comparable to one another.)

Beowulf (or any of the others listed above) is not automatically the correct distributed computing methodology. Selecting the proper solution for the job at hand is far more complex than you might imagine. There is a lot more developer activity on some of these than there is on Beowulf - MPI in particular is maturing rapidly and is used for solving big/tough problems in many of the largest companies in the world. (No particular MPI advocacy or bias, it just seems like I run into it more often than the others...)

--
"The future's good and the present is nothing to sneeze at." - Roblimo's last ./ post

Re:IO by Tower · 2000-03-21 22:38 · Score: 2

If you need it, get Ultra3 SCSI with solid-state drives. Sure it'll cost you an entire lifetime's salary, but hey, they're great.

Really though, using a solid state drive as a cache for a disk subsystem is an easy way to enhance performance, and is already being sold. You perform a write - instant gratification, and wiht proper caching algorithms, you can get the same thing for reads. A multi-gig SS Drive can easily max out a bus. Multi-level caching is a necessity as speeds increase in systems.

In this sort of system, the interconnect fabric (as fast as it is) can still be a little bit of a bottleneck, too... A good cached RAID disk system on the one end can really keep things smoking, though.