Linux Clustering Cabal project

← Back to Stories (view on slashdot.org)

Linux Clustering Cabal project

Posted by Roblimo on Friday September 24, 1999 @01:37PM from the no-it's-not-beowulf dept.

RayChuang turned us on to this ZDnet story about the Linux Clustering Cabal project, which, Ray says, is "...the one that will allow Linux server clustering of many server machines. Sounds like just the thing to finally get eBay working reliabily and also make John C. Dvorak eat his words about the deficiencies of Linux."

5 of 59 comments (clear)

Min score:

Reason:

Sort:

right and wrong by mattdm · 1999-09-24 11:21 · Score: 4

You're right in that clustering is not anything new -- one of the best implementations is in Digital's OpenVMS, which is pretty old-school if you're counting in internet years.

But clustering is very different from the examples you give. It's not running different services on different machines. It is taking a bunch of machines and making them act as one.

Beowulf-style clusters are one way of doing this, but there's a limit to how many nodes you can connect that way and still get performance increases. It scales up, but probably not to thousands of nodes. Now, the LCC people obviously haven't built anything to prove that they can do better, but it sounds like they may have a theoretical improvement.

And, it's only hinted in the article ("satisifies both commercial data processing and HPC requirements"), but it's possible also that this technology is not only fast, but unlike Beowulf also provides improved robustness.

This is all vapor now of course. But we'll see. The people working on this have some important projects to their credit.

--
*cough* Clustering 'new'? by Signal+11 · 1999-09-24 08:57 · Score: 5

Umm, not to burst anybody's bubble.. but decentralized computing has been the paradigm for IT for a long time - put your web server on one box, your DNS on another, your mail server on a third (Multiply the number by 4 if you are running NT...), etc.
Clustering isn't ground-breaking technology.. it's been around for a long time. Now, the concept of parallel processing has been around for a long time too... and it doesn't seem like many manufacturers are rushing to get their products working on beowulf clusters.
This isn't to say it isn't a great idea - it's just that there isn't any support for it. There's plenty of alternatives too. For example:
Webservers: Set up several servers, and an SQL backend (or an NFS mounted partition) to hold the content. For added speed, throw squid over that setup. You can even tell remote caches to access your servers round-robin style by putting in multiple 'A' records.
DNS/mail: Heh. Even the IETF got this one right by suggesting primary and secondary DNS.
Filesharing: There is some work being done to create a 'real' beowulf cluster to create something of a decentralized logical file server. For now, use AFS or CODA.. which have all kinds of cool performance benefits. As an aside - both are a helluva lot more stable than the Nightmare File System (NFS).
Printing: They have affordable net appliances to do this (HP print server anyone?), and even some printers support direct access. Failing that, setting up multiple servers for multiple printers works pretty well - This is decentralized by design anyway...
So there you have it... all the staples of the corporate network - "clusterized". New technology? I don't think so. All the examples I gave you are in wide use (and have been for some time!).

--
But how does it work? by Matt+Welsh · 1999-09-24 08:59 · Score: 5

I would love to see a whitepaper on this. I have spoken a couple of times with Stephen Tweedie about his ideas, and he certainly has a lot of experience (he worked on VMS clusters for a while). However there are many smart people all over the world working on this same set of problems -- Microsoft, IBM, Oracle, Compaq, etc. all spring to mind. A large number of university research projects are working on things that most commercial vendors aren't even thinking of yet -- my own research project at Berkeley being one of them.

For those who want some background on the important issues, I highly recommend Gregory Pfister's book In Search of Clusters . Clustering is a lot harder than most people realize, and people should not ignore the work that's been done before in this area. The important question for LCC is what is fundamentally new in their design. I doubt that the lack of kernel locks is really it.

The thing that remains to be seen is what set of applications they target, and what tradeoffs they make to support those applications. The fundamental issues in clustering have been addressed by a large number of research projects and products, and I'd like to know what's new about LCC.

That being said, I'm happy that some smart people are going after this problem!
Are they trying to duplicate SGI? by LL · 1999-09-24 09:38 · Score: 4

As Matt Welsh noted, it is not exactly a trivial problem. If you look very closely at the article, the LCC wants to occupy a happy ground between the share-nothing crowd (Microsoft, Tandem) and the share-everything (Oracle). The share nothing pardigm is rather simplistic in its approach and reflects the fact that throwing together a bunch of machines with a cheap interconnect is a comparatively straight-forward re-engineering approach. The share-everything come froms the extension of shared-bus architectures (e.g. Sun Starfire) which enforces a multiple lock strategy. Companies like SGI have thrown million of R&D dollars into the middle-ground which is why their cc-NUMA architecture and cellular IRIX is quite popular. I wish the LCC luck but there is a reason why a successful working solution is expensive as it requires a savvy combination of hardware+software+smart routing (the SGI solution uses a cache directory). You are effectively paying for some very sophisticated know-how as part of every SGI machine.

Given the direction that SGI is heading (Linux for entry-level&apps + IRIX kernel extensions for high-end) I would wonder whether the LCC would produce anything practical in a realistic time-frame. This is not to decry their laudable efforts and I would hope businesses are patient enough to wait for robust and cheap solutions. If nothing else, it will hopefully offer a shardardised set of software extensions (a la OpenMP) and coding practices so that a single source tree can support 1 to n processors.

Who knows, they might be able to come up with a few tricks that the pros have missed.

LL
What I need for clustering by Bruce+Perens · 1999-09-24 11:01 · Score: 4

I need a replicator for the Zope database. I think Digital Creations is working on a closed-source one for their support-option customers, but of course a Free Software one would be nice to have for the rest of us. Essentially, I'd like to put Zope servers in colocation sites that are distant from each other, and have all of them to have a local copy of a common Zope database, propogating updates to the database to each other, and resynchronizing with each other after a network partition event.
I'd also be interested in hearing about any Free Software databases that can do this sort of synchronization. Thanks
Bruce

--
Bruce Perens.