Flattening Out The Linux Cluster Learning Curve

← Back to Stories (view on slashdot.org)

Flattening Out The Linux Cluster Learning Curve

Posted by michael on Sunday October 31, 2004 @12:05AM from the stomp-stomp dept.

editingwhiz writes "IT Manager's Journal has a good look at a forthcoming book, The Linux Enterprise Cluster, that explains in clear, concise language how to build a Linux enterprise cluster using open source tools. Writer Elizabeth Ferranini interviews author Karl Kopper in a Q&A. Is this more complicated than it appears to be? (IT Manager's Journal is part of OSTG.)"

13 of 89 comments (clear)

Min score:

Reason:

Sort:

The problem with clustering in Linux... by Xpilot · 2004-10-31 00:41 · Score: 4, Interesting

...is that there are a gazillion ways to do it, and every cluster vendor comes up with their own way, and there is no agreed-upon standard yet to easily deploy these things (AFAIK). Now the fact that there is no single vendor controlling how clustering works is a good thing, without a lack of a good standard as to what a clustered environment will offer to the application developer, the task of setting up clusters for different types of applications remains a tedious task.

Lars Marowsky-Brée had a paper in the proceedings of OLS 2002 describing the problem and a suggeted solution in his paper entitled "The Open Clustering Framework". I'm not sure how far standardized clustering has come since then. Anyone has any insight on the matter?

--
"Backups are for wimps. Real men upload their data to an FTP site and have everyone else mirror it." -- Linus Torvalds
1. Re:The problem with clustering in Linux... by barks · 2004-10-31 01:28 · Score: 4, Interesting
  
  I have yet to meet anyone that's running a homebrew cluster and tell me which distro they're running...as awesome as they appear to be.
  
  What I gathered from my introductory Operating Systems class was that this was the next frontier and exciting market to keep an eye for.......that and creating applications for these setups was not as you said, standarized yet. Can Linux applications that normally run a single box setup migrate relatively easily to a cluster setup yet?
  
  --
  
  Some aim to please, I aim to tease.
2. Re:The problem with clustering in Linux... by jweage · 2004-10-31 02:10 · Score: 2, Interesting
  
  I've personally setup two clusters, both in the 40 CPU range. I've used various versions of Red Hat, I'm now using White Box Enterprise Linux.
  
  I use a PXE boot system with Anaconda kickstarts to get the software installed. A poast install script then configures everything else on the machine. When it reboots, the machine appears in the cluster and is ready to use. I use the Torque batch scheduling system.
  
  You don't need the cluster toolkits to setup a cluster! DHCP, TFTP and a configured kickstart file work just fine with Red Hat.
3. Re:The problem with clustering in Linux... by mindstrm · 2004-10-31 02:13 · Score: 2, Interesting
  
  THe answer is "it depends". First, there is no such thing as a generic "cluster". A cluster is just a bunch of machines cooprating to solve a problem (whether that problem is serving a website or computational physics, or the requirement for redundancy)
  
  Some types of applications, it's easy to visualize how to get a dozen or a hundred computers to help with the problem (serving static web pages). Others, it's not (databases)
4. Re:The problem with clustering in Linux... by Anonymous Coward · 2004-10-31 05:27 · Score: 1, Interesting
  
  It depends on the application.
  
  OpenMosix is a clustering technology that allows you to use regular apps and benifit from the cluster.
  
  It works by migrating proccesses from one computer to another. So it's like a SMP machine, the fastest any single thread can be done is limited by the fastest cpu, however the it allows you to do more at once.
  
  For example with a 2 system cluster, you would be compiling the kernel. If you set it up only to use one thread at a time, then you get 100% of the original single machine performance, if you set it up to compile 4 things at the same time, you can get a 160%-180% increase in performacne over a single machine.
  
  All apps can benifit from it, any sort of heavy mutlitasking can benifit from it. No extra programming needed, just a custom kernel that is patched and some services.
Logistics gone digital? by egnop · 2004-10-31 01:13 · Score: 2, Interesting

End users would often complain about the system's slow response time.
He says, "Because we couldn't print the forms for the warehouse people to select the products to be put on the truck, we'd have dozens of truck drivers sitting in the break room each day for more than 10 minutes.

I actually don't get it, most logistics got wireless for about a decade now...
and the truck driver has no right for a break...
VMS clusters by Anonymous Coward · 2004-10-31 01:24 · Score: 4, Interesting

Want practice with decades-mature enterprise clusters? Why not get a few old VAX or Alpha systems on eBay, and/or fire up a few instances of the simh emulator, then join the free OpenVMS hobbyist program (I recommend the also-free-to-hobbyists Process Software's Multinet TCP/IP stack and server software).
And please, don't be put off by VMS because DCL = your first exposure to a VMS system - feels more awkward than bash (in many ways, it certainly is!). It's in the underlying architecture of the OS where the fruits of tight engineering are really demonstrated.
Re:This is the kind of book we need... by Anonymous Coward · 2004-10-31 02:08 · Score: 1, Interesting

> Umm dude.. Enterprise cluster != beowulf cluster

Oh for fuck's sake spare the geeky overliteral bullshit and grow a sense of humor and perspective. Thank fuck it's not GEEKS who are it manager's but true manager's, heaven help us if someone asked a geek to ever look at the big picture in an organisation.

"I fail to see how a examining a large painting would enhance our productivity".

Turn your coder brain off once in a while.
Beowulf Newbie Question by Phoenix666 · 2004-10-31 02:29 · Score: 2, Interesting

I read about setting up a cluster about six months' back, and they said that you can only really run programs that are specifically designed to run on a beowulf cluster. It seems like if you could set up a cluster and be able to run any old app on it without special coding, then you'd have your massive adoption of linux. Plug-n-play supercomputer, using the crappy old boxes gathering dust under the cubicles.

Is there any plans to take beowulf in this direction? Is it already possible, but I was just reading the wrong FAQ?

--
Do what you can, with what you have, where you are.
Re:This is the kind of book we need... by ocelotbob · 2004-10-31 02:30 · Score: 2, Interesting

*Disclaimer: I am tired. It is 6:30 on a sunday morning. I have done the one task I gave myself before I allowed myself to sleep, which was to make pawgloves for my halloween costume. Thus, sanity is overrated right now.
Okay, the classic beowulf cluster is a 4x4 matrix of computers. Now, to have a beowulf of beowulfs, each of those computers on a cluster must be connected to its own 4x4 grid, so you now have a cluster of 256 computers, arranged somewhat suboptimally. Now, in order to communicate with these systems, you are going to need some library functions. Classic beowulfs work well with the industry standard pvm libraries. They can also use openmosix if the application is not natively cluster aware. As we are dealing with clusters of clusters, some applications may not function properly if they were designed to work on just a single cluster. So, most likely, we'll end up needing to use a variety of techniques to beowulf squared an application, such as combining pvm and openmosix

--
Marxism is the opiate of dumbasses
Re:This is the kind of book we need... by perlchild · 2004-10-31 04:40 · Score: 2, Interesting

You bring an interesting point up, I wish each book on the topic of clusters mentioned which type(s) of clusters it dealt with...

Looking for a good book on High-Availability clusters would be so much simpler
apps not designed for cluster with lots of state? by mikefe · 2004-10-31 20:49 · Score: 2, Interesting

I have been looking at network filesystem level clustering and failover and NFS, SMB/CIFS and OpenAFS look like good choices for that. With NFS and CIFS you can have an active/inactive fail-over cluster.

I don't know about NFS, but in the case of CIFS, the protocol spec has provisions for renegotiating locks if a connection is broken, but I don't know if there are bugs in win2k/XP clients with samba 3 servers. OpenAFS can have a sort of active/active setup, but the archatecture is such that there is only one server that handles the writes and the rest are read-only. In all of these you can have a semi active/active failover cluster if you move half of the active volumes to the backup server, but this adds a lot to the complexity of your fail-over system.

Those services have a low to moderate amount of state information kept on the server. In the case of a graphical (VNC) terminal server, I don't know of any open source projects that will allow gnome session to be on one server, have that server go down, another server take over its ethernet MAC and IP address and continue processing where it left off on the backup server. The best I can think of is OpenMosix or maybe OpenSSI which are two single system image type clustering systems. If anyone knows anything, please reply and let me know thanks.

--
There: Something at a specific location.
Their: Owned by someone.
Please make sure your english compiles.
Comments From the Author by KarlKopper · 2004-11-02 06:36 · Score: 2, Interesting

I'd like to jump in here and make a few comments.
First, about the book being a "definitive" guide. I cannot possibly claim to be an expert on every topic in the book--in fact, no one person can. The book is definitive, however, in that project leaders from each of the open source projects participated in editing and reviewing the material for the book.
It is an over broad statement to say it is the definitive guide for building any and all types of Linux Clusters. The book describes how to build a cluster that can be used to run mission critical applications to support an enterprise (it has little or nothing to do with working on the "Big Problem" as Pfister would call it).
(The book took four years to write by the way.)
I do hope it helps with the learning curve, but this is one of the advantages of building what I'm calling a Linux Enterprise Cluster--the system administrator can leverage his/her knowledge of Linux and add concepts that will allow them to build a cluster capable of supporting the enterprise.
I did not invent anything new for this book, and you CAN already find just about everything on-line that is in this book. I started work on the book in 2000 because, at the time, I wanted to have a guide book like this one that would hold my hand through the process of building a cluster that could support mission critical applications running GNU/Linux.
Finally, let me just agree with the comments about the number of nodes ("You don't need 20 nodes if 6 can do the job"). This book is not about building clusters for scientific applications where thousands of nodes and sophisticated batch job scheduling systems are required. How many nodes does it take to build the ideal cluster for your environment? I think that will depend on a lot of things including your budget, the impact of the failure of a single node in the cluster, how many instances of your application can run concurrently on a single node, performance bottlenecks from your node hardware, and so on. In my opinion, the ideal number of cluster nodes for an enterprise cluster--from the system administrator's standpoint--is about 10 (in a pinch you can log on to every node fairly quickly).
The cluster this book was based on has been in production long enough (over 18 months) to have undergone a complete hardware refresh by the way; so the text is based on actual experience (not just theory) and, as I mentioned earlier, it has been reviewed by subject matter experts to insure its technical accuracy.