Flattening Out The Linux Cluster Learning Curve
editingwhiz writes "IT Manager's Journal has a good look at a forthcoming book, The Linux Enterprise Cluster, that explains in clear, concise language how to build a Linux enterprise cluster using open source tools. Writer Elizabeth Ferranini interviews author Karl Kopper in a Q&A. Is this more complicated than it appears to be? (IT Manager's Journal is part of OSTG.)"
...is that there are a gazillion ways to do it, and every cluster vendor comes up with their own way, and there is no agreed-upon standard yet to easily deploy these things (AFAIK). Now the fact that there is no single vendor controlling how clustering works is a good thing, without a lack of a good standard as to what a clustered environment will offer to the application developer, the task of setting up clusters for different types of applications remains a tedious task.
Lars Marowsky-Brée had a paper in the proceedings of OLS 2002 describing the problem and a suggeted solution in his paper entitled "The Open Clustering Framework". I'm not sure how far standardized clustering has come since then. Anyone has any insight on the matter?
"Backups are for wimps. Real men upload their data to an FTP site and have everyone else mirror it." -- Linus Torvalds
End users would often complain about the system's slow response time.
He says, "Because we couldn't print the forms for the warehouse people to select the products to be put on the truck, we'd have dozens of truck drivers sitting in the break room each day for more than 10 minutes.
I actually don't get it, most logistics got wireless for about a decade now...
and the truck driver has no right for a break...
And please, don't be put off by VMS because DCL = your first exposure to a VMS system - feels more awkward than bash (in many ways, it certainly is!). It's in the underlying architecture of the OS where the fruits of tight engineering are really demonstrated.
> Umm dude.. Enterprise cluster != beowulf cluster
Oh for fuck's sake spare the geeky overliteral bullshit and grow a sense of humor and perspective. Thank fuck it's not GEEKS who are it manager's but true manager's, heaven help us if someone asked a geek to ever look at the big picture in an organisation.
"I fail to see how a examining a large painting would enhance our productivity".
Turn your coder brain off once in a while.
I read about setting up a cluster about six months' back, and they said that you can only really run programs that are specifically designed to run on a beowulf cluster. It seems like if you could set up a cluster and be able to run any old app on it without special coding, then you'd have your massive adoption of linux. Plug-n-play supercomputer, using the crappy old boxes gathering dust under the cubicles.
Is there any plans to take beowulf in this direction? Is it already possible, but I was just reading the wrong FAQ?
Do what you can, with what you have, where you are.
Okay, the classic beowulf cluster is a 4x4 matrix of computers. Now, to have a beowulf of beowulfs, each of those computers on a cluster must be connected to its own 4x4 grid, so you now have a cluster of 256 computers, arranged somewhat suboptimally. Now, in order to communicate with these systems, you are going to need some library functions. Classic beowulfs work well with the industry standard pvm libraries. They can also use openmosix if the application is not natively cluster aware. As we are dealing with clusters of clusters, some applications may not function properly if they were designed to work on just a single cluster. So, most likely, we'll end up needing to use a variety of techniques to beowulf squared an application, such as combining pvm and openmosix
Marxism is the opiate of dumbasses
You bring an interesting point up, I wish each book on the topic of clusters mentioned which type(s) of clusters it dealt with...
Looking for a good book on High-Availability clusters would be so much simpler
I have been looking at network filesystem level clustering and failover and NFS, SMB/CIFS and OpenAFS look like good choices for that. With NFS and CIFS you can have an active/inactive fail-over cluster.
I don't know about NFS, but in the case of CIFS, the protocol spec has provisions for renegotiating locks if a connection is broken, but I don't know if there are bugs in win2k/XP clients with samba 3 servers. OpenAFS can have a sort of active/active setup, but the archatecture is such that there is only one server that handles the writes and the rest are read-only. In all of these you can have a semi active/active failover cluster if you move half of the active volumes to the backup server, but this adds a lot to the complexity of your fail-over system.
Those services have a low to moderate amount of state information kept on the server. In the case of a graphical (VNC) terminal server, I don't know of any open source projects that will allow gnome session to be on one server, have that server go down, another server take over its ethernet MAC and IP address and continue processing where it left off on the backup server. The best I can think of is OpenMosix or maybe OpenSSI which are two single system image type clustering systems. If anyone knows anything, please reply and let me know thanks.
There: Something at a specific location.
Their: Owned by someone.
Please make sure your english compiles.
First, about the book being a "definitive" guide. I cannot possibly claim to be an expert on every topic in the book--in fact, no one person can. The book is definitive, however, in that project leaders from each of the open source projects participated in editing and reviewing the material for the book.
It is an over broad statement to say it is the definitive guide for building any and all types of Linux Clusters. The book describes how to build a cluster that can be used to run mission critical applications to support an enterprise (it has little or nothing to do with working on the "Big Problem" as Pfister would call it).
(The book took four years to write by the way.)
I do hope it helps with the learning curve, but this is one of the advantages of building what I'm calling a Linux Enterprise Cluster--the system administrator can leverage his/her knowledge of Linux and add concepts that will allow them to build a cluster capable of supporting the enterprise.
I did not invent anything new for this book, and you CAN already find just about everything on-line that is in this book. I started work on the book in 2000 because, at the time, I wanted to have a guide book like this one that would hold my hand through the process of building a cluster that could support mission critical applications running GNU/Linux.
Finally, let me just agree with the comments about the number of nodes ("You don't need 20 nodes if 6 can do the job"). This book is not about building clusters for scientific applications where thousands of nodes and sophisticated batch job scheduling systems are required. How many nodes does it take to build the ideal cluster for your environment? I think that will depend on a lot of things including your budget, the impact of the failure of a single node in the cluster, how many instances of your application can run concurrently on a single node, performance bottlenecks from your node hardware, and so on. In my opinion, the ideal number of cluster nodes for an enterprise cluster--from the system administrator's standpoint--is about 10 (in a pinch you can log on to every node fairly quickly).
The cluster this book was based on has been in production long enough (over 18 months) to have undergone a complete hardware refresh by the way; so the text is based on actual experience (not just theory) and, as I mentioned earlier, it has been reviewed by subject matter experts to insure its technical accuracy.