Flattening Out The Linux Cluster Learning Curve
editingwhiz writes "IT Manager's Journal has a good look at a forthcoming book, The Linux Enterprise Cluster, that explains in clear, concise language how to build a Linux enterprise cluster using open source tools. Writer Elizabeth Ferranini interviews author Karl Kopper in a Q&A. Is this more complicated than it appears to be? (IT Manager's Journal is part of OSTG.)"
Steepening the curve?
Now it's not just geeks, but also IT Managers who can imagine a beowulf cluster!
I must have missed this, and for anyone else who didn't know, OSTG is the new name for the Open Source Development Network (OSDN) Slashdot is a part of. They're now called the Open Source Technology Group.
The guy puts a single 10 node cluster together and this qualifies him him to write the 'definitive guidebook called "The Linux Enterprise Cluster"'.
Dont think so.
Do not try to read the dupe, thats impossible. Instead, only try to realize the truth
What truth?
There is no dupe
Installing and administering the various open source tools can be tedious work, especially without documentation of how to put things together.
A quick Google search though reveals a lot of free papers and manuals on this very topic.
...is that there are a gazillion ways to do it, and every cluster vendor comes up with their own way, and there is no agreed-upon standard yet to easily deploy these things (AFAIK). Now the fact that there is no single vendor controlling how clustering works is a good thing, without a lack of a good standard as to what a clustered environment will offer to the application developer, the task of setting up clusters for different types of applications remains a tedious task.
Lars Marowsky-Brée had a paper in the proceedings of OLS 2002 describing the problem and a suggeted solution in his paper entitled "The Open Clustering Framework". I'm not sure how far standardized clustering has come since then. Anyone has any insight on the matter?
"Backups are for wimps. Real men upload their data to an FTP site and have everyone else mirror it." -- Linus Torvalds
Publications like this play an important role in establishing best practices and community, two key enablers of standardization.
These in turn will lead to greater adoption, and more publications. A virtuous cycle.
And please, don't be put off by VMS because DCL = your first exposure to a VMS system - feels more awkward than bash (in many ways, it certainly is!). It's in the underlying architecture of the OS where the fruits of tight engineering are really demonstrated.
I will start by admitting that I am just a dumb university student talking out my ass. I have never set up an enterprise scale cluster.
However, last january we set up a small (six node) cluster with the help of CLIC. Once we realized the link between a Mandrake and consective dead CD drives, we installed the cluster in little time.
CLIC might focus a little too much on userfriendlyness and a little too little on flexibility, but for our purposes it was great. It sports ganglia, gexec, distcc and MPI (and probably more), and administration and deployment of nodes is a breeze.
I heartily recommend CLIC for student/test/proof-of-concept projects.
Inevitably high-performance clusters require software designed to run on high-performance clusters. It is better not to think of such a cluster as a single system, but rather as a network of individual machines with a tight network connection. Some of the clustering add-ons for linux approach and even achieve certain aspects of a "Single system image" type of configuration, but it's never completely like a single system.
Back in 1997 or so I tried to get as close as I could to a true Single System Image by building off of the beowulf patchsets combined with patches for Distributed SysV IPC/SHM and a globally-shared root filesystem using CNFS (cluster-nfs, so that a few essential configfiles can have unique copies per cluster node). It was very daunting work to get those patches integrated together, and the end result was that without some kind of network-interconnect that was as high-speed and low-latency as a processor's FSB, there was always going to be a big performance hit doing things this way. Of course if an application happens to be perfect for simple HPC clusters (all cpu intensive, very little I/O, and the work is easily divisible without tons of IPC between the workers), then it runs fantastically on such a Single System Image cluster, but then again it would have run fantastically on a simple cluster that doesn't look like a Single System too. So what the Single System concept bought me really was a nice abstraction layer that made everything easy to deploy, configure and manage. But it came at a severe initial cost of human labour. It's not worth the trouble.
11*43+456^2
A flat learning curve is a bad thing.
The term "learning curve" was invented by the aerospace industry in the 1930s as a way to quantify improved efficiency from mass production (basically, the more you do a task, the easier it becomes). The term was later adopted by psychology and the social sciences, where most people first encounter it.
In both cases, the horizontal axis of a learning curve represents time or effort, and the vertical axis represents amount learned or productivity. Therefore something that is intuitively obvious in fact has a steep learning curve.
"Learning curve" was a technical term with a specific definition for decades before it was ever a (misused) marketing buzzword.
Thank you for your time :)