Macintosh Clustering
HiredMan writes: "Wired is running an article comparing the set-up and admin of Linux Beowulf clusters versus Mac based clusters. Slant of the article is that the Macs are easier to set-up, maintain and are more flexible. They note that the Linux "how to" manual is 230 pages while the corresponding Apple document is a 1 page PDF file. Dauger Research of former Appleseed fame is mentioned as well, of course. MacSlash is also covering the article. Let the on-topic (for once) Beowulf comments fly..."
Having used the old Nextstep API (which I believe have been ported to OS X under the guise of CoCo) I can say that they are well suited for cluster computing.
I remember Richard Crandall and the mathematica guy (Wolfram) using Zilla (an old Next distributed computing program) to crack the world's largest prime in the mid nineties...
Anyone know if Zilla is back on OS X?
Also the Gigabit ethernet on motherboad and the large 2MB cache on the PowerPC chips will go a long way on making these machines a good cluster.
It's been a while since I've done distributed computing (hey, I am out of acedemia) but OS X will hopefully make the whole shebang easier...
-- Initial build costs are much lower (dual Athlon 2000+ right now without graphics hardware is way cheaper than a dual G4 1GHz).
True.
-- Maintenance costs are much, much lower. Anything goes wrong with a PC node, just swap out that part with another commodity part. Mac repair or parts replacement costs will eat you, especially if you start to have many, many nodes.
Wrong. Commodity parts such as memory and hard drives are exactly the same on the Mac. I have bought memory and hard drives at Sam's club, and they work just fine in my Mac.
Plus you can modify bits of Linux if you need to optimize the behavior of your cluster for the sort of computing you do, which you can't do with Mac OS.
Wrong again. At the level of the OS where you might need to have some custom tweaks (the kernel) you can customize OS X to your hearts content. See Darwin.
Now this article may have been talking about OS 9 clusters, but there is nothing preventing anyone from using OS X.
You think the P4 price/performance is bad, G4's are insane
USC Macintosh Cluster Running the AltiVec Fractal Benchmark achieves over 1/5 TeraFlop on 152 G4's and demonstrates excellent scalability.
KLAT2's complete results are: Rmax=64.459 GFLOPS with 64 Athlon 700MHz with 128MB PC100 CAS2 SDRAM
So a 1 tflop apple machine would cost about $440,000 in hardware for 152 G4 1000mhz -vs- 270 Tbird 1400mhz at about $160,000.
The difference, $280,000 could certainly hire someone literate enough to read the long linux manual.
If voting were effective, it would be illegal by now.
... for scientists like myself, this is a very nice thing. Not all of us in the sciences are tech-savvy... I'm probably the one in my 5-person research group who understands the most about *nix. For those of you who don't realize this, many research scientists have to work hard to get their grants and outside money.
So, what does all this mean to us? As an atmospheric scientist, having some serious number crunching power is mighty helpful. Weather modeling is quite the processor intensive task, and then interpreting the results can take years after all the computing is done, including further computations and visualization routines. To put it shortly, we can easily tax our computers.
So, now you know that we need computing power, but money is a premium for us in many cases, so why shouldn't we just get some cheap Intel boxes and *nix cluster them? Well, we could, but then we'd need to hire a systems admin. Someone who is tech-savvy enough to keep everything running decently well for us. That requires another person who REALLY understands what's going on in many cases, which is another salary on the payroll. For us, it all ends up balancing in the end. The $5-10K that we save in clustering our 8 Intel boxes over the Macs is eaten up in one year or less by the guy (or woman) who has to set up the whole thing. So, for us, the ease of setup and use is something that can translate into some good savings and we don't have to worry as much about having to rely on another person to save us if something goes wrong. That's the benefit of simplicity for us.
I agree that it is important to know, as one person said, "The nature of the beast", but that's something that takes time to do, and when you're not being paid to learn about how to cluster computers, but to figure out how the atmosphere works, then things like "The nature of the beast" are just further complications. I would rather have something that I can slap together, know that it works, and get back to my work, without the interference of others if I don't need it.
And that brings me to another rebuttal, about someone mentioning that if you buy the Macs, you're also going to pay for all the extra Superdrives and video cards and all that. I say to that, "Good." That way, if the cluster doesn't need to be used, then I don't have a bunch of mostly useless boxes sitting around... or if a collaborator comes around and needs a computer, I can just remove one of the computers from the cluster and let them use that for as long as they need. The point is that there are advantages and disadvantages to each setup. Now you've heard some advantages and why the scientific community might care about this. Remember, not everyone here can compile their own kernels and not everyone cares about being able to do that. Some of us, thank the deity of your choice, actually want to do something with this power and not care how it works in depth. To each their own.
-Jellisky
Obviously, you know very little about the Macintosh. You should learn a bit more before you go spouting off flames.
The software used to accomplish the clustering for AppleSeeds is Mac MPI, which is based upon the *standard* for parallel computing, MPI. The reason that the PDF doesn't talk about programming MPI is that there is no need for redundant documentation. Go find a book on MPI if you want to learn to prgram to that API.
And yes, I will get quite far telling you it's easier to upgrade Mac OS X to its latest version/. Thanks to Apple's Software Upgrade control panel program, this can all take place automatically according to any schedule you desire. Two clicks of a mouse is all it takes to set this up, as opposed to spending quite a lot of time figuring out how to use the incredubly arcane "apt". In fact, AFAIR, Software Update is now set to operate automatically by default.
Gee, I didn't realize that particle physics simulations involving millions of particles wasn't a *real* application...
The fact that your comment has been moderated up to four (so far) is simlply an empiric demonstration of the lack of knowledge of most Slashdot readers.
If it can be optimized for AltiVec, almost nothing will be faster than a G4.
Just take a look at these RC5 stats (mid-way down the page). G4s smoke everything, because the RC5 client is optimized for AltiVec, thus it can compute four keys in a single clock cycle. By comparison, Athlons do one key per clock cycle, and Pentium 4s do one key every four clock cycles.
So if you've got an operation that can benefit from the G4's SIMD capabilities, Macs are your best bet.
Free Hans!
The fact that a manual is shorter doesn't mean that it is a better or easier to install program.
I would agree that comparing manual lenght is not a reliable guide to judge the relative complexity of two programs. The one-page doc is even a "quick start guide" not a complete manual. But I still suspect that the writer is correct that Appleseed clusters are easier to set up and maintain than a Beowulf cluster. Reading over the directions myself it did looked pretty brain-dead simple - most of that one page didn't even have much to do with the actual installation of the program but with such complicated tasks as connecting your Mac to an ethernet hub: "For each Mac, plug one end of a cable to the Ethernet jack on the Mac and the other end to a port on the (ethernet) switch." and noting a few system requirments (CarbonLib 1.2 or OS X 10.1) The installation instructions consists of "Double-click the Pooch Installer and select a drive for installation." Instructions on how to use consist of dragging and dropping the program you want to run in parrallel onto the Pooch app and "click Select Nodes..., select the computers you want to run it on, and, in the Job Window, click on Launch Job."
Besides, if you are going to have a cluster, you want cheap, off the shelf machines such as PCs with plenty of spare parts that can be customised to suit your needs : why pay for a good 3d graphics card in every pc if you are going to do number crunching !
This is only the case if the individual PC's are dedicated nodes and not being used for anything else. Most Appleseed clusters are made up of computers that are primarily being used for something else. School Mac computer lab by day; clustered "supercomputer" by night. The cluster of that did 233 gigflops (76 dual G4's mostly 533's with a few 450's) was simply all of the Macs at UMC working as a cluster over Christmas break. This is where the easy set up, maintenance and the ability to cobble together computers with different processors and even different OS's (some nodes may be running MacOS 9 and some nodes may be running OS X) is an advantage. The Appleseed clusters that are made up of dedicated machines are probably discarded computers they already had kicking around so cost is not an issue there either.
Just have all of your OS X clients boot off of a disk image on a Mac OS X Server machine.
http://www.apple.com/education/k12/networking/diff er/index.html#macmanager
This reminds me of an old Mac story.
t ml
:^)
The situation was that Guy Kawasaki (an Apple "evangelist" at the time) challenged some PC folks to a "bake off," to determine which system made some tasks easier.
When the day came, Kawasaki sent out a 10-year-old to go head-to-head with the PC geek.
The full details of the story are at http://www.halcyon.com/kegill/mac/win95/faceoff.h
Maybe we should have a new challenge where a Linux geek and a 10-year-old compete to see who can set up a compute cluster the fastest.
Computers are useless. They can only give you answers. -- Pablo Picasso
It's not your fault, because you probably didn't know this, but the USC Mac cluster didn't cost anything near $440,000, and it didn't have any 1000 MHz. G4's in it.
At the "Macs in Science and Engineering" user conference at Macworld, they gave the general specs. of this cluster, and all of the machines were dual processors, but of different hardware generations. Although the fastest machines were dual 800 Mhz. on 133 MHz. bus, the majority were slower dual 450 and 500 Mhz. machines with 100 Mhz. buses.
With the fact that all were dual, and ignoring depreciation on the older hardware, the cost would be at most $220,000, If you were using Dual 1 GHz. G4's, it would still be only $220,000. My notes are on my laptop, but I believe that the actual cost of the USC cluster was less than $200,000.
Also, I assume that you think that the 270 uni-processor T-birds will scale performance linearly as well. I doubt it would only cost ~$600 per node as you would have to use Myrinet or some other fast fabric, and with three and a half times as many nodes, the latencies, hardware, and administration cost would be crippling. I have the same cost argument if you use dual Athlons, as the boards are quite rare, and the node count is almost double the Mac node count.
Your price/performance assertions don't stand up!
-- Len
Agreed, however if you'd ever actually tried to use the product you'd realise that this is not the case. Let me show you through exactly how simple it is in just 10 simple steps:
- Grab a bunch of Macs, a switch and a monitor.
- Plug Macs into the power.
- Plug a keyboard and the monitor into the first mac and turn it on.
- Configure the network through the easy to use Networking Control panel. Or alternatively don't configure it and throw a DHCP server into the mix somewhere.
- Install and run pooch (drag and drop from the disk image it comes on then double click).
- Repeat for each Mac.
- On the last Mac, pick an application you want to run on the cluster, drag and drop it into pooch.
- Select which Mac's you'd like to help out with running this program.
- Click start.
- There is no step 10.....
Voila! The best bit about this is that I've never even read the pooch manual, yet I've still managed to set up my own Mac Beowolf cluster. I've looked into Linux beowolf clustering a number of times and gotten hopelessly lost and confused despite having respectable Linux knowledge.If you've ever set up a Mac beowolf cluster you'll very quickly realise that there is no comparison in ease of use and anyone who argues otherwise is clearly uninformed.
Like always, don't bash what you haven't tried...
These Xeons feature 512K of L2 Cache. Sure there are Xeons with HUGE amounts of L2 cache, but then we are hitting the $10000 price range. These are workstation machines, not server machines.
I can't compare the Apple's to the P4s... P4s don't go dual processor, so the PPC G4 wins here. I can't get a Dual proc P4.
Athlon? None of the vendors I checked have Athlon workstations, so they weren't in consideration.
However, after realizing the lack of Athlons, I remembered that Penguin Computing has a line of Athlon based workstations.
I went to their website, and priced out an Athlon MP system, the Tempest 210MP Workstation.
With 2 Athlon MP 1900+, not really competetiive with the new 1 GHz G4s, but close enough for our comparison (and matching your assertion that they are in the same league as them). With 512MB PC2100 RAM, and upgraded to the Gigabit Ethernet card (they have one, might as well try to be fair), and my workstation price is $2707.
Congratulations, we have a winner. A Athlon MP 1900+ (running at 1.53 GHz if I recall?) with similar specs at the Apple Workstation comes in $300 cheaper. The Apple has some advantages, the better video card and Superdrive are nice features when the machine is recycled as a desktop machine, but for now they are superfluous.
What is the point of my work?
You're all full of shit. Apple's computers are extremely price competitive. They are cheaper than Xeons from the real vendors with similar specs (Xeons had faster RAM, equal L2 cache, no L3 cache, and no gigabit ethernet).
Apple puts out a really competitively priced Unix workstation to Linux workstations from major vendors.
Apple puts out really competitively priced consumer machines (iMac/iBook) compared to Wintel machines from major vendors.
You can choose to use an Apple solution or not, but stop spreading the bullshit about Apple being more expensive.