Slashdot Mirror


Macintosh Clustering

HiredMan writes: "Wired is running an article comparing the set-up and admin of Linux Beowulf clusters versus Mac based clusters. Slant of the article is that the Macs are easier to set-up, maintain and are more flexible. They note that the Linux "how to" manual is 230 pages while the corresponding Apple document is a 1 page PDF file. Dauger Research of former Appleseed fame is mentioned as well, of course. MacSlash is also covering the article. Let the on-topic (for once) Beowulf comments fly..."

21 of 612 comments (clear)

  1. Oh my God by blowhole · · Score: 5, Funny

    Finally we may rejoice! For once Apple has surpassed the user-friendliness of Linux! Let the merriment begin!

    --
    "Ask me about Loom"
  2. Recent MacSlash Thread by pcolley · · Score: 5, Informative
    MacSlash recently had a thread on a Mac G4 cluster.
    "'Macintosh' and 'Cluster' aren't two words you see together very often. Some enterprising folks at USC have created a cluster of 76(!) dual-processing G4s (56 DP G4/533 + 20 DP G4/450). You can check the info here . Glad to see parallel computing isn't just for the *nix crowd (well, they are running OSX, so technically...). I wonder if they just had 76 G4s lying around, or else there must be some very upset department secretaries. "
  3. Re:Manual length and Macs vs. PC by vought · · Score: 5, Insightful

    I don't think this comment is insightful at all, but hey - I don't have moderator points today, so I'll argue.

    The fact that the manual is shorter - VASTLY shorter in this case does in fact imply that accomplishing a task is easier.

    Here's the skinny: Human factors. A one-page PDF is easier to scan and reference than a 200-page text file without references or pointers. If references an pointers are added along with a TOC, then scanning for specific instructions becomes much easier.

    Comparing a 200-page document written by programmers to a one-page document made possible by a more graceful GUI and architecture, and written by professional tech writers is ludicrous. Less instructions to accomplish the same task = easier. Plain and simple.

  4. Re:Manual length and Macs vs. PC by Triv · · Score: 5, Insightful

    yes, but according to the article they got a sixth-grader in Hawaii to set one up. Doesn't that say something about ease of use? --Triv

  5. maya, photoshop, etc. on a cluster? by goto11 · · Score: 5, Interesting

    Wouldn't it be great if "plug and play" clustering became a reality. Say your office mates are out to lunch, or there's no one scheduled to use the school computer lab for the next hour and you want to render the effects for you three-hour iMovie, or you want to perform batch despeckle on a few hundred inages in Photoshop...
    Nothing against Linux (I use it myself for a router), but a three-day setup for Beowulf clustering isn't a great deterrent if your calculations will be going for a month or two.
    The type of clustering we're talking about here is something that could potentially appeal to the average SOHO or school, where they have five to 500 general-use Macs that have processor cycles to spare.
    My question is this:
    What would it involve to make Mac OS X and every program that runs natively on it to be able to take advantage of clustering right out of the box? If they can natively use multiprocessing, how much of a leap is it to patch the OS to natively support clustering?
    Not only would this be great for techies, but it seems that this would be a great incentive to volume sales from Apple, where they now generally only get one or two Macs per site and the rest are Wintel workstations.

    --
    Why don't you just make 10 louder and make 10 be the top number...and make that a little louder?
    1. Re:maya, photoshop, etc. on a cluster? by keytoe · · Score: 5, Insightful
      What would it involve to make Mac OS X and every program that runs natively on it to be able to take advantage of clustering right out of the box?

      Actually, native Cocoa apps have the capability to be built using distributed objects quite easily. In fact, the mechanisms used for multithreaded communication (NSConnection, NSPort, etc) are the same classes you use to communicate with other processes - on ANY machine.

      The mechanism they use relies heavily on the dynamic nature of Objective-C objects, so I'm guessing it's NOT based on some standard (SOAP,CORBA,RPC,.NET). That would make it hard to integrate it into any cross platform clusters, but we were talking about Photoshop, weren't we?

      So, it boils down to simply this: If you write a Mac OS X app, write it threaded and use Cocoa. If you do that, you'd be amazed what sort of functionality you get for 'free' - including being able to distribute your app across clusters!

      Down with Carbon!

  6. About the same... by alexhmit01 · · Score: 5, Insightful

    Cost of 10 good Intel machines to install Linux on... trivial (pobably about $15,000)...

    Cost of 10 good Highend Macs, (about $30,000)...

    Both are in the trivial range compared to the costs of time, energy, etc.

    There is a more important question, which machine gives you the most bang for your buck?

    We know that Photoshop runs better on the G4, what about your operation?

    If the Mac gets a 2:1 performance advantage, then the costs are equal. If the Mac out-performs it regardless, you get an advantage.

    For the moment, let's assume that you are getting real machines that are tested, not parts off of a sketchy vendor from pricewatch.com. If you are really trying to build a parallel computer, you want real systems, not junk that may or may not work.

    This also rules out eMachines, or home computers. You are basically in the Compaq Workstation, Dell Workstation, HP Workstation, or IBM Workstation area. You aren't setting up a bunch of Presarios.

    1. Re:About the same... by Apotsy · · Score: 5, Informative
      "We know that Photoshop runs better on the G4, what about your operation?"

      If it can be optimized for AltiVec, almost nothing will be faster than a G4.

      Just take a look at these RC5 stats (mid-way down the page). G4s smoke everything, because the RC5 client is optimized for AltiVec, thus it can compute four keys in a single clock cycle. By comparison, Athlons do one key per clock cycle, and Pentium 4s do one key every four clock cycles.

      So if you've got an operation that can benefit from the G4's SIMD capabilities, Macs are your best bet.

  7. Re:Manual length and Macs vs. PC by ryanvm · · Score: 5, Funny

    First things first: I'm a big Mac fan [...]

    Mmmmmmm, Big Macs....

  8. How to set up a Mac cluster by webslacker · · Score: 5, Funny

    Step One: Plug them in.

    Step Two: Turn them on.

    Step Three.... there's no Step Three! There's no Step three...

  9. Re:Easier vs. cheaper... by scorpioX · · Score: 5, Informative

    -- Initial build costs are much lower (dual Athlon 2000+ right now without graphics hardware is way cheaper than a dual G4 1GHz).

    True.

    -- Maintenance costs are much, much lower. Anything goes wrong with a PC node, just swap out that part with another commodity part. Mac repair or parts replacement costs will eat you, especially if you start to have many, many nodes.

    Wrong. Commodity parts such as memory and hard drives are exactly the same on the Mac. I have bought memory and hard drives at Sam's club, and they work just fine in my Mac.

    Plus you can modify bits of Linux if you need to optimize the behavior of your cluster for the sort of computing you do, which you can't do with Mac OS.

    Wrong again. At the level of the OS where you might need to have some custom tweaks (the kernel) you can customize OS X to your hearts content. See Darwin.

    Now this article may have been talking about OS 9 clusters, but there is nothing preventing anyone from using OS X.

  10. Re:Manual length and Macs vs. PC by jguthrie · · Score: 5, Insightful
    But if the one-page document is a "Quick Start" guide (and the document is entitled "Pooch Quick Start") and the 230 page book is a detailed technical reference discussing all of the important aspects of designing, building, using, administering, and programming a cluster, as appears to be the case in this instance, then the relative sizes of the documents says absolutely nothing about any human factors.

    In fact, my first inclination is to try to use the Beowulf stuff rather than Pooch simply because such a detailed work exists and is available for Beowulf clusters, but I don't know if any such information exists for Pooch.

  11. Re:Easier vs. cheaper... by SirSlud · · Score: 5, Insightful

    What surprise that we're in a market based economy.

    The market always wins. The social costs (ease of maintenance, accessibiliy, at the (granted) cost of performance) are almost always ignored when people vote with individual walets.

    Natch:

    > Anything goes wrong with a PC node

    Thats cause stuff goes wrong far more often in a PC envrionment. I say this with 10 years of computing experience on both platforms. YMMV, and I'm sure I'll collect anywhere from 2 to 200 replies either quoting amazing PC/Linux uptimes or terrible Mac related experiences, but I've worked, at length and in technical situations with MacOS, Windows, Linux, FreeBSD, HPUX, AIX, Solaris ... and Macs are by far the most reliable platforms in terms of hardware failure or incompatiblies that arise from drivers, etc. (Note: I am exclduing all Powerbooks. I'm well aware of the 5300 being the exact opposite of what I'm saying .. those things were more trouble than ANY platform I've ever worked on.)

    > Plus you can modify bits of Linux

    OSX, the kernel is Open Source, so you are free to munge around with it, although I havn't gotten a chance to look deep into it, so I'm not sure of the extent of the validity of this.

    OS9, removing kernal modules from the OS is a simple point and click, although I think there is obviously more code in the base system than on a bare bones Linux system. Again, trade offs are unavoidable.

    It is only because Apple sells their OS as 'easy to use' to people assume this is equivilent to 'non customizable'. Any dedicated mac techie knows that while MacOS ain't as granular as Linux in its customizability, the perfornace loss in putting your CPU against surperfluous tasks pays back in the other advantages of the platform.

    Note that I'm not arguing that MacOS is better to cluster than Linux .. I'm only trying to debunk some of the most commonly lobbed FUD against the Mac platform, especially as it relates to its (supposed) unsuitability to non-multimedia related tasks. :)

    What I love the most is how people expect computers to be cars. Ie, if its more expensive, it had better be faster. Man, I'll take a slower and more enjoyable and pain-free computing experience any day of the week, which is why my dream setup would be OSX by default, then Linux or some BSD variant (I'm a programmer on FreeBSD), and then Windows. This holds true even in computationally-intensive tasks. If I can't enjoy the experience of doing it, I don't want to do it, even if it can be done faster or cheaper. My happiness and level of stress is more important than speed.

    --
    "Old man yells at systemd"
  12. Meanwhile... by Misch · · Score: 5, Funny

    They note that the Linux "how to" manual is 230 pages while the corresponding Apple document is a 1 page PDF file.

    Meanwhile, documenters have been developing a "What to do with a linux beowulf cluster" list. That document has grown to 230 pages. The corresponding mac list has come up with one idea (And it fits on a 1 page PDF file): "Create a system that allows us to use Photoshop to edit super-high resolution pictures of Natalie Portman eating hot grits."

    (j/k!, and, btw, I'm using a Mac right now. :-)

    --

    --You will rephrase your request for me to go to hell. Goto statements are not acceptable programming constructs
  13. Everybody's missing the point by rho · · Score: 5, Insightful

    The point isn't flexibility: sure you can be more flexible with a Linux-based cluster. You can tweak and tune a Linux-based cluster to meet your specific needs. This is why Google uses such a cluster.

    The point isn't about cost: the real difference between a decent name-brand PC and a Mac is negligible. In the case of these Mac-based clusters, since the clustering software is just another app, a Mac-cluster can be setup and torn down quite readily. You come into the lab on Wednesday to find your workstation has been appropriated for the cluster.

    The point is accessibility! If you're a physicist in a small school looking to model some complex interaction, you can rent some computer time from somebody (expensive), build a cluster (very expensive, because you'll have to hire somebody to do it--physicists aren't likely to be Beowulf experts), or use the Mac clustering software (expensive, because you'll have to buy the machines if you don't already have it, but you can do it yourself, quickly, without much bother).

    Accessibility! It's what keeps Apple in business. This is another example of it.

    I'm pretty disappointed in the posters who knock it, because it strikes me that they are a bit put out that they won't remain the Technical Elite because they've got the spare time to read the 230-page Beowulf manual.

    --
    Potato chips are a by-yourself food.
  14. The real point here is... by jellisky · · Score: 5, Informative

    ... for scientists like myself, this is a very nice thing. Not all of us in the sciences are tech-savvy... I'm probably the one in my 5-person research group who understands the most about *nix. For those of you who don't realize this, many research scientists have to work hard to get their grants and outside money.
    So, what does all this mean to us? As an atmospheric scientist, having some serious number crunching power is mighty helpful. Weather modeling is quite the processor intensive task, and then interpreting the results can take years after all the computing is done, including further computations and visualization routines. To put it shortly, we can easily tax our computers.
    So, now you know that we need computing power, but money is a premium for us in many cases, so why shouldn't we just get some cheap Intel boxes and *nix cluster them? Well, we could, but then we'd need to hire a systems admin. Someone who is tech-savvy enough to keep everything running decently well for us. That requires another person who REALLY understands what's going on in many cases, which is another salary on the payroll. For us, it all ends up balancing in the end. The $5-10K that we save in clustering our 8 Intel boxes over the Macs is eaten up in one year or less by the guy (or woman) who has to set up the whole thing. So, for us, the ease of setup and use is something that can translate into some good savings and we don't have to worry as much about having to rely on another person to save us if something goes wrong. That's the benefit of simplicity for us.
    I agree that it is important to know, as one person said, "The nature of the beast", but that's something that takes time to do, and when you're not being paid to learn about how to cluster computers, but to figure out how the atmosphere works, then things like "The nature of the beast" are just further complications. I would rather have something that I can slap together, know that it works, and get back to my work, without the interference of others if I don't need it.
    And that brings me to another rebuttal, about someone mentioning that if you buy the Macs, you're also going to pay for all the extra Superdrives and video cards and all that. I say to that, "Good." That way, if the cluster doesn't need to be used, then I don't have a bunch of mostly useless boxes sitting around... or if a collaborator comes around and needs a computer, I can just remove one of the computers from the cluster and let them use that for as long as they need. The point is that there are advantages and disadvantages to each setup. Now you've heard some advantages and why the scientific community might care about this. Remember, not everyone here can compile their own kernels and not everyone cares about being able to do that. Some of us, thank the deity of your choice, actually want to do something with this power and not care how it works in depth. To each their own.

    -Jellisky

  15. Re:Manual length and Macs vs. PC by overunderunderdone · · Score: 5, Informative

    The fact that a manual is shorter doesn't mean that it is a better or easier to install program.

    I would agree that comparing manual lenght is not a reliable guide to judge the relative complexity of two programs. The one-page doc is even a "quick start guide" not a complete manual. But I still suspect that the writer is correct that Appleseed clusters are easier to set up and maintain than a Beowulf cluster. Reading over the directions myself it did looked pretty brain-dead simple - most of that one page didn't even have much to do with the actual installation of the program but with such complicated tasks as connecting your Mac to an ethernet hub: "For each Mac, plug one end of a cable to the Ethernet jack on the Mac and the other end to a port on the (ethernet) switch." and noting a few system requirments (CarbonLib 1.2 or OS X 10.1) The installation instructions consists of "Double-click the Pooch Installer and select a drive for installation." Instructions on how to use consist of dragging and dropping the program you want to run in parrallel onto the Pooch app and "click Select Nodes..., select the computers you want to run it on, and, in the Job Window, click on Launch Job."

    Besides, if you are going to have a cluster, you want cheap, off the shelf machines such as PCs with plenty of spare parts that can be customised to suit your needs : why pay for a good 3d graphics card in every pc if you are going to do number crunching !

    This is only the case if the individual PC's are dedicated nodes and not being used for anything else. Most Appleseed clusters are made up of computers that are primarily being used for something else. School Mac computer lab by day; clustered "supercomputer" by night. The cluster of that did 233 gigflops (76 dual G4's mostly 533's with a few 450's) was simply all of the Macs at UMC working as a cluster over Christmas break. This is where the easy set up, maintenance and the ability to cobble together computers with different processors and even different OS's (some nodes may be running MacOS 9 and some nodes may be running OS X) is an advantage. The Appleseed clusters that are made up of dedicated machines are probably discarded computers they already had kicking around so cost is not an issue there either.

  16. My Manual is Smaller than Your Manual! by Geoff · · Score: 5, Insightful

    I recall, back when CD-ROMs were fairly newfangled, the "manual" that came with the CD, if it was a dual-platform disk, often offered an interesting contrast.

    The Windows instructions would go on for pages, discussing running the installer application, how to get the right drivers, etc.

    The Macintosh instructions were usually:

    1. Insert the disk
    2. Double-click on the icon

    I never understood why Apple didn't market that advantage heavily.

    --

    Computers are useless. They can only give you answers. -- Pablo Picasso

  17. Why am I taking the bait... by alexhmit01 · · Score: 5, Insightful

    10 good Intel machines will not cost less than $10,000. For scientific work, I don't consider eMachines or your grey-boy solutions a "good" system.

    So, I took the bait... I went to Compaq's site and spec'ed out an equivalent workstation. Note, I'm not souping up the video card or CD-ROM like the Apple workstations. No need to waste money.

    Compaq Evo Workstation W6000, Intel Xeon 2.00 GHz/512K processor, dual processor... Upgrading to 512MB RAM. $3521.00. Note that this machine only has 10/100 networking. The Apple has Gigabit. This should matter in a cluster.

    Dell Workstation 530. Intel Xeon 2.0 GHz x2, 512MB RAM, and an upgraded sound card (Dell won't sell a dual-proc workstation without an $80 soundcard upgrade... weird). Dell did let me downgrade the video card annd monitor... Price: $3878.00. Unlike Compaq, I could buy the Dell workstation with Linux (supported) instead of NT and needing to swap OSes.

    Next I went to Big Blue. They push Linux, they should sell me good Linux workstations. When I bought my last round of Penguin Computing machines (to run OpenBSD and Linux) I looked at IBM first...

    IBM's only dual processor workstation, the IBM Intellistation M Pro 6850 Tower. With a second 2.0 GHz Xeon processor, $5218.

    Real computers cost money. Flaky machines that hardware lock from time to time do not. You can't compaq the Apple workstations to the bottom-barrel systems.

    In fact, at $1300 for the lowend iMac (700 MHz G4), admittedly with a silly flatscreen for this project, or $2300 for the midrange (933MHz) G4, Apple hits some good price points for this.

    Look, the new G4s (in the 933MHz and 1GHz-dual models) are sporting a 2MB L3 cache! That's damned impressive. A 2MB L3 cache should make cache misses SO infrequent that the slower memory bus speed is irrelevant.

    Look, if you need lots of power, you used to need to stop millions. You're not going to cut corners on your machines. You're looking at $3500 for an Intel dual-Xeon based solution or $3000 for the dual-G4 based Apple solution. Sure you get an unneeded Superdrive, but who cares? When the project is over, I bet you everyone in the lab is happy to take one of the Superdrives home...

    Geeze people, get a grip.

    Apple's G4 workstations are not the same quality as the computer you have in your room in your parent's house. These are real machines with:

    Gigabit Ethernet (very significant for a cluster, and unlike the PC's 32-bit, 33 MHz bus, real machines like the Apple, Compaq, or Dell workstations have 64-bit OR 66 MHz (sometimes both) PCI busses so you can actually USE the Gigabit Ethernet.

    The Apple's L3 Cache has 2MB DDR SDRAM at up to 500MHz, this is much faster than the 266MHZ DDR in PCs and comparable to the PC800 RDRAM in the Dell/IBM workstations. Sure the System RAM is slower, but a 2MB L3 cache makes this less relevant.

    The Superdrive, Firewire, and Video cards are all unnessary here, but they are actually really nice features if these machines will be reassigned as desktop machines when the project is over. You could buy new PowerMacs with the G5s ship within 6 months and reassign these as desktop machines. The real workstations are the same. You $45000 cluster of crap machines won't take you very far. They are trash when replaced, and if the machine hasn't been QC'd? Well, time to explain that your project needs to start over.

    Come on people... Quake != scientific computing

  18. Correction... by LenE · · Score: 5, Informative

    It's not your fault, because you probably didn't know this, but the USC Mac cluster didn't cost anything near $440,000, and it didn't have any 1000 MHz. G4's in it.

    At the "Macs in Science and Engineering" user conference at Macworld, they gave the general specs. of this cluster, and all of the machines were dual processors, but of different hardware generations. Although the fastest machines were dual 800 Mhz. on 133 MHz. bus, the majority were slower dual 450 and 500 Mhz. machines with 100 Mhz. buses.

    With the fact that all were dual, and ignoring depreciation on the older hardware, the cost would be at most $220,000, If you were using Dual 1 GHz. G4's, it would still be only $220,000. My notes are on my laptop, but I believe that the actual cost of the USC cluster was less than $200,000.

    Also, I assume that you think that the 270 uni-processor T-birds will scale performance linearly as well. I doubt it would only cost ~$600 per node as you would have to use Myrinet or some other fast fabric, and with three and a half times as many nodes, the latencies, hardware, and administration cost would be crippling. I have the same cost argument if you use dual Athlons, as the boards are quite rare, and the node count is almost double the Mac node count.

    Your price/performance assertions don't stand up!

    -- Len

  19. Re:Manual length and Macs vs. PC by TheAJofOZ · · Score: 5, Informative
    Short documentation doesn't necessarily mean a simpler product

    Agreed, however if you'd ever actually tried to use the product you'd realise that this is not the case. Let me show you through exactly how simple it is in just 10 simple steps:

    1. Grab a bunch of Macs, a switch and a monitor.
    2. Plug Macs into the power.
    3. Plug a keyboard and the monitor into the first mac and turn it on.
    4. Configure the network through the easy to use Networking Control panel. Or alternatively don't configure it and throw a DHCP server into the mix somewhere.
    5. Install and run pooch (drag and drop from the disk image it comes on then double click).
    6. Repeat for each Mac.
    7. On the last Mac, pick an application you want to run on the cluster, drag and drop it into pooch.
    8. Select which Mac's you'd like to help out with running this program.
    9. Click start.
    10. There is no step 10.....
    Voila! The best bit about this is that I've never even read the pooch manual, yet I've still managed to set up my own Mac Beowolf cluster. I've looked into Linux beowolf clustering a number of times and gotten hopelessly lost and confused despite having respectable Linux knowledge.

    If you've ever set up a Mac beowolf cluster you'll very quickly realise that there is no comparison in ease of use and anyone who argues otherwise is clearly uninformed.

    Like always, don't bash what you haven't tried...