Macintosh Clustering
HiredMan writes: "Wired is running an article comparing the set-up and admin of Linux Beowulf clusters versus Mac based clusters. Slant of the article is that the Macs are easier to set-up, maintain and are more flexible. They note that the Linux "how to" manual is 230 pages while the corresponding Apple document is a 1 page PDF file. Dauger Research of former Appleseed fame is mentioned as well, of course. MacSlash is also covering the article. Let the on-topic (for once) Beowulf comments fly..."
This article sounds biased. The fact that a manual is shorter doesn't mean that it is a better or easier to install program.
In fact, as far as I'm concerned, I wouldn't go with a solution claiming to make computer clusters "easy" with a one page manual.
Besides, if you are going to have a cluster, you want cheap, off the shelf machines such as PCs with plenty of spare parts that can be customised to suit your needs : why pay for a good 3d graphics card in every pc if you are going to do number crunching !
"The obvious mathematical breakthrough would be development of an easy way to factor large prime numbers." Bill Gates,
I think that an important thing to remember when taking into consideration the higher cost of apple hardware is that it costs so little to maintain over the long run.
Just one day of heavy tech support can up the difference in cost between a comparable 'off the shelf' pc.
i think that the earlier statement "you want cheap, off the shelf machines such as PCs with plenty of spare parts that can be customised to suit your needs" misses the point in two ways. first, the long-run maintenance issues i mentioned, second that apple is so standardized that most of your existing spare parts are still uselful.
I can't comment on whether or not a Mac cluster is easier to create or maintain (since I've never used a Mac cluster), but I'd prefer a Linux cluster running PC hardware, because:
-- Initial build costs are much lower (dual Athlon 2000+ right now without graphics hardware is way cheaper than a dual G4 1GHz).
-- Maintenance costs are much, much lower. Anything goes wrong with a PC node, just swap out that part with another commodity part. Mac repair or parts replacement costs will eat you, especially if you start to have many, many nodes.
Plus you can modify bits of Linux if you need to optimize the behavior of your cluster for the sort of computing you do, which you can't do with Mac OS.
My $0.02.
STOP . AMERICA . NOW
... is the ease of use. Tech Professionals can't make a living supporting the platform.
... However, he hasn't done any consulting yet because all of his clients have figured it out for themselves. All they need are a few G4 Macs, some Ethernet cables, a hub and the Pooch software. Getting it up and running is as simple as installing the software and configuring it through a couple of dialog boxes. ..."
Before someone accuses me of saying they never break, always work flawlessly, and the like: They do need support. It's just that the ideal career envoirment is when there is more work than workers. An underwhelmed support staffer soon finds the company wants him to help unload pallets in his spare time.
When all the IT staffers know one platform, what do you think they're going to recommend come upgrade time?
From the article:
"
Cost of 10 good Intel machines to install Linux on... trivial (pobably about $15,000)...
Cost of 10 good Highend Macs, (about $30,000)...
Both are in the trivial range compared to the costs of time, energy, etc.
There is a more important question, which machine gives you the most bang for your buck?
We know that Photoshop runs better on the G4, what about your operation?
If the Mac gets a 2:1 performance advantage, then the costs are equal. If the Mac out-performs it regardless, you get an advantage.
For the moment, let's assume that you are getting real machines that are tested, not parts off of a sketchy vendor from pricewatch.com. If you are really trying to build a parallel computer, you want real systems, not junk that may or may not work.
This also rules out eMachines, or home computers. You are basically in the Compaq Workstation, Dell Workstation, HP Workstation, or IBM Workstation area. You aren't setting up a bunch of Presarios.
I don't think very many people will choose to use the Mac for clustering _even if_ it is easier than other platforms as this article seems to suggest.
Macs are luxury computers. They are generally more expensive than their custom PC counterparts, and Apple limits the BTO options that you can use to reduce the price of their G4 towers.
If you wanted to cluster 10 G4 towers, you'd be paying for 10 superdrives, 10 3d accelerated video cards, 10 snazzy cases etc etc. Most people building a cluster will want each system to only have the components they need: processor, memory, network IO, backplane bandwidth etc. You won't want to pay for components you won't use (like 9 extra superdrives).
So unless Apple decides to offer special deals for those who want clustering, I think the economics of the situation will work against Macs and infavour of x86 PCs running Linux where the economies of scale conspire to lower component costs to the minimum.
And don't note that the manual (if it's the Beowulf book everyone cites) is mostly about how to PROGRAM it (e.g., includes an intro to MPI).
Comparatively, the Beowulf books talk about what kind of network infrastructure you'll want for different types of applications, different standard communication libraries to use between the nodes, automatic administration of nodes, how to make redundant nodes, etc.
You also want to have an infrastructure for automatically loading software on computers, perhaps booting off the network... none of this is available on that PDF. Perhaps even not possible.
And you won't get very far telling me that it's easier to upgrade OS X to OS X.1 or whatever where you have to go around with a CD and reboot every computer on a 1024-node cluster, compared with just having them all "apt-get dist-upgrade"
In a nutshell: if you need a high-performance computing cluster, you need to go with a Linux-based beowulf cluster. Perhaps on Apple hardware, perhaps on Alpha, probably on x86. If, on the other hand, you want a toy that can run a fractal program really fast (perhaps povray too) and don't have a real application then this Mac cluster is probably what you need.
-- Erich
Slashdot reader since 1997
was building a super computer supposed to be easy? Chances are if you have a reason to build one you would have the technical ability to follow a 230 page user manual. Then again, maybe "Super computers for dummies" would have a bigger audience than I'd expect.
thirsty*i^2
"Ya I finished that last week, it just doesn't work"
MOSIX clusters are a one-liner to set up, for example. I challange Apple to beat that!
I'm not sure about Compaq's One-Stop Linux Clustering. I've never got it to compile. But, assuming it can be made to work, I bet it'd be pretty decent, too.
Last, but by no means least, clustering in the Real World tends to be through PVM or MPI, which are platform-independent. Hardly anyone uses OS-specific clustering, because hardly anyone but high-energy physicists ever develop large clusters in the first place!
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
How is a Mac "easier to set up" in a Beowulf cluster than a group of identical PCs?
I can see where the author might make a point to say that the Mac is nice to use for a cluster because Mac hardware doesn't really change much from box to box, but the same could be said for a group of equal-built PCs. Infact, most real-world (re: not your bedroom.) Beowulf cluster nodes are NOT loosely conglomerated machines with wildly different capabilities from node to node. Most clusters are planned out well in advance, in where each node is precisely equal in terms of its hardware and horsepower.
"Its easy to set up because all of your nodes are the same with a Mac!!" ceases to be a valid "advantage", when the same can be said of a group of SGI O2 boxes, a group of Sun E10K boxes, or a group of lowly 386 PC boxes.
Besides, "its see-thru orange!!!" shouldn't top your list of reasons to purchase Macs for your cluster. You buy a pile of 1U rackmounts, because you normally don't have a whole room to dedicate to a cluster. (duh)..
Cheers,
Bowie J. Poag
If you read the one page pdf file, it assumes you already have a network of OSX boxes set up. The same thing in linux would look like:
.rhosts files on very node /etc/hosts files
Requirements: Linux Network with rsh enabled, preferably with firewall and IP Masquerade.
1.) Download jobmanager and bWatch rpm's
2.) Do a rpm -Ivh *.rpm
3.) Add list of nodes to
4.) List all nodes in
5.) In a terminal issue: jr -q [process command]
Viola! your distributive computing!
! == goatse.cx
Actually, native Cocoa apps have the capability to be built using distributed objects quite easily. In fact, the mechanisms used for multithreaded communication (NSConnection, NSPort, etc) are the same classes you use to communicate with other processes - on ANY machine.
The mechanism they use relies heavily on the dynamic nature of Objective-C objects, so I'm guessing it's NOT based on some standard (SOAP,CORBA,RPC,.NET). That would make it hard to integrate it into any cross platform clusters, but we were talking about Photoshop, weren't we?
So, it boils down to simply this: If you write a Mac OS X app, write it threaded and use Cocoa. If you do that, you'd be amazed what sort of functionality you get for 'free' - including being able to distribute your app across clusters!
Down with Carbon!
Culture is more than commerce
The point isn't flexibility: sure you can be more flexible with a Linux-based cluster. You can tweak and tune a Linux-based cluster to meet your specific needs. This is why Google uses such a cluster.
The point isn't about cost: the real difference between a decent name-brand PC and a Mac is negligible. In the case of these Mac-based clusters, since the clustering software is just another app, a Mac-cluster can be setup and torn down quite readily. You come into the lab on Wednesday to find your workstation has been appropriated for the cluster.
The point is accessibility! If you're a physicist in a small school looking to model some complex interaction, you can rent some computer time from somebody (expensive), build a cluster (very expensive, because you'll have to hire somebody to do it--physicists aren't likely to be Beowulf experts), or use the Mac clustering software (expensive, because you'll have to buy the machines if you don't already have it, but you can do it yourself, quickly, without much bother).
Accessibility! It's what keeps Apple in business. This is another example of it.
I'm pretty disappointed in the posters who knock it, because it strikes me that they are a bit put out that they won't remain the Technical Elite because they've got the spare time to read the 230-page Beowulf manual.
Potato chips are a by-yourself food.
Photoshop is optimized for the G4. That was my point. We're not clustering Quake here. We're talking about special purpose applications that do scientific calculations.
If you application does better on the Intel, you are likely better off considering a Linux cluster. However, if it isn't much better, you might be better off with the Mac cluster by adding a few more machines to compensate... depends on the costs of time.
If you are running an application that, LIKE Photoshop, does better on the G4, you will see the price performance favor the Mac line. That's my point.
If this market was a decent size, I bet Apple could get some really competitive cluster systems. It would be nice to see an Apple dual or quad G4-1 GHz, with a CD-ROM, ATI Rage 128, and Gigabit Ethernet for the scientific community.
They could make the machine without PCI slots and fit in a 1U case for OS X processing goodness.
However, the reality is that the extras (better video card, Superdrive, etc.) don't add much to the Apple's price. However, the right form factor could make them tremendous cluster machines.
Alex
I'm an Apple user, and I agree, hiding is a good thing. I have little or no desire to know HOW a computer works, I just want it to work. Just as I have little or no desire to know HOW a car works, a TV, a stereo, or a frig.
Obviously, it is nice to know these things, especially if you find them interesting or desireable to know. But the point is, Apple believes that the gears, wheels, and cogs of a computer are not what the computer is about. It is about performing previously complex tasks in a simple and intuitive manner as possible.
Others (and this means most of the people reading this) are actually interested in the guts and the inner workings of a computer. For those of you that are, enjoy it. It's a free country. Just be glad computer geeks like you, and Apple goofballs like myself, can share polite conversation over a decent cup of coffee.
As an Apple user, I concern myself with the inner workings of other things, not the computer itself. This isn't to say that I'm some sort of computer dingbat, but I'm not into coding, the command line, or any such pursuits. You concern yourself with the inner workings of things that some Apple users may find boring. Tit-for-tat, six and one half dozen of the other...
Myself, and other computer users believe that a computer should be so powerful that it is actually EASIER to use. ANd with every advance that Apple, Microsoft, the Linux & *nix communities, the processor designers and engineers, the programmers, the manufacturers, and the forward thinking researchers out there make, the better it gets. Hopefully this means the EASIER they get, as well.
Try pricing a replacement motherboard out of warranty. Also, the parent post to you was not about adding components to a standard system. It was about getting a stripped down system from Apple for cluster use. Don't need a CDROM or 3D video for that. I gotta admit though, it does look easy to set up. The big question mark is how good is their SDK for recompiling code for parallel processing.
I recall, back when CD-ROMs were fairly newfangled, the "manual" that came with the CD, if it was a dual-platform disk, often offered an interesting contrast.
The Windows instructions would go on for pages, discussing running the installer application, how to get the right drivers, etc.
The Macintosh instructions were usually:
I never understood why Apple didn't market that advantage heavily.
Computers are useless. They can only give you answers. -- Pablo Picasso
Everyone here seems to be suggesting that the manuals indicate nothing. "Apple has weak docs!" seems to be the summary. But can we entertain the notion that perhaps while 1 page is too short, 230 pages is far too long? If so, is this because the people who wrote the manual are not professional authors, and got too wordy? Or is it because Linux just isn't usable enough?
And whatever you think, isn't it reasonable to suggest that making Linux more intuitive and the manuals more succinct might help rid us of idiot lusers who won't RTFM? They won't really go away, but if we actually take usability seriously, perhaps developers can get half those people to solve their own problems. Wouldn't this be a good thing? I guess that's a rhetorical question -- I am sure it is a good thing. I spend my entire workday building apps for people, and one usability tweak can mean the difference between 20 nagging people a day and 2. My team even has blacklisted a couple people in the company, whose projects are always time-sinks to build and time-sinks to maintain. Why? Because those people are control freaks who won't let us fix usability errors, and my team ends up spending their days on support. If you can build something intuitive and usable, both the users and the developers will be much happier.
My Greasemonkey scripts for Digg &
Or...they could keep doing what they're doing, be successful, and have all of the people be jealous that the coolest OS out there doesn't run on their ugly beige hardware.
Okay, that was a bit of flamebait. But still, Apple has a good thing going now. They piss some people off, but they are being pretty darned successful compared to the other major box manufacturers. I honestly don't think a new Dell product could get the cover of Time.
There should be a moratorium on the use of the apostrophe.
Max V.
NeXTMail/MIME Mail welcome
The size is what really kills them.
A 45U rack will hold 45 1U dual-CPU systems. Even more of the server-blade type systems (280 of the Compaq in a 42U rack).
The only way to rackmount a G4 that I can find is at Marathon Computer. A set of replacements for the "handles" for $225 or a whole new case which is 4U but is $550. Given a 45U square-hole 19" rack, you could squeeze in 11 dual CPU G4s.
I don't care what your performance fantasies are about the G4 systems, they're not more than 4x faster than dual x86 systems.
It's interesting, all the comments I've read so far, including yours, seem to deal with this as a dichotomy between Linux/Intel and OS10/PPC. Don't forget you can run Linux on PPC. For a high performance dedicated cluster that would definately be an option I would look at.
Of course, there are situations where the Mac software has advantages that will really shine. Like if your "Cluster" is really just the lab machines at the college, acting as a cluster when not being used for DTP and Video editing or whatever. In that case the ease of setting this up with Mac OS10 would be a real plus.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Friends don't let friends enable ecmascript.
10 good Intel machines will not cost less than $10,000. For scientific work, I don't consider eMachines or your grey-boy solutions a "good" system.
So, I took the bait... I went to Compaq's site and spec'ed out an equivalent workstation. Note, I'm not souping up the video card or CD-ROM like the Apple workstations. No need to waste money.
Compaq Evo Workstation W6000, Intel Xeon 2.00 GHz/512K processor, dual processor... Upgrading to 512MB RAM. $3521.00. Note that this machine only has 10/100 networking. The Apple has Gigabit. This should matter in a cluster.
Dell Workstation 530. Intel Xeon 2.0 GHz x2, 512MB RAM, and an upgraded sound card (Dell won't sell a dual-proc workstation without an $80 soundcard upgrade... weird). Dell did let me downgrade the video card annd monitor... Price: $3878.00. Unlike Compaq, I could buy the Dell workstation with Linux (supported) instead of NT and needing to swap OSes.
Next I went to Big Blue. They push Linux, they should sell me good Linux workstations. When I bought my last round of Penguin Computing machines (to run OpenBSD and Linux) I looked at IBM first...
IBM's only dual processor workstation, the IBM Intellistation M Pro 6850 Tower. With a second 2.0 GHz Xeon processor, $5218.
Real computers cost money. Flaky machines that hardware lock from time to time do not. You can't compaq the Apple workstations to the bottom-barrel systems.
In fact, at $1300 for the lowend iMac (700 MHz G4), admittedly with a silly flatscreen for this project, or $2300 for the midrange (933MHz) G4, Apple hits some good price points for this.
Look, the new G4s (in the 933MHz and 1GHz-dual models) are sporting a 2MB L3 cache! That's damned impressive. A 2MB L3 cache should make cache misses SO infrequent that the slower memory bus speed is irrelevant.
Look, if you need lots of power, you used to need to stop millions. You're not going to cut corners on your machines. You're looking at $3500 for an Intel dual-Xeon based solution or $3000 for the dual-G4 based Apple solution. Sure you get an unneeded Superdrive, but who cares? When the project is over, I bet you everyone in the lab is happy to take one of the Superdrives home...
Geeze people, get a grip.
Apple's G4 workstations are not the same quality as the computer you have in your room in your parent's house. These are real machines with:
Gigabit Ethernet (very significant for a cluster, and unlike the PC's 32-bit, 33 MHz bus, real machines like the Apple, Compaq, or Dell workstations have 64-bit OR 66 MHz (sometimes both) PCI busses so you can actually USE the Gigabit Ethernet.
The Apple's L3 Cache has 2MB DDR SDRAM at up to 500MHz, this is much faster than the 266MHZ DDR in PCs and comparable to the PC800 RDRAM in the Dell/IBM workstations. Sure the System RAM is slower, but a 2MB L3 cache makes this less relevant.
The Superdrive, Firewire, and Video cards are all unnessary here, but they are actually really nice features if these machines will be reassigned as desktop machines when the project is over. You could buy new PowerMacs with the G5s ship within 6 months and reassign these as desktop machines. The real workstations are the same. You $45000 cluster of crap machines won't take you very far. They are trash when replaced, and if the machine hasn't been QC'd? Well, time to explain that your project needs to start over.
Come on people... Quake != scientific computing
Isn't that a concidence. I just came from a Mac discussion forum which was discussing the same linked article on the Mac Head site.
I show up in Linux city and what do I find? Well, I find a lot more messages, but that doesn't mean a thing.
All I have to do is take the ones on the Mac board, switch Max and Linux, and do the same here. They're interchangeable.
The Wintel chappies are bug eyed with glee and laughing it up as us dumb kiddies.
Heh. Lotsa Linux types haul an iBook around. And lotsa Mac sites run Linux on their servers. Does that suggest any thing? Maybe we should check out these other guys, maybe?
Why are the penguinites and mac heads banging? Maybe.....just maybe, there's a little objectivity here in 10% of the posts. The others are either ill informed or prejudiced.
Yeh? Well I posted about the same dumb message you just read on the mac head board too.
heh.
well, then that's great that you're also paying Apple's inflated prices for:
67 56k modems (not optional)
67 Superdrives (DVD-RAM, not optional)
67 GEForce4 video cards (not optional)
67 sets of hyper-inflated Apple RAM which you could otherwise get from any other vendor at half the price. (512 Meg, not optional on that model).
If Apple would work a deal where I could get the same boxes without these add-ons for say, $1500 a piece, THEN we could make a deal on a cluster.
Not to mention, you'd probably want to hack the OS in some way so that you could kill CPU-hog Aqua.
I'm just trying to point out that Apple's destop machine is not necessarily optimal for this kind of application.
These are my friends, See how they glisten. See this one shine, how he smiles in the light.
>a brand new computer than to hire me for 1 day to fix it
.. it almost makes me want to point out how, despite one's salary, employers consistantly underestimate the costs of labour (because the goal is to drive it ever cheaper) and typically overestimate the true cost of hardware. Add to the that the completely ignored (in a free-market) social costs depending on your selection, and you've got some very difficult-to-explain from a total-cost perspective monopolies ...
Indeed! Well, I'm salary, but its still about 10 days of my time = new computer.
Nothing speaks more about the falsehood of the market choosing wisely than the tech sector, as it relates to the perception of technology costs versus people costs. Who cares if it's 75% as fast if I need to spend less time thinking, caring, stressing over it.
It's kind of funny
"Old man yells at systemd"
t.
There's a few people saying the cost of a Linux cluster of similar computing power would be much less than a cluster of Mac towers. That is completely wrong, and here's why:
1.) Power vs. cost. The G4, with AltiVec-enabled MPI code, can blast data through in 128-bit chunks. Steve Jobs loves to term this the "Velocity Engine", and it is much, much more powerful when doing solid number crunching -- exactly what would be taking place on these clusters. It's not as amazing for day to day operations, but the capability is there to quadruple the data flow of a traditional processor when doing clustered computing. Typical AMD/Intel processors can just not do this.
2.) Maintenance. This is key. I maintain a Linux cluster and have worked with others in the past, and wonderful as they are, they require lots of maintenance. It's pure and simple math. You probably built all 16 or whatever nodes with individual parts made by various companies, and inevitably, each of those elements will have problems. This makes debugging and fixing hardware problems unbelievably painful, especially when you also have to deal with multiple parts vendors. When you use Apple Power Macs, ALL hardware problems can go through ONE support source, and that's Apple. Plus, they are pre-built, tested, and refined in Apple's R&D labs far before they make it to your cluster room. This saves such incredible amounts of time and money, it definitely pays for the extra cost of the computers themselves. I wish I could explain to you the sheer pain of keeping a cluster alive which constantly had one part go bad here and there -- but one part, sixteen computers, each with eight or nine significant custom-attached parts... well, it meant a lot of troubleshooting time, a lot of replacement time, and having to deal with far too many different companies to get the parts and support I needed.
3.) MacOS X. Clustering under previous MacOS versions was, despite the best efforts of AppleSeed, absolutely reprehensible. The operating system was simply not designed to do massive computing projects, and it was not efficient at all. Definitely not worth it despite the work of the pioneers in the field. With OS X, you now have a BSD operating system, one that has done clustered parallel computing for over a decade. MPI, with AltiVec enhancements; gcc with multiprocessor compilation support, you name it, it now runs under OS X and, with the operating system natively supporting the G4, it does it DAMNED fast.
"What the heck do you know," you might ask. Again, I maintain a 16-node Linux cluster for a plasma simulation group at the University of Colorado, and am also the CU campus rep for Apple Computer. I am well-versed in both OS X and Linux, and their scientific computing environments, and have experience in clustering in both environments. I am in the process of establishing a scientific computing initiative at CU, and I am doing it on behalf of Apple because the G4s (and soon, G5s) are simply the best platform for multi-platform scientific and high-intensity computing.
The best saving grace from a sysadmin's point of view, is that I will never have to worry about maintaining the variety of parts in those damned Linux clusters. The operating system is wonderful for scientific computing, yes, but there's simply no cost-effective way to purchase and maintain Linux-based PC hardware that could ever compare to the Mac. From an overall perspective, and this is definitely the most important aspect, those who are using massive parallel clusters of computers need their data crunched fast, and the G4 processor, combined with AltiVec-enhanced code, is simply the fastest way to crunch data, straight and simple.
I hope that clears up the issues for people, because that's how it is. Just the facts, ma'am.
Ryan Bruels
Apple Campus Representative
University of Colorado, Boulder
bruels@mac.com * 303-332-5434
"All your base are belong to this file I send in order to have your advice."
The original 128K Macintosh came with a thin manual and a casette tape (which you played along with a movie running on the screen). This was enough. One of the first Mac commercials showed a PC, with a stack of books falling on the table, and a Mac, with the thin manual floating down.
However, they made the same error that you make: thinking that people select for ease of use. They don't. This is what happens:
The sum total of this is what I call "the Acolyte effect." An Acolyte is someone studying for the priesthood. Computer acolytes are attracted by the pseudo-mystical nature of software; learning its ins and outs is for them a rush. The choice of computers and software becomes a social hierarchy.
Presumably, most of the annoyance at Apple in this community comes from the whole whacking-unauthorized-clone-makers-with-a-big-stic k attitude it adopted.
Very hacker-unfriendly, and more monopolist than Microsoft. In PC land, there are almost always at least three suppliers for every major component. (e.g. CPUs: Intel, AMD, VIA, Transmeta; Motherboard chipsets: Intel, AMD, VIA; etc...)
You may have a point about the jealousy, though. Although my hardware is neither beige nor ugly -- and each of my components was selected at my choosing -- I have to admit that it would be kind of neat to mess around with OS X for a while. Now, if only Apple would let down its sometimes-whimsical sometimes-Microsoft-esque-monopolist schizophrenia for long enough to realize that it could really change the world and make a killing at the same time by entering the PC OS market... But, alas; Star Trek was crushed long ago in favour of the misguided hardware company vision.
Heck... If they leave the price/performance ratio wins and the majority market share to PC land, the Dells of the world will gladly reward Apple with all of the "cover of Time" success it wants. =)