Macintosh Clustering
HiredMan writes: "Wired is running an article comparing the set-up and admin of Linux Beowulf clusters versus Mac based clusters. Slant of the article is that the Macs are easier to set-up, maintain and are more flexible. They note that the Linux "how to" manual is 230 pages while the corresponding Apple document is a 1 page PDF file. Dauger Research of former Appleseed fame is mentioned as well, of course. MacSlash is also covering the article. Let the on-topic (for once) Beowulf comments fly..."
What about cost? The cost (monetary, not time) of setting up a Linux cluster vs. a Mac cluster?
I think there are pros and cons of both clusters.
Good quote, too many chars. Seriously, the slashdot 120 char limit sucks!
Why haven't I heard about these Beowulf "clusters" on Slashdot before?
The Linux manual is a Beowulf cluster of Mac manuals.
This article sounds biased. The fact that a manual is shorter doesn't mean that it is a better or easier to install program.
In fact, as far as I'm concerned, I wouldn't go with a solution claiming to make computer clusters "easy" with a one page manual.
Besides, if you are going to have a cluster, you want cheap, off the shelf machines such as PCs with plenty of spare parts that can be customised to suit your needs : why pay for a good 3d graphics card in every pc if you are going to do number crunching !
"The obvious mathematical breakthrough would be development of an easy way to factor large prime numbers." Bill Gates,
...beowulf cluster of beowulf cluster jokes.
> Let the on-topic (for once) Beowulf comments fly...
Sorry d00d, but it ain't no fun when it's legal.
Sheesh, evil *and* a jerk. -- Jade
Finally we may rejoice! For once Apple has surpassed the user-friendliness of Linux! Let the merriment begin!
"Ask me about Loom"
but i guess u cant achieve the same density spacewise... and i feel that drilling holes in
a G4 case is a sin...
i'd like to see that document... so in 1 page it is going to tell me how to do everything concerning this clustering, or i just plug them together and assume the options are all 'optimized' for me???
i read the whole 230 pages of the linux how to and still find myself asking 'well how do i do XXXXXXXX'
my guess is the pdf is an ad for the ipod and imac claiming 'seamless integration with your new cluster or macs'
MARIJUANA, SHROOMS, X: ONLINE?! - E
Honestly, I'm surprised. Yes, it's easier on Mac OS X - the company spent millions of dollars developing and refining an imperfect GUI that succeeds in bringing more transparent administration to a machine.
Whoo. That's not tough.
If linux is to make further inroads (and I by all means wish Apple luck in the same) against Microsoft in the server arena, contributors must work towards this goal. It's the interface, stupid! I don't care how many geeks' grannys can send e-mail from the command prompt, but the MCSE-in-a-box crowd aren't going to go for it if it isn't simple to set up. Reading a 200-page howto isn't going to cut it, especially with the level of technical writing skill out there....
1000 deadugly apples lined up. Then I'd rather go for the slower pentium based ones.
I think that an important thing to remember when taking into consideration the higher cost of apple hardware is that it costs so little to maintain over the long run.
Just one day of heavy tech support can up the difference in cost between a comparable 'off the shelf' pc.
i think that the earlier statement "you want cheap, off the shelf machines such as PCs with plenty of spare parts that can be customised to suit your needs" misses the point in two ways. first, the long-run maintenance issues i mentioned, second that apple is so standardized that most of your existing spare parts are still uselful.
Having used the old Nextstep API (which I believe have been ported to OS X under the guise of CoCo) I can say that they are well suited for cluster computing.
I remember Richard Crandall and the mathematica guy (Wolfram) using Zilla (an old Next distributed computing program) to crack the world's largest prime in the mid nineties...
Anyone know if Zilla is back on OS X?
Also the Gigabit ethernet on motherboad and the large 2MB cache on the PowerPC chips will go a long way on making these machines a good cluster.
It's been a while since I've done distributed computing (hey, I am out of acedemia) but OS X will hopefully make the whole shebang easier...
I can't comment on whether or not a Mac cluster is easier to create or maintain (since I've never used a Mac cluster), but I'd prefer a Linux cluster running PC hardware, because:
-- Initial build costs are much lower (dual Athlon 2000+ right now without graphics hardware is way cheaper than a dual G4 1GHz).
-- Maintenance costs are much, much lower. Anything goes wrong with a PC node, just swap out that part with another commodity part. Mac repair or parts replacement costs will eat you, especially if you start to have many, many nodes.
Plus you can modify bits of Linux if you need to optimize the behavior of your cluster for the sort of computing you do, which you can't do with Mac OS.
My $0.02.
STOP . AMERICA . NOW
imagine a mosix cluster of dual-gigahertz g4s!
Wow, imagine one of these clustered machines running ALL BY THEMSELVES!
Strange, it doesn't seem to have the same comedic value this way...
In flyspeck 2pt font, printed onto A0 paper....
Pooch Quick Start
Requirements: Macintoshes running OS X 10.1 or later with proper connections to the Internet. (If the Macs are on an isolated network, manually configure their Network system preferences to use unique IP addresses from 192.168.1.1 to 192.168.1.254.)
1. Select a parallel application: Download the AltiVec Fractal Carbon demo and drag it from the Finder to the Pooch alias icon on the desktop.
2. Select nodes: To add other nodes, click on Select Nodes from the Job Window that just appeared to invoke the Node Scan Window. Double-clicking on a node moves it to the node list of the Job Window.
3. Launch: Click Launch Job in the Job Window to start your parallel job. Pooch should be distributing the code and launching the parallel application.
Congratulations! You are now operating your first parallel computer.
Congratulations! You have just built your first parallel computer. Now to test it
pooch@daugerresearch.com
http://daugerresearch.com/pooch/
Copyright © 2001 Dauger Research, Inc.
Installation: Double-click the Pooch Installer. Repeat for additional
Macs on the same local area network.
... is the ease of use. Tech Professionals can't make a living supporting the platform.
... However, he hasn't done any consulting yet because all of his clients have figured it out for themselves. All they need are a few G4 Macs, some Ethernet cables, a hub and the Pooch software. Getting it up and running is as simple as installing the software and configuring it through a couple of dialog boxes. ..."
Before someone accuses me of saying they never break, always work flawlessly, and the like: They do need support. It's just that the ideal career envoirment is when there is more work than workers. An underwhelmed support staffer soon finds the company wants him to help unload pallets in his spare time.
When all the IT staffers know one platform, what do you think they're going to recommend come upgrade time?
From the article:
"
I looked into using G4's for this purpose recently, but decided against it because of the availability of good Fortran compilers, either for OSX or Linux.
Price and simplicity have their place, but generally not at the expense of functionality and performance.
They note that the Linux "how to" manual is 230 pages while the corresponding Apple document is a 1 page PDF file.
Sounds like an old Apple commercial called "Manuals" (sorry, I spent ten minutes Googling and still did not come up with link to the ad) that showed an IBM PC with a stack of huge binders thundering down from the sky into a heap next to it... then panned over to the 128K Mac, as its single, thin manual fluttered down like a feather in comparison.
~Philly
While I've never set up a Beowulf cluster myself, I do know that Mosix is very easy to set up. It basically just involved recompiling the kernel (one of the easiest, and yet most frightening sounding things one can do on a Linux box), editing a text file, and installing a daemon. I disbanded my Mosix gang though, since two of the three boxes had only 16MB of RAM and Pentium 133 CPUs, and I wasn't noticing enough of a speed improvement in kernel compiles to justify the transition from a comfortable 747 noise level to a painful F-14 with afterburners.
;p), and those who have no desire to learn even simple bash, probably aren't smart enough to need a cluster or use it wisely. Even if we're talking about graphics houses setting up clusters of Macs to do their big renders, I still say, "Hire a professional." (Plug:) Like me.
That said, I don't think point-and-click people have any business setting up a cluster. The ability to use a CLI says something about your intelligence (or at least your desire to use a CLI
A solution to the problem with music today
Finally Apple has hardware (powerful G4s and gigabit networking) and software (Mac OS X with preemtpion, protection, and a mature TCP/IP stack) that can really handle this sort of this.
I mean, this shit flew under Mac OS 9 and 400MHz G3s. Now we have Mac OS 10.1 and *dual* GHz machines with Gigabit ethernet. I can't imagine the power.
Wouldn't it be great if "plug and play" clustering became a reality. Say your office mates are out to lunch, or there's no one scheduled to use the school computer lab for the next hour and you want to render the effects for you three-hour iMovie, or you want to perform batch despeckle on a few hundred inages in Photoshop...
Nothing against Linux (I use it myself for a router), but a three-day setup for Beowulf clustering isn't a great deterrent if your calculations will be going for a month or two.
The type of clustering we're talking about here is something that could potentially appeal to the average SOHO or school, where they have five to 500 general-use Macs that have processor cycles to spare.
My question is this:
What would it involve to make Mac OS X and every program that runs natively on it to be able to take advantage of clustering right out of the box? If they can natively use multiprocessing, how much of a leap is it to patch the OS to natively support clustering?
Not only would this be great for techies, but it seems that this would be a great incentive to volume sales from Apple, where they now generally only get one or two Macs per site and the rest are Wintel workstations.
Why don't you just make 10 louder and make 10 be the top number...and make that a little louder?
Available from here
I think...I think it's in my basement. Let me go upstairs and check. -M.C. Escher (1898-1972)
Uh, the article has the link to the one page PDF.
For instance, what makes it easier, the OS? Is it the Aqua part, or is everything needed present in Darwin? What would be the advantages and disadvantages of a Darwin-based cluster versus an Apple cluster or a Beowulf cluster? A Darwin PPC cluster versus a a Darwin x86 cluster?
Cost of 10 good Intel machines to install Linux on... trivial (pobably about $15,000)...
Cost of 10 good Highend Macs, (about $30,000)...
Both are in the trivial range compared to the costs of time, energy, etc.
There is a more important question, which machine gives you the most bang for your buck?
We know that Photoshop runs better on the G4, what about your operation?
If the Mac gets a 2:1 performance advantage, then the costs are equal. If the Mac out-performs it regardless, you get an advantage.
For the moment, let's assume that you are getting real machines that are tested, not parts off of a sketchy vendor from pricewatch.com. If you are really trying to build a parallel computer, you want real systems, not junk that may or may not work.
This also rules out eMachines, or home computers. You are basically in the Compaq Workstation, Dell Workstation, HP Workstation, or IBM Workstation area. You aren't setting up a bunch of Presarios.
I'm no genius but it doesn't take one to realise that the top priority for people setting up clusters isn't how easy they are to set up initially but how they behave once they are in use.
To most sysadmins, the performance and reliability of the cluster once it's up and running would be top of priority list.
Whether the cluster took 5 minutes or half a day to set up would be irrelevant compared to how quickly they could do the task in hand and how much ongoing maintenance was required. (I'm not saying that Macs aren't reliable, so don't flame me for that, just that, in the real world, people think about these things.)
Additionally, a fair proportion of real world clusters won't be built from scratch using brand new boxes. More than likely, several machines in the new network will have been appropriated from elsewhere. The likelyhood that these will be all Macs is low, whereas the chance that they'll be able to run Linux is much higher.
And even if you were building from scratch, cost would be a factor. Last time I checked, you could get far more bang for your buck buying PCs than Macs.
Again, I'm not saying that you shouldn't build a Mac cluster only that, given the alternatives, I doubt the few pros outweigh the many cons.
"Accept that some days you are the pigeon, and some days you are the statue." - David Brent, Wernham Hogg
This macintosh clustering app (pooch) is both an amazing piece of technology and a remarkably stupid idea, from what I can tell (in the article and the ONE PAGE of documentation provided).
All you have to do is write an application for pooch (that, for example, does your linear algebra homework, or perhaps pingfloods slashdot.org) and run it on all the cable/dsl mac machines that now run pooch because of slashdot.org, and enjoy the amazing technology!
Now, if what I have outlined isn't possible, please let me know; this is all from the article and the incredibly meager documentation I have read. But as usual, it looks like the security ramifications for this are enormous, perhaps worse than other common and incredibly boneheaded ideas, such as auto-updating software, and executing code in e-mails from random people...
pb Reply or e-mail; don't vaguely moderate.
This is not suprising at all. NeXT used to ship software with their OS's that connect computers running NEXTSTEP (and later OpenStep) in a cluster over a network (they also had a really sweet derivitive that would do batch rendering for Pixars RenderMan over a cluster os NS computers). Even though it was just a demo app to show what and how clustering was/worked, it only took about 15 min to set up and get running, and it work pretty good and was decently configurable.
I was starting to wonder when some of the cool software for OpenStep would surface on OSX, since it is essencially an upgraded OpenStep version that runs on Mac hardware.
I don't think very many people will choose to use the Mac for clustering _even if_ it is easier than other platforms as this article seems to suggest.
Macs are luxury computers. They are generally more expensive than their custom PC counterparts, and Apple limits the BTO options that you can use to reduce the price of their G4 towers.
If you wanted to cluster 10 G4 towers, you'd be paying for 10 superdrives, 10 3d accelerated video cards, 10 snazzy cases etc etc. Most people building a cluster will want each system to only have the components they need: processor, memory, network IO, backplane bandwidth etc. You won't want to pay for components you won't use (like 9 extra superdrives).
So unless Apple decides to offer special deals for those who want clustering, I think the economics of the situation will work against Macs and infavour of x86 PCs running Linux where the economies of scale conspire to lower component costs to the minimum.
We are talking about Apple and Macintosh here. Try an orachard :)
-
ping -f 255.255.255.255 # if only
And don't note that the manual (if it's the Beowulf book everyone cites) is mostly about how to PROGRAM it (e.g., includes an intro to MPI).
Are there any *useful* tasks that can be done on beowulf clusters? (besides password/encryption cracking or seti at home, fractal generation)
Is beowulf ever used for (photo-realistic) 3d rendering, weather prediction, etc?
p r m t h s
but can you imagine a Beowulf cluster of Beowulf clusters? That would be sweet. Or something.
Comparatively, the Beowulf books talk about what kind of network infrastructure you'll want for different types of applications, different standard communication libraries to use between the nodes, automatic administration of nodes, how to make redundant nodes, etc.
You also want to have an infrastructure for automatically loading software on computers, perhaps booting off the network... none of this is available on that PDF. Perhaps even not possible.
And you won't get very far telling me that it's easier to upgrade OS X to OS X.1 or whatever where you have to go around with a CD and reboot every computer on a 1024-node cluster, compared with just having them all "apt-get dist-upgrade"
In a nutshell: if you need a high-performance computing cluster, you need to go with a Linux-based beowulf cluster. Perhaps on Apple hardware, perhaps on Alpha, probably on x86. If, on the other hand, you want a toy that can run a fractal program really fast (perhaps povray too) and don't have a real application then this Mac cluster is probably what you need.
-- Erich
Slashdot reader since 1997
was building a super computer supposed to be easy? Chances are if you have a reason to build one you would have the technical ability to follow a 230 page user manual. Then again, maybe "Super computers for dummies" would have a bigger audience than I'd expect.
thirsty*i^2
"Ya I finished that last week, it just doesn't work"
i don't want to argue with what the article claims, but the whole thing reads like a paid advertisement for apple. (not an apple slam - an observation about the article.) but i suppose news includes editorials.
.pdf explaining osX clustering preclude anyone from writing a 230 page book entitled, "how to build a better mac osX cluster"? i think not.
The article contains 5 instances of the phrase "dauger said" and doesn't seem to have any other sources. dauger is a guy looking "to commercialize his expertise in Macintosh cluster computing." should we be surprised about his bias? does the existence of a 1 page
i guess when stories are low, advertising gets cheaper. it would be nice to see some bench marks instead of just claimed superiority of a specific machine/configuration over a generals class of machines/configurations.
now imagine a beowulf cluster of complaints like this. sorry, couldn't resist
you probably shouldn't have read this.
The Zila program came on NeXTstations and NeXTcube and was aimed at providing networked users with such solutions by multithreading the apps objects over the network. :-)
Guess the solution you discuss about is actually inherited from this one
Trolling using another account since 2005.
...shit, nevermind. Too late again...
... he's fairly uninformed on clustering. He claims that you have to have the exact same kernel version on a linux beowulf cluster or it grinds to a halt... ... this is, of course, bullshit. Our 96 node cluster here uses different kernels.
And that's just a single example of his lack of experience with clustering...
* Firewire connection networking
* Gigabit ethernet networking
* Numbercrunching processors
Ditching the screen and stack large numbers
in racks might be a problem, how about power
requirements?
Speaking for myself, I only need one screen and
one computer with a diskdrive but I'd like to
see better ways of using multiple computers
(as long as any one program can crash one of them)
i've never setup a cluster at all, but i have installed redhat at least 20 times over the past few years.
more recently i've noticed an option to install a cluster from the nice new X-based GUI installer. i've never had more than one box lying around, but i was wondering if anyone has tried this route from the setup before. i would have thought that the nice little cluster icon in the GUI setup would have meant an easy (relative) installation and not a 230 page manual.
does anyone have any experience will installing like this, if so, i would love to hear about it.
Don't mistake desire for capability or intelligence. I'm sure you don't desire the chore of taking out the garbage but, does that mean that you lack the necessary intelligence to perform the task? I hope not.
Having had to master the commands and synataxes of several, at least 8, different CLIs I have absolutely no desire to learn *yet* another one. That, however, has absolutely no bearing on my intelligence.
If you would like, I can write a CLI for clustering OSX. I can make it very archaic and very cumbersome. Do you desire to learn the interface that I create, or do you lack the intelligence to do so?
Around here, that is known as the gay pride parade.
"Adequacy.org: Where congenital stupidity is not an option, but a requirement."
...Designer Massimo Zanigni has announced a way to use his clothing to mop up the floor. With an instruction manual of only seventeen words, it's sure to offer stiff competition to complicated, difficult-to-use products such as the Platinum PVA Mop, which, while currently the top-selling mop for cleanup purposes, and although it radically outperforms Massimo Zanigni rags, comes with a thirty-five page instruction manual, including several paragraphs of legalese.
Has anyone that actually knows what they are doing tried to do serious number crunching under OS X?
I'm extremely disappointed with the floating point performance on both my G4 and iBook...especially when using libm.
If you're going to reply with some Photoshop benchmarks...I will laugh in your general direction.
It means it is easy to take a bunch of macs, tie them together, and say hey, I've got a parallel computer.
That's very different from actually building a cluster that works with your application, is optimized for performance, and features high availability compatible choices, issues that the 230 pages book probably address.
What they mean is that it is an easy to install piece of software, just as many shareware for windows are, but useless nonetheless.
"The obvious mathematical breakthrough would be development of an easy way to factor large prime numbers." Bill Gates,
Nope, not me, I must be someone else...
Dood, you're getting a Dell! Dork.
Fucking idiots. Why are we comparing a free set of tools to something which is sold commercially for $100 per node? I'm sure there are plenty of commercial products for clustering which are easier / faster / better / prettier / whatever than the free tools used to build Beowulf clusters on Linux. If you find the software worth the money, then buy it. If you want something unencumbered, open source, and free, then use free tools. Another point which is glossed over is what support there is (or more likely, isn't) for very high speed interconnect networks like Myrinet on the Mac platform. Linux clusters are certainly more of a pain in the ass to get running, but they are infinitely more adaptable, extendible, and tunable.
Step One: Plug them in.
Step Two: Turn them on.
Step Three.... there's no Step Three! There's no Step three...
for example, if beowulf were extreamly easy to install and mosix were not then maybe i'd opt for beowulf without knowing that my current situation calls for mosix.. i hope you get the idea.
MOSIX clusters are a one-liner to set up, for example. I challange Apple to beat that!
I'm not sure about Compaq's One-Stop Linux Clustering. I've never got it to compile. But, assuming it can be made to work, I bet it'd be pretty decent, too.
Last, but by no means least, clustering in the Real World tends to be through PVM or MPI, which are platform-independent. Hardly anyone uses OS-specific clustering, because hardly anyone but high-energy physicists ever develop large clusters in the first place!
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
How is a Mac "easier to set up" in a Beowulf cluster than a group of identical PCs?
I can see where the author might make a point to say that the Mac is nice to use for a cluster because Mac hardware doesn't really change much from box to box, but the same could be said for a group of equal-built PCs. Infact, most real-world (re: not your bedroom.) Beowulf cluster nodes are NOT loosely conglomerated machines with wildly different capabilities from node to node. Most clusters are planned out well in advance, in where each node is precisely equal in terms of its hardware and horsepower.
"Its easy to set up because all of your nodes are the same with a Mac!!" ceases to be a valid "advantage", when the same can be said of a group of SGI O2 boxes, a group of Sun E10K boxes, or a group of lowly 386 PC boxes.
Besides, "its see-thru orange!!!" shouldn't top your list of reasons to purchase Macs for your cluster. You buy a pile of 1U rackmounts, because you normally don't have a whole room to dedicate to a cluster. (duh)..
Cheers,
Bowie J. Poag
Imagine a Beowulf Cluster of XT's vs a cluster of MAC 2Si's.... any takers on which would win?
... oh wait, it's closed source.
You could get like 50 RC5 keys done a day with about a cluster of 20.
Off to port OS X to MAC 2Si
---- The geek shall inherit the Earth.
Oh lookie, Mac Slash has been slashdotted.
This is not the greatest sig in the world, no. This is just a tribute.
There's precious little left on the web about this, but NeXTSTEP had distributed computing installed, by default, on every box all the way back to the '030 cube in 1988. Anyone remember Zilla?
I'd be curious to know whether Pooch is an evolution of the Zilla codebase/protocol, and whether Zilla was used for any specific projects within large NeXT customers like the Stanford Linear Accelerator Center or Morrison-Knudsen.
Back when I marketed NeXT boxes, it was little more than a scientific curiosity, with few if any case studies to go by.
Yes, it means that you sit around learning all the minutia of arcane commands instead of trying to get actual work done.
Try using the aNAL connector.
It looks like my "Can you imagine a Beowolf cluster of these" post in regards to the G4 story a while back actually was on topic.
~ now you know
Hey, do you know ..... ... them. Well, now is the time for you to
That anyone who uses your computer
can see what websites you've visited?
And that simply deleting the history
only removes part of the records.
If someone starts typing a web site, your
browser will auto-recall old sites you've visited.
Your boss or wife can start typing "www.amazon.com" and
the browser will recall "www.amateurnipples.com/".
Every picture on every site you've ever
visited has been copied to your hard drive.
Deleting the cache does not permanently remove them!
And there are many more ways you are secretly
tracked with cookies, bookmarks, favorites, favicons, start menu insertions
and alike. You have probably seen this type of product advertised on lots
of other great sites. HistoryKill works. HistoryKill also kills those annoying
pop-up and pop-under ads. Think about it - no more pop-up banners and ads
opening quicker than you can closeI'm protected. How about you
get this great product!
I'm protected. How about you?
It's pretty neat. And super easy to install. Yet, most customers ask me to do it, knowing that they could do it themselves. The manual is not 1 page long but it's not bigger than 10. So why do they come to me?
Basically, because they want the most performance out of their clusters. What's the point of clustering two (or more) machines if they're not going to be tuned for your specific application requirements. Now, does that oh-so-easy to install MAC cluster (and a few other "clustering" mechanisms out there) really allow for fine-tuning on the cluster performance? Do they allow you to give priorities to specific synchronization tasks? Do they synchronize at all? Can you control how often they do it? Can you have it synchronize whenever *anything* changes on *any* machine in the cluster?
And the list goes on and on.
The reason why customers keep coming back is because the documentation doesn't tell them how to optimize the Application or OS or whatever cluster to THEIR SPECIFIC NEEDS.
And that's good (for me).
.sig
Isn't the real question what the better solution is for a given task? Evaluating the cost of pcs relative to macs seems to sidestep the issue, as does debating whether a product with a boatload of limitations and a one-page instruction sheet is easier to install than a complete and open Linux-based solution. Ease of installation and cost of hardware are important factors, but not necessarily deciding ones. I want to know what the best way to accomplish a given task is, and that's going to depend heavily on the nature and scope of the task.
Sounds like the old addage "when all you've got is a hammer, everything starts looking like a nail." Sounds like all he's got is Macs...
Always doubt the universal solution.
http://drteknikal.blogspot.com/
I've noticed many are pooh-pooing Mac clusters vs. Linux clusters due to cost issues. I agree that, in general, if I was setting up a brand new cluster I'd probably use AMD boxes because of price BUT there may be three exceptions that would change my mind.
1. I already have a bunch of old Macs laying around. Why not use them. Macs have a significant presence in the science/reasearch arena that will only increase with the advent of OS X, so there are bound to be old Macs laying around in companies that are most likely to utilize a cluster.
2. The type of number crunching I require lends itself well to the 128-bit vector processing unit on the G4 (Apple calls it Velocity Engine). x86 chips cannot compete head-to-head with a G4 when it comes to tasks that can be optimed for Velocity Engine.
3. Perhaps I'm in a situation where I'd rather spend my money on buying more expensive hardware up front than, for a week, re-tasking my big dollar scientists/engineers/IT guys while they figure out setting up a Linux cluster for a week when they are needed elsewhere, like their normal jobs.
or you can just use QNX, which does this nativly
As usual, I've been had from the lack of journalism on slashdot and the sites they point to; thanks for pointing out that the real manual is 46 pages long, and not ONE. :)
My imagination originally came up with a similar scenario, and then it all FIT! That's why the POLAR ICE CAPS are MELTING! It isn't global warming; it's a BEOWULF CLUSTER! I figure they have TUNNELS connecting the supercomputing centers to the ESCAPE ROUTES for the ARK.
Work on the ark continues...
pb Reply or e-mail; don't vaguely moderate.
Well, kind of. Dauger's Pooch is expensive and didn't easily support what I wanted to do (running signal processing tasks to support MRI research) so I used the libraries to write my own. The system does work very well and is quite easy to set up. Better yet, there's no reason to take the Mac off the secretary's desk -- let it stay there. During the day it's a word processor, at night it does Fourier Transforms.
Ethernet has very high latensy at about ~10 milliseconds. Projects like PAPERS and the KLAT2 use the parallel port to connect compute nodes because of the much lower 1 ms latency.
Ok, no parallel port on Macs... but I wonder how do Firewire ports perform?
There are 10 types of people in this world, those who can count in binary and those who can't.
Are you feeling lucky, punk?
pb Reply or e-mail; don't vaguely moderate.
I've just read the article, and added my grain of salt for bias, but most people here fail to realize that hardware costs are *very cheap* in relation to human costs. If what they say is true, it's worth the extra price on hardware.
If you have a kernel with mosix http://www.mosix.org/ patched into it, and then added a nice gui, I could see it as a similar thing to this 'Pooch' software.
Comparing it to PVM http://www.csm.ornl.gov/pvm/ (where I think the real work of clustering on linux is done). Usually requiers special code the is highly specialized for the application. Not just a recompile.
Parallel problems require a completely different approach or you end up with worse performance than it running on a single machine (due to bandwidth usually).
What kinda of cluster are we talking about?
Grid, tree, hypercube..
This artilce is just pure Mac FUD the more I think about it.
-jj-
Considering that the "Beowulf" applications are really just a collection software that can compile on almost any platform, and really just need to be on hosts that can rsh or ssh in a trusted fashion to one another, it's preposterous to say it runs in a less complicated fashion on Apple. It runs the same. It uses the same manual. It's the same software. Compiling PVM or MPI on any is about the same. And, it doesn't give a dang if you're using kernel 2.5.3 on one and 1.1 on another and FreeBSD or Ultrix on another as long as the local machines can run your app and they can talk to one another.
But, the true Beowulf COTS fan (unless you're getting your Macs "surplus") knows the cheapest price to performace is x86. You can get make an AMD cluster for less than $500 a node that will smoke anything that Apple can match in price, because nodes don't need video cards, USB, sound, CD-ROM, or even hard drives, and you can't get a Mac without all that.
You think the P4 price/performance is bad, G4's are insane
USC Macintosh Cluster Running the AltiVec Fractal Benchmark achieves over 1/5 TeraFlop on 152 G4's and demonstrates excellent scalability.
KLAT2's complete results are: Rmax=64.459 GFLOPS with 64 Athlon 700MHz with 128MB PC100 CAS2 SDRAM
So a 1 tflop apple machine would cost about $440,000 in hardware for 152 G4 1000mhz -vs- 270 Tbird 1400mhz at about $160,000.
The difference, $280,000 could certainly hire someone literate enough to read the long linux manual.
If voting were effective, it would be illegal by now.
I support a department of 250 workstations and printers. We are 80% Mac and I can manage this pretty well all by myself.
Think about that for a second: One person can manage IT support for a large department.
What this has meant is that because I haven't had to spend as much time with virus cleaning and fixing cheaply made Wintel hardware and software, so I can work on improving overall computer infrastructure instead of just fixing what we have.
In practical terms, over three years, my department now has four computer classrooms, 100BaseT networking where we used to have 10Base2 coaxial, safe and secure servers, and wireless networking. This is a direct result of supporting Macs.
"A good use for these [ancient] machines is to recycle them and one way to recycle is to create a bigger faster machine with them."
Not if your primary concern is getting the most FLOPS/$. Given that a brand-new $1000 computer will be something like 10 times as fast as your old ones, at the same power consumption, it doesn't take very long before your new computer pays for itself with the money you save in electricity not running 9 additional machines.
Consider:
150 Watts (low for a PC, probably average for a Mac) x $0.10/KWH x 24 Hr/day x 30 day/mo. x 10 machines = $108 per month. Your $1000 new machine will pay for itself in less than a year, from electrical savings alone.
Of course, this assumes dedicated compute servers running all the time. If you run the cluster software as a backgound task on desktop machines with many users, it's a different story.
It would be a really good idea to make clustering easier, but there is a trade-off between easiness and performance. Making the creation of clusters easy ("a few G4 Macs, some Ethernet cables, a hub and the Pooch software.") by only talking about the easy-to-use software and not optimized network topology (correct me if i'm wrong but the Beowulf handbook probably covers a lot of that) will definitely keep performance quite low.
BTW. on the wired site it says:
while almost the first sentence in the 1-page-pdf says:Do you have any no-biased (i.e. not from apple) figures to back this up? Or are you just talking out of your ass?
Building a true multi-user environment (I mean with multiple people at multiple machines) isn't all that easy. I doubt support costs are really less.
I've seen people say this before. But personaly doubt it's anything other then random apple hype (like the 230 page manual vs the 1 page PDF, even though much shorter beowulf docs exist)
autopr0n is like, down and stuff.
If you read the one page pdf file, it assumes you already have a network of OSX boxes set up. The same thing in linux would look like:
.rhosts files on very node /etc/hosts files
Requirements: Linux Network with rsh enabled, preferably with firewall and IP Masquerade.
1.) Download jobmanager and bWatch rpm's
2.) Do a rpm -Ivh *.rpm
3.) Add list of nodes to
4.) List all nodes in
5.) In a terminal issue: jr -q [process command]
Viola! your distributive computing!
! == goatse.cx
They note that the Linux "how to" manual is 230 pages while the corresponding Apple document is a 1 page PDF file
...could this just means Apple left out 229 pages of important information?
I mean who cares how many pages a reference manual is? I would rather have a complete manual than an incomplete one.
I Heart Sorting Networks
They note that the Linux "how to" manual is 230 pages while the corresponding Apple document is a 1 page PDF file.
Meanwhile, documenters have been developing a "What to do with a linux beowulf cluster" list. That document has grown to 230 pages. The corresponding mac list has come up with one idea (And it fits on a 1 page PDF file): "Create a system that allows us to use Photoshop to edit super-high resolution pictures of Natalie Portman eating hot grits."
(j/k!, and, btw, I'm using a Mac right now. :-)
--You will rephrase your request for me to go to hell. Goto statements are not acceptable programming constructs
No, it isn't that plain and simple, and you must be a manager, or even a business owner, to think like that.
How is a one page document going to detail background, concepts, initial setup, not to mention troubleshooting. Every setup will have problems. These are all detailed in most of the technical docs for linux I have ever written.
It will be like the Jeff Goldblum "3 easy steps to get on the internet" iMac advertisement. Using a PC is that easy also, if you already assume the rest of the knowledge of everything being in the box, and the software is already understood. A reasonable person doesn't make these assumptions in a technical document, those ludicrous claims are saved for MARKETING.
I would expect more than 1 page of instructions for a system that was already put together, that I was inheriting from a predecesor.
The point isn't flexibility: sure you can be more flexible with a Linux-based cluster. You can tweak and tune a Linux-based cluster to meet your specific needs. This is why Google uses such a cluster.
The point isn't about cost: the real difference between a decent name-brand PC and a Mac is negligible. In the case of these Mac-based clusters, since the clustering software is just another app, a Mac-cluster can be setup and torn down quite readily. You come into the lab on Wednesday to find your workstation has been appropriated for the cluster.
The point is accessibility! If you're a physicist in a small school looking to model some complex interaction, you can rent some computer time from somebody (expensive), build a cluster (very expensive, because you'll have to hire somebody to do it--physicists aren't likely to be Beowulf experts), or use the Mac clustering software (expensive, because you'll have to buy the machines if you don't already have it, but you can do it yourself, quickly, without much bother).
Accessibility! It's what keeps Apple in business. This is another example of it.
I'm pretty disappointed in the posters who knock it, because it strikes me that they are a bit put out that they won't remain the Technical Elite because they've got the spare time to read the 230-page Beowulf manual.
Potato chips are a by-yourself food.
Every time I walk by a Windows lab, you should see the "CLUSTER" they have going on in there! ;c)
A cluster of clusters!
Answering my own question, I found this PDF on google about the performance of IEEE 1394 (Firewire). It says that Firewire can have latency as low as 125 microseconds, and bandwidth as high as 50MB/sec.
So why not network a cluster of G4s together with firewire?? Seems like it would perform much better than ethernet.
There are 10 types of people in this world, those who can count in binary and those who can't.
A Ford Escort is also easier to set-up, maintain and more flexible than a Lamborghini Diablo but how the hell does that make it good?
I DL'd and read the manual. It really does seem just that easy. It costs $100+ per node, but you pays fer yer time & headaches, doncha?
The faster the machine and traffic the better of course, but you could do this with the cheapest iMac ($799 new, ~$400 used) or a bunch of cubes (banking finally on their close packing ability) if you want Altivec in the mix.
Gosh, a reason to make a headless iMac2 - that would be quite the aesthetic eh? Seventy six of those snuggling on a ping pong table...
Communication can be over Airport, too - so you can imagine ad hoc Mac Clustering begin setup during the first half of every Jobs keynote - you know, the part where he just says stuff - to go thru all possible iterations of the product to be intro'd in the second half of the keynote...
"Win treats sysadmins better than users. Mac treats users better than sysadmins. Linux treats everyone like sysadmins."
... for scientists like myself, this is a very nice thing. Not all of us in the sciences are tech-savvy... I'm probably the one in my 5-person research group who understands the most about *nix. For those of you who don't realize this, many research scientists have to work hard to get their grants and outside money.
So, what does all this mean to us? As an atmospheric scientist, having some serious number crunching power is mighty helpful. Weather modeling is quite the processor intensive task, and then interpreting the results can take years after all the computing is done, including further computations and visualization routines. To put it shortly, we can easily tax our computers.
So, now you know that we need computing power, but money is a premium for us in many cases, so why shouldn't we just get some cheap Intel boxes and *nix cluster them? Well, we could, but then we'd need to hire a systems admin. Someone who is tech-savvy enough to keep everything running decently well for us. That requires another person who REALLY understands what's going on in many cases, which is another salary on the payroll. For us, it all ends up balancing in the end. The $5-10K that we save in clustering our 8 Intel boxes over the Macs is eaten up in one year or less by the guy (or woman) who has to set up the whole thing. So, for us, the ease of setup and use is something that can translate into some good savings and we don't have to worry as much about having to rely on another person to save us if something goes wrong. That's the benefit of simplicity for us.
I agree that it is important to know, as one person said, "The nature of the beast", but that's something that takes time to do, and when you're not being paid to learn about how to cluster computers, but to figure out how the atmosphere works, then things like "The nature of the beast" are just further complications. I would rather have something that I can slap together, know that it works, and get back to my work, without the interference of others if I don't need it.
And that brings me to another rebuttal, about someone mentioning that if you buy the Macs, you're also going to pay for all the extra Superdrives and video cards and all that. I say to that, "Good." That way, if the cluster doesn't need to be used, then I don't have a bunch of mostly useless boxes sitting around... or if a collaborator comes around and needs a computer, I can just remove one of the computers from the cluster and let them use that for as long as they need. The point is that there are advantages and disadvantages to each setup. Now you've heard some advantages and why the scientific community might care about this. Remember, not everyone here can compile their own kernels and not everyone cares about being able to do that. Some of us, thank the deity of your choice, actually want to do something with this power and not care how it works in depth. To each their own.
-Jellisky
Please keep in mind who this software is targeted to. It appears to be for scientists and researchers. These folks really do not need to be able to build a beowulf cluster from the ground up. They also often have access to macs. This provides a good solution to a problem. Whether or not it is better or worse than solution 'x' is not as critical if it helps people get the job done with the equipment on hand.
Microsoft brought us Windows XP. I bought a Mac.
Imagine a Beowolf cluster of Mac Clusters...
(Sorry, couldn't resist)
This can get more recursive:
Imagine a Beowolf cluster of linux systems, each emulated under soft-pc on the mac cluster...
Blink£%^£$%^"$%£$%&*()..
- Paul
Semi-related topic. Get several Macs running Linux and cluster them. Includes about the best instructions you can find for getting Linux to run on Nubus Macs, too.
Constitutionally Correct
Photoshop is optimized for the G4. That was my point. We're not clustering Quake here. We're talking about special purpose applications that do scientific calculations.
If you application does better on the Intel, you are likely better off considering a Linux cluster. However, if it isn't much better, you might be better off with the Mac cluster by adding a few more machines to compensate... depends on the costs of time.
If you are running an application that, LIKE Photoshop, does better on the G4, you will see the price performance favor the Mac line. That's my point.
If this market was a decent size, I bet Apple could get some really competitive cluster systems. It would be nice to see an Apple dual or quad G4-1 GHz, with a CD-ROM, ATI Rage 128, and Gigabit Ethernet for the scientific community.
They could make the machine without PCI slots and fit in a 1U case for OS X processing goodness.
However, the reality is that the extras (better video card, Superdrive, etc.) don't add much to the Apple's price. However, the right form factor could make them tremendous cluster machines.
Alex
A school that has 20 nice Macs in their HR office, and 60 in a lab that is locked overnight can (for no cost, and very little effort) leverage these at night for large scale number crunching...
That is, as long as they can get buy-in from the administration to install it.
Are you saying uninformed idiots have trouble getting consulting gigs?
:P
It's to bad I havn't got mod points, I'd give you +1 funny. Thanks for brining a smile to my day
autopr0n is like, down and stuff.
I've not really thought this through, but sometimes it's occurred to me that with *nix (in particular Linux) being "harder" to setup is actually a Good Thing, since it means it cannot be done by a moron.
It is alleged that a 5 week old chimpanzee could get an MCSE, which they do, and the next minute they're trapsing round server rooms earning nearly as much as someone who really does know what they're doing.
I think.
I'm an Apple user, and I agree, hiding is a good thing. I have little or no desire to know HOW a computer works, I just want it to work. Just as I have little or no desire to know HOW a car works, a TV, a stereo, or a frig.
Obviously, it is nice to know these things, especially if you find them interesting or desireable to know. But the point is, Apple believes that the gears, wheels, and cogs of a computer are not what the computer is about. It is about performing previously complex tasks in a simple and intuitive manner as possible.
Others (and this means most of the people reading this) are actually interested in the guts and the inner workings of a computer. For those of you that are, enjoy it. It's a free country. Just be glad computer geeks like you, and Apple goofballs like myself, can share polite conversation over a decent cup of coffee.
As an Apple user, I concern myself with the inner workings of other things, not the computer itself. This isn't to say that I'm some sort of computer dingbat, but I'm not into coding, the command line, or any such pursuits. You concern yourself with the inner workings of things that some Apple users may find boring. Tit-for-tat, six and one half dozen of the other...
Myself, and other computer users believe that a computer should be so powerful that it is actually EASIER to use. ANd with every advance that Apple, Microsoft, the Linux & *nix communities, the processor designers and engineers, the programmers, the manufacturers, and the forward thinking researchers out there make, the better it gets. Hopefully this means the EASIER they get, as well.
If you have a scientific cluster, you don't want to be swapping things out. You don't want to take nodes offline because a video card fried. You want a system that is going to work.
I just priced out some Compaq Workstations yesterday and compared them to Apple Powermacs (Apple's workstations) for doing some OpenGL game development.
Apple Powermac with dual monitors and the upgrades we'd want... $5k. Compaq Workstations... $5k.
In the price-conscious area, Apple's iMacs/iBooks offer a good solution at a reasonable price. You can't compare Apple's workstation line with your "look ma, I built it myself" machine.
Apple does QC. You don't. You and your screw driver does not equal scientific requirements for reliable and predictable. If a node fries, you likely need to start over again. You can't just try to fix the damage.
Linux is great, OS X is great. They are very different UNIXes in different markets.
Alex
I just finished tearing down and re-building a twenty node linux cluster in less than half-a-day.
Also I set it up the first time by myself as my first experience. I did everything from builidnthe nodes to installing linux ans clustering software and that only took two days.
I almost laugh at the thought of a mac running parrallel. The whole point of cluster computing is to provide a cheap means of high performance computing. Hey not that macs wouldn't be great, but common man they are expensive. All you need fo r a cluster to work is descent RAM, motherboard, and harddrive, and a cheap (In my case $5) video card.
If I had better cooling in the room I probalby would have went with AMD as they have much better floating point performance than Intel.
I guess that If I just had the money to spend on a super-cluster I would go with twenty Octane 2 dual processors. At about 12000 apeice that is pretty damn expensive. Oh yeah my cluster top to bottom was under twenty thousand (includeing cisco router), and I had to re-buy RAM 'caues I cut corners there the first time.
what?
a 1 page pdf - I love that kind of stuff. I knocked up an Appleseed cluster at work just for the fun of it - took my about 20 minutes. If only I had an application... Clustering for the rest of us!
That was classic intercourse!
>"It took NASA's Jet Propulsion Laboratory two weeks
> to put together a 16-node Linux cluster." he
>added. "I could do the same thing in less than an
>hour."
Then JPL was either building the systems from whitebox components, or is completely incompetent. I built a 20 node cluster in about 1.5 days, including the OS install on all of the nodes.
>Dauger added that Linux clusters are extremely
>fragile: If all the machines in the cluster
>aren't running the same version of the kernel,
>everything grinds to a halt. By contrast, a
>Macintosh cluster can be made from a mix of G3
>and G4 Macs running Mac OS 9 or X.
Excuse me???
My cluster is currently running 2 different linux kernels (2.4.18, 2.4.9), two different processing architectures (alpha and x86) and I occasionally throw an SGI O2K into the mix. Sure, the x86, alpha, and SGI binaries need to be compiled seperately, but it hardly "grinds to a halt"
>Dauger said Mac clusters have better bandwidth
>than similarly configured Linux clusters. They
>can transfer bigger chunks of data between nodes
>but their latency is less (The individual bytes
>of data are transferred less rapidly).
Huh??
And now let's look at the cost.
I can build dual athlon nodes for about $500/cpu
Let's assume his claim of 70% faster is true (I doubt that numberbut anyway). Can he build G4 nodes for $700/cpu?
From the documentation, it looks like you just hook in another Mac via EtherNet, give it a differnt IP address, install and run Pooch, and >voila new cluster node. How does that compare to a Beowulf cluster?
Not all subscribers are male... and not all geeks are homosexual even if they hate talking about sports or cars....
"If the Mac gets a 2:1 performance advantage [but it costs twice as much as a PC], then the costs are equal."
Let Computer A have
Price = X
Performance = Y
Let Computer B have
Price = 2 X
Performance = 1/2 Y
The value of the computers IS NOT THE SAME.
Price is a one-time, up front cost.
But I get the benefit of performance for the life of the computer.
So if I can save $5 per hour on the more expensive computer (for example, if I can get more jobs done on the fast computer), and I work 40 hours a week, 50 weeks a year, right there I save $5 x 40 x 50 = $10,000!
So saving $5 an hour is big bucks, and if the computer costs you twice as much, you still come out lots ahead. Saving even $1 an hour (a $2000 per year additional savings) will thus justify the more expensive Mac (at $3000) than the PC (at $1500)
i'm no econ guy, so maybe i've misused "savings" at times (instead of profit, or some other term of art).
Zilla is mentioned at Apple here. Sounds like they are still playing with it.
Not surprising that they have it. Crandall wrote it (and pieces of Mathematica), and was Chief Scientist at NeXT. Now he's a "Distinguished Scientist" at Apple. Odds are he's got his own lab and a budget to do with as he pleases.
Also from that article: "Zilla was not used to find record-setting prime numbers, as is often supposed; instead, it was used to develop, through factoring and other number-theoretical calculations, certain cryptographic systems, tests, and algorithms such as Fast Elliptic Encryption (FEE), described below."
Funny thing that. I'd thought it was used for prime searches as well. When I had him as a professor at Reed he complained when the divide by zero bug was revealed in the Pentiums. Turns out he had to throw out a bunch of the searching he'd done on clusters at NeXT because some of the boxes he'd been running on were x86 based. Details of his ongoing research are at his site: http://www.perfsci.com.
-Noah
And I quote "But according to Dauger, Linux clusters require a PhD to set up and to run."
Yeah, I guess there wouldn't be any qualified people amoung those running Tokamak fusion simulations or 100 million mutually interacting particle simulations.
A diskless linux system is cake to setup and as far as different kernels are concerned, the article is clueless, you can use LamMPI to mix different platforms (ie sun,sgi,intel linux, alpha linux) in a single cluster.
Disclaimer: I have a Ph.D.
nohup rm -rf ~/. >& zen &
You are attempted to determine the value of X.
I will increase my wealth/happiness by $10000 with computer A. I will increase me wealth/happiness by $9000 with computer B. If Computer A and B both cost $3000, I buy A. If B only costs $2000, I am indifferent. If B only costs $1000, I buy B. I determine what gives me the most value.
However, in this case we are comparing two clusters, one of x86 machines running Linux with one of Apple PPC machines running OS X. In either case I am buying many computers.
I need to do X operations per second. How much x86 hardware would this take? How much would it cost? How much Apple PPC hardware would this take? How much would it cost?
You are right that MOST computer buyers look at the price and not the benefit. Almost ANY productivity increase from the Apple makes it a good choice, even if it costs an extra $1000-$2000 for the machine.
However, in this particular case, we are discussing clusters. We are buying a certain amount of computer power. We should compare the variable costs of power ($X/gigaflop, or whatever unit you want to use), plus the fixed costs of setup time, and compare.
Alex
I'm surprised that there is even 1 page!
Most Mac users I know have never looked at a computer manual before.
"A plan fiendishly clever in its intricacies"- Homer Simpson
Just have all of your OS X clients boot off of a disk image on a Mac OS X Server machine.
http://www.apple.com/education/k12/networking/diff er/index.html#macmanager
I recall, back when CD-ROMs were fairly newfangled, the "manual" that came with the CD, if it was a dual-platform disk, often offered an interesting contrast.
The Windows instructions would go on for pages, discussing running the installer application, how to get the right drivers, etc.
The Macintosh instructions were usually:
I never understood why Apple didn't market that advantage heavily.
Computers are useless. They can only give you answers. -- Pablo Picasso
733 MHz G4 specfp95: 23.9
1 GHz G4 spefcp95, est.:
Macs have always been simple to cluster. Almost all macs have the ability to netboot, simply by booting while holding down the N key. This feature, used in conjunction with a dhcp/tftp server, you can boot a remote kernel with a ramdisk, which will automaticaly build the node. Terra Soft Solutions, makers of Yellow Dog Linux, offer a node management suite called Black Lab for Yellow Dog Linux which automates the entire procedure.
Anyone have a bunch of iMacs laying around?
Well, the problem here is the part "tasks that can be optimized for Altivec".
:-)
There are very, very few problems that are completely linear and streamlined to the point where you can vectorize them completely. (If this wasn't true, everyone would be using cray machines).
Further, it assumes you take the time to code these parts either in the altivec assembly wrapper functions specific to Motorola hardware.
AND it assumes you don't want to do anything in double precision...
AND this setup couldn't use the industry-standard MPI message passing interface?
Cool... that probably leaves you with all the people building clusters to run a carefully selected set of photoshop filters....
For everything else we can just compare specfp95:
733 MHz H4: 23.9
1 GHz G4, est.: 32.6
1.7 GHz Athlon: 50.3
There's a reason why Apple doesn't publish specfp results
Everyone here seems to be suggesting that the manuals indicate nothing. "Apple has weak docs!" seems to be the summary. But can we entertain the notion that perhaps while 1 page is too short, 230 pages is far too long? If so, is this because the people who wrote the manual are not professional authors, and got too wordy? Or is it because Linux just isn't usable enough?
And whatever you think, isn't it reasonable to suggest that making Linux more intuitive and the manuals more succinct might help rid us of idiot lusers who won't RTFM? They won't really go away, but if we actually take usability seriously, perhaps developers can get half those people to solve their own problems. Wouldn't this be a good thing? I guess that's a rhetorical question -- I am sure it is a good thing. I spend my entire workday building apps for people, and one usability tweak can mean the difference between 20 nagging people a day and 2. My team even has blacklisted a couple people in the company, whose projects are always time-sinks to build and time-sinks to maintain. Why? Because those people are control freaks who won't let us fix usability errors, and my team ends up spending their days on support. If you can build something intuitive and usable, both the users and the developers will be much happier.
My Greasemonkey scripts for Digg &
The ability to use a CLI says something about your intelligence... those who have no desire to learn even simple bash, probably aren't smart enough to need a cluster or use it wisely... I still say, "Hire a professional." (Plug:) Like me.
Congratulations: you have won the *NIX Bigot Arrogance Award for most asinine comment on slashdot. There was a lot of competition but the judges have selected your comment as the overall winner. The assertion that use of a CLI equates to intelligence was an impressive display of arrogance in itself but concluding with a plug for your own "professional" services was truly the piece d'resistance.
And this is news because....
dinner: it's what's for beer
digging around, I finally got to the order form. pricing is: $150+(N-1)x$100, or $150 for the first seat, $100 per each node thereafter.
Dear Slashdot: next time you want to mess with the site, add a rich-text editor for comments.
Typical mac article. All about "performance", and
performance/clock, but nothing about performance/price.
The simple fact that they don't mention it, says enough.
The whole idea of commodity hardware clusters revolves around the lower cost per gflop.
Using PPC that is 2 times as fast and 4 times as
expensive makes it uninteresting for a large group
of researches.
Some will choose PPC because less nodes is less spaces, but most will go for costreduction, and stay with intel (or better: Amd)
Again and again, the moderators show that they are either:
1. Incompetent
2. Biased
3. Both
There was nothing in this parent's post that in any way suggested flamebait or trolling. He had an opinion, it was valid, and he stated it. There were no "death to Mac's" or anything of the type. Modding this as flamebait shows how wonderful censorship can be.
Now go ahead and mod this down, so nobody will notice the hypocrisy on this thread.
I may have missed a previous post on this, but the 1 page Quickstart document assumes that you have a nice distributed application. In my experience, useful distributed applications don't write themselves and the expense (dollars and time) of creating the software will dwarf any expense involved in setting them up (unless you are doing things on a distributed.net kind of scale).
The size is what really kills them.
A 45U rack will hold 45 1U dual-CPU systems. Even more of the server-blade type systems (280 of the Compaq in a 42U rack).
The only way to rackmount a G4 that I can find is at Marathon Computer. A set of replacements for the "handles" for $225 or a whole new case which is 4U but is $550. Given a 45U square-hole 19" rack, you could squeeze in 11 dual CPU G4s.
I don't care what your performance fantasies are about the G4 systems, they're not more than 4x faster than dual x86 systems.
They note that the Linux "how to" manual is 230 pages while the corresponding Apple document is a 1 page PDF file.
Yes. Wonderful. This says nothing. This is one of "those" statistics. The Linux "how to" could be 230 pages because it not only tells you how to set it up, but gives you advice on customizing, creating optimized programs, hacking the kernel, and FAQs covering every single problem or question you might have.
The Mac PDF might be an almost blank page that says, "Call tech. support." Furthermore, why mention that it's a PDF at all? Are you saying that it's somehow better to use a proprietary document format (e.g. Proprietary Document Format - PDF, get it?) instead of plain text? Is the information somehow MORE relevant because it's in PDF?
Please. I've seen neither, but all this tells me is that someone wouldn't know a relevant comparison if it widdled on his shoes and stole his wallet.
Jake
Dating: while( 1 ){ call_girl(); get_rejected(); drink_40(); } return 0;
I browsed through it briefly when KLAT2 was announced on /. - but didn't come across the PAPERS stuff.
That is a cool project - even cooler was WAPERS - parallel clustering using modified parallel port switchboxes and custom cables - cheap interconnect hardware, to say the least! Even PAPERS didn't look that hard to implement (basically the same kind of system, but using AND gates to tie everything together, resulting in a "safer" system less likely to burn out "non-compliant" ports) - plus you get cool blinking lights!
Hmm - here is an idea - imagine making a PAPERS interface on a "per-machine" basis that fits into a 5 1/4 inch bay (like a bay bus device) - basically, split up the PAPERS box, then build custom interconnect cables (might need two cables per box?) - a real nice custom high-speed interconnect.
I need to look further into this interconnect, and see how it fares against others - cool...
Reason is the Path to God - Anon
The money saved by using a free OS is quickly eaten up by the salary of someone who has to make them run smoothly, which is damning if you're a small business with only a few employees, or in your case, a research group.
-
the Mac manual was published using a 1 point font, OTOH the beowulf manual is large, double spaced 36 point.
Contain my voice. Place my user into your foe list.
Linux has the endless instructions and most windows games I try install themselves just by putting the cd in.
What I find interesting is that someone creating, say, linux cluster server software, doesn't 'market' it using a kde install and administration tool. they could have a command line version as an add on for those that need it.
Ah yes, of course, what if you don't like kde, think it sucks and have fvwm instead. Yah, probs.
Its a difference in mentality.
Ease of install/setup vs some other way like just the way it use to be.
J
That's the x86 is more of a standard. Let's say that node 15 of my massive cluster bursts into flames for no apparent reason. I can replace it with any old 'off-the-shelf' computer, and, at the very least, the 'architecture' is the same. With the exception of various slot/socket layouts, the PC is more 'interchangable'. If my Athlon overheats, I can run into practically any computer store, buy the same chip, and pop it in, where the old one was. If my iMac processor bursts into flames, I'd most likely have to take it to a special Mac place.
Another 'disadvantage' of the PowerPC platform (and this one won't really affect me, or many Slashdotters for that matter) is that you can't run Windows on a the PPC (Mac) platform. My choice for a clustered operating system would most definitely be UNIX-based, but surely some would like to run Windows on their cluster. With x86, that's possible.
________________________________________________
suwain_2
Can you imagine Beowulf cluster of those things?
The book to which he's referring covers everything from the ground up, from basic clustering principles and theory, to building and administering a cluster, to specific programming API's used when creating cluster-aware applications.
The one-sheet document he describes is nothing but how to install a particular set of client software on a Mac, connect it to an already-working cluster, and submit pre-written software to the cluster.
The former is intended for engineers who are looking to learn how to build a cluster from scratch - regardless of the platform they're building it on. The latter is an excellent example of the type of end-user documentation those engineers should write for the people who will be using that cluster when it's built - regardless of the platform it's built on.
It's interesting, all the comments I've read so far, including yours, seem to deal with this as a dichotomy between Linux/Intel and OS10/PPC. Don't forget you can run Linux on PPC. For a high performance dedicated cluster that would definately be an option I would look at.
Of course, there are situations where the Mac software has advantages that will really shine. Like if your "Cluster" is really just the lab machines at the college, acting as a cluster when not being used for DTP and Video editing or whatever. In that case the ease of setting this up with Mac OS10 would be a real plus.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Friends don't let friends enable ecmascript.
Yellow Dog, makes Black Lab linux for cluster Macs. Go there, take a look, they also point you to a company that makes racks.
10 good Intel machines will not cost less than $10,000. For scientific work, I don't consider eMachines or your grey-boy solutions a "good" system.
So, I took the bait... I went to Compaq's site and spec'ed out an equivalent workstation. Note, I'm not souping up the video card or CD-ROM like the Apple workstations. No need to waste money.
Compaq Evo Workstation W6000, Intel Xeon 2.00 GHz/512K processor, dual processor... Upgrading to 512MB RAM. $3521.00. Note that this machine only has 10/100 networking. The Apple has Gigabit. This should matter in a cluster.
Dell Workstation 530. Intel Xeon 2.0 GHz x2, 512MB RAM, and an upgraded sound card (Dell won't sell a dual-proc workstation without an $80 soundcard upgrade... weird). Dell did let me downgrade the video card annd monitor... Price: $3878.00. Unlike Compaq, I could buy the Dell workstation with Linux (supported) instead of NT and needing to swap OSes.
Next I went to Big Blue. They push Linux, they should sell me good Linux workstations. When I bought my last round of Penguin Computing machines (to run OpenBSD and Linux) I looked at IBM first...
IBM's only dual processor workstation, the IBM Intellistation M Pro 6850 Tower. With a second 2.0 GHz Xeon processor, $5218.
Real computers cost money. Flaky machines that hardware lock from time to time do not. You can't compaq the Apple workstations to the bottom-barrel systems.
In fact, at $1300 for the lowend iMac (700 MHz G4), admittedly with a silly flatscreen for this project, or $2300 for the midrange (933MHz) G4, Apple hits some good price points for this.
Look, the new G4s (in the 933MHz and 1GHz-dual models) are sporting a 2MB L3 cache! That's damned impressive. A 2MB L3 cache should make cache misses SO infrequent that the slower memory bus speed is irrelevant.
Look, if you need lots of power, you used to need to stop millions. You're not going to cut corners on your machines. You're looking at $3500 for an Intel dual-Xeon based solution or $3000 for the dual-G4 based Apple solution. Sure you get an unneeded Superdrive, but who cares? When the project is over, I bet you everyone in the lab is happy to take one of the Superdrives home...
Geeze people, get a grip.
Apple's G4 workstations are not the same quality as the computer you have in your room in your parent's house. These are real machines with:
Gigabit Ethernet (very significant for a cluster, and unlike the PC's 32-bit, 33 MHz bus, real machines like the Apple, Compaq, or Dell workstations have 64-bit OR 66 MHz (sometimes both) PCI busses so you can actually USE the Gigabit Ethernet.
The Apple's L3 Cache has 2MB DDR SDRAM at up to 500MHz, this is much faster than the 266MHZ DDR in PCs and comparable to the PC800 RDRAM in the Dell/IBM workstations. Sure the System RAM is slower, but a 2MB L3 cache makes this less relevant.
The Superdrive, Firewire, and Video cards are all unnessary here, but they are actually really nice features if these machines will be reassigned as desktop machines when the project is over. You could buy new PowerMacs with the G5s ship within 6 months and reassign these as desktop machines. The real workstations are the same. You $45000 cluster of crap machines won't take you very far. They are trash when replaced, and if the machine hasn't been QC'd? Well, time to explain that your project needs to start over.
Come on people... Quake != scientific computing
Dude, have you ever opened one of those things up? 1 picture, maybe 2, and 2 lines of text.
Don't make up this nonsense about needing extra pages.
The administration software would have to be installed and configured, or another couple pages for that...
Once again: a paragraph at best.
If I buy a turn key system that I will have no idea, nor desire to know how to run
That is just a flat out unjustified comment. You shouldn't try to sneak equivications in like that.
Turnkey != lack of understanding.
Truth is, if you are prone to setting up clusters, you are a cut above Joe Homeuser. I doubt anyone's grandmother would be doing this. G4 are used a lot in academic settings, and anyone who is doing that could probably "field strip" any machine, be it DIY or mac.
While I agree with you that simpler is not always mean something is the best, the lack of undue complexity does. (fewer fail points, lower TCO, etc.)
______
Once: you're a philosopher. Twice: a pervert.
If short docs == good usability, then Linux must have the most usable set of apps on the planet, given how many have either no docs or so little they're not worth reading.
Isn't that a concidence. I just came from a Mac discussion forum which was discussing the same linked article on the Mac Head site.
I show up in Linux city and what do I find? Well, I find a lot more messages, but that doesn't mean a thing.
All I have to do is take the ones on the Mac board, switch Max and Linux, and do the same here. They're interchangeable.
The Wintel chappies are bug eyed with glee and laughing it up as us dumb kiddies.
Heh. Lotsa Linux types haul an iBook around. And lotsa Mac sites run Linux on their servers. Does that suggest any thing? Maybe we should check out these other guys, maybe?
Why are the penguinites and mac heads banging? Maybe.....just maybe, there's a little objectivity here in 10% of the posts. The others are either ill informed or prejudiced.
Yeh? Well I posted about the same dumb message you just read on the mac head board too.
heh.
It's not your fault, because you probably didn't know this, but the USC Mac cluster didn't cost anything near $440,000, and it didn't have any 1000 MHz. G4's in it.
At the "Macs in Science and Engineering" user conference at Macworld, they gave the general specs. of this cluster, and all of the machines were dual processors, but of different hardware generations. Although the fastest machines were dual 800 Mhz. on 133 MHz. bus, the majority were slower dual 450 and 500 Mhz. machines with 100 Mhz. buses.
With the fact that all were dual, and ignoring depreciation on the older hardware, the cost would be at most $220,000, If you were using Dual 1 GHz. G4's, it would still be only $220,000. My notes are on my laptop, but I believe that the actual cost of the USC cluster was less than $200,000.
Also, I assume that you think that the 270 uni-processor T-birds will scale performance linearly as well. I doubt it would only cost ~$600 per node as you would have to use Myrinet or some other fast fabric, and with three and a half times as many nodes, the latencies, hardware, and administration cost would be crippling. I have the same cost argument if you use dual Athlons, as the boards are quite rare, and the node count is almost double the Mac node count.
Your price/performance assertions don't stand up!
-- Len
All of a week ago, I went to a talk where the man, Dauger himself, got up in front of a bunch of professors and explained why Mac clusters were the best thing in the world. The Wired article reads just like his presentation. He even had a copy of the Beowolf book at the presentation, and handed out copies of his one page manual. There is no comparison. His manual says, basically, to install Pooch and reap the rewards. I found something interesting. The USC cluster that was mentioned was our Language Center Lab (I'm a student at USC). They ran a fractal benchmark. The thing that I found interesting was this, and maybe someone can help me out here. The language lab doesn't have any dual processor machines, and doesn't have clock speeds anywhere near 1 GHz on any of the machines. It's my understanding that all of these Macs pump out about 1 GFLOP each. There are 56 machines in the lab. 1 GFLOP * 56 machines = 56 GFLOPS peak. Dauger's benchmark said the cluster was pumping out 223 GFLOPs. What am I missing?
Dragging a sensationalist article like this out of somewhere with the obvious intent to start a Holy War is crass.
Of anything, mac users need to stop touting that OS X will make all the *nix users "Finally See The Light" about macs, and *nix users need to stop touting that OS X will make mac users "Finally See The Light" about *nix.
We should more work together amongst each other.
You're absolutely right: READING TOO HARD!
In the header for the slashdot article "Slant of the article is that the Macs are easier to set-up, maintain and are more flexible." I think it is funny that you call the article's conclusion a slant just because it's in favor of the Mac. I'm sure your header would state the conclusion as absolute fact (LINUX easier to set-up, maintain and is more flexible!) if the article's conclusion had been the opposite. You should used an unbiased header like "Article concludes ..." Is slashdot a news site or a propaganda site?
Zet
Why do you continue to egg these these Mac zealots on with these stories. Jesus H Christ, just shut up with these Mac stories and let these idiots relax a bit.
Sorry, but this just doesn't seem like a very stable or efficient setup to me. First, Pooch appears to be Carbonized...which has a decent amount (IMHO) of legacy problems/sluggishness that it'll drag into OS X. Second (kinda related to the first item) it's using OpenTransport. OT was usable for its time, but good grief *don't* use it now...especially when you have sockets just sitting there in OS X waiting to be used. OT is piss-poor compared to sockets. Finally, I really have to wonder about how multiple processors are being taken advantage of. Since we're talking Carbon, it may very well be via a high level call. My point: all the legacy fluff is going to be extremely inefficient compared to a good ol' pthread. The last thing I want to run on my G4/dual-800 w/OS X is something that has even the scent of Carbon calls.
;-)
Some would say 'hey, it works though; look at the results'. Well, so what? You can boot into WinMe 60% of the time too...bfd.. Nice to see this, but it misses the mark.
I'd rather stick with a Un*x-based implementation than something like this....be it MacOS X, Linux, or, um...nothing at all
Indeed. One of the Macintosh labs at Carnegie Mellon is actually called the "Apple Orchard"... more suitible, now just wait for Apple to market it.
Beowulf was predated by "Zilla.app", which shipped on NeXTStep 2.0. Richard Crandall used Zilla on any workstation that was idle, anywhere on NeXT's network (idle being defined as "the screen saver was running"), to find the 13 Fermat number, among other things.
So, this kind of (relatively) low-cost clustering began on Mac OS X's predecessor.
-jcr
The only title of honor that a tyrant can grant is "Enemy of the State."
These Xeons feature 512K of L2 Cache. Sure there are Xeons with HUGE amounts of L2 cache, but then we are hitting the $10000 price range. These are workstation machines, not server machines.
I can't compare the Apple's to the P4s... P4s don't go dual processor, so the PPC G4 wins here. I can't get a Dual proc P4.
Athlon? None of the vendors I checked have Athlon workstations, so they weren't in consideration.
However, after realizing the lack of Athlons, I remembered that Penguin Computing has a line of Athlon based workstations.
I went to their website, and priced out an Athlon MP system, the Tempest 210MP Workstation.
With 2 Athlon MP 1900+, not really competetiive with the new 1 GHz G4s, but close enough for our comparison (and matching your assertion that they are in the same league as them). With 512MB PC2100 RAM, and upgraded to the Gigabit Ethernet card (they have one, might as well try to be fair), and my workstation price is $2707.
Congratulations, we have a winner. A Athlon MP 1900+ (running at 1.53 GHz if I recall?) with similar specs at the Apple Workstation comes in $300 cheaper. The Apple has some advantages, the better video card and Superdrive are nice features when the machine is recycled as a desktop machine, but for now they are superfluous.
What is the point of my work?
You're all full of shit. Apple's computers are extremely price competitive. They are cheaper than Xeons from the real vendors with similar specs (Xeons had faster RAM, equal L2 cache, no L3 cache, and no gigabit ethernet).
Apple puts out a really competitively priced Unix workstation to Linux workstations from major vendors.
Apple puts out really competitively priced consumer machines (iMac/iBook) compared to Wintel machines from major vendors.
You can choose to use an Apple solution or not, but stop spreading the bullshit about Apple being more expensive.
What most of us hate about Apple is that they make it impossible to unhide them, to get into the guts of the thing and change it as we see fit
/Applications/Utilities/Terminal, and launch it.
On any machine running Mac OS X, go to
-jcr
The only title of honor that a tyrant can grant is "Enemy of the State."
File an "Enhancement/Feature request" at bugreporter.apple.com. The more we get, the higher the priority.
-jcr
The only title of honor that a tyrant can grant is "Enemy of the State."
Fear: When you see B8 00 4C CD 21 and know what it means
Looks like x86 machine code. CD 21 is INT 21 in assembly, which prints a $-terminated string in MS-DOS. What are the other three bytes? I haven't played with that stuff since high school.
$x='S24;r)>63/* h@<5+oZ)32"5cz';$me='phroggy'x$];
$x=~y+ -xz+\0-Tx+;print$_^chop$me for split'',$x;
Or don't buy the model with the Superdrive. G4s start at $1,250 for the education market. It's a cluster, people, just add more machines to make up for the performance difference between bleeding-edge and price/performance champ, and you come out way ahead.
Lies about crimes
I don't know what university or what department you work for, but at every single one I have seen, they buy brand-name desktops. Heck, my department is computer science, and even we buy Dells. Biology and Physics sure as hell aren't going to start building pricewatch boxes or buying bargin-basement computers with shady support.
Lies about crimes
Clustering assumes reliability.
:)
Reliability of Mac OS? Are you kidding?
And the other point, does anybody here know that Mac does not directly mean Mac OS? Did anybody here heard about Linux or BSD on Mac? If not - then you are not far away from people who doesn't know any OS rather than from M$
One more point: check the performance and memory consuming of Blackdown Java under Linux/x86 and Linux/PPC, especially in green-threads. I know your choice for application server platform after that - Linux/PPC! Am I wrong?
To answer the question of what a Mac would be used for, the answer is quite a lot. Most cluster-based stuff is homegrown applications, which can be written for OSX as easily as most OSes. But beyond that, there is actually a huge call for rendering farms for programs such as After Effects and Maya that film companies use to create films (more importantly, the films I actually want to see, the ones where things fling through space and explode, not the ones where things are passed around a coffee table while people discuss important issues of sexual politics).
I know Linux just had a big win with Dreamworks, but Macs are huge in F/X industry. And if clustering brings new avenues to cheaper special effects, that means more special effects. And that is just good.
As for it being easier then Linux, it probably is. No point in crying about it, let's get a Beuwolf-out-of-the-box solution. I agree that Macs aren't customizable enough to my taste, but this doesn't mean there can't be a default configuration of BW that would work immediately and could be tweaked later.
Ah but see, the Mac's don't need to be the same and it's still just as easy. You should be comparing setting up a bunch of random Mac's and a bunch of random PCs. Even if you have identical PC's that's not the only advantage. The big advantage is that you don't have to go off and configure a whole heap of stuff, you just drag and drop the program you want, select the nodes to run it on and click start.
Perhaps you should try actually setting up a Mac beowolf cluster before claiming it isn't easier...
The funny part is that I had just finished watching a presentation on AltiVec where the presentor stated that 16x is your theoretical max and that most people cannot get more than 8-10x.
Go AltiVec!
Your Mac cluster machines would be better off booting from the same machine. Your cluster machines don't even need hard drives (aside from that memory caching.)
So much for your insane idea of having to update 1024 machines.
Instead your 'easy' linux solution becomes 1024 times more work than the mac solution.
GG.
http://en.wikipedia.org/wiki/2004_U.S._Election_c
From what I understand, Maya is a popular Mac program, and it takes quite a while to do the crunching for that program. Now, artists are always clamoring for faster machines in order to bring such times down. I wonder if it would be possible to run Maya or photoshop in pooch. Also, I wonder how well pooch would handle running on a headless Darwin system (OSX is nice and all, but the GUI does have some operating overhead). This way, instead of buying a new computer for a 50% increase in speed, they could buy it for a 100% increase (assuming new computer is 1.5X faster than old, and the old won't be able to contribute it's all). I could see this working. Of course, the apps would have to be completely re-coded to see a real speed benefit, but having the computer simply work as a headless slave that handles compiles/renders/etc., whilst the human continues to use the front one sounds like a good idea...
BlackGriffen
:(
http://en.wikipedia.org/wiki/2004_U.S._Election_c
There's a few people saying the cost of a Linux cluster of similar computing power would be much less than a cluster of Mac towers. That is completely wrong, and here's why:
1.) Power vs. cost. The G4, with AltiVec-enabled MPI code, can blast data through in 128-bit chunks. Steve Jobs loves to term this the "Velocity Engine", and it is much, much more powerful when doing solid number crunching -- exactly what would be taking place on these clusters. It's not as amazing for day to day operations, but the capability is there to quadruple the data flow of a traditional processor when doing clustered computing. Typical AMD/Intel processors can just not do this.
2.) Maintenance. This is key. I maintain a Linux cluster and have worked with others in the past, and wonderful as they are, they require lots of maintenance. It's pure and simple math. You probably built all 16 or whatever nodes with individual parts made by various companies, and inevitably, each of those elements will have problems. This makes debugging and fixing hardware problems unbelievably painful, especially when you also have to deal with multiple parts vendors. When you use Apple Power Macs, ALL hardware problems can go through ONE support source, and that's Apple. Plus, they are pre-built, tested, and refined in Apple's R&D labs far before they make it to your cluster room. This saves such incredible amounts of time and money, it definitely pays for the extra cost of the computers themselves. I wish I could explain to you the sheer pain of keeping a cluster alive which constantly had one part go bad here and there -- but one part, sixteen computers, each with eight or nine significant custom-attached parts... well, it meant a lot of troubleshooting time, a lot of replacement time, and having to deal with far too many different companies to get the parts and support I needed.
3.) MacOS X. Clustering under previous MacOS versions was, despite the best efforts of AppleSeed, absolutely reprehensible. The operating system was simply not designed to do massive computing projects, and it was not efficient at all. Definitely not worth it despite the work of the pioneers in the field. With OS X, you now have a BSD operating system, one that has done clustered parallel computing for over a decade. MPI, with AltiVec enhancements; gcc with multiprocessor compilation support, you name it, it now runs under OS X and, with the operating system natively supporting the G4, it does it DAMNED fast.
"What the heck do you know," you might ask. Again, I maintain a 16-node Linux cluster for a plasma simulation group at the University of Colorado, and am also the CU campus rep for Apple Computer. I am well-versed in both OS X and Linux, and their scientific computing environments, and have experience in clustering in both environments. I am in the process of establishing a scientific computing initiative at CU, and I am doing it on behalf of Apple because the G4s (and soon, G5s) are simply the best platform for multi-platform scientific and high-intensity computing.
The best saving grace from a sysadmin's point of view, is that I will never have to worry about maintaining the variety of parts in those damned Linux clusters. The operating system is wonderful for scientific computing, yes, but there's simply no cost-effective way to purchase and maintain Linux-based PC hardware that could ever compare to the Mac. From an overall perspective, and this is definitely the most important aspect, those who are using massive parallel clusters of computers need their data crunched fast, and the G4 processor, combined with AltiVec-enhanced code, is simply the fastest way to crunch data, straight and simple.
I hope that clears up the issues for people, because that's how it is. Just the facts, ma'am.
Ryan Bruels
Apple Campus Representative
University of Colorado, Boulder
bruels@mac.com * 303-332-5434
"All your base are belong to this file I send in order to have your advice."
well besids the cost of macos thats another reason why linux sucks.
You're right... drat those units...
the latency on ethernet is about 10 microseconds, not milliseconds... on the parallel port it's 1 microsecond... on firewire it's 125 microseconds... which means ethernet is better than firewire and parallel ports are better than ethernet (from a low latency view)
There are 10 types of people in this world, those who can count in binary and those who can't.
Interesting page ya link to there. But it proves the x86 PC is better than the Mac, because the Mac box took longer to open. If you're building a cluster, then by definition you have a lot of boxes to open! 40 seconds here, 40 seconds there, it all adds up...
Oh, and just try to fly to a SPA meeting with a box-cutter these days. Since PCs come with pre-opened boxes, their margin would be even wider.
Look at all the excuses it lists whenever the Mac took a long time: "We didn't have a knife ready", "Max was confused", blah blah. But Jim Louderback doesn't get the same consideration?! At each step, there should be a footnote of explanation: That Jim was crippled by having to use Windows. When one contestant has to use Windows and the other doesn't, it isn't a fair contest!! If Louderback had been allowed to use a Mac too, he could have wiped the floor with that smartass kid. So please, don't post links to this kind of misleading crap.
The Portable Distributed Objects (PDO) ... but they're not. Not really, anyway.
environment is great, fanatastic, incredible, but it's stretching a point to call it clustering; if PDO is clustering, than RPC and Sun's GRID should be, as well
PDO is a kick-ass way to distribute processing load across many systems. It's easier than CORBA (heh), cleaner than DCOM, and not so riddled with security holes as RPC is. Like most development tools out of the NeXT world, it feels like something written by a couple of brilliant developers who needed the functionality, as opposed to the six-committees-and-a-quorum-vote crap that CORBA shoves down our throats. But distributed computing != clustering. What we got here is a better way to manage distributed computing. It's not Beowolf, it's not an E10K with 16 domains and IDN, it's not even Wolfpack. Even so, don't dismiss it. NeXT was ahead of its time. Maybe the rest of us have caught up to it.
-Baka!
http://www.daugerresearch.com/pooch/PoochManualX.1 .pdf
Well, Dauger Research is touting their 1 page manual and right they should. The simplicity in setting up this cluster is pretty amazing. The link is to a 46 PAGE technical document that goes into much greater detail. Still a couple of hundred pages shorter than the referenced Linux manual.
Now, if people would stop bashing Apple's documentation and realize that it is Dauger Research who wrote the documentation for Pooch, I'd be very appreciative.
Pooty tweet
I think more to the point is you can use Macs that your organization probably already has. If you're at a school whose got a big pile of Macs in the library or graphics labs or something you can turn them into a super computer by night and still have them usable by students during the day. The same can't be said for the highly tuned Beowulf all the systems need to be idendical and within four feet of one another or the doppler shift over the copper wire will fuck something up system. Beowulfs are cool and in some cases are very effective (when you have the money to buy and build a new system) but if you need to use stuff you already have (and you've got Macs) the Appleseed is a good choice.
I'm a loner Dottie, a Rebel.
I think this guy has some marketing blood in him..
.254. Does this mean it has a limitation of about 253 machines, or has he not yet programmed the button that takes you to an area where you can *configure* that stuff..
""There's a book called How to Build a Better Beowulf that's 230 pages long and tells you how to set up clusters with Linux," Dauger said. "We have a one-page manual (PDF) that shows you how to do it on PowerMacs."
I went ahead and looked at the PDF, and I admit, it does look easy. Essentially, the PDF states the system requirements, as well as installation instructions (run setup on this certain program on all machines). The next part is a 3 step process, first, start a parallel application, second, select the nodes to use, and lastly, hit the launch button.
First of all, he picks a book that I can't seem to find anywhere, at least on amazon or bn.com. I found one that was similar, How to Build a Beowulf (239 pages), and from what the reviews say, it *teaches* you about clustering computers. One would think, that this guy, *should* mention a book that is comparable to the information available in his PDF, but no, he choses a book which is essentially something you could proabably teach a college class on.
Im not really saying that its not as good, as I don't have the expertise to really know the differences between the 2, which works better, etc.. but this guy is spouting crap in my opinion.
Ok, my next problem with this. This company or guy or whoever, created a program to do this. He sees something that is on Linux already, applefies it (making it so a 6th grader could do it) and then tries to compare the 2. Obviously whoever wrote beowulf didn't write it so a 6th grader could install it. From what I remember reading about beowulf, someone from NASA wrote it. I could be wrong, but im sure it was someone in a scientific field of some sort that needed to do this. So, the area the beowulf programmer wanted this to be used was in a fairly technical enviorment, where the people setting this up would know how that its a complex job, but its probably fairly configurable as to how its going to work. Now, could I not create a very similar program for linux? Could I not, essentially keep everything as default or whatever, make a wizard that was a 3 step process, that made a cluster in linux? He makes it sound like there is something special with Apple computers that makes this thing work, but from what I read, its just a program that links them together. You don't need anything special that an apple has, he just decided to write it there.
This is really how I see it, lets say that there is some sort of wizbang technology out there, and its a fairly complex, tunable technology, not for the faint of heart. I create a similar technology, but I keep all the tunable settings, and everything at a default level. I make it extremely easy to setup, because im only making the user provide the very basic of information to get it to work, any sort of *settings* are the same across the board. I then start getting websites to write reviews bashing the more complex system because there is one available that only has a on and off button.
I dunno, its probably pretty cool, but the only place he really mentions this being used at is in grade schools, (and im kinda pissed that in 6th grade we didn't do a damn thing with computers at all, and kids these days in 6th grade are building supercomputers). Also, from his PDF, your limited to the IP address range of 192.168.1.1 to
I would love to see a team of NASA's engineers go against this single guy in a battlebot type of competition with that statement. Give the whole entire team of NASA engineers to build a Beowulf cluster in 2 weeks, and give this guy an hour to make his with Pooch and Apple computers. Then have them both run a predetermined test (that both groups can look at while building their cluster) and see whos cluster *performs* better.
It seems like to me that if I wanted a bigger dick, I would buy a bunch of apples and tell all my friends how easy it was to get this cluster working. If I wanted to get work done, I would use something that I know works, is used in a lot of places, and I know can work fast.
Ah well, that guy sucks, the technology might be pretty slick, im not really smart enough to be able to pick apart the systems or anything, but that seems like a Mr. Ego way of trying to sell your product...
Cost of 10 good Highend Macs, (about $30,000)...
Reading Beowulf in a PDF file on a cluster of these: priceless.
The original 128K Macintosh came with a thin manual and a casette tape (which you played along with a movie running on the screen). This was enough. One of the first Mac commercials showed a PC, with a stack of books falling on the table, and a Mac, with the thin manual floating down.
However, they made the same error that you make: thinking that people select for ease of use. They don't. This is what happens:
The sum total of this is what I call "the Acolyte effect." An Acolyte is someone studying for the priesthood. Computer acolytes are attracted by the pseudo-mystical nature of software; learning its ins and outs is for them a rush. The choice of computers and software becomes a social hierarchy.
you said: Not to mention, you'd probably want to hack the OS in some way so that you could kill CPU-hog Aqua.
.0% 0:00.13 1 16 16 480K 656K 948K 5.74M
.0% 48:45.56 8 141 224 9.35M 13.0M 14.7M 72.4M
.0% 0:00.46 1 66 75 940K 7.61M 2.74M 57.7M
.0% 0:09.47 3 92 151 4.21M 11.1M 9.98M 65.3M
.0% 1:42.18 5 118 173 7.14M 13.1M 11.8M 120M
.8% 0:54.66 6 123 247 2.65M 9.80M 6.26M 63.1M
.0% 12:15.22 7 124 167 5.78M 12.9M 9.21M 86.3M
.0% 2:59.75 1 64 78 1.18M 6.80M 9.60M 53.9M
.0% 0:00.15 1 46 38 356K 2.79M 956K 30.2M
.0% 3:43.34 3 116 147 2.61M 10.9M 7.42M 60.2M
the proof is in %top
Processes: 46 total, 3 running, 43 sleeping... 189 threads 03:17:24
Load Avg: 0.91, 0.68, 0.51 CPU usage: 7.0% user, 13.9% sys, 79.1% idle
SharedLibs: num = 124, resident = 29.8M code, 2.25M data, 7.20M LinkEdit
MemRegions: num = 4221, resident = 118M + 8.61M private, 84.9M shared
PhysMem: 64.7M wired, 114M active, 555M inactive, 734M used, 34.5M free
VM: 1.66G + 56.9M 9675(0) pageins, 617(0) pageouts
PID COMMAND %CPU TIME #TH #PRTS #MREGS RPRVT RSHRD RSIZE VSIZE
1146 top 10.4% 0:02.80 1 14 14 216K 328K 456K 1.37M
1141 tcsh
1105 Radio User 3.4% 12:38.38 11 107 221 10.9M 15.7M 19.3M 81.5M
959 iTunes
952 SecurityAg
889 JavaBrowse
862 BBEdit 6.5
832 Terminal
807 OmniWeb 0.0% 9:46.23 38 221 996 46.6M 35.4M 72.3M 148M
805 Eudora 5.1
802 netstat 0.0% 0:09.20 1 12 14 60K 336K 264K 1.32M
801 PTHClock
800 iTunesHelp
799 SystemUISe 2.6% 12:30.23 6 157 202 4.51M 7.64M 6.29M 65.8M
798 Dock
my mac right now.
I used to have a better sig than this, but I got tired of it
It was so fast that it was classified as a weapon and couldn't be exported to countries such as China, Iraq and North Korea.
Is it just me, or is that awsome?
That plus the fact that clustering is so easy that sixth graders can litteraly do it makes macs seem very attractive to me.
I think the power for a beowolf cluster is much greater, I don't have any facts, but given the bloat in MacOS... Second, why did the Mac people turn this into a Mac VS. PC war? There's no reason you can't run a beowolf cluster on Macs.
Maybe Steve would feel more comfortable releasing OS X to PC if the Mac people would be more anti-windows, and less anti-pc.
"And we have seen and do testify that the Father sent the Son to be the Savior of the World"
1 John 4:14
He said something potentially "bad" about Macs. It doesn't matter if it might be true. It doesn't matter if it's said in a civil, calm, mature, and professional fashion. It doesn't matter at all.
Don't you know what these Mac zealots are like? They're worse than Linux zealots, and they've been around longer. They will not tolerate any criticism of the Macintosh in any fashion. As far as they are concerned, the Macintosh has been and will always be unconditionally superior to the PC in every way possible, Mac OS X represents the pinnacle of operating system engineering, and Steve Jobs is to be worshipped like Jesus. Heck, some of them might believe Steve Jobs is Jesus!
It's always fun when an Apple article appears on Slashdot. I just switch to -1, nested mode, and search for all occurrences of "flamebait" in the article. I always come across some juicy criticisms of Apple and the Mac which Mac addicts with mod points are trying to suppress.
It's strange, because by marking these posts as "flamebait", they are singling them out, making them much easier to find and read. But, they clearly don't want them to be read, or why else would they mod them down?
Oh well... I've never had a high opinion of Mac zealots and their stupid Mac Addict magazine either. I shouldn't expect much intelligence from them.
Well I did. Our cluster is intended for solving complex problems in radiation transport theory. Applications are ranging from Monte Carlo simulations (trivially paralellizable) to the numerical solving of coupled differential equations (damn hard to parallelize). We are using every tool in the book: PVM (parallel virtual machine), MPI (message parsing interface) as well as various batch queuing systems. Language of choice for this kind of computation is FORTRAN.
/. geeks here is our configuration:
My point is following: If you need a cluster for the scientific research you have to do a lot of your own programming and customization. Many of the comments posted here are along the lines of high costs for the configuration of linux cluster. That argument is simply not applicable in this case. The overhead spent on actual configuration is negligible in comparison to the time spent on actual coding. Moreover above mentioned tools make the task of parallelization much easier.
I admit, I don't know much about Mac's, but: Are there any good FORTRAN compilers for Mac (comparable to HPF)? It seam's that "pooch" drag'n'drop approach does not give me much of the control of how the subprocesses are spawned. What sort of the libraries and toolkits are provided?
One more point: Using SMP machines as cluster nodes is not necessarily the best way to go. MPI, for example, does not like threads much. At any rate, it is very hard to write an application that will fully utilize high bandwidth that SMP offers, while simultaneously having lower bandwidth utilization between ethernet connected nodes.
Lastly, for all the
24 nodes, each node having:
Epox 8KHA+
512 Mb Mushckin high performance cas2 ram
Athlon XP 1900+
20 Gb Maxtor HDD
Lynksys Gigabit NIC
$45 el chipo case (we had a good previous experience with the particular model though)
Overall hardware cost ~$16000
OS: RedHat 7.2
My 0.02 Dinars
Lets see here...
I can go to Grand Vitesse Systems' online store and buy a 2U, dual 1Ghz mac with a gig of ram and all the other apple goodness (gigE, superdrive et al) for around $3500.
For compairson, we next go to dell and price out a similar 2U server using wintel, namely the PowerEdge 2550. Put in dual 1.4Ghz Intel pent III (G4's will eat this for breakfast), 1GB of ram, Red Hat 7.2 pre-installed, and basic everything else what do we get? $4,871!!! Granted this comes with an 18GB SCSI 10K drive vs. the mac's 80GB and 40GB ATA hard drives, but I think you can get a SCSI controller and a 18GB HD for less than $1300.
Face it, since OS X macs have been better than anything that runs on Intel for any application.
--InfinityEdge
"Not only was the performance faster than the Pentiums but it was comparable to the performance achieved on some Crays," the team said in a report.
Ummm... Ummmm... Man I wish I had a witty retort, but they must be comparing new Macs with old Crays...
A cluster of those new imacs would be pretty cool, cooler if they cool rotate their screen to follow the sun... like a big field of sun flowers.
Realisticly speaking, how many cluster users are using their cluster for an application that commercial-off-the-shelf software will be available for?
"The software used to accomplish the clustering for AppleSeeds is Mac MPI, which is based upon the *standard* for parallel computing, MPI."
There are a laundry list of parallel computing standards. MPI is on the list. Hint: MPI is supported directly on linux. I wonder which one has the bigger software repository?
"The reason that the PDF doesn't talk about programming MPI is that there is no need for redundant documentation. Go find a book on MPI if you want to learn to prgram to that API."
Yes. But the poster was trying to indicate that the literature comparison is a bit stacked, no? =)
"as opposed to spending quite a lot of time figuring out how to use the incredubly arcane "apt"."
More gooey distributions (e.g. Red Hat, mandrake, etc.) include gooier automatic updating tools.
(But, presumably, if one has difficulty comprehending a simple debian command-line utility, one is clearly not qualified to understand source code for a particle physics simulation coded in a high level language, and should be thrown off the project, right?)
Using SMP machines doesn't REQUIRE you thread your applications thus fucking up your MPI performance. You could have your program fork itself as a separate process or just run a separate instance from another directory or some such and the kernel on the SMP system will load balance and keep each process running on a different processor. This approach is of course going to work alot better with Monte Carlos than differencial equations. Anyway to answer your question if you use pooch you can use any library you've got available on your Macs. Just like building Beowulf apps you load the nodes up with whatever libraries you need for the application and it will go ahead and use them as needed. If you're using OSX you can use Cocoa or Java as an object passing system to get data from somewhere to somewhere else although this isn't exactly ideal for heavy math applications.
I'm a loner Dottie, a Rebel.
Presumably, most of the annoyance at Apple in this community comes from the whole whacking-unauthorized-clone-makers-with-a-big-stic k attitude it adopted.
Very hacker-unfriendly, and more monopolist than Microsoft. In PC land, there are almost always at least three suppliers for every major component. (e.g. CPUs: Intel, AMD, VIA, Transmeta; Motherboard chipsets: Intel, AMD, VIA; etc...)
You may have a point about the jealousy, though. Although my hardware is neither beige nor ugly -- and each of my components was selected at my choosing -- I have to admit that it would be kind of neat to mess around with OS X for a while. Now, if only Apple would let down its sometimes-whimsical sometimes-Microsoft-esque-monopolist schizophrenia for long enough to realize that it could really change the world and make a killing at the same time by entering the PC OS market... But, alas; Star Trek was crushed long ago in favour of the misguided hardware company vision.
Heck... If they leave the price/performance ratio wins and the majority market share to PC land, the Dells of the world will gladly reward Apple with all of the "cover of Time" success it wants. =)
. . . to a normal person/hobbyist (ie not a scientist, not working on a degree, etc.), what exactly is a cluster good for?
Been wondering about that for a while now, but nobody i've asked seems to get the question.
In case that's the case here too, let me rephrase: why would a private citizen build a cluster for his/her home?
xScruffx
>> Is it assumed that apple seeds will not be connected to the internet in any way, nor have any wireless access point attached to them?
Commands between Pooches are protected by an internal, rotating 512-bit encryption key based on the registration name of the user. Therefore, Pooches only of the same registration name can talk to each other. The idea is like shared access to an office resource. The demo version is relatively insecure because, well, anybody can download it. For more info, see the dox at:
http://daugerresearch.com/pooch/download.html
There are plenty of AppleSeed clusters connected to the Internet. I've accessed the one at UCLA from lots of places: using my PBG4 via Airport from my couch in Pasadena, from Toronto, Canada, from Garching, Germany (outside Munich)....
There is someone who charges $250/client for Linux on Intel Beowulf-type software.
Dean
I haven't really seen many posts refering to the efficiency of the clustering software. This would be a major factor in my choice.
And,
Why does everyone assume that the cluster would be a full time cluster? Doesnt anyone else run a network with daytime workstations and night time cluster?
It doesn't mather if the howto for beo-cluster are +200 pages compared to 1 page for osx-clusters.
1. I haven't read the beo-howto, but mostly you do not read the *whole* howto to get things running. you pick the topics that apply to you.
2. you can buy beo-clusters now from vendors, so if you really are not into tech stuff, you can let them manage it. it will cost you, but it will work - good.
3. what happens if a distribution comes out specialised for building beo-clusters that makes it as easy as to set this up as apple has done now. whatever apple is doing on osx, we can do too, it's a unix if we had all their sources they would probably compile with a little tweaking.
On a long enough timeline, the survival rate for everyone drops to zero.
It's often worthwhile to do this test:
Compile two versions of your code, one using double precision, the other using all single precision.
Compare the accuracy of the final results, and decide if the performance penalty of using double precision is worth the extra accuracy.
Don't get me wrong, I know double precision is essential for some problems. But I also know engineers who code everything double precision by default, even though 95% of the time single precision results would be every bit as good.
That that is is that that that that is not is not.
You could manage all the machines on a LAN/WAN from a GUI, and you could have each machine run your app all the time, or only when the screensaver kicked in. It was kickass stuff for it's time (~1990).
Robert
Awesome furniture, accessories and cabinetry in Santa Rosa, CA: http://humanity-home.com/
It was necessary for me to evaluate G4 architectures for embedded distributed applications about three years ago. The first system available was a PowerMac G4 (350MHz). We put YellowDog Linux on it and then set up the small Beowulf cluster (master and four rack slaves). The installation of YDL was probably the most time consuming (me as a newbie). SSH was my most difficult topic.
I found the NFS pretty simple to administer and the launch of MPI applications from the shell easy using all the open source MPICH with CH_P4 (TCP/IP). If I wanted to review all the MPICH docs and Linux docs I'm sure it would be many hundreds of pages.
I have always liked the Mac hardware and simple to use software. The software described seems very easy to use and administer and I applaud the developers. However, the post seems a bit biased towards the developer and the one page HOWTO.
However this tool only works under MacOS. My clusters are portable to Linux, Solaris, VxWorks OS using Pentium, PowerPC, Sparc, etc. Portability is a distinct advantage when going from workstations to embedded deployment. I am also unsure if the entire MPI v1.2 standard functions are available on the MacOS platform.
Steve Prause
CSP Inc.