Xgrid Agent for Unix
mac-diddy writes "Someone on Apple's mailing list for Xgrid, Apple's clustering software, just announced an 'Xgrid agent for Linux and other Unix platforms' available for download. There are still some issues being worked on like large file support, but it does allow you to simply add a Unix node to your existing Xgrid cluster. Just goes to show that when companies embrace open standards and code, the world doesn't fall apart."
Somewhat silly, but wouldn't you incur a bit of overhead mixing machines of different endian-ness? I suppose for non-communication intense algorithms this wouldn't be a big deal.
Not really. Everyone uses network byte order for communication, so you won't have more overhead in a mixed system than you would in a homogenous system.
"They redundantly repeated themselves over and over again incessantly without end ad infinitum" -- ibid.
Well, windows runs on Athlons and Xeons....and technically it runs on a G5(the XBox 2 dev kit that MS distributed is a modded NT kernel is run on a dual G5), so it's possible to do it on Windows, but why would you want to?
basically, for your kind of applications: nothing.
I doubt you compile applications that big
photoshop: get an smp instead and plugins that support it
quake,mame: u kidding get a faster gpu instead
In the past, as I have moved between jobs, I've written a number of Object->relational mapping tools.
After a while they cease to become fun to write, and you'd rather just get on with writing code that does something instead of infrastructure. By using and contributing to OSS projects, you can use the same code no matter what company you end up at. Because the code is portable it can become part of the package you can offer to a potential employer - they not only get an employee but potentially one that can producive almost right away because they are familiar with the tools they'll be using, with no cost to the company for said tools.
So it makes life easier for you, less re-work. And it makes life easier for employers, as they get richer products sooner. And if the employee becomes really proficient at a widely used OSS project they can write their own way through consulting or training.
"There is more worth loving than we have strength to love." - Brian Jay Stanley
You don't need XGrid for faster compilation - Developer Tools already includes distcc
"[Regarding the 'cloud,'] ownership was what made America different than Russia." -- Woz
zero for home use. grid stuff is currently only good for science apps, although openmosix on linux is pretty cool for a linux network (not sure if works on macs). 10/100 cuts it alright, wireless is even better, just add openswan to protect your data
Well firstly, since Windows is more then just a kernel, just because the Dev kit has it does not mean Windows as a whole does. Secondly, Athalons, P4, Celerons and Xeons are all the same architecture, ia32. If you had used the P4 and say, Itanium or Opteron/A64 you would have a point, which is why I qualified the statement in the beginning with 'to a large extent.'
"I use a Mac because I'm just better than you are."
Actually, no it is quite feasible if you do it on a large scale and depending on what you use the cluster for. Big Mac and the Army cluster are two examples of where a mac cluster can be cheaper.
Jesus was a compassionate social conservative who called individuals to sin no more.
This troll is getting old. MS does not and never did own 40% of Apple. They bought a large chunk of non voting shares in exchange for making IE Apple's default browser. As soon as the 3 year contractual agreement was up, MS sold the shares, and for a decent profit.
There are many other open source cluster/queuing systems available.
The one I prefer is OpenPBS. It works very well for engineering compute clusters, and there are many different resource schedulers available which use the PBS job and node management system.
I wouldn't say that - I find it pretty amusing you've been registered at ./ for so long and are still so wrong.
p.s. I know I should reference - how about 'MS owns fuck all anymore' - will this do?
The Mothership
Not quite. "Network byte order" is big endian. So on big endian ppc's, which macs are, all those "ntos" macros, etc., expand to NOPs. Once you introduce little endian machines into the mix, they start doing real work to transform internal representations for the wire.
The real tragedy is when you have homogenously little endian machines; e.g., a network that only has PCs on it. An integer gets byteswapped twice to end up in exactly the same byte order it was all along.
Can anybody confirm if the linux and unix ports are smp aware?
(I wrote the xgridagent).
As the other poster said, XGrid does not care what the binary does (so it can be smp aware, multi-threaded, whatever). However, the xgridagent itself is not explicitly smp aware, but it is multi-threaded. Each task is started in its own thread and depending on the OS(?) I guess they could spread to other CPUs. The other aspect of the question is "Does the Unix XGrid agent support MPI like Apple's GridAgent for OS X?". It does not and I can't say for sure how difficult it would be to support it. However, since all communication is done via the XGrid protocol, I don't see what would prevent it from being implemented. BUt other things need to be done first.
The most pressing issue is to fix the annoying "large message" issue which makes the agent hang (while it waits forever for the controller to accept more frames). I am convinced it is trivial, I just don't know enough about BEEP to fix it. I am hoping somebody who knows BEEP will take a look at xgridagent-profile.c and fix the xgridagent_SengMSG() function and send me the patch.
Daniel Côté
So on big endian ppc's, which macs are, all those "ntos" macros, etc., expand to NOPs. Once you introduce little endian machines into the mix, they start doing real work to transform internal representations for the wire.
not quite.
first, i think you mean "ntohs" (and ntohl and friends).
second, they are not macros. they are, in fact, real functions (in glibc, bsd libc, and windows' winsock library). i'd imagine it's the same on macs.
third, a macro that does nothing is not expanded to a NOP, it is simply removed by the preprocessor.
so, assuming the macs are conforming to bsd networking standards, ntohs is required to be a function, so there is still a function call per conversion (which is much more costly than doing the actual byteswap).
The real tragedy is when you have homogenously little endian machines; e.g., a network that only has PCs on it. An integer gets byteswapped twice to end up in exactly the same byte order it was all along.
a real high performance implementation (ie, the kernel) would not use ntohl, it would implement a similar byteswap macro. a byteswap can be done on x86 in one instruction, so it is fairly trivial to do.
Personally, I found PBS to be the best open source solution last time I had to choose, but that was just prior to the Sun buyout of GRD, so things may have changed. [My current employer rolls their own batch scheduler, so I haven't had a need to survey the field for a few years.] There are also some things Condor rocks at (cycle scavanging, userspace checkpoint/restart/migration) which none of the others even attempt, so it's definitely worth a look for some sites.
If your paying $$ for your batch scheduler, LSF pretty much trumps all of them, but the price is too steep for me.
To utilize Xgrid, the application has to be written for it
Not so, not so.
If your problem is embarrassingly parallel, chances are you can use Xgrid to run it right now.
For example, let's say you're rendering a 3D animation. (I haven't done real 3D work since the PowerAnimator days, so pardon me of some of my jargon is antiquated.) You've got a scene file on which you can run a render command. A command-line argument tells the renderer which frame to render.
No problem. Just use use Xgrid's Xfeed plugin. Xfeed lets you set up a job that runs a single command with a variety of command-line arguments. You tell Xfeed that you want to run the "render" command with "-f" and the numbers 1 through 720.
Xgrid goes to the first available machine on the grid and says, "Run render -f 1." Then it goes to the second machine and says, "Run render -f 2." And so on, until there are no available machines. Then it waits until a machine becomes available and says, "Run render -f n."
As each output file (a frame, in this case) becomes available, Xgrid (the client application itself, I mean) collects them in whatever directory you specified when you submitted the job.
The cool part comes when you realize that this isn't a cluster. It's a grid. That means machines can come and go as they please. If this job is running overnight, when I come in the next morning and sit down at my workstation, the agent on my computer stops the job and de-registers itself. The job goes back in the controller's queue for processing on whatever the next available machine is.
And you don't have to have any special software for this. It can be done right now with the tools that already exist in Preview 2.
I write in my journal
- Tjp
I am in wallow with my inner money grubbing capitalistic pig. ... Oink!
for example, to halve the video compression time of iMovie when making a DVD.
Video compression is a difficult task to parallelize. If each frame were compressed individually it'd be easy: just and an uncompressed frame to a node and get the compressed frame back. But that's not how it works.
Now, for something like Pixlet, which is frame-based, there's the possibility of distributing the task. But you will never use Pixlet. It was designed to compress 2K or 1080 material losslessly at a ratio of about 2:1. Very specific tool for a very specific purpose.
So using Xgrid for video compression isn't going to be the wonder that you might wish it could be.
I write in my journal
The other packages require a bit of planning, whereas Xgrid excels at locating nearby resources for pawning off processing tasks. Rendezvous (ZeroConf) is exactly about the need for ad hoc networking. Xgrid extends that to the cluster...
There exists no way of exchanging information without making judgments. --Bene Gesserit Axiom
Maybe. I don't know enough about the internals of how compression works, but that sounds plausible enough to fool me.
I write in my journal
Aye, you don't need to be running an XCode application. My friend and I were running XGrid on our PowerBooks (867 MHz Ti and 1.5 GHz Al), and he was able to write something that would allow us to use our combined power for Blender rendering. It was rather awesome, breaking 2.something GHz on G4 processors.
Mind you, I don't know how he did it, as I am still a code monkey-in-training.
"Either that or OpenMosix"
Xgrid treats the cluster as one proccessor, while OpenMosix assigns each to thread to a cpu thats not doing muck work.
Facts do not cease to exist because they are ignored. -Aldous Huxley
Here are three gotchas that can make this sort of thing less appealing than it may seem at first:
Problem type: The problem may not be well suited to running on a bunch of PCs (especially when the agent app isn't allowed to take 100% of the machine's resources to accimplish the task) over typical office networks. Basically if the app needs to communicate frequently with other nodes, or if a huge data set is involved (or both), latency or bandwidth issues might outweigh the possible advantage of putting more CPUs to work.
Security: The data may be highly sensitive, in which case you might not want to put it on ordinary desktop PCs that might have untrustworthy users, spyware, etc.
Configuration: The configuration of your office's PCs may vary enough to make the cost of getting a companywide desktop cluster working unacceptably high. You'd have to pick a few target configurations and settle for that. Hopefully drivers and such wouldn't matter as much as CPU, RAM, disk, and OS version, but there are still companies that are just now getting their desktops updated to Win2K. There's also the headache of installing yet another required application on a large number of heterogeneous machines, which is virtually guaranteed to result in confusing installation problems. Oops, our app crashes if the user has this or that service pack installed. Oops, our app requires strong encryption. You could build your app on top of some sort of moderately portable framework or VM or whatever but that will have system requirements too, and probably will have some surprising gotchas when deployed in a real-world environment.
Unless ntohs is an inline function. Most compilers will optimize out inlines that return their calling argument unchanged.
unless you're linking your libc statically, it can't be inline. it similarly can't be inline if you use a function pointer to it in some fashion.
Of course reality differs and they are actually null macros on OS/X
then osx has a broken bsd socket implementation. ntohs should be a function. that is, you should be able able to take a function pointer to it and all the others (something you cannot do with a macro), and any code that relies on this will break on osx.
Compressing video is very easy to do in a parallel manner. The first step is to perform a DCT (or DWT or similar) on each frame. This is embarrassingly parallel, especially for DCT where each macro block (usually an 8x8 pixel square) can be done in parallel. Next some form of quantisation is applied to the result. This, again, can be done in parallel on a per-frame basis. Finally, delta frames are computed for inter-frame compression on codecs that support it (MPEG and friends). Since key frames are usually a fixed number of frames apart you can simply have a number of nodes running these in parallel for each key frame block. If they are not, then the problem is a little more tricky but basically doable (I'd imagine that you'd start at fixed points, and insert key frames as required on fixed size blocks, possibly requiring some backtracking).
I am TheRaven on Soylent News
Actually, that is completely false:
As 30 seconds of Googling will tell you, distcc [is] a fast, free distributed C/C++ compiler.
As they have done with KDE's KHTML engine in Safari, so is Samba's distcc engine being used in XCode.
Care to try again ?
:-)
DO NOT LEAVE IT IS NOT REAL
Well, there's Darwin, their (improved, IMnsHO) version of BSD.
Rendezvous is their (improved) version of ZeroConf.
Safari runs on the KHTML engine. Apple made some improvements and gave them back to the KHTML people, who thanked and praised Apple.
They've worked to improve gcc on PPC-based compilers.
They also provide the standard tools like apache, perl, python, etc etc etc, with OS X. I don't know if they have worked on these specifically, but it wouldn't surprise me in the least.
The Independent: Reverend Spooner Arrested in Friar Tuck Incident - ISIHAC, Historical Headlines
If Apple breaks this intentionally (meaning not for adding significant, enhanced functionality) in their next release, I will stand with you as an anti-Apple nay-saying zealot and deride them all up and down /.
-Potentially recovering Mac zealot (it's so hard with WWDC right around the corner :-( )
sigs are for fools and trolls. no signature is *always* appropriate. you should turn them off in your preferences.
Receiver swaps.
In DCE RPC, the receiver does the byte swapping, if necessary. One of the main reasons Windows network services are built on DCE RPC is that between homogenous systems, there's no swapping taking place: all that data goes out in host byte order, and there's no such thing as network bte order.
One of the big arguments about this had to do with Windows machines on Intel not "playing fair" with systems that natively implement network byte order as their host byte order. When talking to Intel boxes, these machine have to gain additional overhead.
This also gives a big disadvantage to servers whose byte order doesn't match that of their predominant clients.
Actually, from a computational overhead point of view, a more correct approach would have been to have "client swaps to seerver byte order", to put the computational overhead on the most efficient side of the link for it (by offloading the most computationally loaded component, the server).
As far as I recollect, this lost out in committee to people who were arguing against it in order to have leverage to enforce vendor lock-in for both clients and servers. 8-(.
-- Terry
He probably did it kinda like this:
http://www.atpm.com/10.06/blender.shtml
***General Consultant to the Human Race*** My opinions are free. You get what you pay for.
unless you're linking your libc statically, it can't be inline. it similarly can't be inline if you use a function pointer to it in some fashion.
Don't be silly. (Do the moderators actually think before scoring +1?) gcc is perfectly capable of inlining functions even when glibc is dynamically linked. It can also inline functions whose address is taken, just by generating a separate copy. Any other compiler clever enough to have inlines is very likely to do the same.
Better let the BSD team know that then, because they'll surely want to make sure their code complies with the "bsd socket implementation" spec you mention. Or... and here's a crazy idea, you could realise you're wrong and that Apple didn't decide to deliberately break the BSD code they used and actually have a very similar implementation to the BSD code.
Source: http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/sys/e* Nothing is foolproof to a sufficiently talented fool *
This was posted on the Apple SciTech and Clusters lists:
Apple and a third-party partner are looking to target a few key
applications with the hope of developing parallel versions that would
benefit from computational clusters. As many of you know,
embarrassingly parallel algorithms like BLAST are easily written to
take advantage of clusters. There is a large set of problems, however,
where this is not the case. We would like to find some of these more
difficult applications and find a way to parallelize them using some
interesting technologies developed by our partner.
I'd like to solicit feedback from the members of these mailing lists
with respect to choosing two or three "killer applications" that, if
parallelized, would present an immediate value to their respective
users. We have a few in mind, but I'd like to leave the question
open-ended. Any science is equally applicable -- bioinformatics,
molecular dynamics, physics, engineering, etc. We would prefer to work
with open source applications.
Feel free to reply to me directly, or to the entire list.
Regards,
Matt
--
Matt MacInnis
Research and HPC Manager
Higher Education
Apple Computer, Inc.
Office 408-974-6322 / Mobile 408-203-1001