Renderfarm Setup Tips?
"In the hardware side, we still haven't made a choice between using AMD's Opteron or Apple's Xserve G5 (they have some very nice and price convenient cluster nodes which seem to be ideal for this kind of job), with Linux. As for the networking between them, is Gigaethernet enough or should we be going for Fiber? The software used to manage the render queues is another important point as well: I've been looking into Rush, and even though it's a commercial package, it works on all of the platforms we currently use (W2k/XP, Irix, OS X and Linux). But then there is also Dr. Queue, which is open source and is supported on at least the *NIX members of the aforementioned OS's. Other options include RenderPal and Pixar's RenderMan, but I would prefer an F/OSS alternative. Finally, it's worth noting that we'll be using the renderfarm for Maya and Adobe AfterEffects."
Peace
... but it was rejected. How do you deal with terabytes of data (50+ TB), all in a single directory tree, all must be accessible to every node? This is larger than what you can store on a single filer. Also, for performance reasons, the data must be separated across multiple filers. Currently we use lots of symlinks to tie it all together into a single logical directory tree, but that's a really ugly solution. There's got to be a better way. Right? Anyone?
___
If you think big enough, you'll never have to do it.
There is a fiber interface (Myrinet) to each node used by the MPI crowd, but our rendering group doesn't use it; they seem content with the performance of Ethernet over copper. Your needs may be different, of course, but latency isn't really an issue for rendering, and copper should provide all the bandwidth you need.
I'm not knowledgable regarding all the software packages you list there, but I'm wondering if any of them would really take advantage of a 64-bit kernel (either on Opteron or G5-PPC970). Of course you can put a PPC version of Linux on the Xserve, but not without sacrificing nearly all Apple-provided management. If you expand the cluster to a large number of nodes, or even if you keep a small number of nodes but place it in a remote location, Xserve running Linux would be painful to manage (no remote power-off/-on, remote console problems). Xserve is shiny and has the requisite blue LED's, but and AMD or Intel box (from the right vendor) would be much easier to manage remotely.
My question is - when is someone going to make a blade with PCI-Express and enough room for the latest batch of cards from NVidia and ATI?
Seriously - this is going to become a huge issue, as more rendering is pushed out to the stream processor that is the GPU.
Education is the silver bullet.
We have a 50 node dual 3Ghz render farm, 20 on w2k and 30 on linux 2.4.20 (ok Redhat9) We are currently using Muster www.vvertex.com, but are not that happy with it. It was reasonably priced, and the developers are very helpful, but its not quite there yet. I have been seriously considering building my own using a SQL database (currently Postgresql, but may swich to MySql) and perl. A render manager is really just a database with a bit of network sockets and scripts to run occasionally. A simple concept that is probably going to come back and bite me :)
--
Hehe, but to me, fast IS nice. =)
I'm not a Machead, I'm an x86head. Always will be.
- It's not the Macs I hate. It's Digg users. -
This is a fun idea in the abstract, but if you're looking for concrete advice, you need to give us some concrete data. Most importantly, -what- are you rendering?
If you're at a university and you're doing some sort of bioinformatics visualization, use whatever the researchers are most comfortable with. The odds are good that this is whatever the CS department was teaching on 5 years ago. Probably Suns or Windows machines. Slave... errr, grad student labor is cheap, so use an OSS scheduling and job management system if you can.
At most other places, a similar rule applies: use whatever the users are most comfortable with. If you're using Mac workstations and software, then it may make sense to go with a G5 rendering farm. If you're using Windows... well, okay. Windows render farms still suck, but at least buy PCs to leave your options open. Unless you're a really large organization (that is, the sort that doesn't have to resort to Ask Slashdot for research), you probably want to use products that come with support contracts. $20k/year is a pretty good deal when compared to keeping a full-time support person for the same task.
--
We did this because we primarily use Discreet's 3dsmax (with Brazil and V-Ray) and Eyeon's Digital Fusion. We have found that most existing render farm solutions do not support these two packages very well -- thus we decided to develop our own custom solution. We also support After Effects, Alias|Maya, AIR and other RenderMan compliant rendering packages.
Of interest to the general Slashdot crowd may be that this Deadline Render Management Solution is based on the open source (BSD License) Exocortex C# library originally released with this C# 3D Engine. Deadline is built with C# in the hopes that using Mono we will be able to start supporting Linux with minimal extra effort.
I'll be reading all the posts on this Slashdot thread but I would also appreciate any direct feedback on our current beta product. We also found solutions such as Rush and Smedge to be less than user friendly in many respects. Thus we have tried as best as we could to increase a 3D package that is not well supported by most render farm management solutions -- except for Discreet's Backburner (which we found not that that scalable.)
Welp, your post sparked some debate here at the office. I'm at a small studio making a full length animated movie. One aspect of it we've been chewing on is what to get in terms of render farm down the road. I just had a few questions for you, if you don't mind:
1.) The words 'dual' and 'Opteron' both surprised us. We were kind of under the impression that maybe single proc machines would be better for a render farm. We were really curious why dual was chosen over single. Did the extra cost end up being worth it?
2.) You mentioned that Opteron was more efficient than Xeons. I just had to ask: Was the particular software you were using particularly tuned to Opteron (i.e. 64-bit?) or was the 32-bit side of it just pleasant to work with? Any more insight you can share with me about the use of Opteron would be most helpful.
3.) Did you guys end up buying a bunch of machines from a place like IBM or something, or was it more like "we bought the components and assembled ourselves..?" If it's the former, how'd you like the service?
4.) Any regrets or things you'd do differently next time around?
5.) Why are you getting rid of the machines used for Riddick? Or did I read that wrong?
Appreciated,
NanoG
"Derp de derp."
You're right. It probably is. However, Slashdot is supposed to be a community, right? So what's wrong for asking the community for help? That is one of the things community is about.
I don't see the problem with asking here. You can actually get a lot more insight from a lot of different backgrounds in one place. Yeah, you have to weed out some of the gems, but moderation helps some with that.
You elitist pigs are starting to bug me. We've all needed help from time to time, yourself included I'm sure. Don't knock others for asking for help.
CromeDome
Are you able to tell us which productions were these machines involved in rendering?
These particular machines were just used for The Chronicles of Riddick. Computer technology advances so fast, has lowered in price so quickly, and movie post-production schedules are so long (six to nine months, typically) that we typically don't use any particular machines for more than a couple of films.
Also, in the interest of understanding how much it costs to set up a significant render farm, how much does this sort of thing cost? Is it all in the PCs, or would the backplane infrastructure cost surprise us a lot?
In fact the dominant cost, at least for us, is not the render boxes themselves. The network is a significant expense, as is the data server system. An even larger expense, though, are the licenses for the rendering software. Top-of-the-line rendering systems like RenderMan (for 3D) and Shake (for 2D) cost thousands of dollars per node. And then there are significant infrastructure costs in just electrical wiring and cooling.
At least in the 10-to-50 server range, I would say that the costs are pretty linear. As you get bigger than that, you can start to see some economies of scale.
At some point, it becomes profitable to start developing in-house software tools instead of buying licenses. Digital Domain's Nuke system was originally developed as a renderer for Flame, for example, so that the expensive Flame machines could be used for the interactive work and the batch rendering could happen on commodity hardware. For Riddick, we developed our own smoke-rendering system rather than use RenderMan, to free up render licenses for other parts of the movie.
I'm afraid that an explict cost-per-node breakdown would get into details that we keep confidential, but this should give you an overview of our situation.
Thad Beier
Hammerhead Productions
p.s we don't do Videos, we make Films.
I love Mondays. On a Monday, anything is possible.
What did you think of the freeware options, e.g. Aqsis?
"Orthodoxy is unconsciousness" - Orwell
Well, I should have mentioned that are largest line item, by far, is the animators that drive all of this software. At least on a project of the scale of Riddick, it was best to use software with which our animators would be most productive. For rendering, that means RenderMan. There are many people who have strong experience in writing shaders and doing lighting in RenderMan -- and RenderMan is practically bullet-proof after decades now of work.
We did try a couple of other rendering alternatives. We hoped to be able to use the new Cg shading language to do hardware rendering for some of the shots. We hired an absolute wizard named Hal Bertram to help us with that, but in the end decided that the hardware (at the time) was still not good enough to do final rendering. Interestingly, with the new NV40 chips incorporating Vertex Shader 3.0 and Pixel Shader 3.0 capabilities, almost all of those limitations have been overcome, and we'll be examining hardware rendering anew for our future projects. It seem inevitable to me that in the near future, we will be using hardware rendering for most of our shots -- the quality of what can be rendered with hardware is improving at a breathtaking pace.
We also looked at some other non-free and free (although not open-source) software renderers, some even claiming to be compatible with RenderMan. While they are truly amazing efforts by small teams, we very quickly ran into devastating problems with each of them. I really hate to mention who they are, because they are trying so hard -- but they're not there yet.
One renderer that I'm extremely excited about is Splutterfish's Brazil. It renders some spectacular images mind-bogglingly fast. In particular, it does what has come to be known as "ambient occlusion" so fast that I just can't believe it. Unfortunately, it only runs on the Operating System That Will Not Be Named.
So, in the end, we spent the money for RenderMan to make best use of the really expensive resource, our animators. Having a familiar environment, and one where everything worked every time with no "gotcha's" was worth it. [OK, we did run into some RenderMan surprises...but nothing too serious].
Thad Beier
Hammerhead Productions
thad@hammerhead.com
I love Mondays. On a Monday, anything is possible.