Pixar Eclipses Sun with Linux/Intel
lieutenant writes "Pixar Animation Studios is replacing servers from Sun in its render farm with eight new blade servers from Rackspace. In all, the blade system contains 1,024 Intel 2.8GHz Xeon processors, and it runs the open-source Linux operating system. Pixar has ported its Renderman software to run on Linux." I'd love to see their electric bill ;)
For around $25,000 you too can make Pixar quality movies (+ the cost of those servers). https://renderman.pixar.com/
Not 1024 CPUs in one box. Each CPU sits on a "blade" card and acts like a seperate system. It's a bug cluster.
1024 is 2^10. Computers operate in binary, and 1024 is an "even" number when you consider binary.
If you have a task that can be easily partitioned off (oh like each individual frame would be an easy break for this) you can send each task to a different machine allowing you to parellelize the task.
This is a poor mans version of NUMA (Non Uniform Memory Access) created and popularized by Sequent (now a division of IBM) where rather than have a single pool of addressable memory, you have multiple pools of memory, some with very fast access, some with slower.
What I am wondering is what do they do for the cluster cross connect. In large scale cluster environments, this tends to be a significant bottleneck. In large scale clusters you start seeing things like HIPPI, VIA, and soon to be Infiniband... wonder what this is stocked up with
I have mod points and I am not afraid to use them
As far as I know Rackspace is a managed hosting company. Rackable Systems makes servers - Yahoo and Google both use them. Anyone know if the article has it wrong, and Pixar is actually using Rackable machines?
1,024 Intel 2.8GHz Xeon processors... I'd love to see their electric bill
Well, ignoring the power requirements of RAM, bus controllers, network adapters, hard disks which are probably used for boot only...
Intel rates these things for 74.0W thermal dissipation, which is a pretty good measure of the electrical power consumed... since, unless something is badly wrong, your Xeon chip will not dissipate energy as light or sound.
74W x 1,024 = 75,776W continuous.
Assume they're on 24/7. Assume a cost of $0.06 per kWh, including distribution, debt retirement, Ontario's capped electric rates, etc.
There are 30 days in the average month. There are 24 hours in the average day [grin]. Therefore, there are 720 hours per month.
720 hours @ 75,776W = 54,558,720kWh.
Just a little over $3.2 million per month.
I'd imagine it's less than that; their electric rate is probably somewhat less based on their consumption. But consider that the depreciation on that hardware is probably a greater monthly expense than the electricity to power it...
I'm glad Linux is ready for Pixar, because Linux sure ain't ready for the desktop.
Fire and Meat. Yummy.
My god, I thought they had trouble scaling Linux that far. Seriously. How the hell do you do that when "stock" linux doesnt like 8 CPUs?
Because it's not a single system image. Rendering movies is easy to parallelize because you don't need to have once scene rendered before you can render the next; all the information you need is in the model file.
I must be lost here, but most of these renderfarms I've seen that use Sun products is for network storage solutions, though they're even losing the marketshare these days. I think what people are starting to realize is that just because you paid a whole lot for it, doesn't mean you got "The Best".
Supercomputers of 5 years ago can be built today with computers being thrown away and setup into a computing cluster. Obviously the good old days of 40 trillion dollar super computers paid for by the goernment aren't the super computers of today.
Ignore the "p2p is theft" trolls, they're just uninformed
They have half-depth 1U boxes. That's right, two servers in 1U, back to back.
Includes space between the two for cabling and cooling.
They specialize in delivering easy to manage (physically) racks of highly commoditized systems.
(I work with them in a reseller relationship)
Imagine a 71U rack(minus 1U for a switch), with 142 boxes, all dual proc. 248 procs in a rack!
Man, I wish they'd put the right link in there.
Striving to achieve a lower state of conciousness
My god, I thought they had trouble scaling Linux that far. Seriously. How the hell do you do that when "stock" linux doesnt like 8 CPUs?
I often see this misconception about multiprocessor machines. Some machines have a true tightly coupled multiprocessor architectures with a shared memory space, like big iron machines from SGI, Sun, and HP. These can be used to run a multithreaded process to speed up time-to-solution for a task. The speed-up is subject to the usual Amdahl's Law restrictions. The blade server machines, like Pixar is using, are 'tightly bolted' multiprocessors which share mechanical components and power supplies, but they effectively look like separate computers. Possibly some of the blades have shared multiprocessors, but no more than a 2-4 cpus per blade. Separate instances of the OS run on each blade.
For easy to partition tasks like computer graphic rendering, each frame render task can be run single threaded, and there can be many tasks running at the same time. The time-to-solution for a single rendered frame is not reduced by parallelization, but the overall throughput is increased by multiple tasks.
Nine women cannot make a baby in one month, but nine women can make nine babies in nine months.
I recently attended a talk by Google's chief engineer. They have approximately 15,000 x86 machines running Linux at seven data centers in the United States.
Weird failures occur so often, such as disks returning garbage without the controller informing the OS, that Google does a checksum on _every_ data structure in their user-level software. He also talked about how Linux is good enough for them, but it doesn't perform well with respects to I/O under heavy load. He says they like Linux because they have the source-code and that they minimize excessive I/O loads on their machines. Nobody asked why they don't use FreeBSD but I suspect its because Linux has better hardware support and Google builds their own machines with numerous different components based on the latest technology.
Sun isn't about raw CPU power. For that we have POWER and x86. Sun is about massive scaling. Sure, 1 POWER4 or P4 or Athlon beats an Ultrasparc. And 8 USIIIs lose out to 8 POWER4s or Xeons or Hammer CPUs. But Intel and AMD drop off at about 8P systems (though ItaniumII can handle larger systems, and Opteron can scale past 8P with a HT bridge), and the POWER architecture scales to hundreds of processors. Sun though can pack a thousand chips in a single system image, with plans to scale to 4096 (IIRC) within the next 2 years.
I'm sure Sun would love to have a high-performance CPU to field against massive clusters being deployed for highly parallelizable tasks such as rendering, but the fact is that's not where their strengths lie. Huge tasks which cannot be efficiently split are what Sun is good at, tasks where superb scalability in terms of both CPU power and memory are an absolute must.
For more, read Ace's Hardware's excellent volume multiprocessor articles:
Part 1
Part 2
Part 3
High-speed Road Trip (18.000KPH)
Pixar is on the right track. I do ASIC verification, mainly on Sun boxes (fastest USparc IIIs, multi-proccessor, 14GBs memory, etc). Lately, I have been running the exact same jobs on an LSF enabled Linux farm of Intel boxes.
The improvement is 3-4 times speedup ie 8 hour Sun jobs take 2 hours on Intels.
For the price of one dual proccesor Sun workstation, you can get ten Intel boxes running linux.
Not only is the speedup great, I need less licences to run the CAD software (doing multiple regression jobs). Since a license seat per CAD tool can run from 30K to 200K each plus 10% a year maintence fee, the savings are huge.
Changing over to linux was trivial. I like and have used Suns for years and Suns were a major player in this industry. But I firmly believe that this paradigm is going to be a SUN KILLER!
Residences aren't generally penalized for poor power factor like commercial operations are. Normal residential meters measure true wattage. Even if your power factor is lousy, you'll only get billed for actual watt-hours used.
The parent post is somewhat misleading and more than a little spotty, but it got modded up, so I feel I should clarify.
... we hate you because of your strives for world domination, but then you go and support linux ... bastards we just love to hate you.
That Sun had tried renderman (or whatever they call it) to run on 32 bit processors and it was a horrible disaster. Something about how it seemed more feasible and cost efficient to use Sun until the days in which the competiting 64 bit processors became cheaper.
Renderman is a standard for going exporting frames to a renderer. Pixar's implementation is called Photorealistic renderman. Sun is not involved in this at all. It has run on x86 procs, as well as Linux for quite a while now. Renderers are relativly easy to port, especially from different Unixes. I am not sure if there are speed advantages to 64 bit computers, or if it is just accuracy and memory like always, which is still a big advantage for a renderer. ( can anyone clarify?) I have a PRman rendered image on my desktop right now on my 450 Mhz PIII. The above quote is pretty much completly false.
Doesn't dreamworks use this type of technology already?
The technology is just running off the shelf software and hardware. Different parts of dreamworks do use Linux heavily.
Damned MPAA members
This is horribly misinformed. I don't have the energy to go into the whole issue here but suffice to say that this is wildly misplaced frustration. First of all, Pixar is not a member of the MPAA. They have a deal with Disney, which is. That attidude would be fitting and understandable with Disney for various reasons, but making Pixar your enemy is just wrong (except when they sued Larry Gritz personally to hold off competition to Renderman). The same goes for Visual Effects companies. ILM, Imageworks, Digital Domain, PDI, Pixar, Rythm and Hues, Weta, etc. are the best thing that's happening to Linux right now. They are so far removed from the wrongdoings of the MPAA its like me blaming someone for crime when their friends dad is part of the NRA. They are doing only good for Linux, and they are not hyprocrites. They do have deals with studios that are intern part of the MPAA. Not everything is perfect, and these issues are not something that they as companies are, should be, or will be concerned about. They are also starting to contribute to Linux, and I am confident more will come as Linux matures in their pipeline. Building up anger towards Visual Effects companies perpetuates the sterotype of free software advocates being zealots without understanding the whole issue.
This Wiki Feeds You TV and Anime - vidwiki.org
Agreed. Nor are traditional business apps. Most traditional business apps (Sun's bread and butter market) require a higher bandwidth to compute ratio than can be accomplished by regular intel gear.
In these cases, just throwing a higher clock speed at the CPU does not really do a whole lot. The backplane, layers of cache and interconnects need to be faster.
Sun knows this, as do the people that buy higher end servers.
Movie rendering is different, and therefore a Linux cluster may do just fine.
Is renderman open source yet?
Renderman is a specification, not a product. There are various open-source efforts to implement the renderman specification, but they all seem to be dormant at the moment. See here.
Have you got your LWN subscription yet?
Why the hell 1024 procesors?? Why not 1000??
1024 nodes makes a perfect 10 dimension hypercube. Hypercubes can have major advantages for speeding communications within sub-cubes, which can speed certain types of parallelized applications. Also with this architecture you can avoid a central switch system.
However, you would have to buy 10 ethernet cards per machine, which would be hard to pull off with blades, and I can't think of a way off the top of my head why a hypercube would help with frame rendering, It might be a data server locality thing... but either way, they have their reasons.
Ignore my comment about 10 ethernet cards per machine... you could avoid that and still build a hypercube.
Actually there is one hosted at Sourceforge that is very active, called Aqsis. There were a couple of other projects like gman that never took off, or were just University projects. Aqsis is making good progress:
Aqsis
There are a few other implementations that also run on Linux like AIR, The aforementioned RenderDotC (which I believe Cinesite used), and 3Delight. Hopefully a product like Liquid (from a guy that worked at Weta), which is a Maya to RIB translator (kinda like MTOR) will also take off which could help in making a more powerful combo.
They write their own rendering software, and ported it to Linux for this switch. I'm sure they could have done a PPC port instead, if that's what was needed.
It's hard to be religious when certain people are never incinerated by bolts of lightning.
Sorry Guys... This article looks to be a bit off base!
/. I though you guys did better about checking this kind of thing out! Just because it's on c-net doesn't mean it's accurate. Well kudos to who ever really got this job.
-- Not an Official RS response --
I work for Rackspace Managed Hosting. The company the link "Rackspace" references in the C-Net article. This kind of cluster is not consistent with our business. We are most focused on web-centric managed hosting vrs colocation. A rendering cluster is something that, from my experience, we've never done. Also We don't carry Blade servers. C-Mon
Matthew Montgomery
Rackspace Managed Hosting.
In addition, Itanium performance for CPU-bound applications is bad.
Last time I checked the Spec CPU benchmarks, Itanium2 was the leader for floating-point performance. Check them out...they may not be the leader right now but Itanium2 is no slouch.