Inside the Lucasfilm datacenter
passthecrackpipe writes "Where can you find a (rhetorical) 11.38 petabits per second bandwidth? It appears to be inside the Lucasfilm Datacenter. At least, that is the headline figure mentioned in this report on a tour of the datacenter. The story is a bit light on the down-and-dirty details, but mentions a 10 gig ethernet backbone (adding up the bandwidth of a load of network connections seems to be how they derived the 11.38 petabits p/s figure. In that case, I have a 45 gig network at home.) Power utilization is a key differentiator when buying hardware, a "legacy" cycle of a couple of months, and 300TB of storage in a 10.000 square foot datacenter. To me, the story comes across as somewhat hyped up — "look at us, we have a large datacenter" kind of thing, "look how cool we are". Over the last couple of years, I have been in many datacenters, for banks, pharma and large enterprise to name a few, that have somewhat larger and more complex setups."
Only a few boxen are used rendering and effects. The rest is to track and calculate sales of Star Wars merchandise.
Is that all? Most datacenters that house more than 1 large customer usually starts at about 300tb, nothing to write home about. Most customers using sap use a lot more.
I'll just assume it runs linux. Did it say in TFA?
There are many corporate data centers larger and more powerful than that, it is much more impressive if the entire thing can run one giant application. Still, I'm pretty sure that Google's new datacenter wipes its ass with a datacenter the size of this one.
stuff |
San Diego Super Computing Center (SDSC) has 2 Petabytes of online Storage with 400TB for researchers. They have 18PB of archival tape storage.
a tures/print.php/3634881
Still....I like datacenters. The hum of equipment. 65 degree temps and lower. I once had my cube re-located to a tape library. Quiet...peaceful place
http://www.enterprisestorageforum.com/hardware/fe
Is that the speed you can talk at?
A rhetorical question does not expect an answer...
...so maybe "rhetorical bandwidth" is a nice way of saying that the data flows only in one direction? ;-)
...would post this as a news item. Front page, too.
Let's break this down submission down..
"Hi. I found this article on the web that totally didn't impress me, I think they fiddled with the numbers to make themselves look better than they are, and overall I really couldn't give a shite."
Yes. Obvious front page material for a Sunday!
300TB of storage in a 10.000 square foot datacenter
Can fit 300TB in a single rack these days.. or is that a 10 square foot datacenter?
Well passthecrackpipe, if you and your vast knowledge of large scale datacenters are not impressed with the story, why the hell did you submit it?
10.000 square feet for a datacenter is not very impressive. The datacenter that I work in did a relatively modest 100,000 square foot EXPANSION which was the result of absorbing an adjoining atrium. I suspect that the power equipment and air handlers may take up 10,000 square feet.
and format it?
No folly is more costly than the folly of intolerant idealism. - Winston Churchill
So why submit this if you don't like it? Why not at least title it "Lucasfilm thinks it's soooo great."? I'm sure you've seen bigger data centers, and you can type 500 lines of code a minute, and maybe you defeated a ninja in hand-to-hand combat, but for the rest of us "normal" nerds it's still neat to read about the machines that get the work done in a business. Of course it's hyped up, it's a press release disguised as news. Take it for what it is, relax, and try to imagine those 2,000 servers in a secret cave under your house, manipulating the stock market in your favor. That's what I do.
How about theoretical? *yawn*
Why all the negativity toward Lucas? Jar Jar's dead man, let it go. George said he was sorry already. I think it's a good story. It's absolutely fascinating to me to see how they make movies today, how much data gets pushed around, and how they make sure that the creative people have access to what they need, when they need it. And they do all this to support incredible time schedules, with boatloads of cash riding on every second. I don't know how anyone can say that this isn't an impressive operation. As for Lucas thinking they are so great... well, they pretty much are. I'd say that being organization that created the special effects for tons of blockbuster movies and being nominated for several major movie industry awards pretty much gives them some bragging rights.
300TB storage and 11 petabits/s bandwidth.
This means
A) they can push their entire storage through the network in 300*8Tb/(11Pb/s)=200ms.
or
B) the article author does not have a clue.
I think an anlogy would be: I drive back and forth to work everyday, or 400 times a year. My speed on each trip is 60mph, so in a year my speed is 60x400 or 24000mph.
don't cut it off www.mgmbill.org
Guess they have not been to a hospital data center yet. Should check out someone like dow chemical.
barf
As in reference to THX 1138?
Of course, it could just be a coincidence.
I've got a fever and the only prescription is more COBOL.
I have to wonder how many systems they have? They accomplish a great deal with what is a fairly small area. I would guess that they each computer has major ram and is simply NFSed back to a central server.
What I have found funny is the number of ppl who are speaking of how big their centers. Offhand, I tend to suspect that those centers could go on a MAJOR f%^&ing diet and need to have their budgets cut to a fifth. And finally, it is time to fire a bunch of the incompetents who can not run a tight center.
I prefer the "u" in honour as it seems to be missing these days.
...running FC6 x64.
Why? Because my rig has never so much as contained - much less rendered - an image of Jar Jar Binks.
Pwned.
I am very small, utmostly microscopic.
The datacenter at one of my employer's satellite sites has four CLARiiONs, at 2 racks each, a 5-bay DMX-3, and a 4-bay XP1024, for 380TB raw, in 3,200 sqft, along with thirty racks of servers, a P595 mainframe, and several multi-rack computing clusters. There's plenty of cooling and it's really not THAT crowded. Managing to pack 10-12 racks of storage into a 10,000 sqft data center is not anything noteworthy.
Now I'm disappointed. I had hoped Masi Oka would be working there.
This is the Lucasfilm datacenter. That number finds its way into all sorts of Lucas-related material.
Will it run Vista? Sounds like they might need to upgrade!
Does anybody else find it questionable that he said Pirates of the Caribbean required approx. 50TB of storage, and the next one will require 25% more....but then goes on to say that there is total storage space of 300TB in the data center. Thats basically enough to store six movies of equivalent size to Pirates, so where are all the rest of the movies they make stored??
There's considerable unhappiness in San Francisco about Lucasfilm's operation. It's in the Presidio, which used to be a military base and is now a national park. It's the only national park which has to make a profit, due to a Bush Administration deal. Letterman Army Hospital was torn down to make room for the Lucasfilm facility. The San Francisco Bay Guardian complains about this constantly, as they try to keep the Presidio from turning into an industrial park. The Lucasfilm move to the Presidio was something of a dot-com boom excess, when people thought SF was the place to be.
Pixar, in Emeryville, Tippett, in Berkeley, and Dreamworks, in Redwood City, are the innovative animation companies in the Bay Area. And of course, there's EA, SCEA, and some other game companies. Lucasfilm doesn't seem to get much attention.
There are data centers in San Francisco proper with far more storage, too. The Internet Archive has several petabytes of storage. There's a large colocation facility at the 6th St. offramp from I-280.
They wanted me to move across the continent from a place with average cost of living and a 10 minute commute to work in San Francisco (right in the city, not even an outlying area) for about a 15% increase in pay. The only way I could afford that would be to take on a 2-3 hour commute and even then I'd have to run an even tighter ship, financially speaking, than I do now.
I suppose they were counting on the "cool factor". The job was cool, but not so cool I was willing to stick a stake through the heart of my family. Right after this, I read that Lucas donates 170 million to his alma mater. Hey George, why not donate 10% less and actually pay your people something more since you're insisting on setting up shop right in the freaking Presidio?
600 Tbyte of disk in total can't be right. I wrote an application a couple years ago that has 6 terabytes of disk allocated to it to cache its work. This was for a single app. Admittedly, we worked with fairly big data files where I was working, but I've got to think Lucasfilm's files are way larger than my 1-2 gig files.
The majority of Lucasfilm's processing power is used for Graphics generation for ILM (unlike say Google). I think the hidden message of this article is they could get a huge screen and projector, play, for example, Crysis on full settings including 64x Antialiasing and Anisotropic Filtering, at 4320p with 22.2 surround sound and say to Sony, "Thats TrueHD!"
http://en.wikipedia.org/wiki/UHDV
http://en.wikipedia.org/wiki/22.2
09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
I guess that depends on what the internal temperature of the machines will be at the hotest parts of the day. If the AC and outside thermal insulation isn't up to snuff, you might need to go into a day with some pretty cold temperatures just so your servers stay viable, heat-wise by the end of the day.
As somebody who (ab)uses that particular rig daily, the article misses the point about what's so awesome about the system.
It's a good sized datacenter, but what it's able to support in processing ability is the impressive part, and that the fat bandwidth runs at capacity almost all of the time by the demands of processing jobs. Proprietary software doles out jobs 24/7 to thousands of procs all over campus-- including artists' desktop machines-- for heavy duty computation: rendering and simulation and whatever it takes.
I can't imagine a facility where so many people are creating and pumping so much data around.
I toured their new facility in San Francisco. They have over 300 10Gbps ports and all PCs are connected via gigabit. Their datacenter was 2/3 full of dual-Opteron servers running SuSE Linux (though they were considering switching). Their server room was spotless. No cables were visible anywhere, but I did see a Roomba moving about the floor. The fellow who ran it said that since they're ILM, they have to have droids.
The facility was absolutely beautiful. When going between two buildings on an overhead walkway I saw the Golden Gate bridge with a nice orange sunset behind it. I wish I had my camera with me.
They said that they have many dedicated OC-48 pipes to various studios and can handle just about any format, since every studio uses their own format. They convert it to their own internal format, which I believe they open sourced.
When they moved from Skywalker Ranch, it was completely seamless. They had an OC-192 (10gbps) link running between the old and new facility as more and more equipment was migrated to the new facility but people continued to work at the old one.
-Aaron
This post is encrypted twice with ROT-13. Documenting or attempting to crack this encryption is illegal.
The point of the story was to display ILM data crunching power as impressive for a POST PRODUCTION house. Not "the greatest data center in the world". Compared to any other post production house, ILM is pretty darn impressive.
but at a resolution of 4096x2160 pixels, each frame takes a while to generate.
Where'd you come up with this resolution? I've never worked on a movie where the final rez was higher than 2K. I can only think of one set of elements I rendered at 4K -- a bunch of badly aliasing Mental Ray renders. A halfway decent renderer will let you get away with rendering at a lower rez than needed for the final comp.
Not to say your point isn't a good one though. Google's velcro and duct tape solution to server farms isn't really appropriate for the needs of 3D rendering. Plus, visual effects studios just don't have Google money to throw at custom farm solutions. On their scale, it's much more effective to just pay Dell or HP to take care of it.
Nevermind the fact that there are much larger and complex setups out there, as others have pointed out. Nevermind the fact that Star Wars was a ripoff of a Japanese pulp science fiction novel.
Theoretical bandwidth is a chimera. All the cars on Los Angelos freeways at a given time, carrying boxes of tapes -- now that's some theoretical bandwidth. What matters is achieved write and read capacity -- I believe the record is 14.5 Gb/s sustained.
Goddammit, as soon as I post in this topic I see something like this.
Someone mod this troll down, please.
Nobody else has this sig.
TFA talks about 2000 servers equipped with 10 Gbps network cards.
11.38 Pbps is 11380 Tbps or 11380000 Gbps. This means that each
server has 569 network interfaces !! This is total bullshit. If
they had said they had 10*2000*2 = 40 Tbps, it would have been
based on more real (though irrelevant) data.
I hate it when ignorant journalists post meaningless data for public
consumption.
Willy
We techies really sux at Negotiations. Sadly, the more hard core you are, the less business savey we appear to be. I have been stuck around 100K. A friend of mine with less education and experience was offered a job at MS. He was originally offered 85K (this was 8 years ago). He said no and held out for 150K, stock options, and benefits. They came around and re-offered him. I do not know exactly what it was (per contract, he was not allowed to say), but he says that it was more than what he wanted. After seeing the house that he picked up in the Seattle area, I believe him. For all I know, he had MS give him the down payment for it.
Considering the team that he was on, I was more surprised that he was offered so low at first, but that is what business ppl do. We all have to learn when and how to negotiate better. Perhaps CS/CE should take up classes on this.
I prefer the "u" in honour as it seems to be missing these days.
All I can think of is that it's their internal standard for a wide screen HD feed.
Just an idea.
Yes Francis, the world has gone crazy.
Though anything in the SF Bay Guardian should be taken with a grain of salt, it should be noted that publication blames now-Speaker Nancy Pelosi (D-San Francisco) for the Presidio arrangement, not the Bush-41 Administration. Since the legislation was passed during the Clinton-42 administration, blaming it on either Bush is farfetched.
But the course taken wasn't unreasonable. The Presidio was already developed when it was a military base. Turning it into a traditional, naturalist national park would have required un-development, destroying pre-existing housing, buildings and roads. (Restoring it to its native grassy sand dunes would have required deforestation.) Mixing updates of the prior development with other expansion of public use and re-naturalization made sense -- and given the immense value of just the already-developed real estate, having the whole project pay its own way should have been a no-brainer.
Care to mention which Japanese pulp sci-fi novel was ripped off? Just curious.
Imageworks has almost twice as much gear as they have. They are just blowing smoke out there ass.
300TB is still nothing, even with insane redundancy.
For example, assume you've got Sun Fire X4500 servers, and that you take them with the stock 500GB drives. 24TB per 4U server.
Now let's assume that you run RAIDZ2 on each server, dedicating 2 of the 48 drives to parity. That's 23TB per server. Now let's assume you want some redundancy, as in, completely separate failover capacity. You mirror every single server with hot standbys. Easy with ZFS, you can mirror the file system in real time without any issues. So we get an effective 11.5TB per 4U server.
Now, we need 300TB, so that requires 27 servers. 108U, or three 42u racks, with each rack having 6u for network infrastructure, load balancers to do the failover to the hot-standby units (or even localized UPS if you want double UPS coverage).
3 racks. With pretty decent redundancy that allows two drives per server to fail before you need to switch over to the hot standbys, and that's ignoring the fact that each X4500 server has redundant power supplies and allows hot swapping of drives and power supplies. THREE RACKS.
Explain to me why they need a 10,000 sq. ft. datacenter for three racks worth of servers? Yeah, I know, AC, UPS, generators, all that stuff takes up room, but I could fit a lot more than 300TB in a 10k sq. ft. datacenter.
Sure, I'm no expert, and I might be missing some important considerations, but even I could set up a 300TB high-uptime file storage network without needing an entire datacenter to do it.
That's right. If you have a renderer that does adaptive supersampling decently (aka not Mental Ray), you can render below your final resolution and get away with it. It seems like you have a lot of experience, so I can't imagine that you haven't been somewhere at 5am trying to get 3 hour renders out for a post session at 9am. Those are the times you're thankful you have a renderer where you can actually get away with rendering some elements at .75 or .9 rez and still have them look good.
And what the hell are you talking about "roll their own solutions, hardware and software"? If you watched TFV in TFA, you would have seen that ILM flashed Dell badges among others. And as everyone knows, Pixar was in bed with Sun for years. Granted, I don't really keep up with what other people buy anymore, and I can't speak for Weta, Framestore, and the other smaller shops you mention. The companies I've worked for had deals with HP, Compaq (nee Digital), and Dell (x2). They were all film studio size deals, which might explain the different experience.
You should go on record instead of posting as an AC, I'd love to actually have an on the record conversation with someone else in the business.
* This thread is useless without pics! *
We want our nerd porn!
dragonhawk@iname.microsoft.com
I do not like Microsoft. Remove them from my email address.
Now let's assume that you run RAIDZ2 on each server, dedicating 2 of the 48 drives to parity.
No-one would do this in a production environment where performance was even a passing concern (or if they did, they shouldn't have) - firstly, because parity-based RAID configurations are (relatively) slow and secondly because RAIDZ[2] (or parity-based in general, really) arrays shouldn't be any bigger than about 8 drives in total, or performance (and reliability) start to go downhill.
It's reasonable to assume you lose at least half the raw space in redundancy using RAID10 (or equivalent), to get the best performance - probably plus another drive or two for hot spares. So double your estimate of 3 racks to 6.
(I agree 300TB is nothing particularly impressive, however.)
Right after this, I read that Lucas donates 170 million to his alma mater. Hey George, why not donate 10% less and actually pay your people something more since you're insisting on setting up shop right in the freaking Presidio?
He could have invested that same 10% in the world's best script writer for the Prequels and thereby realized a 10x ROI from now-former-fanboys actually buying the DVD's of his movies and thereby able to both raise pay and donate more money. But ego is a terrible vice.
This from a guy who wrote his college entrance essay on George Lucas and wanted to work for ILM (before he got smart and grew up).
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
It probably runs gentoo so they use up that bandwidth and cpu power upgrading all the time.
5000 CPUs isn't particularly impressive either when you consider 1U quad core quad processor machines. Up to 672 CPU cores per rack. Pretty insane power requirements as far as density is concerned, though.
Let's not forget that the Sun X4500 is a quad-processor beast itself, and you're not going to saturate all 16 (possible) cores with storage demands, unless you're processing the data locally.
A more dedicated processor such as the Cell (which would seem to be ideally suited for software rendering) would reduce the space/power/heat requirements significantly, although I'm not sure of what kind of enclosure you'd need. Assuming you're not shoving PS3s into a datacenter, the Cell comes on PCI-X cards. I'm sure there is some enclosure designed to maximize the number of PCI-X cards in a limited space.
I don't think their datacenter is bad, by any means, just that the technology exists to do what they're doing in much less space (and possibly cheaper).