How Facebook Runs Its LAMP Stack
prostoalex writes "At QCon San Francisco, Aditya Agarwal of Facebook described how his employer runs its software stack (video and slides). Facebook runs a typical LAMP setup where P stands for PHP with certain customizations, and back-end services that are written in C++ and Java. Facebook has released some of the infrastructure components into the open source community, including the Thrift RPC framework and Scribe distributed logging server."
As I recall, some of their code was made open source in 2007, although not deliberately.
About how much has Facebook saved by using Open Source Software? I ask because I am not familiar with licensing costs from competing solutions. Thanks!
I haven't watched the presentation so don't know if this is answered there, but it's hard to pin down any numbers on precisely how many servers facebook operates. That said, an estimate of their expected power usage in their recently acquired second datacenter is 6 megawatts, placed at twice the usage in their current datacenter. Realistically, this probably equates to a cluster of around 5,000 machines in the current datacenter.
Costs per machine are likely to be restricted to Windows Server Web Edition; other software would not be needed on all machines (depending on cluster architecture, of course) so would be a trivial cost in comparison. Retail for the web edition is $399; I think we could expect such a high profile user to qualify for a 50% discount. This would put their software costs at about $1M. Considering that they're believed to have spent over 100 times this on hardware and support costs over the last year, I doubt this would be a particular concern. Price of purchase is not a factor in why facebook does not run on proprietary software.
That has little to do with the infrastructure and more to do with the site design. Please don't blame the sys engineers/admins for the poor interface design.
Well, the fact that they gave a talk about their LAMP stack tells you that they consider engineering more important than site design. Furthermore, a poor choice of infrastructure makes doing good site design hard.
And that's my point: Facebook is evidently driven by system stuff and programmers, while it should be driven by site design.
Clearly, $MY_SPECIALTY should drive the entire system! They made a big mistake by allowing $OTHER_SPECIALTY to take precedence. Everyone knows that only $MY_SPECIALTY should dominate all design plans. Duh.