How Facebook Runs Its LAMP Stack
prostoalex writes "At QCon San Francisco, Aditya Agarwal of Facebook described how his employer runs its software stack (video and slides). Facebook runs a typical LAMP setup where P stands for PHP with certain customizations, and back-end services that are written in C++ and Java. Facebook has released some of the infrastructure components into the open source community, including the Thrift RPC framework and Scribe distributed logging server."
As I recall, some of their code was made open source in 2007, although not deliberately.
Whatever they're doing, it's not working too well. Sure, they manage to serve the pages, but the user experience is confusing and it seems to take them forever to roll out new and improved versions.
Every few days I run into whole sections of core Facebook functionality that are just plain broken for hours. Earlier this week, my main page wouldn't load for most of the day. And every couple of weeks I'm greeted with a "Sorry, you can't log in right now." message.
As an architect, I decided to view the presentation so that I can learn new things about scalability and architecture. This presentation came across as very amateurish and lacks any serious technical depth.Facebook seems to be stitched together as a set of "solution de jour" technologies without any real architecture behind it. Too many languages, frameworks and other gems. These guys took the notion of the right language for the task to an extreme. I have to believe that code releases into production is a big challenge for these folks.
I need new glasses.
Anyway, I think the answer is 'Simply by existing'?
About how much has Facebook saved by using Open Source Software? I ask because I am not familiar with licensing costs from competing solutions. Thanks!
"Every few days I run into whole sections of core Facebook functionality that are just plain broken for hours"
What response did you get when you reported it to the Bug Reporting site?
Ah, the sweet ironies (and hypocrisies) in life. There's something beautifully creepy about a person fighting so hard against the same thing they fought so hard to create. In today's case, the culprit is Mark Zuckerberg, the young man more responsible than perhaps any other for his generation's obsession with displaying itself publicly on the internet. The New York Times has reported that a judge turned down Facebook's request to have "unflattering documents" about Zuckerberg removed from the website of Harvard magazine 02138.
At the center of the issue is an article in 02138 about Facebook's evolution and the subsequent lawsuit from classmates asserting Zuckerberg stole the idea and computer source code to begin his own project. The New York Times calls the article "sympathetic to the plaintiffs's account and questions the validity of Mr. Zuckerberg's claims."
The 02138 article also contains Zuckerberg's handwritten application to Harvard, and a journal that "contains biting comments about himself and others."
Perhaps Gawker summarized it best, saying, "This is the same dude who made billions from a website that allows you to let everyone in your friend network know when you are peeing."
And now he's mad that a private persona he would like to keep that way has entered the public domain. Yes, the sweet ironies and hypocrisies in life: why do we love them so much?
"Facebook seems to be stitched together as a set of "solution de jour" technologies without any real architecture behind it. Too many languages, frameworks and other gems. These guys took the notion of the right language for the task to an extreme. I have to believe that code releases into production is a big challenge for these folks"
What's 'hodge podge' about a highly customized solution. It is precisely what LAMP is all about. It does seem to work for them and with Facebook supporting 200 million active users, it is a good example of an Open Source success, so they must be doing something right.
I know this should be the job of tags, but to help put this in context, remember the recent uptime comparison that showed Facebook with pretty decent availability compared to other social networking sites. I'd say it takes the admins a fair amount of disclipine and perseverance to attain those kinds of numbers. (of course, it probably has nothing to do with the uptime of their various sundry and mostly useless modules, but I'd guess that's a different set of admins than the ones that care for the core LAMP platform)
This almost makes Facebook geek "cool" in my book, but I guess all the non-geek "cool" kids who use it already think so.
We at Facebook know how to program, unlike the programmers for slashdot.org. I wrote the code that deploys the system and believe me, it's great and not fraught with problems as some here suggest. LAMP is a great tool, and we leverage it for what it is.
When I first saw the post, I though it said how Facebook RUINS its LAMP stack.
I think that has to do with my experience with the apps and how often things timeout in that regard. It's a little frustrating and I'm sure it has nothing to do with the guys at Facebook, but it is interesting to find how that third-party experience affects my subconscious.
Linux - because it doesn't leave that Steve Ballmer aftertaste.
That has little to do with the infrastructure and more to do with the site design. Please don't blame the sys engineers/admins for the poor interface design.
Well, the fact that they gave a talk about their LAMP stack tells you that they consider engineering more important than site design. Furthermore, a poor choice of infrastructure makes doing good site design hard.
And that's my point: Facebook is evidently driven by system stuff and programmers, while it should be driven by site design.
Well, I'll be - the first LAMPCJ Stack - make it too big to fail.
For all of you fellow architect bods out there, this is how you do it:
PHP - California, Texas and France
C++ - New Jersey and Tibet
Java - California, India, and Somalia
Now, what does this variable name in Somali represent?
Her lips were softer than a duck's bill, but her quacks
PHP is the most popular language on the planet for a good reason - transaction rate.
If code is written in any language such that the app cannot handle more than 12 transactions per second, it's time to find another profession instead of blaming the language.
Depending on the application, PHP can handle several hundred transactions per second, on *one* machine. It is common knowledge that Java requires far more resources to achieve a typical transaction rate, than PHP.
Her lips were softer than a duck's bill, but her quacks
Both Java and PHP are interpreted languages because this is how you create a cross-platform language.
Each gets compiled to bytecode which gets executed in a OS specific VM.
Jave bytecode is compiled manually. PHP bytecode is compiled automatically using an encoder. In both cases code is compiled once and reused.
We chose PHP for our website because of it's efficiency in terms of development (e.g. no class generation step for programmers) and execution overhead. You don't need as much memory or cpu to run a typical PHP server.
Frankly, most websites do not need an app server. Wikipedia uses PHP, not Java. It is not a 'simple' website that you say PHP is suited for.
Her lips were softer than a duck's bill, but her quacks
Well, now we see who is pumping M$ on Slashdot.
It looks like turfers are turning their poison pens towards Sun, just as near a decade ago they turned towards the late Novell.
Beta is broken and the link to classic doesn't work. Stop wasting our time or there won't be anybody left here.
This is both informative (a believable in-the-ball-park analysis of Facebook operational costs) & insightful (price is not enough to be the best solution)
There's also Cassandra, the Java based distributed database that they've made open-source - first on google code, and now as an apache project(in incubation).
I guess a story on /. with only 75 comments after 7 hours pretty much answers that question, eh?
Big Media and Cable are exactly like this too. I had the gross misfortune to work for a "media search" subsidiary of a massive nation-wide cable company. There were a handful of sharp programmers, don't get me wrong, but the rest of them were flat out disgustingly bad. Apparently nobody knew the first thing about software engineering, algorithms, let alone database design. I had the distinction of being in charge of all internal systems, run by a manager who was the most incompetent I have ever run into in 15+ years.
One choice example of sheer stupidity was this: in order to read thru a search result set, they read 100 rows. then to get the next 100 they REread the 1st 100, threw them away and continued with the 2nd set of 100. Repeat for the 300th, 400th records.
They created these massive Berkeley DB's but completely trashed it so that you couldn't actually use the hashes for anything. So it was basically a linear scan. The underlying technology was JBoss-based and they didn't have so much as the first clue how to even do that right.
I was shown the door after I wouldn't shut up about how fscking inexcusably imcompetent the lot of them were (incl.) their non-existent management. I'm not the least bit unhappy about leaving them. I hope they fail and do so spectacularly. It probably doesn't have too much bearing on it but nearly all of the development team were foreigners. I don't know how many of them were H1B but even if they weren't, it was a profound example of Russians, Indians, and other Slavic programmers can be every bit of shit as their American counterparts.
Haha, you honestly can't see a difference between willingly posting things publicly and having other people dig up your private life and publishing it?
Just to be different, I run: Linux, Lighttpd, Sqlite, Kepler
Homonyms are fun!
You're driving your car, but they're riding their bikes there.