How Facebook Runs Its LAMP Stack
prostoalex writes "At QCon San Francisco, Aditya Agarwal of Facebook described how his employer runs its software stack (video and slides). Facebook runs a typical LAMP setup where P stands for PHP with certain customizations, and back-end services that are written in C++ and Java. Facebook has released some of the infrastructure components into the open source community, including the Thrift RPC framework and Scribe distributed logging server."
As I recall, some of their code was made open source in 2007, although not deliberately.
About how much has Facebook saved by using Open Source Software? I ask because I am not familiar with licensing costs from competing solutions. Thanks!
To quote a joke on slashdot
"Is there anything Java cannot make slow."
Domestic spying is now "Benign Information Gathering"
Ah, the sweet ironies (and hypocrisies) in life. There's something beautifully creepy about a person fighting so hard against the same thing they fought so hard to create. In today's case, the culprit is Mark Zuckerberg, the young man more responsible than perhaps any other for his generation's obsession with displaying itself publicly on the internet. The New York Times has reported that a judge turned down Facebook's request to have "unflattering documents" about Zuckerberg removed from the website of Harvard magazine 02138.
At the center of the issue is an article in 02138 about Facebook's evolution and the subsequent lawsuit from classmates asserting Zuckerberg stole the idea and computer source code to begin his own project. The New York Times calls the article "sympathetic to the plaintiffs's account and questions the validity of Mr. Zuckerberg's claims."
The 02138 article also contains Zuckerberg's handwritten application to Harvard, and a journal that "contains biting comments about himself and others."
Perhaps Gawker summarized it best, saying, "This is the same dude who made billions from a website that allows you to let everyone in your friend network know when you are peeing."
And now he's mad that a private persona he would like to keep that way has entered the public domain. Yes, the sweet ironies and hypocrisies in life: why do we love them so much?
Whatever they're doing, it's not working too well. Sure, they manage to serve the pages, but the user experience is confusing and it seems to take them forever to roll out new and improved versions.
That has little to do with the infrastructure and more to do with the site design. Please don't blame the sys engineers/admins for the poor interface design.
Too paraphrase the answer:
"Sun's stock price plummet.
PHP is the most popular language on the planet for a good reason - transaction rate.
If code is written in any language such that the app cannot handle more than 12 transactions per second, it's time to find another profession instead of blaming the language.
Depending on the application, PHP can handle several hundred transactions per second, on *one* machine. It is common knowledge that Java requires far more resources to achieve a typical transaction rate, than PHP.
Her lips were softer than a duck's bill, but her quacks
While I think Facebook is nothing more than one big popularity contest, I have to agree.
At least most of the stuff on Facebooks website works.
With slashdot, half the time clicking on a comment to expand it doesn't work unless I refresh several times or copy and paste the link into a new browser.
The right hand sidebar will say 'freshmeat' and show stuff from linux.com and vice versa.
At first I thought this was because I still used IE and that was the problem, being that slashdot doesn't cater to IE users, fine. So after I switched to Chrome I figured it wouldn't be an issue, yet its not any different.
I still can't expect expanding a comment to work, I still get crap listed as fossfor.us showing freshmeat entries, 'get more comments' doesn't do shit half the time.
As I've said countless times, programming in PHP and using MySQL 99% of the time means you don't know what you are doing. There are, however, those few large sites that use it that can actually justify its usage because it fits, but only if you actually know what your doing.
I have websites powered by PHP, ASP.NET, ASP, Java, and C. Some of those are good fits for what they do, some of them aren't and I've learned that the hard way. I've also learned that in most cases things are written because a developer 'knows' a specific language. My personal opinion is, if you only 'know' one language, you aren't a programmer. A real programmer can use just about any language given a good reference manual, and can be proficient in that language rather quickly after starting to work with it.
Unfortunately, most people who call themselves programers, aren't. They just happen to be able to get by with a language they've been spoon fed in the past long enough to hack out some POS that barely manages to get the job done and will drive any sane programmer absolutely mad when they get stuck taking over after the original devs are found to be incompetent.
Makes you wonder how many online services have failed because of arrogance and ignorance of the developers.
Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
That has little to do with the infrastructure and more to do with the site design. Please don't blame the sys engineers/admins for the poor interface design.
Well, the fact that they gave a talk about their LAMP stack tells you that they consider engineering more important than site design. Furthermore, a poor choice of infrastructure makes doing good site design hard.
And that's my point: Facebook is evidently driven by system stuff and programmers, while it should be driven by site design.
Clearly, $MY_SPECIALTY should drive the entire system! They made a big mistake by allowing $OTHER_SPECIALTY to take precedence. Everyone knows that only $MY_SPECIALTY should dominate all design plans. Duh.
If your site infrastructure is influencing how you design, you've made some sort of monolithic error along the way. Good code completely separates the content from the design. It's not like they've just hacked up a Wordpress install (which seems to go out of its way to tie content and design together) - Facebook employs hundreds if not now thousands of programmers; it's pretty safe to assume there's at least one UI/UX specialist on board as well.
All things considered, I'd actually say that Facebook's design is pretty decent, but that's of course a matter of opinion. A lot of the code that went into that design sucks, but that's what happens when you have to support IE6. Regardless, I think it's great that they're sharing knowledge about how they've managed to use and customize an infrastructure to support 200,000,000 users, especially with the amount of traffic they have to deal with. That's well beyond the scale that many governments have to worry about!
How are sites slashdotted when nobody reads TFAs?
It takes pretty much 0 work to make LAMP continue to function. Its for all practical purposes, set it up once (properly) and forget it.
It takes work to make the applications on top of it function continually as thats where the change occurs. LAMP isn't going down on its own, it'll appear to 'go down' because of the 'mostly useless modules' that work along with it fail, not because LAMP does.
I would expect the admin(s) that care for 'the core LAMP platform' spend most of their time doing other stuff. In reality, its probably only multiple to avoid any single person holding to much knowledge and to maintain coverage while that person isn't at work. I just can't imagine they do a whole lot of work 'keeping it running', with the exception of handling database growth and performance, which is more likely handled by the people who design and work with the applications that use that database.
Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
You do realize that you can do massive improvements to PHP perf in the space of 5-10 minutes without a recompile right...? The idea that PHP is "slow" is FUDtastic. Of course it's slow if all you're doing is letting it interpret every time, but with APC or another caching mechanism it's interpret once, run-the-bytecode every other time. Massive speed improvements.
"You can either have software quality or you can have pointer arithmetic, but you cannot have both at the same time."
PHP, as a language, is more than capable of handing four requests per second (which can be said of pretty much anything other than punch cards).
Writing bad code in PHP, however, will of course slow things way down. Just like not having indexes on your databases, or doing stupid/unnecessary JOINs. Or not caching properly (see: Wordpress). Writing fast and efficient code in any language is easy enough provided you're a skilled programmer. Facebook, unfortunately, started off as Zuckerberg paying a friend with some web skills to build out a system, and it grew so quickly that replacing the code (or, rather, the DB schema) with something that doesn't suck probably became near-impossible. If you write code with scalability in mind, it's not a tremendous problem.
Of course, nothing is going to cope well with the sheer volume that Facebook deals with. There's plenty you can do along the way to help yourself out, which Facebook may or may not have done. You can bet that nobody thought the site would ever have 200MM users when the first lines of code were written; they probably never expected 1% of that. Writing intelligent code is the most important part of scalability - writing smart DB queries and minimizing the number required probably being the biggest part of that. Have your MySQL servers instead of PHP do some calculations in queries (hashes, query-related math, etc) usually doesn't hurt since you're generally offloading CPU-intensive operations to a disk-bound machine (i.e., has spare cycles).
There's all sorts of tricks and optimizations. Some are language-specific, and some aren't. But making bad decisions early on is a lot harder to fix than an inefficient foreach loop.
How are sites slashdotted when nobody reads TFAs?
They have somewhere in the region of 5,000 servers in their main datacenter and (I believe) others scattered around the world, but restricting it to just that main center, that means each server is handling around 4 requests per second
I somewhat doubt every single one of them is a dynamically driven webserver. Probably at least half are databases, search servers, caching servers, backend appservers, file servers, CDN type stuff, backup servers, hot spares, admin servers, staging machines, etc.
For example: Newzbin has 5 webservers in main rotation; it also has 7 search servers (plus one development machine with similar specs), 6 database machines, 2 backend systems running most of our cronjobs, 2 admin servers, 1 web development server, and 2 systems for building and deploying OS's from. As far as load is concerned, the backend stuff is far more important than the frontend. Sure, we could rewrite the main site in Java or Scala or C++ and get away with 3 webservers and still be N+2, but trust me, those extra two or three webservers is not a significant cost next to that of development.
I can either spend £5k on extra equipment (plus occasionally boosting our space and bandwidth costs, but those are dominated by other systems already), or I can spend £70k a year on another developer, who *still* won't allow us to match our development speed with PHP, and then rewrite tens of thousands of lines of code, likely into much more.
Much of our backend is written in C. That's where the big payoffs for efficient languages is, not a bit of database-limited HTML rendering. Judging by how many big sites are still running PHP, Python and Ruby for their frontends, this would seem to be the case elsewhere, too.
True. But writing cache code is not easy and makes your code more brittle. It increases the likely hood a user will interact with the website and do something, say "update my profile" only when they click "save", their profile hasn't updated yet because your cache sucks. Then you have to plaster your site with bullshit messages about "please allow 30 seconds to see the change".
But what is far, far, far worse is you are allocating programming resources to non-features. Caching is a non-feature that adds zero value to your website. Your users dont interact with your cache. They interact with your website--and I bet if you are like any moderatly complex site, you've got all kinds of bugs that annoy the hell out of them. So rather than allocate your developer time to fixing those annoying bugs (thus adding value) or adding new features (thus adding value), you are stuck pissing away time optimizing bullshit your users never see.
So yeah. You can cache the fuck-all out of your website. But only by stealing developer time away from working on features that make your users happy. Of course if you wrote the thing in C instead of PHP, you'd have a different set of development problems of which I could only have nightmares about.
In otherwords, engineering is always a tradeoff. Use PHP (and MySQL) and piss away developer time on caching the fuck around their weakness. Use a compiled language like C and piss away developer time doing fuck-if-I-know because you didn't free mallocs or had to write a template language from scratch or some insane shit like that. Pick your poison!
As you say, there is a tradeoff. It doesn't matter if you're fighting the need to cache intelligently in PHP, or the need to get everything right because you're developing a complete solution in C (or whatever) or the need to interface to someone else's system for serving pages if you're using something in between. It also doesn't matter if you're using a servlet technology, or you're punching bits out on a paper tape and feeding it into a machine which converts it into EBCDIC and... you get the idea: don't fuck up.
In any case the whole argument is fucking stupid because: PHP is not implemented in PHP. And Facebook is not implemented in pure PHP. See summary: Facebook runs a typical LAMP setup where P stands for PHP with certain customizations. At some point you have to ask yourself how many wheels you want to reinvent. If you extend PHP you can reinvent fewer wheels. I'm not sure it's the right answer, but I'm sure it's not a horribly wrong one. I'm also absolutely certain that barring some massive development in processing the future is only going to involve more parallelism and more clustering, and that if you expect PHP to scale on a single machine you're a bozo.
What I have personally noticed about using PHP is that a single page load can consume an absolutely insane amount of memory. This problem, too, is mitigated or eliminated by aggressive use of caching. In order to cache properly you need to do something intelligent with your data store, which I think is where most people fall down. Having looked into the mishmash that most CMSes produce in the db is enough to make you weep. I long for an elegant object-oriented CMS based on practically anything, but the simple truth is that PHP is by far the easiest thing to get going without spending any money and that has probably done more than anything else to propel it to the head of the FOSS class, at least in terms of popularity. A staggering number of quite excellent websites seem to be built with it as well.
In summary, I reject the notion that PHP is a serious limiting factor for the majority of websites and that most of those for whom it is have failed to understand PHP. (Not that I'm any PHP guru.) It's true that a clustered web application is significantly more complex than something which is not clustered. However, it's also [potentially] far more scalable. At some point you simply run out of machine. When you can't get anything better from Sun (AFAICT they make the single machines which can handle the most threads today) you're going to have to cluster, even if it's only to two machines. At that point you'll have far more complexity invested in having a single system image to work with and the pain of moving to a cluster will be magnified that much more as well. If you accept the notion that clustering is today and for the foreseeable future the best way to handle scalability (which I admit is at this point not a proven notion, but is at least a well-supported theory) then the idea that PHP is a major limiting factor is just plain silly. Sun is circling the drain, and everyone else is concentrating on clustering. Your call...
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
I was thinking the opposite - they have developed an architecture that is modular enough to allow them to develop different pieces using different technologies, yet they all work together pretty seamlessly. I'd say that's quite an accomplishment!
"Slow down, Cowboy! It has been 3 years, 7 months and 26 days since you last successfully posted a comment."
I guess a story on /. with only 75 comments after 7 hours pretty much answers that question, eh?
Both Java and PHP are interpreted languages because this is how you create a cross-platform language.
Each gets compiled to bytecode which gets executed in a OS specific VM.
Java is JIT compiled to native code, whereas PHP is bytecode interpreted. The difference is more than an order of magnitude. In fact, judging by this comparison, in many cases Java is about 100 times faster than PHP.
Frankly, most websites do not need an app server. Wikipedia uses PHP, not Java. It is not a 'simple' website that you say PHP is suited for.
Wikipedia is presenting uncustomised content to most users. It runs a huge squid cache in front of its PHP servers. If it tried to run PHP for each user it would crawl. I run mediawiki locally on an AMD Athlon64 2200+. It takes ~0.2 seconds of 100% CPU time to process a simple request. There is simply no way Wikipedia could run without content cacheing.
This is not to say that the task of serving that content is cheap. But they're doing a lot better than facebook; they're serving 30,000 requests/sec with only 350 servers. The difference, I suspect, is mostly down to the amount of cacheing they prform.
Facebook is much less able to cache content. It doesn't have a squid front end because relatively few users see the same exact content, unlike for wikipedia; most users are logged in most of the time and see pages customised for themselves.