Slashdot Mirror


How Facebook Runs Its LAMP Stack

prostoalex writes "At QCon San Francisco, Aditya Agarwal of Facebook described how his employer runs its software stack (video and slides). Facebook runs a typical LAMP setup where P stands for PHP with certain customizations, and back-end services that are written in C++ and Java. Facebook has released some of the infrastructure components into the open source community, including the Thrift RPC framework and Scribe distributed logging server."

11 of 111 comments (clear)

  1. Open source by Norsefire · · Score: 5, Funny

    As I recall, some of their code was made open source in 2007, although not deliberately.

  2. One question: by bogaboga · · Score: 4, Interesting

    About how much has Facebook saved by using Open Source Software? I ask because I am not familiar with licensing costs from competing solutions. Thanks!

    1. Re:One question: by julesh · · Score: 5, Informative

      About how much has Facebook saved by using Open Source Software? I ask because I am not familiar with licensing costs from competing solutions. Thanks!

      I haven't watched the presentation so don't know if this is answered there, but it's hard to pin down any numbers on precisely how many servers facebook operates. That said, an estimate of their expected power usage in their recently acquired second datacenter is 6 megawatts, placed at twice the usage in their current datacenter. Realistically, this probably equates to a cluster of around 5,000 machines in the current datacenter.

      Costs per machine are likely to be restricted to Windows Server Web Edition; other software would not be needed on all machines (depending on cluster architecture, of course) so would be a trivial cost in comparison. Retail for the web edition is $399; I think we could expect such a high profile user to qualify for a 50% discount. This would put their software costs at about $1M. Considering that they're believed to have spent over 100 times this on hardware and support costs over the last year, I doubt this would be a particular concern. Price of purchase is not a factor in why facebook does not run on proprietary software.

  3. Re:whatever by AHuxley · · Score: 4, Funny

    To quote a joke on slashdot
    "Is there anything Java cannot make slow."

    --
    Domestic spying is now "Benign Information Gathering"
  4. Re:whatever by Tubal-Cain · · Score: 4, Funny

    Too paraphrase the answer:
    "Sun's stock price plummet.

  5. Re:the blame is with management by Anonymous Coward · · Score: 5, Insightful

    That has little to do with the infrastructure and more to do with the site design. Please don't blame the sys engineers/admins for the poor interface design.

    Well, the fact that they gave a talk about their LAMP stack tells you that they consider engineering more important than site design. Furthermore, a poor choice of infrastructure makes doing good site design hard.

    And that's my point: Facebook is evidently driven by system stuff and programmers, while it should be driven by site design.

    Clearly, $MY_SPECIALTY should drive the entire system! They made a big mistake by allowing $OTHER_SPECIALTY to take precedence. Everyone knows that only $MY_SPECIALTY should dominate all design plans. Duh.

  6. Re:the blame is with management by Firehed · · Score: 4, Insightful

    If your site infrastructure is influencing how you design, you've made some sort of monolithic error along the way. Good code completely separates the content from the design. It's not like they've just hacked up a Wordpress install (which seems to go out of its way to tie content and design together) - Facebook employs hundreds if not now thousands of programmers; it's pretty safe to assume there's at least one UI/UX specialist on board as well.

    All things considered, I'd actually say that Facebook's design is pretty decent, but that's of course a matter of opinion. A lot of the code that went into that design sucks, but that's what happens when you have to support IE6. Regardless, I think it's great that they're sharing knowledge about how they've managed to use and customize an infrastructure to support 200,000,000 users, especially with the amount of traffic they have to deal with. That's well beyond the scale that many governments have to worry about!

    --
    How are sites slashdotted when nobody reads TFAs?
  7. Re:Not very well by Firehed · · Score: 4, Interesting

    PHP, as a language, is more than capable of handing four requests per second (which can be said of pretty much anything other than punch cards).

    Writing bad code in PHP, however, will of course slow things way down. Just like not having indexes on your databases, or doing stupid/unnecessary JOINs. Or not caching properly (see: Wordpress). Writing fast and efficient code in any language is easy enough provided you're a skilled programmer. Facebook, unfortunately, started off as Zuckerberg paying a friend with some web skills to build out a system, and it grew so quickly that replacing the code (or, rather, the DB schema) with something that doesn't suck probably became near-impossible. If you write code with scalability in mind, it's not a tremendous problem.

    Of course, nothing is going to cope well with the sheer volume that Facebook deals with. There's plenty you can do along the way to help yourself out, which Facebook may or may not have done. You can bet that nobody thought the site would ever have 200MM users when the first lines of code were written; they probably never expected 1% of that. Writing intelligent code is the most important part of scalability - writing smart DB queries and minimizing the number required probably being the biggest part of that. Have your MySQL servers instead of PHP do some calculations in queries (hashes, query-related math, etc) usually doesn't hurt since you're generally offloading CPU-intensive operations to a disk-bound machine (i.e., has spare cycles).

    There's all sorts of tricks and optimizations. Some are language-specific, and some aren't. But making bad decisions early on is a lot harder to fix than an inefficient foreach loop.

    --
    How are sites slashdotted when nobody reads TFAs?
  8. Re:Not very well by Fweeky · · Score: 4, Interesting

    They have somewhere in the region of 5,000 servers in their main datacenter and (I believe) others scattered around the world, but restricting it to just that main center, that means each server is handling around 4 requests per second

    I somewhat doubt every single one of them is a dynamically driven webserver. Probably at least half are databases, search servers, caching servers, backend appservers, file servers, CDN type stuff, backup servers, hot spares, admin servers, staging machines, etc.

    For example: Newzbin has 5 webservers in main rotation; it also has 7 search servers (plus one development machine with similar specs), 6 database machines, 2 backend systems running most of our cronjobs, 2 admin servers, 1 web development server, and 2 systems for building and deploying OS's from. As far as load is concerned, the backend stuff is far more important than the frontend. Sure, we could rewrite the main site in Java or Scala or C++ and get away with 3 webservers and still be N+2, but trust me, those extra two or three webservers is not a significant cost next to that of development.

    I can either spend £5k on extra equipment (plus occasionally boosting our space and bandwidth costs, but those are dominated by other systems already), or I can spend £70k a year on another developer, who *still* won't allow us to match our development speed with PHP, and then rewrite tens of thousands of lines of code, likely into much more.

    Much of our backend is written in C. That's where the big payoffs for efficient languages is, not a bit of database-limited HTML rendering. Judging by how many big sites are still running PHP, Python and Ruby for their frontends, this would seem to be the case elsewhere, too.

  9. Re:Not very well by drinkypoo · · Score: 4, Interesting

    As you say, there is a tradeoff. It doesn't matter if you're fighting the need to cache intelligently in PHP, or the need to get everything right because you're developing a complete solution in C (or whatever) or the need to interface to someone else's system for serving pages if you're using something in between. It also doesn't matter if you're using a servlet technology, or you're punching bits out on a paper tape and feeding it into a machine which converts it into EBCDIC and... you get the idea: don't fuck up.

    In any case the whole argument is fucking stupid because: PHP is not implemented in PHP. And Facebook is not implemented in pure PHP. See summary: Facebook runs a typical LAMP setup where P stands for PHP with certain customizations. At some point you have to ask yourself how many wheels you want to reinvent. If you extend PHP you can reinvent fewer wheels. I'm not sure it's the right answer, but I'm sure it's not a horribly wrong one. I'm also absolutely certain that barring some massive development in processing the future is only going to involve more parallelism and more clustering, and that if you expect PHP to scale on a single machine you're a bozo.

    What I have personally noticed about using PHP is that a single page load can consume an absolutely insane amount of memory. This problem, too, is mitigated or eliminated by aggressive use of caching. In order to cache properly you need to do something intelligent with your data store, which I think is where most people fall down. Having looked into the mishmash that most CMSes produce in the db is enough to make you weep. I long for an elegant object-oriented CMS based on practically anything, but the simple truth is that PHP is by far the easiest thing to get going without spending any money and that has probably done more than anything else to propel it to the head of the FOSS class, at least in terms of popularity. A staggering number of quite excellent websites seem to be built with it as well.

    In summary, I reject the notion that PHP is a serious limiting factor for the majority of websites and that most of those for whom it is have failed to understand PHP. (Not that I'm any PHP guru.) It's true that a clustered web application is significantly more complex than something which is not clustered. However, it's also [potentially] far more scalable. At some point you simply run out of machine. When you can't get anything better from Sun (AFAICT they make the single machines which can handle the most threads today) you're going to have to cluster, even if it's only to two machines. At that point you'll have far more complexity invested in having a single system image to work with and the pain of moving to a cluster will be magnified that much more as well. If you accept the notion that clustering is today and for the foreseeable future the best way to handle scalability (which I admit is at this point not a proven notion, but is at least a well-supported theory) then the idea that PHP is a major limiting factor is just plain silly. Sun is circling the drain, and everyone else is concentrating on clustering. Your call...

    --
    "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
  10. Re:Yeah, Blame the Language by julesh · · Score: 4, Informative

    Depending on the application, PHP can handle several hundred transactions per second, on *one* machine. It is common knowledge that Java requires far more resources to achieve a typical transaction rate, than PHP.[citation needed]

    This is just bullshit. A Java-based server will typically require a fairly constant 64MB more RAM than an equivalent PHP server, but other than this the Java system will outperform PHP in every sense. If the content generation is even remotely complex, Java can be up to 100 faster, which translates to 100 times higher transaction rate.

    Sure, PHP can handle several hundred transactions per second, if your script is <?php echo "hello world"; ?>. This benchmark of a non-trivial e-commerce application shows that Java can easily handle 500 requests per second on a small 2000-era 4-cpu cluster. A modern quad-core server should be handling at least 20 times that rate, absent any improvements in Java architecture since then (and there have been many; this test was run on Java 1.1, which was hideously slow compared to modern Java versions), and ignoring the performance improvement from not having to load balance requests at the front end or access the database server across the network.