Slashdot Mirror


Facebook Rewrites PHP Runtime For Speed

VonGuard writes "Facebook has gotten fed up with the speed of PHP. The company has been working on a skunkworks project to rewrite the PHP runtime, and on Tuesday of this week, they will be announcing the availability of their new PHP runtime as an open source project. The rumor around this began last week when the Facebook team invited some of the core PHP contributors to their campus to discuss some new open source project. I've written up everything I know about this story on the SD Times Blog."

13 of 295 comments (clear)

  1. High performance in scripting languages? by BadAnalogyGuy · · Score: 5, Insightful

    At some point, if you are lucky enough, you will require extremely high performance from your web pages. You start out coding HTML in Notepad and move on to Perl CGI then on to PHP with scripting embedded right in the generated HTML. All the time you gain programming crutches at the expense of processing speed, and for a while this is a great tradeoff.

    But one day you start having server hiccups because your scripts can't keep up with your traffic. Sites like Amazon have already run into this and have moved away from scripting languages and back to system languages. Running applications directly on the CPU instead of relying on a runtime to translate (at best) bytecode into machine instructions means maximizing CPU cycles.

    So I wonder what longterm benefit there is in improving the language runtime.

    1. Re:High performance in scripting languages? by sakdoctor · · Score: 5, Insightful

      1. Static HTML
      2. PHP
      3. ???
      4. Rewrite the PHP runtime

      Truth is, that step 3 involves a whole load of steps where 90% of the problem will be database bound. Complied languages are not going to the magic solution in a real world situation.

    2. Re:High performance in scripting languages? by gbjbaanb · · Score: 3, Insightful

      Complied languages are not going to the magic solution in a real world situation.

      whilst that's perfectly true, its only true to a point. Lots of people run eaccelerator or apc on their PHP sites, simply to improve performance. If these pre-compile caches didn't do anything for performance then you'd be totally right, but as they do, you've got to appreciate that replacing the PHP with a compiled language will make a significant difference.

      as always, don't guess where performance problems live, measure them. Often you'll be surprised, especially as load increases.

    3. Re:High performance in scripting languages? by amn108 · · Score: 4, Insightful

      Check your facts better next time please.

      PHP does not have to execute scripts from scratch on every request. APC cache API transparently caches JIT-compiled PHP script intepretations and these are run instead upon request.

      Apache does not have to compile regular expression for mapping each URL request, when you specify RewriteRule directives in virtual server or server context it compiles them (or however else it wishes to cache these) on server startup and that is it. During the entire lifetime of the server, the specified rules are no longer interpreted.

      Other than that I agree that the current style of server infrastructure we are "enjoying" can be improved at least 100-fold, database access including.

      * Truly persistent (across HTTP requests) user memory is a good step in the right way - extend the lifetime of all scripted objects to persist across entire application lifetime (i.e. forever - servers are not supposed to die)

      * Many database requests are so primitive that these could bypass TCP socket layer and benefit from extra speed at the cost of no longer being asynchronious. Most PHP developers use blocking database query APIs anyway, even though non-blocking calls are available.

      * Compile scripts and run from compiled cache, preferrably as machine code. Think about environment - all those quad-core datacenters wasting cycles, because programmers are supposedly very expensive. Well, they are not that expensive that when a good team get together they cannot re-think this. They can. If nothing, they should worry about the environment too - it is not cheap.

      * Offer asynchronious calls where these make sense (i.e. where they can benefit from hardware parallelization). Those devs who know how to make use of them will be happy to do speed up their applications.

      In short, cache everything, whenever you can. Memory is cheap. Cache SQL query strings, cache script compilation output, cache, cache, cache.

  2. So is it a fork or isn't it? by erroneus · · Score: 3, Insightful

    Sounds like Facebook rewrote PHP and then invited PHP core developers to adopt it as their core development platform? I can't imagine that went over all that well... probably hit a number of them in the pride region. And the article said it is to be released as open source, but failed to mention the license. Will this be some sort of twisted "FriendFace Public License" or some perversion?

    This is not what is meant when a party contributes to an open source project. "Here, I rewrote it for you. It's better. Now just throw away everything else you've done and use this." Really?

  3. One man effort by hey · · Score: 3, Insightful

    So there is one guy at Facebook doing this PHP rewrite. It must be possible to figure out who he is. Have they hired any high profile PHP developers?

  4. Re:Assembler is High Performance by erroneus · · Score: 4, Insightful

    Actually, that isn't necessarily true. It might be true in a linear sense, but when it comes to juggling different threads and the like, assembly language as I knew it wasn't all that capable of describing the process all that well. Assembly language is a go-cart with rocket boosters.

    I would truly like to see an assembly language revival though. I truly would. It would be a return to sensibilities in programming. It would be a return to being careful with memory usage with improved focus on small efficient programming. It would be a really good thing. I just don't see it happening.

    The purpose for these more complex languages is about being able to more symbolically describe the processes to be executed by the machine. Assembly language was some of the worst about that -- if there wasn't a very detailed set of comments for nearly every line of code, it would be nearly impossible to follow in source. These more complex languages will always have their place and purpose. Trying to make them more efficient is a good and useful thing. Now if we were talking about writing the PHP interpreter in assembler, I'd say you had a winner compromise.

  5. Three sources of scripting language inefficiency by tepples · · Score: 5, Insightful

    I don't know what the fascination is with scripting languages on the Linux platform or with FOSS in general, but it results in slow programs

    Speed of development is faster in a scripting language, and in developed countries, below a certain scale, throwing hardware at it is cheaper than throwing programmers at it. The point of the article is that Facebook is above that scale, and programmers to write a new PHP interpreter have become cheaper than adding hardware+power+cooling.

    with flaky UIs.

    Citation needed. True, the often use a different widget set from the rest of the desktop (e.g. Tk from Tcl and Python and Swing from Java), but the popular widget sets also have scripting language bindings. how can one really tell the difference between a wxWidgets or GTK app written with Python vs. C++?

    I like to use refurbished/recycled machines; which means that I'll have an old P4, 512M RAM and a slow bus.

    Do these use more electric power than, say, an Acer Aspire Revo? The power consumption of a Pentium 4 and the power to remove the heat it generates can become an issue, especially for a server that's turned on 24/7.

    Many times, applications written in a scripting language, whether it be Perl, Python, PHP, or whatever, will hang often and then start working.

    There are three causes for this, and you can distinguish them with 'top' or 'Task Manager' or something else that can count CPU time and page file accesses:

    • Swapping: More dynamic languages tend to use more general data types, which incur memory overhead. For instance, they might use UTF-16 strings instead of 8-bit strings with an assumed encoding. Or they might use double-precision floating point instead of short integers for large arrays. This might cause a program to run out of RAM and fail over to the disk more often.
    • Garbage collection: This covers ways of determining which resources are no longer in use by any active part of the program. Python, Objective-C, and lately C++ primarily use an incremental garbage collection method called reference counting, which keeps track of the number of things that "know about" an object. But some other language interpreters use only tracing garbage collection, which in the naive form causes the application to pause and make a list of all live objects once memory allocations exceed some amount. This will cause a CPU spike.
    • Blocking: A lot of the APIs available to programs in scripting languages don't have well-known non-blocking versions. For example, a host name lookup might freeze the program until it finishes. The only way to work around these is to start a thread. This will cause a pause with 0 CPU and 0 disk.
  6. Re:PHP is slow (check), now what.... by jimicus · · Score: 3, Insightful

    Why not just stash your farm of slow php systems behind some heavy duty caching appliance(s)?

    Something like aicache might fit the bill.

    When your application is with each iteration generating more content dynamically than it was before (and you want to continue down that route), the benefit of caching starts to drop quite quickly.

  7. JavaScript by tepples · · Score: 3, Insightful

    Flaky UIs - click on a button and nothing happens. Or things not drawing properly.

    I've seen buttons do nothing and redraws fail even in compiled programs.

    A refurb machine is about a third the cost of a new machine

    By "cost", do you include or exclude the cost of power and cooling? And do you include or exclude the cost of replacing failed components? Capacitors die.

    scripting languages are not appropriate for large applications with GUIs.

    One scripting language has a huge deployment advantage over everything else: ECMAScript. It interacts with Document Object Models exposed by various runtime environments, and it's sandboxed so that users can more or less safely run a program without getting an administrator to install it. You might know it as JavaScript (ECMAScript + HTML DOM) or ActionScript (ECMAScript + SWF DOM). Or would you rather go back to ActiveX, where the web site sends the equivalent of a compiled DLL to each user, which runs with the user's full privileges and doesn't run on anything but a convicted monopolist's operating system?

  8. Re:Screw PHP, I write everything in C by Galactic+Dominator · · Score: 3, Insightful

    I don't know why this has not yet been linked

    Just taking a shot in the dark here, but I'll attempt an answer. The reason no one else linked to it is because you're the only one who considers it obligatory. Slashdot regulars will know that this type of thread happens on a near daily basis and with all due respect to xkcd there is simply no need to make another tired attempt at karma whoring.

    --
    brandelf -t FreeBSD /brain
  9. Re:Assembler is High Performance by javacowboy · · Score: 4, Insightful

    I'm a Java developer with 10 years of experience developing enterprise grade server applications. We use Java, like the majority of Fortune 500 companies, because a Java app can be maintained with a development team greater than 1 coder, common memory coding errors and behaviours is avoided, a large API library prevents us from having to re-invent the wheel constantly, and the JVM is battle-tested in large deployments.

    But, no, I guess I'm just a kid who doesn't know how to code.

    --
    This space left intentionally blank.
  10. Re:"Java" and ".Net" as an inspiration? No, thank by Abcd1234 · · Score: 4, Insightful

    Speed of "Java" and ".Net"? Is it a joke?

    No, it's not.

    "Java" hangs all the time

    No it doesn't.

    and the ".Net" code to do a simple task is so convoluted that it is just ridiculous.

    No, it's not.

    Honestly, you really have no fucking clue what you're talking about, do you?