Slashdot Mirror


Facebook's HipHop Also a PHP Webserver

darthcamaro writes "As expected, Facebook today announced a new runtime for PHP, called HipHop. What wasn't expected were a few key revelations disclosed today by Facebook developer David Recordan. As it turns out, Facebook has been running HipHop for months and it now powers 90 percent of their servers — it's not a skunkworks project; it's a Live production technology. It's also not just a runtime, it's also a new webserver. 'In general, Apache is a great Web server, but when we were looking at how we get the next half percent or percent of performance, we didn't need all the features that Apache offers," Recordon said. He added, however, that he hopes an open source project will one day emerge around making HipHop work with Apache Web servers.'"

25 of 304 comments (clear)

  1. GUI applications by sopssa · · Score: 5, Interesting

    While theres already several libraries intended for creating windows and interfaces with PHP, and to put them together into an executable file, this might greatly improve that area in PHP too. While being faster as well, being machine code it protects your code too.

    Along with making it work with Apache Web servers I hope someone works on this aspect too. PHP is really nice and fast to write. *ducks from the c/c++ coders*

    Definitely interesting project.

    1. Re:GUI applications by shutdown+-p+now · · Score: 4, Funny

      PHP is really nice and fast to write. *ducks from the c/c++ coders*

      You should duck from Python and Ruby coders. The C++ guys are too busy beating up Java schmucks. ~

    2. Re:GUI applications by dgatwood · · Score: 4, Insightful

      What makes PHP nice is that it is so close to C. For people who are comfortable working in C, PHP is just a few dollar signs away.

      --

      Check out my sci-fi/humor trilogy at PatriotsBooks.

    3. Re:GUI applications by MightyMartian · · Score: 5, Funny

      What makes PHP nice is that it is so close to C. For people who are comfortable working in C, PHP is just a few dollar signs away.

      Which is like saying an anus is almost like a vagina...

      Feel free to take that analogy the distance.

      --
      The world's burning. Moped Jesus spotted on I50. Details at 11.
    4. Re:GUI applications by ls671 · · Score: 4, Insightful

      Application servers based on Java are heavy only on start up, the allocated memory is then reused which makes it light on system load once started.

      Java uses some of its memory to cache machine code in order to re-execute it the next time it is needed and this also makes it light on system load.

      Simply by using top, you could understand what I am talking about. Java uses more memory but it is otherwise very light on system load and guess what ?

      Machines typically have 4 GB of ram nowadays.

      Most people bitching about Java being heavy do not understand what I am trying to explain to you here ;-)

      --
      Everything I write is lies, read between the lines.
    5. Re:GUI applications by eihab · · Score: 5, Insightful

      Hopefully he has upgraded to the "once in a while" switch replacement technique.

      That struck me as weird, because as a programmer you usually start with conditionals and then move on to loops. I had a hard time believing that someone would know of "while(true)" and not "else if".

      So I decided to run some tests over dinner. I'm no C++ programmer but here's how I went with this.

      First I wrote a tests.cpp that looks like this:

      #include
      int main () {
          int subType, mainType = 11;

          Slashdot_Filter_Sucks // Editable section
          while (true) {
              if (mainType == 7) {
                  subType = 4;
                  break;
              }
              if(mainType == 9) {
                  subType = 6;
                  break;
              }
              if(mainType == 11) {
                  subType = 9;
                  break;
              }
              break;
          }
          Slashdot_Filter_Sucks // End of editable

          std :: cout

      I compiled that and it resulted in a 8120 bytes binary that ran in 0.005ms.

      I thought about other obvious and simple ways to write this code and I created four more versions that are identical except for the code between the dividers (I had pretty asterisk lines but Slashdot's junk filter made me take it off). They are:

      testif.cpp (test using an if/else statement):

      if (mainType == 7) subType = 4;
      else if (mainType == 9) subType = 6;
      else if (mainType == 11) subType = 9;

      testifonly.cpp (no else, only ifs):

      if (mainType == 7) subType = 4;
      if (mainType == 9) subType = 6;
      if (mainType == 11) subType = 9;

      testswitch.cpp (using a switch statement):

      switch(mainType) {
          case 7: subType = 4;
          case 9: subType = 6;
          case 11: subType = 9;
      }

      testp.cpp (subtract 3 from mainType since that seemed like a pattern):

      subType = mainType - 3;

      I compiled everything using g++ then I ran time ./output. All the versions ran on average in 0.005ms, however, the binary sizes were different:

      #ls -l (ordered by size)
      8072 testp
      8109 testifonly
      8120 tests
      8121 testif
      8125 testswitch

      Ok, no case here in terms of size. So I tried compiling again with -O3, and the results were:

      #ls -l (ordered by size)
      8024 testp_o3
      8024 tests_o3
      8025 testif_o3
      8029 testifonly_o3
      8029 testswitch_o3

      Here it seems that the subtraction and the weird while/break method have the smallest file size. Without code context, one can imagine that subType was to be left alone if mainType was not 7,9 or 11. Which would mean the subtraction code wouldn't work in that scenario.

      Now, I don't know the intricacies of C++ or Assembly, but I have to wonder if this was the work of a moron or someone who knew exactly what they were doing and did so for a reason.

      Again, without context, none of this matters.

      --
      If you can't mod them join them.
  2. A stupid question... by __aaclcg7560 · · Score: 3, Insightful

    For all the trouble you're going through to convert PHP into C++ (300,000 lines and 5,000 unit tests), wouldn't programming in C++ in the first place be easier?

    1. Re:A stupid question... by shutdown+-p+now · · Score: 4, Insightful

      As a programming language, PHP is simple. Simple to learn, simple to write, simple to read, and simple to debug.

      PHP is not a simple language. A keymark of a simple language is consistency, and PHP is anything but - I won't even touch on the mess that is the standard function library, but just the language itself. For example, this gem, taken directly from the language spec, regarding array indices/keys:

      A key may be either an integer or a string. If a key is the standard representation of an integer, it will be interpreted as such (i.e. "8" will be interpreted as 8, while "08" will be interpreted as "08"). Floats in key are truncated to integer. The indexed and associative array types are the same type in PHP, which can both contain integer and string indices.

      This is awesome on many levels. The obvious fubar is the treatment of "8" vs "08" (and note that, while it is clearly obvious when a string literal is used in the source code, how about a string variable, or other expression computed at runtime?). But the bit about silent float->int truncation is also interesting, especially the "silent" part. Combined with rounding errors and the overall non-obviousness of binary floating-point arithmetic (especially to a typical PHP coder), this design decision is just hilarious.

      I've long held the opinion that C/C++ rules on mixed signed/unsigned arithmetic and comparisons are a good example of awful language design, but PHP beats that by a margin so large it's not even funny.

      Oh, I also don't know of any other language that has what effectively amounts to synactic sugar for try/catch with an empty catch block. Good programming practices FTW!

      I find it curious, by the way, that PHP coders like to compare the language to C++ or Java - where it actually has some subjective advantages, such as dynamic typing - but very rarely to Perl, Python or Ruby, where all such advantages disappear, but design flaws immediately stand out.

    2. Re:A stupid question... by MightyMartian · · Score: 3, Insightful

      PHP is a lot better environment to develop new features quickly and doesn't get you into so many security pitfalls. And they're already using C++ for some parts of the site:

      Except for the problem that, historically, PHP is been one big vast security pitfall.

      --
      The world's burning. Moped Jesus spotted on I50. Details at 11.
    3. Re:A stupid question... by Cyberax · · Score: 4, Informative

      An experienced C++ programmer rarely creates memory leaks, and they are easily detected by a variety of tools.

      Also, for PHP-style programs it might be easier to just restart a child server process each N requests. So memory leaks are of even less concern.

      The main problem is compilation speed. C++ compilers are just plain slow.

    4. Re:A stupid question... by shutdown+-p+now · · Score: 4, Insightful

      Because it's really easy to create memory leaks and similair bugs in C++.

      It's very easy to get rid of memory leaks in C++, as well. A very simple rule: never, ever write a type declarator with a * in it. In other words, no raw pointers - use Boost/TR1 shared_ptr, or roll out your own, it doesn't matter - just use it consistently. At that point, you can still get reference cycles (which are also leaks), but you can do that in PHP 5.3 - which also uses reference counting with no GC for cyclical references - just as well. And the usage of 5.3 so far is minuscule.

      Alternatively, just use any of third-party tracing GCs, such as Boehm.

      By the way, from personal experience, I find that languages with built-in reference counting and no cycle detection (those I know of are VB6 and PHP) are actually more prone to memory leaks when coding that languages with explicit memory management. The reason is that, in, say, C++, coders are actually aware of issues such as memory allocation, and view smart pointers as convenient helpers, not as some kind of magic fairy dust. Because of that, the question of "what happens if two smart pointers reference each other" is a rather obvious one, and the issue is noticed and rectified early on. In contrast, in VB6 and PHP, you don't have to deallocate explicitly, so refcounting is magic - many people don't even understand how the algorithm works! - that is, until you run into a cyclic reference that leaks...

    5. Re:A stupid question... by dgatwood · · Score: 3, Informative

      The @ syntax is not a try/catch. PHP doesn't stop execution when it encounters errors opening files and stuff. It merely blasts a warning message to the output stream (web client). The @ operator suppresses that output. It's equivalent to sending the perror() after a failed fopen() call to /dev/null. Whether the command succeeds or fails, control still returns to your code after the statement. The @ operator merely suppresses the error message generated by PHP so that you can display a more appropriately formatted and/or more useful error message (or not display a message at all if the failure is expected). In a production environment, most people disable the warning output from PHP anyway, making it basically a no-op except during debugging.

      If you folks want an argument against PHP, you're all going about it wrong. Probably the best argument against PHP is that it makes it easy to design yourself into a corner---putting code into the middle of HTML templates that suddenly needs to be able to set a header field and "whoops, that has already been sent", putting code into the middle of templates that needs to change the content up a level, and "whoops, have to add a hack over here to fix that", etc. The result is that for really simple sites, PHP is awesome, but it has real problems scaling to more complicated designs without dropping the templates (or at best, including them on the back end after you do a lot of compute processing up front to set things up, choose a template programmatically, etc.).

      Or you could simply attack it for being a lot slower than C and leading to design patterns that waste lots of memory. For example, associative arrays are simple and easy to use, but 90% of the time, there are much simpler data structures that can do just as well. If your data structures are small, no problem. If you deal with something big, the difference in memory pressure between a clean, lightweight binary tree (even without balancing) and an associative array can result in an order of magnitude impact in performance (or two or three).

      --

      Check out my sci-fi/humor trilogy at PatriotsBooks.

    6. Re:A stupid question... by abulafia · · Score: 3, Insightful

      No, I think I'll stick with attacking it for being a truly crappy language. I don't care that it is slow or wastes memory. If you're paying$20/month for your dance-school-business-calendar installed and customized by a local teenager, the idea of writing a web app in C is silly for efficiency is silly. Likewise, whatever the intent of @ is for, I most certainly expect people have and will abuse it in exactly the way described to "fix" problems. People endlessly bash, for instance, Perl as being write only, and there's truth to that. But there is truth to that because the language tends to encourage hard to read code. You can say that's not the intent, and you'd be right, but that doesn't matter. (Though I do still love Perl.)

      I do agree that PHP is fine for toy web sites, and that people get themselves in trouble using the executable web page model because the don't know what they're doing. These things are true for the same reason: PHP is full of poorly thought out magic that allows people to get in over their head, and doesn't provide the tools to easily dig back out. I'm all for making programming more accessible, but encouraging people to foot-bullet themselves in predictable ways doesn't strike me as a good approach.

      I dislike it for other reasons, but for instance Ruby on Rails is a much more solid approach, in my opinion - the path of least resistance is generally the right thing to do, once a newbie internalizes the MVC idea and a couple conceptual points the learning curve is pretty gentle, and Ruby is a pretty well constructed language that lets people grow into using more conceptually useful techniques over time without the up-front demands of learning, say, Lisp.

      (While I'm chasing people off my lawn, the whole RoR mindset seems to lead people down a rabbit hole of writing dumb little DSLs -- who on earth thinks a toy language for generating CSS is a good idea? You just push yourself one more indirection layer away from what's going on and end up dinking around with yet another silly new syntax for your effort. Muppet coding at its worst.)

      --
      I forget what 8 was for.
  3. Well, I Guess "HipHop" Is the New Champ! by RobotRunAmok · · Score: 5, Funny

    And here I never thought that anything could ever take the award for "Most Stupidly Named Software" away from the Ubuntu distros.

    Congrats again, HipHop! Can I get a Fist-Bump?!

    1. Re:Well, I Guess "HipHop" Is the New Champ! by NevDull · · Score: 4, Insightful

      My guess is that it was probably a progression from "Haiping's PHP" to HPHP to HipHop.
      Two syllables vs four or more... looks like they're not just computing more efficiently, but also speaking more efficiently!

  4. Re:Ambitious by sopssa · · Score: 3, Insightful

    He said they were struggling just to get half a percent more performance with Apache. That had nothing to do with "HipHop".

    In reality their CPU usage dropped average 50%

    With HipHop we've reduced the CPU usage on our Web servers on average by about fifty percent, depending on the page.

  5. Ah ha! by bernywork · · Score: 3, Funny

    This explains a lot, a couple - few months ago, I started getting complaints about "potentially virus infected" / "unscanable" zip files when being served content from facebook.com and fbcdn.net etc.

    They probably changed at this point how they were sending data out of the web server with zip compression and it all started falling over at this point....

    I was wondering what the change was....

    --
    Curiosity was framed; ignorance killed the cat. -- Author unknown
  6. Facebook Still Runs Terribly Slow by ztransform · · Score: 3, Funny

    Chat routinely freezes up the browser, and people appear offline when they are online.

    I frequently get error messages from pages that won't dynamically load (there is something wrong with the server, or such message).

    Facebook doesn't need a half percent increase in performance, they need a lot more!

  7. Re:Ambitious by PhiberOptix · · Score: 3, Informative

    sure, hardwares cheap, but when you have over 30k* servers, a 1% saving on them might be worth their coders time.

    * http://www.datacenterknowledge.com/archives/2009/10/13/facebook-now-has-30000-servers/

  8. From TFA... by BitHive · · Score: 3, Funny

    As a programming language, PHP is simple. Simple to learn, simple to write, simple to read, and simple to debug. We are able to get new engineers ramped up at Facebook a lot faster with PHP than with other languages, which allows us to innovate faster.

    hahahahahhahahahhahahahahhahahahahhahahah

  9. Re:i can hear it now by bill_mcgonigle · · Score: 4, Insightful

    the php haters: "look how awful php is, you need to convert everything into c++ before you can use it in really large scale deployments!"

    "Look how awful C++ is, you have write bits in assembly to get it to really run."

    "Look how awful assembly is, you really optimize when you can write machine opcodes."

    And the microcode guys just glare out from their caves with their glowy little eyes in incredulity.

    Elsewhere is heard, "You guys still use CPU's? It's the GPU decade, dude."

    And somebody down the hall builds an ASIC to solve a specific problem and thinks he's so smart.

    But, the analog EE understands his elegant circuit doesn't enable a team of 200 developers to build the top social networking site.

    --
    My God, it's Full of Source!
    OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
  10. Re:Ambitious by shutdown+-p+now · · Score: 3, Interesting

    They wrote their own webserver/php-interpretor

    They didn't write a PHP interpreter. They wrote a PHP-to-C++ translator.

    Also, I presume that "one-to-half percent" refers to further optimization opportunities after they've done that.

  11. Re:i can hear it now by ancientt · · Score: 4, Interesting

    I'm both sides.

    I love PHP when I need to throw something together fast (like today) but don't expect a lot of heavy use. I love PHP when I want to get some handy tools that I can easily hack into doing what I really want. Still, when I have a significant project, and server load starts to matter, I loathe trying to use PHP and would usually rather write it as Perl, sometimes even compiling (gasp) Perl into something about as efficient and a whole lot more reliable than if I tried to write it in C. If I were really serious, I'd write it in C, but a day's work in C is 30 minutes in Perl or 10 in PHP.

    Choice of language for me is about return on investment. I'm not a grand programmer, I don't have the luxury of getting comfortable programming, if I'm programming it means that I'm not spending time on the dozens of other issues confronting our IT department. Most of the time if we're doing any sort of serious project, we're buying service from somebody who does it better than I have time to, probably better than I could.

    If this HH thing (no, can't stand to type the real name) gets momentum then it could be really good for shops like ours. We could turn the tools we don't have time to do well into things that don't suck so much and the tools that we wouldn't think were worth the hardware into things we can afford to run.

    --
    B) Eliminate all the stupid users. This is frowned upon by society.
  12. duration of leak by pikine · · Score: 4, Interesting

    We're talking about C++ as a CGI script. Who cares about memory leaks that only last for the duration of an HTTP request, which is a fraction of a second? The real problem with memory leaks is when you have a long-running process like single-process web browsers.

    --
    I once had a signature.
  13. Analogy of the year by oldhack · · Score: 3, Interesting

    "Which is like saying an anus is almost like a vagina..."

    I bow down in respect. Somebody mark this post for posterity. It's only Feb 2, but this has to be the Analogy of the Year.

    --
    Fuck systemd. Fuck Redhat. Fuck Soylent, too. Wait, scratch the last one.