Slashdot Mirror


Facebook Rewrites PHP Runtime For Speed

VonGuard writes "Facebook has gotten fed up with the speed of PHP. The company has been working on a skunkworks project to rewrite the PHP runtime, and on Tuesday of this week, they will be announcing the availability of their new PHP runtime as an open source project. The rumor around this began last week when the Facebook team invited some of the core PHP contributors to their campus to discuss some new open source project. I've written up everything I know about this story on the SD Times Blog."

66 of 295 comments (clear)

  1. is this being used now? by Anonymous Coward · · Score: 5, Funny

    Is this what they're using on the newly redesigned site? Because if so, it's pathetically slow. Facebook is one of those places that with every attempt to "improve" things somehow manages to make it worse and worse. They're a perfect candidate for a Microsoft buyout.

    1. Re:is this being used now? by Daengbo · · Score: 5, Informative

      Try the "Lite" version. It's much faster, and doesn't have that annoying chat bar.

    2. Re:is this being used now? by rel4x · · Score: 2, Informative

      Facebook has a few problems. Overuse of ajax combined with this absolutely bizarre habit of including dynamic javascript at random points in the script. These lead to slower runtimes, especially with older browsers where (upon encountering a JS file) they completely stop doing everything else to execute it.

      --

      Before you mod me funny, think, perhaps I was insightfully funny?
  2. Screw PHP, I write everything in C by Anonymous Coward · · Score: 5, Funny

    PHP is for lazy developers. I develop my webapps in C and I even wrote my own httpd to improve performance.

    1. Re:Screw PHP, I write everything in C by biryokumaru · · Score: 4, Funny

      C is for lazy developers. I develop my webapps in JWASM and I even wrote my own httpd to improve performance.

      --
      When you're afraid to download music illegally in your own home, then the terrorists have won!
    2. Re:Screw PHP, I write everything in C by eclectro · · Score: 4, Funny

      I develop my webapps in C and I even wrote my own httpd to improve performance.

      C is for lazy coders if you ask me. I hand code and tune assembly language programs on an altair 8800 flipping switches on the front panel. As all real programmers should do.

      --
      Take the cheese to sickbay, the doctor should see it as soon as possible - B'Elanna Torres, "Learning Curve"
    3. Re:Screw PHP, I write everything in C by sopssa · · Score: 4, Funny

      JWASM is for lazy developers. I develop my webapps in machine code and I even wrote my own internet to improve performance.

    4. Re:Screw PHP, I write everything in C by biryokumaru · · Score: 5, Funny

      Servers are for lazy developers. I develop my webapps in my head and I even deliver the pages manually to improve performance.

      --
      When you're afraid to download music illegally in your own home, then the terrorists have won!
    5. Re:Screw PHP, I write everything in C by deniable · · Score: 4, Funny

      Excuse me, I'm still waiting for the last page and it's almost bed time.

    6. Re:Screw PHP, I write everything in C by Anonymous Coward · · Score: 5, Funny

      Webapps are for lazy developers. I sip my coffee, causing a multiply entangled photon to collapse and resolve the location of countless electrons throughout the universe, spurring various exotic species of butterflies to flap their wings a twitch differently, which in turn subtly alters the flow of the viscid gaseous matter in Earth and various other planets, affecting the touch organs of living matter that can feel moving fluid, which messenger or nervous impulses relay to their minds, creating customer-tailored web content for my clients.

    7. Re:Screw PHP, I write everything in C by Galactic+Dominator · · Score: 3, Insightful

      I don't know why this has not yet been linked

      Just taking a shot in the dark here, but I'll attempt an answer. The reason no one else linked to it is because you're the only one who considers it obligatory. Slashdot regulars will know that this type of thread happens on a near daily basis and with all due respect to xkcd there is simply no need to make another tired attempt at karma whoring.

      --
      brandelf -t FreeBSD /brain
    8. Re:Screw PHP, I write everything in C by BRSQUIRRL · · Score: 3, Funny
    9. Re:Screw PHP, I write everything in C by rpetre · · Score: 3, Informative

      I'd like to point out that long before xkcd there was userfriendly, and that in my circle we still like to and this sort of joke by saying "magnets" and giggle. The "Edward Lorenz, the butterfly and the chaos theory" punchline seems a bit forced (unless you go for the 'M-x butterfly' twist to make the emacs guy get the attention ;) )

    10. Re:Screw PHP, I write everything in C by S.O.B. · · Score: 4, Funny

      Nice try. I rewrote the universe to include php and httpd in the kernel.

      Reboot in 3...2...

      --
      Some of what I say is fact, some is conjecture, the rest I'm just blowing out my ass...you guess.
    11. Re:Screw PHP, I write everything in C by quickOnTheUptake · · Score: 2, Insightful

      you must be *REALLY* new here

      --
      Mod points: Guaranteed to remove your sense of humor.
      Side effects may include gullibility and temporary retardation
    12. Re:Screw PHP, I write everything in C by mkosmul · · Score: 2, Funny

      Actually programming in your head is for lazy developers. I didn't write anything, I only proved that my algorithm is correct.

    13. Re:Screw PHP, I write everything in C by that+this+is+not+und · · Score: 2, Funny

      You're hitting the firewall. That's the decoy page you're seeing.

      Telnet is blocked in inetd.conf, you'll have to ssh.

    14. Re:Screw PHP, I write everything in C by supernova_hq · · Score: 2, Funny

      Too late, I wrote it as a module and modded it in during runtime.

  3. High performance in scripting languages? by BadAnalogyGuy · · Score: 5, Insightful

    At some point, if you are lucky enough, you will require extremely high performance from your web pages. You start out coding HTML in Notepad and move on to Perl CGI then on to PHP with scripting embedded right in the generated HTML. All the time you gain programming crutches at the expense of processing speed, and for a while this is a great tradeoff.

    But one day you start having server hiccups because your scripts can't keep up with your traffic. Sites like Amazon have already run into this and have moved away from scripting languages and back to system languages. Running applications directly on the CPU instead of relying on a runtime to translate (at best) bytecode into machine instructions means maximizing CPU cycles.

    So I wonder what longterm benefit there is in improving the language runtime.

    1. Re:High performance in scripting languages? by sakdoctor · · Score: 5, Insightful

      1. Static HTML
      2. PHP
      3. ???
      4. Rewrite the PHP runtime

      Truth is, that step 3 involves a whole load of steps where 90% of the problem will be database bound. Complied languages are not going to the magic solution in a real world situation.

    2. Re:High performance in scripting languages? by sopssa · · Score: 4, Interesting

      Exactly, and it's not like there is so much heavy processing cpu wise. Facebook probably has calculated that they can get enough performance out of recoding the runtime (even 1% is large enough for site as large as facebook). While doing that they also create a faster runtime that everyone can use. Everyone moving to write their sites in C/C++ doesn't make any sense.

      Also a lot of the site structure can be cached in memcache or accelerating proxies like squid, so you aren't actually interpreting PHP code lots of the time. Facebook also did a lot of work towards memcache, because they are mostly a DB heavy site, not CPU.

    3. Re:High performance in scripting languages? by gbjbaanb · · Score: 3, Insightful

      Complied languages are not going to the magic solution in a real world situation.

      whilst that's perfectly true, its only true to a point. Lots of people run eaccelerator or apc on their PHP sites, simply to improve performance. If these pre-compile caches didn't do anything for performance then you'd be totally right, but as they do, you've got to appreciate that replacing the PHP with a compiled language will make a significant difference.

      as always, don't guess where performance problems live, measure them. Often you'll be surprised, especially as load increases.

    4. Re:High performance in scripting languages? by amn108 · · Score: 4, Insightful

      Check your facts better next time please.

      PHP does not have to execute scripts from scratch on every request. APC cache API transparently caches JIT-compiled PHP script intepretations and these are run instead upon request.

      Apache does not have to compile regular expression for mapping each URL request, when you specify RewriteRule directives in virtual server or server context it compiles them (or however else it wishes to cache these) on server startup and that is it. During the entire lifetime of the server, the specified rules are no longer interpreted.

      Other than that I agree that the current style of server infrastructure we are "enjoying" can be improved at least 100-fold, database access including.

      * Truly persistent (across HTTP requests) user memory is a good step in the right way - extend the lifetime of all scripted objects to persist across entire application lifetime (i.e. forever - servers are not supposed to die)

      * Many database requests are so primitive that these could bypass TCP socket layer and benefit from extra speed at the cost of no longer being asynchronious. Most PHP developers use blocking database query APIs anyway, even though non-blocking calls are available.

      * Compile scripts and run from compiled cache, preferrably as machine code. Think about environment - all those quad-core datacenters wasting cycles, because programmers are supposedly very expensive. Well, they are not that expensive that when a good team get together they cannot re-think this. They can. If nothing, they should worry about the environment too - it is not cheap.

      * Offer asynchronious calls where these make sense (i.e. where they can benefit from hardware parallelization). Those devs who know how to make use of them will be happy to do speed up their applications.

      In short, cache everything, whenever you can. Memory is cheap. Cache SQL query strings, cache script compilation output, cache, cache, cache.

    5. Re:High performance in scripting languages? by Anonymous Coward · · Score: 2, Informative

      You should check your facts first.

      I did not say PHP had to PARSE each script per request -- which would happen without an opcode cacher. It has to EXECUTE the script, opcode cached or otherwise, from scratch on each request.

      So there is no way for PHP to hold application data in memory between requests, except by using shared memory or a memcache. Other languages like Python, Java, etc... allow you to instantiate your classes and their data upon startup, and then call methods per request, without having to instantiate all the classes over and over again in each request.

      Of course, sometimes you have to, but PHP does not even allow you to pre-instantiate those classes that do not have to.

      So on each request, the application has to load its configuration, even if it is stored as PHP array in some config.inc.php. It has to re-evaluate the arrays (construct the array, build hash keys, etc...) even when opcode cached.

      Granted, certain important extensions do keep pools of resource handlers between requests, like PDO, memcached, etc...

      Also, I did not mention Apache. I said (PHP) applications that map URL patterns to controllers -- aka central index.php that patches URL to controllers and their methods. That is different from Apache rewriting that maps URL patterns to PHP scripts.

    6. Re:High performance in scripting languages? by Lennie · · Score: 4, Informative

      Facebook added to memcache the ability to use UDP instead of TCP. They also changed MySQL so one replication-command from one datacenter to the next would also invalidate what is in memcache on that location.

      At some point they have so much traffic from their webservers to their backendsystems, they saturated their internal network and were dropping UDP.

      That's the kind of problems/scale they deal with, I'm surprised PHP wasn't their biggest bottleneck before (they did some work on PHP already, but not something like this).

      After all Facebook is the second site after www.google.com-search page (which handles 'just' one task) and Google has pretty much a custom-build platform.

      --
      New things are always on the horizon
    7. Re:High performance in scripting languages? by thetoadwarrior · · Score: 2, Informative

      I'm sure a lot of intensive stuff is done in a system language but Amazon still uses Perl. Google use Perl and Python through their sites.

      There's no need to to use a system language for everything. Facebook is probably using PHP on its own and that's just not wise for a site like that.

    8. Re:High performance in scripting languages? by Eil · · Score: 2, Interesting

      What you call a crutch, most developers call an enormous time saver. The web moves so fast that you simply cannot afford to take forever developing it just so that the code executes efficiently. Sure, PHP, Perl, Python, and Ruby are slower than C or C++. But for at least the last decade, hardware has been cheap enough that it makes a lot more business sense to just throw more servers at the problem than it does to delay your product launch for a year and/or double your programming staff while you make everything "perfect" in a lower-level language. Most of the problems around making web apps scalable to millions of concurrent users have been solved or will be in the near future. (CDNs, memcache, load balancers, etc.) When you find bottlenecks, you rewrite those specific parts using a better design or a lower-level language. If your developers are any good, they will have modularized the code, making such upgrades relatively painless. Trying to optimize the entire codebase for performance before it's even out in the wild ensures that it will never get there.

  4. Assembler is High Performance by Murdoch5 · · Score: 5, Funny

    Don't starting talking about high performance and then naming languages that don't have the chance to deliver. What you really need to do is just program the entire web page in Assembler and then your going to have speed and performance that can't get any faster. If your developers are noobs and can't use real languages and there just Object Oriented kids who can't work on memory and need to access everything through abstracted methods, then fire them and get in some embedded developer who know speed = good code and good languages. If you don't want to use assembler then use good old C!

    You want speed use languages that can deliver and don't try to rewrite slow scripting languages to do the job of the trusted old methods, assembler and C.

    1. Re:Assembler is High Performance by erroneus · · Score: 4, Insightful

      Actually, that isn't necessarily true. It might be true in a linear sense, but when it comes to juggling different threads and the like, assembly language as I knew it wasn't all that capable of describing the process all that well. Assembly language is a go-cart with rocket boosters.

      I would truly like to see an assembly language revival though. I truly would. It would be a return to sensibilities in programming. It would be a return to being careful with memory usage with improved focus on small efficient programming. It would be a really good thing. I just don't see it happening.

      The purpose for these more complex languages is about being able to more symbolically describe the processes to be executed by the machine. Assembly language was some of the worst about that -- if there wasn't a very detailed set of comments for nearly every line of code, it would be nearly impossible to follow in source. These more complex languages will always have their place and purpose. Trying to make them more efficient is a good and useful thing. Now if we were talking about writing the PHP interpreter in assembler, I'd say you had a winner compromise.

    2. Re:Assembler is High Performance by Anonymous Coward · · Score: 2, Interesting

      You are kind of person I would never hire. Period.
      Assembly doesn't mean speed. there is only potential not more. It has become increasingly difficult to develop in assembly as the architecture and complexities involved (drivers, APIs, hardware, devices, underlying os, etc.) has evolved so much. Decent optimizing compilers nowadays out perform hand written assembly. Further more in commercial software development time is pretty much the most important factor. If you could do in 1 day same thing that would take 2 weeks with assembly? The choice is clear. Not to mention concerns about portability, maintainability, extendibility, ..

      So really everything considered you should seriously rethink your ideals about developers because any seriously minded person just laughs at statements like yours.

      Sorry, but true.

    3. Re:Assembler is High Performance by javacowboy · · Score: 4, Insightful

      I'm a Java developer with 10 years of experience developing enterprise grade server applications. We use Java, like the majority of Fortune 500 companies, because a Java app can be maintained with a development team greater than 1 coder, common memory coding errors and behaviours is avoided, a large API library prevents us from having to re-invent the wheel constantly, and the JVM is battle-tested in large deployments.

      But, no, I guess I'm just a kid who doesn't know how to code.

      --
      This space left intentionally blank.
    4. Re:Assembler is High Performance by jc42 · · Score: 2, Insightful

      I've been involved in a number of projects that were prototyped in a scripting language (usually perl or python) and then rewritten in C for performance, with the disappointing result that the C code ran slower. I've also seen the same with C -> assembly a few times.

      The explanation is fairly straightforward. The low-level-language experts (including me a few times) may have known their language well, but they'd never looked into the perl/python/cc code to learn the algorithms used there. It turned out that the implementers of perl/python/cc have developed some rather sophisticated algorithms for some of the time-wasting operations (e.g. table lookup) that were unknown to the asm programmers.

      If they'd recoded the same algorithm used in the interpreters and compilers for the higher languages, they'd probably have won the contest, because there's no doubt that there's still some wasted cpu time in the higher languages' code. But, as others have pointed out, very often the algorithm used is a better predictor of speed than the language used.

      So high-level vs. low-level language is a bit of a bogus distinction. The actual speed of the code is a combination of the algorithm and the efficiency of the implementation of the language. And some languages have several implementations with different efficiencies.

      --
      Those who do study history are doomed to stand helplessly by while everyone else repeats it.
  5. So is it a fork or isn't it? by erroneus · · Score: 3, Insightful

    Sounds like Facebook rewrote PHP and then invited PHP core developers to adopt it as their core development platform? I can't imagine that went over all that well... probably hit a number of them in the pride region. And the article said it is to be released as open source, but failed to mention the license. Will this be some sort of twisted "FriendFace Public License" or some perversion?

    This is not what is meant when a party contributes to an open source project. "Here, I rewrote it for you. It's better. Now just throw away everything else you've done and use this." Really?

  6. HyperPHP, or HPHP by hkz · · Score: 4, Interesting

    According to that article posted recently about Facebook's master password being 'Chuck Norris', the project is indeed a compiled PHP that goes by the name of HyperPHP, or HPHP. It will supposedly lower the load on the servers by 80% and speed up things 5x, according to the unnamed source in the original blog post.

  7. Misleading Summary (surprise!) by Anonymous Coward · · Score: 5, Informative

    From TFA: UPDATE: After sifting through the comments here and elsewhere, I'm inclined to agree with the folks who are saying that Facebook will be introducing some sort of compiler for PHP.

    Not a fork. Not as newsworthy as implied.

    1. Re:Misleading Summary (surprise!) by paziek · · Score: 2, Informative

      If thats so, then they are reinventing wheel, since there is already PHP compiler available, with is also open source: http://www.roadsend.com/home/index.php

  8. Was revealed 3 weeks ago by insider by diretalk · · Score: 5, Informative

    This PHP compiler item was revealed three weeks ago by a Facebook employee. Read at http://therumpus.net/2010/01/conversations-about-the-internet-5-anonymous-facebook-employee/?full=yes

  9. One man effort by hey · · Score: 3, Insightful

    So there is one guy at Facebook doing this PHP rewrite. It must be possible to figure out who he is. Have they hired any high profile PHP developers?

  10. VM's by MattBD · · Score: 2, Interesting

    Would a language that runs in a VM, like Java, Scala or C#, be faster? After all, Twitter rewrote their backend in Scala and they seem to have gotten better performance.

  11. Three sources of scripting language inefficiency by tepples · · Score: 5, Insightful

    I don't know what the fascination is with scripting languages on the Linux platform or with FOSS in general, but it results in slow programs

    Speed of development is faster in a scripting language, and in developed countries, below a certain scale, throwing hardware at it is cheaper than throwing programmers at it. The point of the article is that Facebook is above that scale, and programmers to write a new PHP interpreter have become cheaper than adding hardware+power+cooling.

    with flaky UIs.

    Citation needed. True, the often use a different widget set from the rest of the desktop (e.g. Tk from Tcl and Python and Swing from Java), but the popular widget sets also have scripting language bindings. how can one really tell the difference between a wxWidgets or GTK app written with Python vs. C++?

    I like to use refurbished/recycled machines; which means that I'll have an old P4, 512M RAM and a slow bus.

    Do these use more electric power than, say, an Acer Aspire Revo? The power consumption of a Pentium 4 and the power to remove the heat it generates can become an issue, especially for a server that's turned on 24/7.

    Many times, applications written in a scripting language, whether it be Perl, Python, PHP, or whatever, will hang often and then start working.

    There are three causes for this, and you can distinguish them with 'top' or 'Task Manager' or something else that can count CPU time and page file accesses:

    • Swapping: More dynamic languages tend to use more general data types, which incur memory overhead. For instance, they might use UTF-16 strings instead of 8-bit strings with an assumed encoding. Or they might use double-precision floating point instead of short integers for large arrays. This might cause a program to run out of RAM and fail over to the disk more often.
    • Garbage collection: This covers ways of determining which resources are no longer in use by any active part of the program. Python, Objective-C, and lately C++ primarily use an incremental garbage collection method called reference counting, which keeps track of the number of things that "know about" an object. But some other language interpreters use only tracing garbage collection, which in the naive form causes the application to pause and make a list of all live objects once memory allocations exceed some amount. This will cause a CPU spike.
    • Blocking: A lot of the APIs available to programs in scripting languages don't have well-known non-blocking versions. For example, a host name lookup might freeze the program until it finishes. The only way to work around these is to start a thread. This will cause a pause with 0 CPU and 0 disk.
  12. PHP is slow (check), now what.... by Ritz_Just_Ritz · · Score: 4, Interesting

    Why not just stash your farm of slow php systems behind some heavy duty caching appliance(s)?

    Something like aicache might fit the bill.

    1. Re:PHP is slow (check), now what.... by jimicus · · Score: 3, Insightful

      Why not just stash your farm of slow php systems behind some heavy duty caching appliance(s)?

      Something like aicache might fit the bill.

      When your application is with each iteration generating more content dynamically than it was before (and you want to continue down that route), the benefit of caching starts to drop quite quickly.

    2. Re:PHP is slow (check), now what.... by slim · · Score: 2, Insightful

      Facebook does as much caching as it can - I mean, they're not daft. They're probably the world's greatest experts on large scale MySQL + memcached.

      But sometimes cached data isn't good enough. Facebook users expect their statuses, messages and comments to reach their friends within seconds.

  13. Complexity and cost of embedded approach by tepples · · Score: 3, Interesting

    If your developers are noobs and can't use real languages and there just Object Oriented kids who can't work on memory and need to access everything through abstracted methods, then fire them and get in some embedded developer

    Embedded developers tend to 1. work on smaller, more focused systems, and 2. charge more. For one thing, a module inside Facebook deals with data types more complex than those in the firmware of a car engine's microcontroller. And below a certain scale, the money you save by hiring noobs (and taking the tax credit for recent graduates if available) can pay for throwing more hardware at the problem.

    1. Re:Complexity and cost of embedded approach by CptPicard · · Score: 3, Interesting

      true, but on the other hand most of what I've seen is that the embedded developers can program the higher level languages with more care anyway

      My impression tends to be that the best overall programmers are those with a solid understanding of algorithmics theory, programming language design in general (meaning they have had exposure to all kinds of solutions), and most interestingly, tend to have an understanding of functional programming. The true programmer gods I have come across have always been Lispers, almost without exception.

      On the other hand, I never understood what is supposedly so educational and intellectually important in things like assembly. If one only learned that, it wouldn't still mean that one could actually use it for anything... it's just "manipulate state in registers and RAM by making use of extremely rudimentary basic operations". The transformations into machine code from higher-level program solution descriptions are much more consistently handled by a compiler than a human, and as that is manual, automatable work, it may be more important to study compiler construction... (which Lisp is pretty good for)

      --
      I want to play Free Market with a drowning Libertarian.
  14. Assembler? Really? by Atmchicago · · Score: 5, Interesting

    Assembly language isn't platform-independent. It's really easy to screw up and hard to optimize. And it's not much faster than C/C++. The issue at hand is balancing the cost of writing the code with the cost of running it. I don't see how the cost of writing and maintaining software in assembly language will ever compete with the costs of C/C++, potential speed increases and all. Object-oriented languages make small performance sacrifices in return for much greater maintenance, and that's how it should be. Scripting languages take this even further, and for these large websites have lost their advantage. The only time assembly will prevail is when we return to incredible memory constraints, but even embedded systems pack tons of memory now so I don't see that being an issue.

    --

    You can lead a horse to water, but you can't make it dissolve.

  15. Re:Is compiled PHP even possible? by BadAnalogyGuy · · Score: 3, Interesting
  16. Re:Facebook's architecture is the problem, not PHP by pmontra · · Score: 3, Interesting

    I'm no PHP fan but I won't be surprised if FB decided that optimizing the interpreter and investing resources in new functionality is a better business decision than investing in a giant rewrite of what they have now. That would effectively stop them for many months in the best case, or double their costs as a team keeps adding features to the PHP architecture and another one plays catch-up in another language. But maybe they also have some plan to rewrite some core components in a faster language, like twitter did porting the backend tasks from Ruby to Scala.

    We could say that they started with the wrong technology but using PHP Zuckerberg was able to deliver what turned out to be a successful product back in 2004. Had he wrote it in Java he could have missed a window of opportunity and people could be using some different social network now. Same logic applies to twitter's choice of Ruby, which by the way they still use for the frontend. Many recent interpreted languages (I'm thinking about Ruby) trade execution speed for speed of coding and delivering products. Many products totally fail and many others don't get so successful to need optimizations so IMHO speed of delivery is the key factor: deliver, get customers, get money and only then we'll think about making our servers run fast.

    Ah... If only FB's new interpreter could access instance variables without that redundant $this-> construct that clutters all OO PHP code...

  17. JavaScript by tepples · · Score: 3, Insightful

    Flaky UIs - click on a button and nothing happens. Or things not drawing properly.

    I've seen buttons do nothing and redraws fail even in compiled programs.

    A refurb machine is about a third the cost of a new machine

    By "cost", do you include or exclude the cost of power and cooling? And do you include or exclude the cost of replacing failed components? Capacitors die.

    scripting languages are not appropriate for large applications with GUIs.

    One scripting language has a huge deployment advantage over everything else: ECMAScript. It interacts with Document Object Models exposed by various runtime environments, and it's sandboxed so that users can more or less safely run a program without getting an administrator to install it. You might know it as JavaScript (ECMAScript + HTML DOM) or ActionScript (ECMAScript + SWF DOM). Or would you rather go back to ActiveX, where the web site sends the equivalent of a compiled DLL to each user, which runs with the user's full privileges and doesn't run on anything but a convicted monopolist's operating system?

    1. Re:JavaScript by drinkypoo · · Score: 3, Interesting

      One scripting language has a huge deployment advantage over everything else: ECMAScript.

      This is a nice lead-in to my point, though; Facebook is one of those websites that make it look like Firefox is murdering your machine. In reality, it's sites which misuse Javascript. (Sorry, but ECMAScript is just too unwieldy. I wouldn't say it. Too bad, because it desperately needs renaming.) If I leave Slashdot open all night, nothing bad happens. If I leave a long Facebook tab open all night, I have to close it before I can use my browser in the morning, even on my shiny new 4GB RAM system, let alone on my 1GB machines. If they want to improve the user experience, they should try cleaning up their crap Javascript.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
  18. Re:Gotten by Aladrin · · Score: 2, Informative
    --
    "If you make people think they're thinking, they'll love you; But if you really make them think, they'll hate you." - DM
  19. Writing the single biggest bottleneck in asm by tepples · · Score: 2, Insightful

    If you could do in 1 day same thing that would take 2 weeks with assembly? The choice is clear.

    Unless the two weeks of hand-tweaking the assembly language code of your program's single biggest bottleneck would reduce your program's system requirements so that twice as many users can use it. Such a case is reportedly common in video game development, where the increased revenue is often worth it.

    Not to mention concerns about portability

    "Portability" has more than one meaning. There's portability of the code, or its suitability for execution in multiple environments whose hardware isn't compatible. For this, you can keep a fallback implementation of each asm module in C. That's useful for running test cases such as whether the asm version still works correctly or whether it's worth continuing to maintain. The other kind of portability is the ability to run on small, battery-powered devices. These tend to have underpowered CPUs to save manufacturing cost and increase battery life, and the code to run on these CPUs must be extremely efficient in order for the application to be responsive. Go try to make a software 3D renderer on a handheld device with a 16.8 MHz ARM CPU and tell me you don't need assembly anymore.

  20. Re:Gotten by Snarf+You · · Score: 2, Funny

    I'm guessing you haven't gotten laid recently.

  21. Re:Facebook's architecture is the problem, not PHP by kobaz · · Score: 2, Insightful

    Instead of putting a band-aid on the current architecture

    But that's exactly how you run a successful system.

    1) Design product to meet needs of your audience
    2) Design the implementation that you think will handle the load the best (with lots of load testing and simulations to make sure it meets expected demand)
    3) Build product
    4) Watch it behave in the wild... Realize that actual demand is considerably higher than expected demand and will continue to grow
    5) Performance slows with more users... you need a solution that will the push the date of catastrophic overload further into the future, to buy time to work on *really* fixing the problem
    6) Migrate to a new or adjusted architecture that will solve this current problem
    7) Go to step 4

    Facebook is on phase 5. You sound like scripting languages are the bane of slow products. Yet in reality, the main bottleneck is generally the database. If facebook rewrote everything in C or some other non-scripting language, not only would it be an incredibly long process, but the the end result would be far less beneficial than if they revamped their existing technologies and worked to up database performance. There is no ultimate solution for scaling a product. You need to be constantly adjusting your strategies, implementations, and systems to cope with resource usage.

    --

    The goal of computer science is to build something that will last at least until we've finished building it.
  22. They should spend more on the upload tool by crossmr · · Score: 3, Interesting

    That thing is a broken buggy piece of garbage. Any time I go out to an event or something and want to upload anything more than half a dozen photos, it inevitably blows up on random photos for no reason (completely fresh off the camera unedited photos). I have to babysit the upload and instead of just hitting select all and letting it go, I end up having to upload it in chunks of 5 photos at a time.

    1. Re:They should spend more on the upload tool by stimpleton · · Score: 4, Interesting

      Modern users demand upload progress feedback. Which the HTML spec cannot do. The solution is a bevy of hoary hacks on the server end, usually by using a cache or tmp file callback. The value is then read periodically from the client as a javascript page load in an iframe.

      For PHP this is the APC Cache module. You send an id with your file upload form then "Load that page using that ID" till the progress gets to 100%. According to the docs the module can poll at a period of "0 seconds" meaning as fast as possible. This halves upload speed.

      On the client end, the old HTML way(no feedback) was a simple form with a submitted page. If you arrive at the submit page then the upload worked. The new way is 50-60k of javascript, which is a collection of fragile code. Yahoo's GUI upload for example. Try moding their code and your GUI *will* fail. The file may or may not upload.
      br Given the modern web is *all about* uploading user submitted media, I am amazed there arent headlines "Mozilla forgets everything and rebuilds file upload in partnership with Apache...then thinks about HTML 5"

      --

      In post Patriot Act America, the library books scan you.
    2. Re:They should spend more on the upload tool by raju1kabir · · Score: 4, Informative

      Just scale your photos down to 1024x768

      Scale your photos down to 604x453, which is the size Facebook displays them at, and you will get to control the sharpness and image quality.

      Upload at any other size, and Facebook will re-sample them with some very cheap algorithm and apply aggressive compression and they will look like ass.

      Try it, you'll be amazed how much better your photos suddenly look.

      I normally use "convert -strip -sharpen 0.3 -quality 85 -geometry 604x604" before uploading - it just takes a second, and makes a huge difference.

      --
      "Patriotism is your conviction that this country is superior to all other countries because you were born in it." -- GBS
  23. Resin Quercus by parryFromIndia · · Score: 4, Interesting

    Caucho Resin has a mostly pluggable replacement for PHP which is written in Java. It adds web friendly features to PHP like distributed sessions and load balancing. Given the JVM JIT is already plenty fast and the benchmarks show that Java/PHP beats regular PHP handily - I wonder if Facebook considered using it at some point.

    1. Re:Resin Quercus by parryFromIndia · · Score: 3, Informative

      Here is the URL in case people are interested in checking this out - http://www.caucho.com/resin-4.0/doc/quercus.xtp .
      In summary:
      It is OpenSource, 100% Java and it brings all the advantages of using a JVM to PHP - performance (JIT), Safety, Scalability (clustering/load balancing), quality tools (Development, Profilers). One can use most of the Java technologies in PHP to ease development even further - XA Transactions, JNDI, Connection pooling, object caching for example.

      Besides, improving performance of this pure Java PHP implementation ought to be easier than improving the PHP runtime. (Java6 onwards the available tools to debug and optimize Java applications have made significant progress. jmap/jhat , easy heap dumps on OutOfMemory, Object Query Language etc. already come bundled with the JVM and then there are Eclipse and NetBeans GUI profilers.)

      Also worth checking out Dr. Cliff Click's extensive Java vs. C performance blog post - http://blogs.azulsystems.com/cliff/2009/09/java-vs-c-performance-again.html .

  24. Akamai sucks by Turmoyl · · Score: 3, Interesting

    If Facebook really wants to speed up the customer experience all they need to do is remove Akamai from their content delivery network (CDN). That's where my browser is always stuck in a Waiting status when I notice a connectivity issue.

  25. Re:Facebook's architecture is the problem, not PHP by CptPicard · · Score: 2, Insightful

    The key architectural performance issues in large web apps like Facebook are about scalability by clustering and parallelism and caching... usage of proper higher-level languages helps in this (think how pure-functional programming removes shared state and Google's mapreduce for example), while using a lower-level language may give a speedup on single individual machines but makes the architectural problems harder to tackle.

    --
    I want to play Free Market with a drowning Libertarian.
  26. Re:Is compiled PHP even possible? by TheRaven64 · · Score: 2, Informative
    The interpreter we have uses direct AST interpretation, which is pretty slow. On a simple test program (a parser), it took 0.96 seconds of CPU time in the interpreter, 0.023 seconds in JIT-compiled code (pretty primitive so far, doesn't use any profiling info) and something a bit less than that in statically compiled code. Since running that benchmark, I've made a few improvements to the compiler, so it's probably a bit faster now.

    For a recursive Fibonacci calculation, implementing the same algorithm in C, Objective-C and Smalltalk took 2.35, 6.60, and 5.69 seconds, respectively (calculating fib(30) 100 times), with about a 50% variation margin on successive runs. The Smalltalk version was not always faster than the ObjC version (it was most of the time; not sure why, probably some weirdness with the Smalltalk code happening to line up with cache boundaries better), so it's safe to consider them roughly the same speed.

    It's worth noting that the Smalltalk version, unlike the C and Objective-C versions, will never suffer from integer overflow. Tweaking the benchmark so that it computes fib(47) in the three versions, the timing results are: 50, 130, and 280 seconds, respectively.

    The difference is that the Smalltalk version generates the correct answer, while none of the others do. Personally, I'd rather have slow code generating the correct answer, but maybe that's just me. It is, of course, possible to write code in C that would check for overflow (in this case it's relatively easy, you can just test whether the sign bit flipped because you're just adding two positive integers), but returning something that is either an integer or an arbitrary-precision value in C is a bit harder and you'd end up with at least four times as much code to make the C version, and a lot more potential for bugs.

    By the way, calculating fib(47) with a sensible algorithm in Smalltalk takes a tiny fraction of a second, highlighting the fact that good algorithms are usually more important than good compilers.

    The compiler targets the (GNU) Objective-C ABI, so Smalltalk and Objective-C classes can be used interchangeably (you can, for example, subclass an Objective-C class with Smalltalk and then call the Smalltalk methods from Objective-C). Some of the improvements I've recently made to the Objective-C runtime mean that the compiler can now emit code to do polymorphic inline caching and speculative inlining. It doesn't yet do either, but in benchmarks these reduce the cost of a message send to within a hair of the cost of a C function call. For most uses, Objective-C is already fast enough, so I'll probably only implement these as a profile-driven optimisations and enable them for hot code paths where the message sending overhead is actually important.

    I'm giving a talk about this at FOSDEM in the GNUstep developer room next weekend.

    --
    I am TheRaven on Soylent News
  27. What the heck version of PHP were they using? by mgkimsal2 · · Score: 2, Insightful
    From that article:

    PHP is an example of a scripted language. The computer or browser reads the program like a script, from top to bottom, and executes it in that order: anything you declare at the bottom cannot be referenced at the top.

    This was true in PHP3, but since PHP4, even declaring functions at the bottom of a file, they were still available at the start of a file execution. Everything got compiled in to an intermediate stage before execution.

  28. Re:"Java" and ".Net" as an inspiration? No, thank by Abcd1234 · · Score: 4, Insightful

    Speed of "Java" and ".Net"? Is it a joke?

    No, it's not.

    "Java" hangs all the time

    No it doesn't.

    and the ".Net" code to do a simple task is so convoluted that it is just ridiculous.

    No, it's not.

    Honestly, you really have no fucking clue what you're talking about, do you?

  29. Re:Assembly language revival by epine · · Score: 2, Insightful

    Really the only time you have to handle assembly in a PC application is when you're implementing a just in time compiler, and it's becoming the fashion to let LLVM do that for you.

    That's an interesting combination of overstating and understating the case.

    For one thing, your favourite C/C++ compiler likely contains a hand optimized memcpy() routine, down to assembly if it exposes a worthwhile gain, or coded in C with or without intrinsics if it doesn't. Many C/C++ compilers contain hand-optimized floating point routines, even more so in the embedded world. Plus there are many performance libraries out there to handle the heavy lifting in multimedia, mathematics, and encryption, some of which are vendor tuned to the n'th degree. It's been a while since I've used an Intel library, but this is likely one of the breed:

    Intel MKL

    As for LLVM, I'd say it's more than fashion. The differences in performance characteristics from one micro-architecture to another are nightmares to cope with at the assembly language level. The average tablet computer these days could probably play Kasparov to a draw, and there are still macho programmers out there who think they can do register assignment and live range analysis better than your compiler? Dude, if you've got that much talent, roll up your sleeves and fix the freaking compiler. Hopefully LLVM will solve that old problem of first having to swallow the gcc ast syntax enzyme.

    Tautology #1: I can beat my computer at chess => your chess computer sucks (or it's running on your wristwatch).

    Tautology #2: I can beat my compiler at coding a non-trivial loop => your compiler sucks.

    Unless your goal in life is to win rigged competitions, LLVM is a lot more than a fashion statement.

  30. Re:Assembler? Really? by Billly+Gates · · Score: 2, Insightful

    In today's modern processors you wont gain much performance in assembly. A core2duo simply reads the x86 instructions and converts them to risc and much of the optimizations happen at the compiler and during execution on the fly. You can always gain some speed but its nowhere near what it could do just a decade ago.

    What also needs to be taken into account is the costs and time to rewrite years of development work from scratch. Sunken costs drive accountants crazy and threaten the job of any IT manager.

    Instead of starting from scratch its better to use tens of millions of dollars of existing code.

    Its nice from an engineer perspective but facebook is a corporation and money needs to come first and foremost.

    Also assembler can crash a system and freeze it. The point of switching to NT or Unix was the point of stability of using c api's that are managed rather than using Windows95 which had assembly code that could freeze your computer.