Facebook Rewrites PHP Runtime For Speed
VonGuard writes "Facebook has gotten fed up with the speed of PHP. The company has been working on a skunkworks project to rewrite the PHP runtime, and on Tuesday of this week, they will be announcing the availability of their new PHP runtime as an open source project. The rumor around this began last week when the Facebook team invited some of the core PHP contributors to their campus to discuss some new open source project. I've written up everything I know about this story on the SD Times Blog."
Is this what they're using on the newly redesigned site? Because if so, it's pathetically slow. Facebook is one of those places that with every attempt to "improve" things somehow manages to make it worse and worse. They're a perfect candidate for a Microsoft buyout.
PHP is for lazy developers. I develop my webapps in C and I even wrote my own httpd to improve performance.
At some point, if you are lucky enough, you will require extremely high performance from your web pages. You start out coding HTML in Notepad and move on to Perl CGI then on to PHP with scripting embedded right in the generated HTML. All the time you gain programming crutches at the expense of processing speed, and for a while this is a great tradeoff.
But one day you start having server hiccups because your scripts can't keep up with your traffic. Sites like Amazon have already run into this and have moved away from scripting languages and back to system languages. Running applications directly on the CPU instead of relying on a runtime to translate (at best) bytecode into machine instructions means maximizing CPU cycles.
So I wonder what longterm benefit there is in improving the language runtime.
Don't starting talking about high performance and then naming languages that don't have the chance to deliver. What you really need to do is just program the entire web page in Assembler and then your going to have speed and performance that can't get any faster. If your developers are noobs and can't use real languages and there just Object Oriented kids who can't work on memory and need to access everything through abstracted methods, then fire them and get in some embedded developer who know speed = good code and good languages. If you don't want to use assembler then use good old C!
You want speed use languages that can deliver and don't try to rewrite slow scripting languages to do the job of the trusted old methods, assembler and C.
According to that article posted recently about Facebook's master password being 'Chuck Norris', the project is indeed a compiled PHP that goes by the name of HyperPHP, or HPHP. It will supposedly lower the load on the servers by 80% and speed up things 5x, according to the unnamed source in the original blog post.
From TFA: UPDATE: After sifting through the comments here and elsewhere, I'm inclined to agree with the folks who are saying that Facebook will be introducing some sort of compiler for PHP.
Not a fork. Not as newsworthy as implied.
This PHP compiler item was revealed three weeks ago by a Facebook employee. Read at http://therumpus.net/2010/01/conversations-about-the-internet-5-anonymous-facebook-employee/?full=yes
I don't know what the fascination is with scripting languages on the Linux platform or with FOSS in general, but it results in slow programs
Speed of development is faster in a scripting language, and in developed countries, below a certain scale, throwing hardware at it is cheaper than throwing programmers at it. The point of the article is that Facebook is above that scale, and programmers to write a new PHP interpreter have become cheaper than adding hardware+power+cooling.
with flaky UIs.
Citation needed. True, the often use a different widget set from the rest of the desktop (e.g. Tk from Tcl and Python and Swing from Java), but the popular widget sets also have scripting language bindings. how can one really tell the difference between a wxWidgets or GTK app written with Python vs. C++?
I like to use refurbished/recycled machines; which means that I'll have an old P4, 512M RAM and a slow bus.
Do these use more electric power than, say, an Acer Aspire Revo? The power consumption of a Pentium 4 and the power to remove the heat it generates can become an issue, especially for a server that's turned on 24/7.
Many times, applications written in a scripting language, whether it be Perl, Python, PHP, or whatever, will hang often and then start working.
There are three causes for this, and you can distinguish them with 'top' or 'Task Manager' or something else that can count CPU time and page file accesses:
Why not just stash your farm of slow php systems behind some heavy duty caching appliance(s)?
Something like aicache might fit the bill.
Assembly language isn't platform-independent. It's really easy to screw up and hard to optimize. And it's not much faster than C/C++. The issue at hand is balancing the cost of writing the code with the cost of running it. I don't see how the cost of writing and maintaining software in assembly language will ever compete with the costs of C/C++, potential speed increases and all. Object-oriented languages make small performance sacrifices in return for much greater maintenance, and that's how it should be. Scripting languages take this even further, and for these large websites have lost their advantage. The only time assembly will prevail is when we return to incredible memory constraints, but even embedded systems pack tons of memory now so I don't see that being an issue.
You can lead a horse to water, but you can't make it dissolve.
Caucho Resin has a mostly pluggable replacement for PHP which is written in Java. It adds web friendly features to PHP like distributed sessions and load balancing. Given the JVM JIT is already plenty fast and the benchmarks show that Java/PHP beats regular PHP handily - I wonder if Facebook considered using it at some point.
Modern users demand upload progress feedback. Which the HTML spec cannot do. The solution is a bevy of hoary hacks on the server end, usually by using a cache or tmp file callback. The value is then read periodically from the client as a javascript page load in an iframe.
For PHP this is the APC Cache module. You send an id with your file upload form then "Load that page using that ID" till the progress gets to 100%. According to the docs the module can poll at a period of "0 seconds" meaning as fast as possible. This halves upload speed.
On the client end, the old HTML way(no feedback) was a simple form with a submitted page. If you arrive at the submit page then the upload worked. The new way is 50-60k of javascript, which is a collection of fragile code. Yahoo's GUI upload for example. Try moding their code and your GUI *will* fail. The file may or may not upload.
br Given the modern web is *all about* uploading user submitted media, I am amazed there arent headlines "Mozilla forgets everything and rebuilds file upload in partnership with Apache...then thinks about HTML 5"
In post Patriot Act America, the library books scan you.
Speed of "Java" and ".Net"? Is it a joke?
No, it's not.
"Java" hangs all the time
No it doesn't.
and the ".Net" code to do a simple task is so convoluted that it is just ridiculous.
No, it's not.
Honestly, you really have no fucking clue what you're talking about, do you?
Scale your photos down to 604x453, which is the size Facebook displays them at, and you will get to control the sharpness and image quality.
Upload at any other size, and Facebook will re-sample them with some very cheap algorithm and apply aggressive compression and they will look like ass.
Try it, you'll be amazed how much better your photos suddenly look.
I normally use "convert -strip -sharpen 0.3 -quality 85 -geometry 604x604" before uploading - it just takes a second, and makes a huge difference.
"Patriotism is your conviction that this country is superior to all other countries because you were born in it." -- GBS