On PHP and Scaling
jpkunst writes "Chris Shiflett at oreillynet.com summarizes (with lots of links) a discussion about scalability, brought about by Friendster's move from Java to PHP. Chris argues that PHP scales well, because it fits into the Web's fundamental architecture. 'I think PHP scales well because Apache scales well because the Web scales well. PHP doesn't try to reinvent the wheel; it simply tries to fit into the existing paradigm, and this is the beauty of it.' (The article is also available on Chris' own website.)"
PHP inherntely will not lead to scalability, however, if you ever try to create any applications that use a DFS-type algorithm, it can happen. PHP (I know it is web-based, shouldn't ask too much) does not allow for extremely simple soloutions in DFS type algorithms that are apparent to most users. Many will end up with too many "while()" statements and bring down script efficency exponetialy.
I've seen a friendster stack trace before, when the app was running slow at 5 am. For those of you who don't know what this is, it's when Java runs into an error and tells you were your program died. It was really funny. Basically there was a servlet and a call to Database.java and on line 8000 of database.java they were calling mysql directly. Real nice architecture, NOT!
First of all; Everytime I see the term "Scalation", the narrator writes as If scalation was only a term for "bigger". We have to think not only of being bigger, but being smaller.
PHP has a wide support for many RDBMS, APIs and Operating Systems, but it is only a Language. A language doesn't scale, it's the platform that scales.
That's why I see the PHP/Apache/Unix to scale far better than (for example) ASP/IIS/NT: The first platform can run from a PDA to a high-perfomance Minicomputer; The second can run from an I686 (pentium support was removed?) to the best PC-Architecture based computer you can buy. That's the difference: A wide option platform versus a closed option platform.
Probably, the first platform will have perfomance leaks and will not take every perfomance point from the machine it runs within, but its scalability potential resides that it can run in whatever you throw it at. Maybe J2EE or other platforms will run faster on the same hardware than PHP, but PHP will scale there and will be looking shoulder to shoulder to it.
That's why I don't like to valuate Scalability from the "speed" point of view, but the "where it runs" point of view.
------- The last Sig. got fired.
I worked in a small shop developing web apps, and while it wasn't mission critical stuff like banking, it wasn't exactly brainless "dump data from MySQL" stuff either. I was lucky that my boss wasn't picky about languages. But if anyone I work with doubts the power and simplicity of PHP, I usually bring up Yahoo.
IMHO, PHP rocks. It's suitable for pretty much any and all web development. It can be used for quick hacks, or you can code it like a pro with objects and stuff.
The term "scalable" has become an industry buzzword. It is fruitless to argue whether something is scalable or not if there is no clear defination. It's like arguing whether you believe in freedom or not. Of course most people in the world will say they believe in freedom, but if you ask 100 people to define it you will get 100 different answers (the Bush administration has had a field day with this because the minute you oppose them, they accuse you of not believing in freedom; their defination of course).
It is impossible to say php is or is not scalable unless a defination can be agreed on. And with "scalable's" current buzzword status, I don't see that happening very soon.
developers will tend to munge sql calls into the templates, blow off any MVC separation, and get a system that is very hard to keep going for more than a few revisions.
Yes, that is tempting. But, conversely, it's a very useful capability for small projects. For larger projects, you just need to ensure you have the discipline not to use the capabilities.
For instance, here is a site I developed in PHP using a strict model-view separation. There is direct linkage between view and controller and controller and model -- I couldn't be bothered to sort that out for a project of limited size like that one. In a larger project, I'd probably devise some kind of mechanism for that.
You can write unmaintanable code in any language you choose. Discipline is the key.
My main issue with PHP scalability is the lack of a global context for app-level caching.
Sure you can toss more database servers at the task, but a little caching would often (app dependant, of course) give a significantly more efficient solution.
I think to settle this debate is a possible real-world example. Look at the story on the Jboss Nukes Project. It explains the CPU utilization and speed of the PHP version and how moving to a J2EE implementation decreased the wait times dramatically.
Its difficult to argue with facts.
While I am personally gratified that someone is making the case for PHP vs. Java, I think the whole idea of attributing scalability (as in, works for lesser and greater numbers) is wrong.
Scalability depends on how you write your code. If your algorithms are good, your system will scale, and if they aren't, it will not. Any language that doesn't let you write good algorithms cannot be expected to be generally useful, but I think neither PHP nor Java fall in that category.
Finally, I think scalability is really not what's important, but rather performance. When developing tailor-made applications, I only care if they requires more or fewer resources for the number of requests they actually get, not for higher or lower loads. Of course, for libraries, operating systems, etc. the argument is different.
Please correct me if I got my facts wrong.
Even better than JSP and other technologies is to use Jakarta's Tapestry as the presentation layer. Tapestry rocks and I look forward to having something like that on PHP. Right now PHPTal is close. The ability to define a page as components (almost in GUI terms) and then define event call-backs and so forth really makes life better.
Tapestry for the view, Spring for the control, and Hibernate for the model is a combination hard to beat with php. Sooner or later all these technologies will be used no matter what underlying language.
Well, you can make PHP which scales like crap and Java which scales as high as your bandwidth will allow, and vice versa. Java has architectural differences which make it potentially better suited to scaling high (both in terms of handling lots of users and in managing lots of complexity), but you need to have some clue to actually exploit them.
It's like comparing MySQL and Oracle; they both do largely the same thing, but Oracle's a lot more advanced and aimed a lot higher. From the article summary, it sounds like this guy just doesn't need that extra power. Good for him, I hope he's happy, but like you I don't share his low standards in languages -- my last few bits of web development have been with Ruby and FastCGI, and I'm not looking forward to my next bit of PHP maintainence.
Smarty is an extra layer of complexity, I think a well-written site should avoid it.
PHP is itself a template language, so Smarty buys you nothing in that respect except another (badly-designed) pseudo-language to learn. When I used Smarty I had to drop down to PHP all the time and then it struck me how silly the whole thing was (my designer knows a little PHP, but was just confused by Smarty).
The output caching thing can be solved by a 5-line function that renders your PHP template, captures the output using the ob_* functions, and saves it to disk if it is stale. You can also create a hierarchy of dynamic sites and spider them (with wget, etc) to create a parallel hierarchy of static pages (I have used this for a customer that wanted fast static item pages in their ecommerce site.. it improved the site tremendously).
My "template rendering" code is barely one page of PHP, look how bloated Smarty is.
Some people benefit from Smarty, but my advice is, You Ain't Gonna Need It! Keep your code as *simple* as possible. Then when you need to add caching, it's easy to add.
I agree, but I've also seen the opposite - fairly simple projects completely buried in the complexity of multi-tiered architecture.
An example was a project I "inherited" a few years back that was written with ASP for the presentation layer, business logic in COM objects, MS-SQL stored procedures for the database calls and MS-SQL for the backend database. It needed three developers to maintain all the different parts, and a simple change like displaying an existing database field on a web page meant changing code in three different places.
Debugging was also several orders of magnitude more complex - I remember the guys had a serious crash bug we were chasing for days, which we only solved by rewriting as a single ASP script with direct db calls so we see what the bloody thing was doing. Obviously we had access to the existing working multi-tier code as a base, but we did this in only a few hours.