Yahoo Moving to PHP
Erek Dyskant writes "Yahoo has decided to switch from a proprietary system written in C/C++ to PHP for their backend scripting. Here's the notes from a presentation by a Yahoo engineer at PHP Con 2002."
← Back to Stories (view on slashdot.org)
Going from something speedy and efficient to PHP.
Why not switch to J2EE? Obviously, this is an extremely large enterprise web-app. They could take full advantage of all EJBs and Webapp clustering. I just don't see why you'd use PHP, when J2EE has so much more of an advantage on an enterprise level.
On reading the slide show, the reason not to pick J2EE:
you can't really use Java w/o threads
Threads support on FreeBSD is not great
Is this really a bad thing?
Especially for the advantages EJBs give you??
Good quote, too many chars. Seriously, the slashdot 120 char limit sucks!
It runs on all platforms, it is widely supported and deployed and you can get it at every webhoster.
And there are running more Apache servers with PHP module than IIS servers altogether:
securityspace
Apache/PHP is marginalizing IIS and all other servers.
Both Microsoft lovers and monopoly-whiners will hate it, but those are the facts.
to trigger happy mods - again, not a troll, I'm curious.
sic transit gloria mundi
Viewing Yahoo's press releases, you can see no information that leads us to believe that they are switching at all?
Is this a story just to grab attention to the presentation?!?
Yeah, I'm a Republican AND a geek. It is possible.
I'm glad Yahoo is moving to OSS and recognizes the dangers of proprietary software.
I'm a Perl guy, and it was very interesting to note that:
1. Perl beat PHP in all of their performance tests
2. They listed TMTOWDI as a "con" yet,
3. One of the requirements was a language that didn't require a CS degree to use. TMTOWDI helps that, I've noticed.
I'm saddened that Perl has lost a major cheerleader but at least it isn't MS technology.
Even so, I can actually see how PHP is more appropriate. For a site with lots of content, with code mixed in, PHP's "code in the page" model is more ideal. I've had to reinvent something similar in Perl many times, appropriate for whatever I'm working on at the time (I don't like Mason, I prefer my own solution.)
I can see how a solution such as mine - where I prepare an output hash of data then show a webpage by opening and printing the file, using s/// to insert my hash contents with a search/replace method, isn't exactly ideal for Yahoo's high-content needs.
While PerlScript somewhat solves this problem, I remember it being buggy and certainly not as mature as PHP in that regard.
I can't say that I think this is a mistake on Yahoo's part - more like, I think if they wanted to, they could solve Perl's shortcomings and reap the benefits that Perl has over PHP. I guess they're just not interested.
The presentation was a little vague, wish I knew more about the details of their decision.
# Erik
Praise the Lord and pass the pretty-printer! I'm not a PHP fan, but I don't think any of us can make a strong argument against it, except that it's not a general-purpose language, and thus falls into the same geek category as Cold Fusion, Office macros, and, well, ASP. There's a very strong bias against using tools crafted for the job when the job is defined as a presentation method.
If you like, blame the tacit geek belief that any language they learn should allow inline assembler, have CORBA bindings, multithread, and let you hack a serial port monitor to control intelligent coffeemakers.
"Freedom is kind of a hobby with me, and I have disposable income that I'll spend to find out how to get people more."
I can't for the life of me figure out why so many people pick it for web apps.
I guess everyone is smoking crack except you. Seriously, why does MySQL get all this smack talk? I use it because its easy, every language I know of has bindings for it, its fast enough, and its stable. PLEASE spare me your "But XXX does that too, not to mention bla bla bla!" No, I won't switch, because I learned MySQL first (as I'm sure many others have) and so far it hasn't let me down.
python -c "x='python -c %sx=%s; print x%%(chr(34),repr(x),chr(34))%s'; print x%(chr(34),repr(x),chr(34))"
Besides being a move to an open-source-based solution, which is good for the community as a whole, this will be good for the development of Enterprise Level PHP. It has been my observation (perhaps a someone inaccurate one?) that PHP has long been overlooked as a solution for high-volume sites...and perhaps rightly so with PHP in its current state. But no doubt Yahoo! intends to make PHP work for them. I can just hope that some of their innovation trickles back down to the mainstream devlopment of PHP where it can be used to further improve its implementation.
PHP is like PERL made for the web, it has easier access to databases than any other language I know of
I disagree; Perl's DBI interface is *far* simpler (and the functions are not DB-specific) like PHP's. (I think PHP solved that in the not too distant past though)
Take a look at the top downloads at SourceForge. What is the most downloaded server-side web application?
For those of you too lazy to click the link, the answer (at the moment) is phpBB. #2 is Webmin (Perl), #3 is phpMyAdmin, and #4 is PHP-Nuke. (I'm not counting JBoss as #4 because JBoss is the server itself rather than a web app designed to run on a server).
So, we have
1) PHP
2) Perl
3) PHP
4) PHP
BTW You can get #1 and #4 bundled together as LiquidNuke.
I'm simply curious - for most jobs that MySQL is used for, there are better free databases (sometimes not by much, but that's not the point), yet MySQL seems to be the only free RDBMS anyone's ever heard of. I am trying to picture why, and asking people involved (it seems like) with the decision process in a company actually using it, seems like a good way to find out, no?
sic transit gloria mundi
I use it because its free, postgres isn't quite as well integrated with php, and i dont need the extra features of a more complete SQL engine for what I do anyhow (no transacts for example). Plus the documentation for MySQL is great, because there are so many users.
Jeremy
Undisputed. But there are languages that go out of their way to make it easier and perl doesn't. Not a criticizm of perl necessarily, just another point in the decision making process.
I personally love perl - I use it every day and it's usually one of the first technology choices for new projects. But coding in a team environment is often still a very hard thing (not every one out there is that great at doing the wonderful things you listed, and firing people left and right is often not an option), and I'll take all the help I can get there.
sic transit gloria mundi
Their reasoning for throwing out Java seems like utter nonsense. What on Earth does FreeBSD have to do with anything?
They used a red herring argument abuot thread support on FreeBSD (they should change OS's anyway) to discount what's obviously the best choice - Java technology.
I hope more than this one PHP cheerleader is making the decisions on this.
In fact one of the reasons they didn't use Java instead of PHP is that Java on FreeBSD isn't up to par.
So it's time to invest in Walnut Creek... no wait it's BSDI... no wait it's WindRiver... oh damn it! I give up.
*BSD is dying...
Read more of this story at Slashdot.Read more of this story at Slashdot.Read more of this story at Slashdot.
Well, your items 1 and 4 seem identical to me, and this is also something that depends on the design much more than the language.
My application designs always separate presentation from logic, but that's because I choose to. It might not be worth the extra complication for simpler solutions. After all, being able to embed PHP in HTML is part of why it has been so successful. However, this is just a feature of the language, not an enforced characteristic. For some reason, this seems to be a big misconception for many people.
You can have functions that generate the presentation in your format of choice (HTML, XML, etc.), static files that you include, or whatever. I've written code in most every Web scripting language (though very little with ASP), and I never felt more or less restricted in this regard with any of them. No language automatically designs your application for you.
Oh, and PHP has always had native database support for several databases. This is actually one of the complaints against it, as people like to write code that fits an abstraction layer instead, allowing them to switch databases. However, find some performance studies, and look at how well PHP performs with query-intensive applications.
Your other points are all valid, though I for one have never understood the praise for OO.
I myself have been an avid user of PHP for many years and I love it, but true, there are many that despise it.
Why?
Because like Mr. Radwin says(the author of this presentation), PHP is simple to use. It has quite a bit of error protection and it deals with sloppy code. The elite programmers amoung us hate this - they see people whom have not spent the last 12 years of their life learning a language but producing the same (or similar) results. PHP itself is great, and the fact that a corporation like Yahoo! has decided to use it over all the other alternatives just re-enforces that.
To make a pun demonstrates the highest understanding of a language
I am a huge PostgreSQL fan myself, but for mostly-read databases (like most web databases) MySQL is hard to beat.
Mostly valid points, and as far as security is concerned, I think more security problems come from the code written by the engineers themselves, not from the underlying language. I've seen some pretty insecure Perl code in my day.
The only reason I'm replying to you is because of this sentence: "...in my experience PHP is faster, more secure, more feature rich, way easier to compile and maintain, and takes far less code to accomplish the same things as Perl."
As far as faster goes, according to the benchmarks that are included in the article, YSP, which seems to be their mod_perl solution, was faster (though not by a huge amount) than PHP. The one place where it did worse than PHP was memory footprint, though that seemed to be by the same small margin that it beat PHP by in speed.
Secure: I refer to my previous statements, I think security is a bigger issue with the end code than the underlying language. Either way, both of these languages have been very good about getting fixes for security problems out, and publishing security problems so we can avoid being victims of them.
Feature Rich: CPAN is my only point here. From my experience, CPAN has more modules and more mature feature sets than PHP. I'm sure that will change over time, but I have found it much easier to get new functionality/features from CPAN than any other language.
Easier to compile and maintain: Compilation isn't really all that much of an issue with perl or PHP. If you're talking about compiling in mod_perl vs. mod_php, both of them were easy as could be for me, and both can be done through RPM. If you're talking about your actual application code, I honestly don't think that is an issue for either language, and a moot point. As far as maintenance is concerned, well written perl code is very easy to maintain. PHP code tends to have HTML & code in the same file, which I've found causes no end of headaches when working on a project that has seperate template/HTML coders and PHP/Perl coders. It can be written to seperate those two, and IMHO, should always be written that way, but in my experience, it rarely is. Similarly, I'm not a fan of Mason with mod_perl because I find the same problem crops up there.
far less code: Honestly, I can't comment on this one, so I'll take your word for it. I can generally get a lot of things done with little perl code, however, and I've never sat there for days working on the same function wishing that I had to write fewer lines.
Perhaps this will lessen the snobbery that exists regarding PHP over more bloated languages. The same snobby attitude that would have stopped HTML in it's tracks in the 90's.
Let's bring back the pioneering attitude that made the web what it is today...
GO YAHOO!
[preparing to get modded to -20]
Seems a bit strange that when you have 612 developers (!) you would rule out ASP simply because of the cost of buying Windows - plus, I'm sure MS would give them a sweet deal. Surely developer productivity and turnaround time is the most important thing?
I'm not saying they should have used ASP, just saying it's a strange basis for a decision. And they didn't even look at ASP.NET which solves the separation of code from layout better than anything I have seen.
Read reviews of shopping cart software
Let's see, web development is a) parsing strings, and b) concatenating them. Which of these is Java good at? Well, neither. For the former, nothing beats a language with built-in regular expressions. Yes, I've used the ORO library from the Apache Foundation. Yes, it's a solid implementation of Perl 5 regexes -- but it's implemented in Java (slooooowww), and it's a pain in the ass to escape everything twice, once because it's a Java string literal and again because it's a regular expression. And even aside from those two problems, compare the following snippets:
String foo = "bar"; Perl5Util re = new Perl5Util(); if ( re.match( "/\\s*b\\+z/", foo ) ) foo = re.substitute( "/[ \\t]+/g", foo );...and in Perl:
my $foo = 'bar'; $foo =~ s/\s+//g if ( $foo =~Cleaner, tidier, more readable, and the Perl will execute in one-fifth of the time. I got stuck working on a large project with JSP, and we ended up pushing a lot of stuff out into Perl scripts because working with strings in Java is so slow. I'm not talking about small improvements; I'm talking about "when we did this in Java, the user thought the server had hung, but now the user doesn't notice the wait".
And I HATE Perl. But after using Java, I hate Java more. The only thing Java's got that Perl ain't got is OOP features. No, Perl has no OOP features. They have a hilariously ill-conceived imitation that's such a pain in the ass to use that the tutorial says "most Perl programmers will never define a class; only wizards do that". Yeah, well, in any well-designed language (or even a lame but rational one like Java), defining a class is trivial. If you fuck up a feature so badly that only "wizards" have the patience to learn it, yeah, sure, only wizards will use it. That just proves that the language designer is a fool. Frankly, I doubt very much that Larry Wall or anybody else involved in the design of Perl 5 had a firm grasp on what OOP is about, what it's for, and why people use it. It's like asking an Eskimo to design a garage. Go ask an Eskimo to design and igloo, and you'll get one hell of a solid igloo (as Perl is one hell of a procedural quick'n'dirty-text-processing language), but when you want a garage, hire an architect who owns a house that has one, and keeps his car in it, too. Common sense.
PHP's not Perl, though. I'm not thrilled with it as a language. I hate Microsoft even more than Larry Wall, but ASP with JavaScript is not a bad way to do web pages (JS is less text-friendly than Perl, but far, far more so than Java, and it's vastly better suited than Java to the kind of quick development these Yahoo! guys are talking about wanting; it's got proper closures (better closure implementation than Perl, by far) and a clean, simple, and powerful OOP implementation -- you can define new classes pretty much on the fly. JS is a very expressive and pleasing language to write code in, and any halfassed JS interpreter will be at least somewhat faster than even the best JVM). ASP.NET (so-called) apparently has FINALLY, after YEARS of cluelessness, learned from JSP and started including some of the cool architecture that JSP has (letting pages inherit stuff from a parent class, for example, which is real nice in a large project; there's also other modularization goodies) -- but now you don't have to wade through Java shit to use it. Too bad it's a Microsoft product. But you take it as it comes, eh?
Oh, and those 4,500 servers? Much lower TCO than those Big Iron dinosaurs. Furthermore, they can be replaced gradually and (once again) cheaply. You don't need it all happening in the one box, because it's a million separate Apache processes spitting out little HTML pages to a million different clients. Centralizing that buys you nothing. No one Apache process has to give a damn about any other.
Some of the larger projects I have worked on , where integration is important and a key to the success of the product, JAVA seems to be the best bet.
Not to say that things couldn't be done in PHP, probably can... but I have had a lot of luck in writing all my business logic and middleware in JAVA and then using JSP or Servlets + Velocity for presentation. The thing is, it's not something that someone can do without a middleware engineer and a implimentation engineer.
I have been coding java middleware code ware for almost 3 years now, some of it integrates into web based services, some of it ties into legacy workflow systems and even tied into a IBM mainframe, I just can't IMAGINE doing all of that in PHP... I would of been laughed out the door of my company as a matter of fact with a pink slip in my hand.
The strength of being able to pull in other 3rd party libraries for various tasks that come up, JAVA is first rate.
I worked for a company that had a pretty complex logistics based system that integrated with a German logistics ocmpany.. was ALL done in PHP.. I couldn't believe it when I saw it to be honest, but to say the least... was VERY dificult to manage the application as it grew to many hundreds of classes and pages. The company ended moving to an EJB/JSP solution on websphere I think, and eventually was able to cut out about 1/2 of their engineers because the API became quite manageable by fewer people.
You can't call JAVA hype any more than you can call COBOL, FORTRAN c/C++ hype, because the level of profound impact JAVA is having on the industry at the moment is to those levels IMHO.
NOW.. if the project doesn't really reach beyond basic web applications, yes, even very large companies have such projects.. I see nothing wrong with PHP. It's actually a breath of fresh air when I need to hack something out quick and simple. I use HORDE+IMP for my own personal email and the email server for my wife on my linux box.
This slashdot intro suggests that Yahoo currently only uses C/C++. This isn't correct. Many of Yahoo's services use other languages such as Python, Perl and LISP.
You've never heard of ColdFusion? Or do you think ColdFusion isn't a "real" language, because it isn't hard to learn?
Actually, I learn what I am paid to learn. My employer never cared about cold fusion so I never learned it. I know often people comment about everything, but I stick to what I know and what I know is the languages which can easily gain me employment.
What's the big advantage of PHP over C++? The author mentions C++ being "cumbersome" and "prone to buffer overflows". That's a load of BS.
If you have a proper set of string, socket and associative array libraries, C++ works just as well as PHP and offers a whole lot more, if only the ability to check for existence of variables. Plus you don't have to type these f?c?i?g dollar signs in front of ever variable.
The other argument, "memory leaks degrade server performance" indicates bad programming, which is not going to be improved in PHP, but which can be solved in aforementioned libraries. Simply don't allocate anything dynamically outside the libraries. Plus, PHP has its own memory problems, if you don't take care of your arrays. Of course, memory will be freed as soon as the program stops, but that holds for C++ as well and memory management by process termination is another sloppy practice.
So PHP is easier to use in Apache, but that should not be a reason for making such an important change to your code.
And then comes the biggest joke of them all: the list of criteria! Do they really want us to believe that data types in PHP are better than in C++? Do they really think that PHP has "a pleasant syntax"? Or is this simply a red herring?
I can imagine using PHP for web-sites where being neat and efficient simply doesn't pay off, but for the "world's largest web site"...
I'm amazed.