Yahoo Moving to PHP
Erek Dyskant writes "Yahoo has decided to switch from a proprietary system written in C/C++ to PHP for their backend scripting. Here's the notes from a presentation by a Yahoo engineer at PHP Con 2002."
← Back to Stories (view on slashdot.org)
Obviously open source DOES apply to the corperate world.
Ignore the "p2p is theft" trolls, they're just uninformed
I'm sure it must be easier to find someone who knows PHP then it is to find someone who does cgi-bin c/c++ to maintain their sites. We use PHP/Asp for many of our internal applications we use for monitoring network systems and integrate it with MySQL. Works very well.
Eep, why J2EE? It's slow, it's a memory hog, it doesn't deliver on the write-once-run-anywhere promise of Java because of the vendors' differences. Perhaps most importantly for them, you really can't use Java w/o threads, and thread support on FreeBSD is not great. Read that over again. That means that Java doesn't scale well for you if your OS's thread don't scale well for you. If you're running FreeBSD, then that's the case, which further limits Java's absymal performance.
11*43+456^2
isn't going to give any type of boost over a proprietary C/C++ app
It wasn't a proprietary C/C++ app, it was a proprietary C/C++ scripting language.
Performace should be same or better, if I understand correctly.
S
PHP's biggest security problem is it's users. PHP is a powerful and easy to learn and use tool, which means it attracts a lot of new users. And the more new users you have, the more new user mistakes are made.
PHP has made a grep step forward in disabling register_globals by default. Unfortunately, a lot of legacy code isn't built for this.
I skimmed the comments so far and it seems like some people don't have a very high opinion of PHP. It's one thing to feel like something is better, but to despise it baffles me.
I do programming in PHP and have found it not only to be useful, but quite an upgrade over ASP. Is there something inherently bad about PHP that should make me shy away from it, or is it more of a religious debate?
It will be nice to go to a large corporate client that is looking for an enterprise solution (what the hell does that even mean) and say something like:
"I'd reccomend using PHP and Postgres on the backend of the project, given Yahoo's recent success, I think the platform is powerful, sucure, and cost-effective."
I realize that what Yahoo does in reality is irrelevant, but executives like to hear that kind of shit. Of course that assumes Yahoo can make it all work well, time will tell.
Cloud City Digital: DVD Production at its cheapest/finest
The concept of Industry Standard isn't defined by "running on all platforms".
It means the software has a near monopoly on web development. It's popular, but so are CGIs, Cold Fusion, Flash, VB Script, Java Script, and of course JSPs.
What irks me is that people haven't abandoned HTML for all but display. HTML was designed to be stateless; info wasn't remembered as the browser jumped from one page to the next. To overcome this, all sorts of gross, kludgy, slow and complicated technology has been created (including JSPs, PHP, etc, etc) to overcome the inherent statelessness of the web.
The most interesting technology I've seen (and one that I hope will put these lame ducks out of their misery) is Curl, a programming language that runs in a plug-in (yes, sort of like Java, but more advanced, with fewer of the drawbacks). It was started at MIT via a US DARPA-funded project, and includes Timothy Berners-Lee, the creator of the World Wide Web and Director of the W3C, as one of the founders.
I can't wait for the Internet to go back to what it's good at - serving up pictures of pretty, naked women.
No, I don't work for CURL, or even for a company that uses the technology. I just think it's a better mousetrap.
I agree, I just thought I'd point out that this doesn't change the fact that perl is HELL to maintain for larger projects :)
Bullshit, or at least bullshit that Perl makes it harder to maintain than any other language.
Code properly, document correctly and adhere to the same design rules for any other maintainable project (which includes firing the assholes who think that obfuscated perl has a place in a maintainable project) and you will have no more difficulty in maintaining a Perl project than you will any other.
The fact that perl lets you create a mess may be open to debate, but it certainly doesn't mean it will be a mess.
Standard Java stereotype. Java was slow a long time ago, not today. That gross asumption alone should get you modded down.
Standard Java propoganda. The Java language is plenty fast (relative to the other solutions discussed), but most of the I/O libraries are still hideously slow. Ignoring that completely should get you modded down.
Even a fairly slow computational language like Python drops Java out of the running for typical high-volume web site usage, simply because of I/O problems. Java is quite suitable in low-volume settings with stiff transactional requirements or heavy computational requirements--any setting where high I/O costs are amortized by several-order-of-magnitude higher page generation costs. It's a bad choice for a very high-volume site which basically wants to paste several database sources together into a template and shove it out the pipe; Yahoo! falls pretty squarely into that camp for most of its pages.
They also have many components written in various domain-appropriate languages or that they don't want to rewrite for whatever reason; JNI is still pretty heavyweight, and if you have a lot of language interop requirements Java isn't a great choice (though if you're willing to sacrifice some JVM portability this can often be worked around, especially if the other benefits of Java outweigh the cost of implementation).
On top of that, using EJB/J2EE will kill performance even more, which means that actually getting the feature benefits of Java requires handing away even more performance.
All that's without even addressing the "requires tons of threads" problem; multiplexed I/O is pretty new to Java, and there's no good multiprocess API. Both of those are major problems, though hopefully multiplexed I/O will mature quickly. But until there's a good multiprocess API, Java's going to be unsuitable for a number of applications (and sticking to a platform-independent mentality instead of a platform-agnostic mentality makes implementing an efficient multiprocess API very difficult indeed).
Worst of all are the memory issues, but those are well-known enough not to be worth rehashing.
Sumner
rage, rage against the dying of the light
You care to back up any of the claims your making? I have seen J2EE in production environments deployed with great success. There is nothing inherently slow about J2EE in general. "Java's abysmal performance"? In what context is Java's performance abysmal. I won't contest that for a number of tasks it is not optimal, for server application programming tasks it really shines.
I just don't buy outright arguments like that at face value. It is *NOT* well understood or believed that what you state is true among any large groups of professional developers with proven experience deploying J2EE apps. Proof please.
Trust me, I love PHP. I wrote a book on PHP and think it can do great things.. but for enterprise level applications and for quite a few tasks it just isn't there.
Jeremy
According to the slides, the only negative thing they had to say about Java (J2EE / JSP / etc.) is that FreeBSD has really lousy thread support (and proper J2EE solutions require threading)...
To me, that seems like a really stupid, short-sighted way to approach the problem. If Java is the best solution for them (which I think it would be), then why not move to an operating system that properly supports it?
Why hamstring yourself to an inferior solution just because you don't want to give up FreeBSD? That's like complaining that your Pinto is too slow -- but you'd rather fill it with Premium gas to get a little performance boost instead of getting a better car.
And what's up with 4500 servers? What a nightmare! Who in their right mind would want to deploy and manage 4500 servers? If they were really serious about this, they'd upgrade to a couple dozen big-iron IBM mainframes (like one of these!), where it can run hundreds of virtual Linux instances (if needed)...
I guess it goes back to the old saying "When you only have a hammer, everything looks like a nail"...
Dude, this isn't some little backwater ecommerce site, this is the most hit site in the world. I think it's safe to say they considered the performance. (BTW check slides 30-34 of the link for that exact info)
-Ted
-=-=- Quantum physics - the dreams stuff are made of.
3. One of the requirements was a language that didn't require a CS degree to use. TMTOWDI helps that, I've noticed.
I have to disagree strongly here. TMTOWTDI generally means that two implementations of the same design are different enough that someone without a lot of experience probably wouldn't be able to tell that they were the same thing.
Having standard ways to do things makes it a lot easier to understand what's going on and makes it a lot eaiser to do things. Even in perl, people try to find a common way to do things, and it often ends up being regular expressions, even where there are far easier solutions.
-- The world is watching America, and America is watching TV.
I've mostly explored JSP, Zope, and PHP. JSP is cool, tons of support, it feels like and acts like it's the enterprise solution. As such, it's a logical choice for a lot of things because if you need a hammer it's nice to have a sledge hammer. The reality, at least as I've seen it, is it's a bitch. It's huge, it's slow, it takes a super computer to really run, I've seen a fair share of sloppy JSP. It's cool, has all the gizmos that java has and it also has all the gizmos that java has. It seems like you need a ton of crap to build a lot of java stuff, even things from Sun like the JMAPI need 3 or 4 30MB downloads before you can build them and get them working, maybe that's just me complaining though. I'm also not sure what kind of vibe I get about Sun and java as a whole technology any more, I'm not saying it's going away or anything like that but it's not the goose that laid the golden egg anymore either. I don't know I'd tie my cart to java if my cart was as big a yahoo! Again, just my opinions, my C++ and assembly (of all things) skills have taken me farther the last couple years and got me jobs when there weren't any, java has just filled out the resume. While I'm knocking one of the most popular platforms out there let me also throw out the java developer base issue. Java was like a dot.com programming language, in no time it instantly had a huge developer base; how quickly do you think they'll run to the next great thing when/if it happens? I've wondered what would happen if sun started charging for the JDK. Or if .Net 2.0 really rocks and mono
takes off.
Zope. What to say about it. It's the bomb. It's also Python which is huge and on the cusp of going really huge, but hasn't yet. It's its own custom thing. It has a ton of cool parts you can drop in to it. It's probably my favorite. It is also a pain in the ass going to zope.org downloading something and trying to get it to work. It's like they have their own little sourceforget.net running in zope space and it makes the number of available parts look bigger than it really is. It's getting better but there is a lot of dead stuff on there. It also won't drop in to Apache that easily, you usually use their custom server and transport layer. It's not so bad but it's nice to be on mainstreet; it's more trustworthy. Other than that, it rocks, it's just a bit tough to sell it to someone who knows some of the buzz. If it were all up to me Zope would be the next big thing but it doesn't look like it's all up to me.
Then I stumbled on to PHP and it kind of rocked my world when I first started screwing around with it. For simple kinds of web things, like dumping some tables out of a database or something it's kicks the hell out of anything else I've seen. It seems like a few lines of PHP and it's done. No magic web server/container, just the apache server on your redhat box.. Then some of the tools and kits that have been put together with it make it a much more compelling application platform. Zope really appeals to my aesthetic sense of software engineering, I like python, I like the structure and the object nature it just hasn't caught on like the wildfire I think it is. PHP is close to it in terms of pontential and reusable stuff and it's like the second coming of perl. There are still the stock issues, is it fast enough? can it scale? will it last? It seems like those answers are yes. Can it scale better than JSP? I bet for a shop like yahoo! there isn't a comparison; I bet PHP wins unless they triple the amount of RAM that they have or switch off of FreeBSD boxes to S/390s or Sun "Enterprise servers." Also, PHP has such a grass roots following and has really grown up slowly compared to java, I don't see a lot of PHPers really dumping it anytime soon as it is. Now that Yahoo! is involved, PHP may go up to that next level.
- There's More Than One Way To Do It - This is a feature, not a flaw! Perl is much more flexible and powerful than PHP. Maintainability comes from coding standards, not language limitations.
- poor sandboxing, easy to screw up server - Perl can create sandboxes with the Safe module... (And if there's any rough edges, Yahoo's engineers could probably handle it.)
- wasn't designed as web scripting language - So what?? With mod_perl and HTML::Mason or TT2, Perl fits this niche well, without PHP's predisposition towards mixing code and data.
These excuses for not using Perl are hardly compelling; they sound like rationalizations. Perl is a more natural fit for Yahoo's needs, especially considering that they already have 3 million lines of Perl code.But they plowed ahead with PHP, and what did they learn?
- very easy to get some pages up quickly - Expected, but Perl would have been nearly as easy, and probably much easier for their existing Perl programmers.
- But mixed app/presentation problematic - PHP code and HTML forever intertwined - Surprise, surprise! This is exactly why PHP is inappropriate for enterprise applications. PHP encourages such shortsighted design. Beginners like it, but engineers should know better.
- PHP != Perl - The "implement twice" problem - They knew that they had 3 million lines of Perl in the backend; why didn't they leverage it? This was avoidable.
- PEAR != CPAN - repository smaller, less mature than CPAN - Again, this was a foreseeable problem.
- Surprises for people used to coding Perl - It's not just that some semantics differ. Experienced Perl programmers forced to work in PHP have to live with the frustration of having to write ugly convoluted code for things that would be clear and simple in Perl. PHP 4 filled in many gaps, but it just doesn't work as well as Perl does. (I speak from experience here.)
So let's see. Their problems with PHP basically boil down to the fact that it's not Perl. (Despite the claims of PHP advocates, it's just not an equivalent substitute.) Of course, any experienced Perl programmer familiar with PHP could see these issues coming from miles away. They rejected Perl as an option, claiming that it wouldn't be maintainable, then discovered the amount of discipline required for PHP -- would following good coding standards for Perl really have taken any more discipline?Perl was a natural fit for their needs, and the obvious choice. The entire presentation seems to be an exercise in rationalization, attempting to justify a poor strategic decision. They should have used Perl. (Even now, they should probably abandon PHP and use Perl instead, to save themselves from getting further entrenched into this bad decision...)
Deven
"Simple things should be simple, and complex things should be possible." - Alan Kay