Choice of Language for Large-Scale Web Apps?
anyon wonders: "PHP is the most popular language for the web. eBay uses ISAPI (C), Google uses C/C++ (search), Java (gmail), and Python. Microsoft uses ASP (what else?). For small web site, it really doesn't matter. What's your take on language choice for large-scale web applications? Maybe language choice is irrelevant, only good people (developers) matter? If you can get the same good quality people, then what language you would chose? Considering the following factors: performance, scalability, extendibility, cost of development (man-month), availability of libraries, cost of libraries, development tools? Has there been a comprehensive comparison done?"
For everything.
Apple uses WebObjects for its online store and the iTunes store. Consider that those go under a lot of stress. Those seem to be the biggest examples of its use, so I don't know what kind of performance it does in other situations. But for an all-around package, it seems to be pretty good.
No question about it!
performance, scalability, extendibility, cost of development (man-month), availability of libraries, cost of libraries, development tools
Performance? Assembly will give you the best performance followed by C and C++. All three of do not have that great of support for web apps..
However, Java is almost exclusively being used for large enterprise websites. Its powerful enough to handle the big jobs, and using the appropriate app server will give you great performance.
Cost of development is heavy in initial development, but pays for itself in maintenance. Most libraries and APIs are free in java (struts, spring, hibernate, tapestry, etc etc etc...). I'd say they are second to perl in terms of freely available and powerful libraries and APIs.
Development tools? Just check out the (free!) eclipse platform.
In my mind there is no question that Java (more specifically J2EE) is the best option for general large scale enterprise applications.
Good quote, too many chars. Seriously, the slashdot 120 char limit sucks!
For large scale applications, java, c/c++, perl, PHP just don't cut it. You should really check out mod_fortran. Everything you love about fortran with none of the hype.
(yes I program with this monstrosity of a system)
"Have you ever thought about just turning off the TV, sitting down with your kids, and hitting them?"
Most people tend to forget to take a productivity point of view and let themselves be guided by whatever is available or what's cool. If you follow a productivity approach it will help you make the trade-off decisions between interpreted languages like PHP and compiled languages like C/C++, with ASP and Java somewhere in between.
There is a balance between development and production, when you go live and your web-app is well-designed it should be easy to add additional hardware to compensate for performance issues (server is about US$ 2000,- , or the equivalent of 10-20 hours of developer time.)
The single most important piece of advice after recommending that you spend more time on designing the app: don't get married to the language. Be prepared to use PHP to develop quickly and understand what works and what doesn't for your web-app. Once you have solved the usability bugs, investigate how you can drive efficiency by choosing a different language or not.
There is no template for what is the best environment, only your common sense, and oh... did I mention that you should spend more time designing your app?
I use PHP myself because it focuses on one thing and doesn't get distracted by trying to do more than it's build to do... that being, serve dynamic web pages.
Sure you can use it to dynamically generate images, PDF's and alot more but these things tend to slow down and detract from what it is meant to do and should be handled by third party apps preferably on a different server that way you separate your processes and keep PHP focused on it's task.
Plus with the improvements in the ZEND engine and it's object oriented programming, PHP is now comparable and even sometimes faster than Java.
People will say that it doesn't scale but they base this opinion on a preset prejudice or on the scalability of the underlying architecture. But PHP's engine is actually more compact than the JVM because it has less to focus on and thus can scale along side Apache, the entire way.
And with tons of larger companies moving to PHP, it has proven it can handle the load.
My only complaint though is developers who try to do EVERYTHING in PHP. With all the added modules, it does have the potential but do you really want to waste processing power letting PHP handle all these extra tasks? Use PHP for dynamic webpages and any added processing you need to do, I suggest moving to a secondary app preferably built in C/C++ or even Java. That way you get the most bang for your buck.
This is my sig. There are many like it but this one is mine.
Repeat after me: Java is not Javascript, Javasctipt is not Java.
Wrong.
AJAX asynchronously calls any server-side technology without needing a page redraw. It could be PERL, ASP, or anything else that can respond to an HTTP Request.
Please read the docs about Ajax before telling me something that has nothing to do with it.
Please follow your own advice.
Ajax, which stands for Asynchronous JavaScript and XML, does not necessarily imply Java on the backend. Many Web application frameworks, such as Ruby on Rails, include Ajax helpers. I'm sure many Java Web app frameworks have also added support for it.
Adaptive Path has a nice article introducing Ajax called Ajax: A New Approach to Web Applications.
Java:
front end - Tomcat running JSPs (JSTL or Velocity for templating)
in the middle - Spring and Spring MVC
Closer to database - Hibernate.
Ideally, everything running in same JVM. Add more servers for scalability front-ending them with load balancer with sticky sessions.
No J2EE fluff, easy to find people, good productivity.
You can certainly make a large, high traffic site in python. But not with zope. Zope is brutally slow, and the only thing you can do about it is shove a cache infront of it, which does nothing to help speed up user-specific content.
Just use a decent python web framework with a real webserver, zope is a waste of time.
As many as possible. Use PHP for the front end, Perl for input parsing, Euphoria for the graphics, JavaScript on the client-side, Moo for the database and Python for the glue to hold things together. Every language has strengths and weaknesses.
Noooooo!
It will just produce a job ad that says:
Required: 3+ years experience in PHP, Perl, JavaScript, Euphoria, Moo, and Python.
Then when they can't find any individual to fit the bill (surprise!), they will lobby Congress for more visa workers so that they can hunt the entire globe for the "best and brightest".
(Hmmmmm. What the hell is "Moo"?)
Table-ized A.I.
You are mixing up the language with the modules. There is a reason why PHP comes without all those additional modules... so you can decide what you want it to do. If you want to add all those modules to PHP and make it do all that, then you have to do it yourself. But the base install does not include them. In fact it no longer includes MySQL support in it and that too must be added as a module.
:)
:)
As far as your opinions on PHP not scaling, tell that to IBM, Avaya, Hewlett Packard, Disney, Sprint and the others who get millions of hits a day using PHP. Seems to me if sites that get millions of hits a day can handle the bandwidth using PHP, that it JUST MIGHT be able to scale.
And as far as worst security history, you again confuse bad programming with the language it is written in. For this analogy, C# and VB still hold that title. Just because the language allows you to make mistakes in your programming, does not mean it is the languages fault when you create a recursive function that loops perpetually.
I suggest trying a course in logic; it makes your programming better and your argumentative rhetoric make more sense.
This is my sig. There are many like it but this one is mine.
Java is called a language but in this context it is more of a platform which, frankly, is older, more robust and better thought-out than anything PHP has to offer--at this point. I believe PHP is great for small to medium scale web sites, but once you start to deal with the large structures that enterprise systems require, PHP is just not an option--if you want packages already available to you which are thought-out, mature and stable, like all the various J2EE solutions available.
PHP very well may be faster for an individual page--but what are you comparing that to? Tomcat set up to use JSP? Well, there's a lot of infrastructure there that a PHP developer is probably not going to use for a simple dynamic page. And the fact is, PHP is incorporating a lot of 'heavier' OO features now whose effective use is debatable when considering web apps tied to the HTTP protocol--why build and tear down your entire OO structure every time you load a page? To do that intelligently you want an application server caching these objects...and then we start talking about Java and all the years it has on PHP there.
So, I'm really just saying--some things are right for some projects, others for other projects. Choose wisely.
Lets not forget that PHP has the worst security history of any language, there are constant exploits and there's nothing you as a PHP user can do about it.
Constant exploits? For PHP, or for crapply-written content management systems (ahem, phpnuke) that happen to be written in PHP?
CERT has issued two advisories for PHP itself: CA-2002-05 and CA-2002-20. Looking through the changelog I see only a handful of security fixes.
Like most languages, it's possible to write unsecure code. I've seen code that executes stuff on the command line, right from a GET string. It's just as possible to write secure code.
One problem with PHP is it's a simple language, and a lot of beginners with no experience pick it up and can use it to write applications. Knowing nothing about software development, or security issues, they tend to write bad, insecure code. This has nothing to do with the language, it simply has to do with the developers. If python or ruby came into incredibly widespread use (ie, available on pretty much any hosting account you can buy, like PHP is), then you'd probably see the same thing happening. It doesn't say anything about the languages, it's simply a matter of inexperienced developers writting bad code.
Speak before you think
A solution I like is to write a Python backend that is exposed to the frontend as XML-RPC. Then use the language your designers find easiest to work in for front-end coding.. usually PHP.
Python is great for the backend because it has good namespace support which helps a lot for big complex programs. PHP on the other hand is well known and extremely easy for doing various web-scripting type tasks. I have a little PHP function that gets called by the PHP server for every page (without needing to be in the code exposed to the PHP coders) that simply passes the page inputs to Python over XML-RPC and puts the response into a global variable. Then the PHP coders jut display the results however needs to be done based on the inputs and outputs.
Some nice benefits of such a split system is that it's easy to keep UI logic sepperate from application logic and it's easy to split your application up over multiple servers so that it can scale to any load. For example you might have two PHP servers, three Python servers, and a DB server dividing the load. Normal load balancing techniques work just fine for deciding how the machines talk to each other. Pretty nice to be able to just throw another server in where it's needed if you suddenly find a 9/11-type day where your site is getting unexpectedly high loads.
Of course you can split your processing up in more levels if you need to. I like to abstract out all my queries into their own XML-RPC interface that sits in front of the DB so as to not allow direct access to the DB for security reasons. Anyone trying to hack the DB would have to use my stored queries and work through my XML-RPC interface rather than being able to access the DB directly. If your dealing with sensitive information it's just another layer of protection. If you have to access third-party systems that use some unstandardized method of communicating then it can help to keep your code clean if you create a proxy interface between those systems and your own that speaks XML-RPC. This way the code for speaking to that other system is a completely sepperate code base and your main code base is kept clean.
At what price learning? At what cost wisdom? The price is a man's peace of mind, and the cost is his life.
I actually like PHP for large-scale web apps. However, I agree that many PHP programmers do create unmanageable code. This is, however, a programmer issue rather than a language issue.
I started writing HERMES (a CRM framework/app) in PHP and it is now over 20k lines and when I have time to add enhancements it will grow again. The code is incredibly manageable simply because the complexity of the application meant that I had to divide the code into four main areas (each handled in different sets of files):
1) Main engine(s)/UI framework
2) UI generation code/data input screens
3) UI event handling code
4) Core object logic.
This way, if you want to change the user interface, you just change the user interface. System-wide changes get made in one place where screen-specific changes get made somewhere else.
Everything is relatively well abstracted, so the code is very manageable.
Now, other languages have very specific problems associated with them:
1) Scripted languages in general: slow performance
2) Compiled languages in general: Requires rebuild before changes take effect, so testing and retesting is slowed down.
3) Java/.Net/Byte-code languages: Worst of #1 and #2 above.
4) Python: Performs a little better than most scripting languages, but there are times when its reference-based structure can cause bugs to be very difficult to find.
5) PHP: Many PHP programmers write readible but unmaintainable code.
6) Perl: Many Perl programmers write maintainable but unreadible code.
7) LISP: See Perl only even more so.
8) ASP. ASP is only really useful in large apps when paired with COM objects written in C++ or VB. So you have the problems with a scripted language combined with the problems of compiled languages.
But again, many of the worst issues are programmer rather than language issues. Then again, depending on your project, you may have to eliminate possibilities because of language capabilities.
LedgerSMB: Open source Accounting/ERP
"All tools are hammers. Except screwdrivers which are chisels."
"Nature doesn't care how smart you are. You can still be wrong." - Richard Feynman
You say it's "simply not true" but don't actually give any reasons.
.NET 2003 and Eclipse in daily work. Here are just a few reasons I much, much prefer Eclipse to VS.NET:
Now, I've never used IDEA for a prolonged period of time - I couldn't get into it, and was happy enough with Eclipse not to worry. (The fact that Eclipse is free helps - it would be difficult to persuade my company to pay for loads of licences for IDEA when Eclipse is perfectly all right and free.)
I do, however, use Visual Studio
1) Refactoring. Yes, there are tools available to help - but it's free and bundled into Eclipse.
2) Organise imports. Even with VS 2005 having some limited support, it doesn't help nearly as much as it should.
3) Built-in unit testing tools. Using TDD.NET to fire up NUnit GUI (or any of the other things it can do) is much, much uglier than the built-in support for JUnit in Eclipse.
4) Ant support in Eclipse. Our Java build script is *so* much nicer than the nastiness VS.NET encourages. I'm looking forward to investigating the VS 2005 integration with MSbuild.
5) "Hold down ctrl to make anything a hyperlink" - want to go to where a method, variable, class etc is declared? Just hold down ctrl and click. Navigation was never simpler.
6) Search for all references (etc) - in theory there's "go to definition" in VS.NET 2003, but half the time it doesn't work when you're in a large solution, and I don't believe there's any way of finding all references.
7) The VSS plugin for Eclipse is actually better in my view than the VS.NET support... much easier to understand the configuration, change it on a per project basis etc.
8) Launching Tomcat in a debugger with Eclipse (even without any extra plugins) seems a lot more reliable than trying to make sure that IIS has actually caught up with changes. Why do web projects need IIS to be running even to open in VS.NET? It's crazy.
9) Quick Fix and other source options - get Eclipse to write code for you, fix code for you, extract constants, etc. Fantastic stuff - especially in test-first development, where you can write code which uses the API you *want* to exist, then tell Eclipse to create the shell of that API for you.
10) Compile on save with a really good incremental compiler. This saves huge amounts of time. Oh, and changes really do happen, unlike in VS.NET where if you change an embedded resource, a normal build sometimes picks up the change but sometimes doesn't. (Not to mention VS.NET locking access to files it's built quite often, meaning you can't rebuild them without restarting VS.NET - particularly in terms of XML documentation.)
These are not esoteric features which are hardly ever used - although I could list loads of those too, if you want. These are things I use *every day*. My pair programmer and I are *always* saying how much easier our C# work would be if VS.NET supported the features above. Half of them aren't even in VS 2005 beta 2, as far as I can see - or at least aren't as well implemented. Funnily enough, I can't remember the last time we said something similar the other way round...
So, I've given some of my reasons why I think Eclipse isn't just a step ahead of VS.NET, but leaps and bounds. Now, why do you think VS.NET is better than Eclipse, and do you really not care about the above features?
It is Saturday, and instead of being out in the sunshine, taking in rays, talking to women, GOING OUTSIDE, here we are, in front of our screens debating about which language to build our web apps with? Can we suck enough?
Dont bother replying, because when this damn compile is done, I am going outside if it kills me. I wont be here to read any replies, dammit.
The only reason people think they use ISAPI is because that's what they originally used, and an executive decision was made to not break any existing links at any time, ever. Check the Powered by Java image. The /ws/eBayISAPI.dll that you see in all of the requests just invokes a servlet.
If I were your boss, I would hire an intern and have him rewrite your apps from scratch with a single, maintainable language. Once he is done, I would hire him for half of what I pay you, then give you the boot. Job security through incompetence?
A slashdotter who didn't build his own computer is like a Jedi who didn't build his own lightsaber.
[fill_in_the_blank] is the way go to. See [blank].org for more. For anyone who's built custom sites based on [blank], I think they would agree with me. [blank] is really easy to use for building big apps for use in web stuff, and [blank] provides an easy-to-code-for application framework that saves lots of time and money.
Best of all, it is [blank]-oriented so that you just snap functionality together like Lego blocks to get an instant app that runs at the speed of light almost right out of the box! And [blank] scales to every user on the entire planet. And it plugs into XML.
Only a Devry graduate would use anything different. Go with [blank]!
Table-ized A.I.