The Fastest Web Language On The 'Net?
"Basically, we are not experts in any one language, but have quite a bit C/C++/Perl experience. The target platform will be Apache on *nix, but a portable solution would be good. At the moment we have the engine coded in C for CGI which then interfaces with MySQL to store game data. We are thinking of hacking in FastCGI support for a good boost in speed, but we feel a complete recode will be neccesary, as the amount of players in the game will soon be hitting 5 figures."
"At this point we pretty much know CGI alone is out of the question from a speed standpoint, so we are looking for something a bit more robust. We have heard that mod_perl may be a good solution, but have also heard the same for Python, PHP, C++, etc, so if anyone has experience with dynamic content like this, and has some suggestions and comments as to the merits of your choice, we would appreciate it."
Meanwhile, on the other side of the galaxy, slartiblartfast asks of his improbability computator, a similar question: "I have been wondering for a while if anyone has some really good metrics on the relative performance capabilities of the different scripting languages. By scripting languages I mean Perl, PHP, JSP, ColdFusion, ASP etc and by performance I mean how many pages can each one serve per second for the same hardware and load test? Every benchmark I have seen was commissioned by the creators of the technology that eventually won the test. i.e. The guys implementing the technology that won just happened to be on the core development team for the product. Now I just can't swallow that sort of thing, so I thought I'd ask here. Has a truly independent test been done that didn't favour one technology over another, or that at least invited the best from each area to build and optimize the site to be tested?"
Careful. There are lies, damned lies, statistics...and then there's benchmarks. It's a quote that's been seen often enough, here on Slashdot, but it still has its own bit of wisdom to impart.
Writing your own Apache module will get rid of 2 of those three, and the third if you stick with C/C++ for your module. Note that you should still follow good design principles; the mod_whatever should just be a mechanism for getting data into and out of apache and the code that implements your application. The module is not the application; its the means to get apache to exchange data with your application.
"But remember, most lynch mobs aren't this nice." (H.Simpson)
-- Joe
it is the fastest
--
Je t'aime Stéphanie
Well if you want top speed you can only get that from a compiled development platform. Most web environments have grown up as an interpreted solution in order to make changes easier (good old internet time). So if you care most about speed you want to look at a couple options, first is creating your own ISAPI if you're looking at NT, or your own DSO if your looking at apache. You can code either of these in C, if your also looking at using a traditional database behind the scenes then take a good look at Delphi / Kylix. Delphi creates the fastest web apps while still allowing applications to be developed quickly. There are tradeoffs if you look at a compiled approach. (Like you have to restart the web server if you make a code change). There are many inbetweener type solutions you may want to look at like ASP, or FastCGI.
Without real tests, your changes are likely to have little or no effect on overall performance.
Texas: all your electricity are belong to us
Although it's nice to speed up your program execution with changes like cgi to fast-cgi, good design will benefit you the most.
What's a good design? Write your code in a way that you can run it on multiple servers with a web redirector in front of it. Try not to depend too much on fancy SQL logic as it is diffucult to scale your databases. Instead, try to stay out of the database as much as possible, and when you do have to use the database, split up your schema such that it wouldn't be that hard to run multiple database servers. Another good thing to keep in mind in MySQL is not to do too complex of queries. MySQL flies with simple selects on indexed fields. Extremely complex updates can really tie up your database.
Now that you understand good design, how do you code your cgi end? For ultimate speed, you could do apache modules written in C, but mod_perl is only trivially slower and much easier to develop. One stipulation is that if you are getting deep into the guts of apache with things like internal redirects or many layered handlers I'd advise using C, but it doesn't look like you'll be doing that.
Many "web languages" are page-centric. PHP, and ASP are like this. Other "web languages" take application languages and tie them to a page-centric mode. Mod-perl does with as does ASP+COM. For a lot applications, this isn't really a problem because the application flow maps nicely to the page flow. When the application does things which can be presented on a web page, but whose behavior is not easily modelled in a page-view manner, then you start to see kludgy implementations.
Java allows you to code in a manner appropriate to the part of the problem you are solving. If you have, for example, a game-play engine that runs in the background, you can easily spawn a Thread for it that will run just like any other Java Thread without any limitations due to being a "web program."
This allows a design where the game engine is nicely abstracted and isolated from the front end. This also makes it easier to have a team of people in charge of making the game cool for users and another team making the gameplay itself cool.
On a side note, EJB's can impose a lot of infrastructure and programming overhead that's unecessary if you don't need the services of a full-blown Component Transaction Monitor. You can frequently do what you want by using regular Java classes or Java RMI.
But if everything you do is going through the equiv of "CGI", then forget Apache. HTTP is far too easy a protocol to implement (hell, its the protocol used for lots of "embedded" servers in stuff like Napster and Shoutcast). Implement your own HTTP server where you automatically can have all requests go to an engine for processing directly, and take Apache and all that configuration out of the loop. You'd effectively have two servers running -- an apache server to handle throwing images and static pages around, and a second home-grown server that directly serves up the application data. Doing this won't change that your database engine is your primary bottleneck, but it will reduce all other bottlenecks by quite a bit.
Apache is a general purpose system, and does it pretty damn fast, but for a true special-purpose system, its best to implement your own special-purpose server.
The "embedded server" for Java follows the same principle. Maybe W3C has some implementation code in C that may prove useful.
"But remember, most lynch mobs aren't this nice." (H.Simpson)
-- Joe
I don't know about speed numbers (everything I've done server side has been extremely fast) but development time is great with JSP/Servlet/EJB. It's easy to build a great OO design, implement it, and deploy it on gobs of web/app servers. It's really a shame Sun is giving Java such a bad name around hard core GNU/Linux peeps. It's such a pretty, robust, fun environment to code in. Try it. You'll like it. Or you'll vomit.
In reality, language choice has much less of an impact on the speed of an application than the design. but Even a language that's twice as fast can be ten times as slow with a bad design. Some languages make certain designs easier to express, just pick a language that lets you design the way you want.
The *first* thing you need to do is make the design is right. No matter how fast the language is, the number of new users and new features will outstrip any incremental improvements. Even if you make it three times as fast eventually there will be three times as many users.
The only lasting solution is to design it so it scales. If you don't, you'll be chasing the increasing loads by praying incremental optimization and faster new hardware will keep you ahead of the curve. If you build a successful site, it probably won't.
Consider Slashdot a classic case to study.
It's a common misconception that Assembler is faster than C. Good compilers know how to group instructions together so that they execute faster on the given processor. It's quite hard to do by hand.
In fact it's research to that effect, a few years ago, that led to the development of RISC machines.
A good assembly programmer could still outdo a compiler when he really focussed. But the compilers knew MOST of the tricks, and applied them consistently everywhere. In competition with assembly programmers - even good ones - the program that had been through the compiler normally came out significantly ahead.
Given this, and the greater portability of things like Unix (which was mostly in C with some minimal assembly where needed), assembly code was mostly dropped except where it was unavoidable (like OS routines to get the stack arranged after an interrupt so you could get back into C).
But given that the compiler was generating essentially all the code anyhow, it made sense to design computers with simplified ("reduced") instruction sets, rather than extended ("complex") sets of feature-prone instructions. Sometimes it would take several RISC instructions to do the work of a CISC instruction. But the compiler could generate it, so it was no skin off the programmer's nose.
With the compiler to do the work, a RISC computer could be very simple internally. This meant it could be very small. That meant the parts could be close together, so it could run faster with a given technology, and that it could be moved to a faster technology sooner, when the production yeild for a BIG chip was still too low but the yeild on a SMALL one was adequate.
The extra instruction fetches were a problem. But instruction cacheing kept the inner loops in the machine, so there was still a big net gain.
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
Design is a major issue when talking performance, but there's more to it than that. The poster mentioned using MySQL on the backend. That means there's quite a bit of work to do before we even start mentioning design.
... NOW you can start your design.
Someone with a fair understanding of data analysis needs to go through and figure out what the data storage needs are. Now pick your database: MySQL is reasonably fast for small databases on small machines. But it reaches its breaking point relatively quickly. My experience indicates that PostGreSQL is the next step up the ladder. With a user base in the 5 figure range, I would run Postgres on it's own machine and watch it closely. If it seems to have problems keeping up (and you're not on too small a machine) you'll have to start looking at a big database (e.g. Oracle).
Also, the other hardware you're running on has some performance implications. Do you have a large amount of physical memory? The more information you can keep in memory, the faster your system. How are your disk file systems layed out (NFS? RAID?). The when you do have to go to the file system, these having resolved these questions will affect performance.
Now we can talk about languages and delivery mechanisms.
You mentioned keeping an eye towards portability. Unfortunately, there are trade offs there as well. If you want speed, portability is your enemy. Java and Perl are great languages (I use them and recommend them often), but they are relatively poor performers. You can pretty much eliminate any interpreted language (e.g. Tcl) and web script (e.g. PHP, ASP, ColdFusion).
The heavy lifters are still C and C++. But even if you write your CGI in C, you're still incurring the CGI penalty (which is very expensive).
If you insist on using Apache, then start by writing an Apache module in C or C++. Even faster than that is to skip Apache altogether and write the entire server yourself. You want this to be web delivered still, which is fine as the HTTP protocol isn't too difficult to implement.
Once you've figured all this out
Esperanto--the universal language!
Oh, you meant programming language.
[Man, what a bitch it would be to try to code in Magyar...]
--
-- Geof F. Morris
Server side script rarely consumes a lot of processor cycles. I beleive the database server and other libraries that you call out to make a much larger impact in speed.
It's all a matter of optimizing the slowest part for the largest gain. Optimizing the script will result in much less improvement than say, switching to a faster database server.
Also next in line would be the web server that is hosting the application. Some scripting languages are possible more efficient than others but that only matters if you're doing a lot of processor intensive things within the script (mathematical calculations, etc) which is rarely the case.