AJAX Applications vs Server Load?
Squink asks: "I've got the fun job of having to recode a medium sized (500-1000 users) community site from the ground up. For this project, gratuitous use of XMLHttpRequest appears to be in order. However - with the all of the hyperbole surrounding AJAX, I've not been able to find any useful information regarding server load [Apache + MySQL] when using some of the more useful AJAX applications, such as autocomplete. Is this really a non-issue, or are people neglecting to discuss this for fear of popping the Web2.0 bubble?"
After doing quite a bit of AJAX type work for my employer, that's the best advice I can give you. The most common things will be queried the most often, so caching is the key. If you're using PHP and MySQL, use something like eAccelerator for PHP (less important) and MySQL's query cache (most important!) properly tuned. And remember, not everything AJAX has to query a database.
I've been toying around a bit with AJAX, and it really depends on what you are doing. Autocomplete should ideally be implemented using an indexed table of common words, or something like that, since if it does anything complex, it will be dog slow because of the large number of transactions. Also, client-side caching is good to make sure the amount of network trafic doesn't get out of hand. You can do some cool things with very little JavaScript, like my english to elvish interactive translator.
Other AJAX concepts actually make things faster. I've been implementing a forum that never reloads. When you write an entry and press the submit button, an XmlHTTP request is sent containing the new post and the id of the last recieved post. The reply contains all new posts, which are then appended to the innerHTML of the content div-tag. Less CPU-time is spent regenerating virtually identical pages over and over, and less data is sent over the network.
Try out fish, the friendly interactive shell.
There isn't any useful information out there because it all depends on what you are doing.
Take a typical web application for filling in forms. One part of the form requires you to pick from a list of things, but the list depends on something you entered elsewhere in the form. In this instance, you might put the choice on a subsequent page. That's one extra page load, and needless complication for the end user. Or, you can save the form state, let them pick from the list, and bring them back to the form. That's two extra page views and saving state. Or, you can use AJAX, and populate the list dynamically once the necessary information has been filled in. That's no extra page views, but a (usually smaller) JSON or XML load.
In this instance, using AJAX will usually reduce server load. On the other hand, something like Google Suggest will probably increase page load. Without knowing your application and its common use patterns, it's impossible to say. Even using the exact same feature in two different applications can vary - autocomplete can reduce server load when it reduces the overall number of searches, but that's dependent upon the types of searches people are doing, how often they make mistakes, how usual it is for people to search for the same thing, and so on.
Bogtha Bogtha Bogtha
I'm sorry, what?
"Who are in control, they are not in control of anything - they don't even control themselves!" - Glen Beck
Another suggestion is only to auto-complete after .5 seconds with no typing - that way rather than autom completing s sl sla slas slash slasd slasdo the user who knew exactly what they wanted doesn't load down your server with spurious requests.
For this project, gratuitous use of XMLHttpRequest appears to be in order.
All this hyperbole surrounding AJAX is just that - hyperbole. I dunno exactly what your requirements are, but the first thing you can do to ensure that AJAX XML requests don't bog down your system is to decide whether you really need all that fancy AJAX stuff. It's neat stuff, but the majority of web apps can still be done the conventional way, without wasting time on AJAX codeRegister the editry.
Uhm, you should never give blanket advice like that. This is the simplest brute-force way optimize an app:
STEP 1: develop a set of benchmarks.
STEP 2: adjust something
STEP 3: see if it improves your benchmarks. If not, roll it back. REPEAT STEP 2
How can you possibly improve your app if you can't even tell when you've improved it? PHP accelerators may or may not help (actually I would recommend AVOIDING PHP because of the difficulty in dealing with persistent compiled PHP code). MySQL query cache may or may not help (in some of my apps, the query cache *lowered* performance).
You can improve on this basic formula by the way. For instance, you can use benchmarks to *identify* which parts of the app are the slowest so you can focus your energy just on the slowest stuff. But the basic premise is the same: benchmarks are the most important thing you can do. Devote 1/3 of your time and budget, at least, to developing the benchmarks. Throw in some automated testing too, while you're there.
...but then you had to go and say "Web 2.0" and I dissolved into fits of laughter.
Advice: on VPS providers
Just remember that. It's not half a request it's a full request. The easiest way to think about it is imagine instead of using AJAX you reload the page.
Now that isn't quite true as you only reload part of the page.
The common example is google sugest. Instead of a list of searches lets try a list of products. If you use AJAX against a database of 1000 products and you had say 5 users using AJAX hiting the database. If you just did a select each time it would be really bad. At least 5 database hits per second. In the old environment it would have been 1 hit every second (assuming it took 5 seconds to fill in the form). So in this case you're increased your database load by more than 5 times (if you used like instead of = in the SQL).
To get around this you have a number of options. Here some of the ideas I've seen
1. Add columns to the product table with 1, 2, 3, 4 characters and index the columns. This means you can use = instead of like which is faster.
2. Hard code the products in an array.
3. Use has files. EG create 10 has files for different lengths. Then check the length of the imput and then load the correct hash file and look up the key.
The basic idea concept is try and do some kind of work upfront to decrease the overhead of each call because you'll end up having a lot more requests.
It's going to be tempting to use a lot of AJAX, especially if sounds fun. In reality though, you should be considering user experience, since this is a community site. Don't use an AJAX call where someone might expect a page refresh.
With that said, it's best to try to cache frequently accessed items in memory (regardless of whether you're doing AJAX calls). ASP.NET does a good job of this--I don't know what you're programming in, but definitely find out how to cache so that you don't have to read the database all the time. This reduced our database server load from 55% to 45% upon implementation (it's separate from the web server).
To specifically answer your question, the thing that's fast about AJAX is mostly perceived. Yes, you'll reduce calls, but at the sacrifice of having to code things twice: once for users with JS, once for those without. Use it in places where it's senseless to reload an entire page. For example, opening a nested menu. Searches that aren't done by keyword are good as well. Like has been said above, delay a server request until the user is done typing so that you can reduce calls. Remember, it's still a hit on your server, it just doesn't have to get all the rest of the crap on the page.
To reduce bandwidth, use JSON instead of XML, and only pass the headers that you need to into the AJAX call. To reduce server strain, cache frequently accessed database calls/results. Also, other non-AJAX javascript can help reduce calls, such as switching between "tabs" with some display:none action instead of reloading a page.
The answer is not gratuitous AJAX, the answer is thinking through how people will most commonly use your site, and making those parts easiest (so users don't have to redo things, therefore wasting your server capacity/bandwidth). Take things that shouldn't have to refresh the page and make them work using javascript, AJAX or not. Depending on how crappily things are coded now, you should see between a 15 and 35% reduction in server load and database calls.
Grammar Lesson: you're is a contraction of "you are"; your means you possess something; yore means days gone by.
The trick in minimizing server traffic is to come up with the right remote data granularity--i.e. don't fetch too much or too little data on each trip. At one extreme you'd fetch essentially your entire database in a single call and keep it around on the client, wasting both its memory and the bandwith to get data that will mostly go unused. At the other extreme you simulate traditional APIs, which typically get you what you want in very piecemeal fashion, requiring one function call to get this bit of data, which is required by the next function, which in turn returns a struct required by a third function, and so on until you finally have what you really want.
The happy medium is somewhere in between. Come up with functions that return just the right amount of data, including sufficient contextual data to not require another call. For a contacts-type app you would provide functions to read and write an entire user record at a time, as well as a function to obtain a list of users with all the required columns to display them in a single call. You will generally find it more bandwidth and client-side processing efficient to taylor the remote functions towards the UI that needs them, fetching or uploading just the required data for a particular application screen or view. Once you have a decent remote function architecture you will have no doubt considerably less server traffic, since practically only raw data makes the trip anymore.
An example is an application that display a list of cities in a state, after a user selects the state. (1) If you send ALL the data to the client at onece, its a large file transfer and takes a long time. This produces a heavy load all at once. (2) If you coded it to refresh the whole page after the selection then it is a smaller initial load, but on the 'refresh' you are sending the whole page plus the new data. (3) If you use AJAX, you only have to send the initial small request ( not the heavy load ) and then the second request for the part of the page that needs updating.
Between 2 & 3, #3 is better because it reduces the second hit on the server and network as it does not have to resend parts of the page that have already been sent. Between 1 and 3 you actually will have more hits on the server but #3 will result in less data being sent across the network.
The biggest problem with #2 is that sometimes refreshing a whole page ( onchange ) confuses users. Yes, this may sound weird, but I have had people tell me this.
The biggest problem with #3 is that if the server request fails, you must code for this and if you don't the user may not know what happened. Also how do you handle a retry on something like this?
Only 'flamers' flame!
Does slashdot hate my posts?
Make some simple test scripts using something like wget, and capture the response time with PasTmon or ethereal-and-a-script, one test for each transaction type, while at the same time measuring cpu, memory and disk IO/s.
At loads that wget or a human user will generate, 1/response time equals the load at 100% utilization of the application (not 100% cpu!), so if the average RT is 0.10 seconds, 100% utilization will happen at 10 requests per second (TPS).
For each transaction type, compute the CPU, Memory, Disk I/Os and network I/Os for 100% application utilization. That becomes the data for your sizing spreadsheet.
If you stay below 100% load when doing your planning, you'll not get into ranges where your performance will dive into the toilet (:-))
--dave
This is from a longer talk for TLUG next spring
davecb@spamcop.net
I'm in the latter stages now of my first serious professional project using AJAX-style methods. In my experience so far, it can go either way in terms of server load versus a traditional page-by-page design. It all depends on exactly what you do with it.
For example, autocompletions definitely raise server load as compared to a search field with no autocompletion. Using a proper autocomplete widget with proper timeout support (like the Prototype/Scriptaculous stuff) is a smart thing to do - I've seen home-rolled designs that re-checked the autocomplete on every keystroke, which can bombard a server under the hands of a fast typist and a long search string. But even with a good autocomplete widget, the load will go up compared to not having it. That's the trade-off. You've added new functionality and convenience for the user, and it comes at a cost. Many AJAX enhancement techniques will raise server load in this manner, but generally you get something good in return. If the load gets too bad, you may have to reconsider what's more important to you - some of those new features, or the cost of buying bigger hardware to support them.
On the flip-side, proper dynamic loading of content can save you considerably processing-time and bandwidth in many cases. Rather than loading 1,000 records to the screen in a big batch, or paging through them 20 at a time with full page reloads for every chunk - have AJAX code step through only the records a user is interested in without reloading the containing page - big win. Or perhaps your page contains 8 statistical graphs of realtime activities in PNG format (the PNGs are dynamically generated from database data on the server side). This data behind each graph might potentially update as often as every 15 seconds, but more normally goes several minutes without changing. You can code some AJAX-style scripting into the page to do a quick remote call every 15 seconds to query the database's timestamps to see if any PNG's would have changed since they were last loaded, and then replace only those that need updating, only when they need to be updated. Huge savings versus sucking raw data out of the database, processing it into a PNG graph, and sending that over the network every 15 seconds as the whole page refreshes just incase anything changed.
11*43+456^2
I'm not the first person to have this idea, but this brings up the question: do we need to define a new tier e.g. something like a presentation services tier? I think so. I'm not going to go into it because its pretty self explanitory; but honestly I don't think ajax is going away and if you are going to make asynch. calls in this fashion, you need hardware to back it up.
Top 10 Reasons To Procrastinate
10.
Speaking of fads, I note that as of 02:06 EST, www.houseofzeus.com fails to validate as XHTML 1.0 Transitional.
You're going to want to separate out your web server if you are going to face any real load. A good mod_perl implementation with a PostgreSQL (or even MySQL) can give you the kind of dynamic speeds you need. Since the AJAX queries usually translate to a single call, you can probably get much more performance than the older style where each page had to make several queries.
To serve up the webpage, I think you should go with a static HTTP server if you can. If you can't I would use a different server because there are different queries and a different load characteristic. For starters, people can wait 2 seconds while a new page loads, but they will get antsy if they have to wait more than 1/2 a second for a trivial AJAX query to complete.
The radical sect of Islam would either see you dead or "reverted" to Islam.
Don't mention web2.0 it is utter stupidity.
People talked about RSS web server loads versus advertising revenue about 2 years ago on slashdot, so I hardly think people are that stupid.
Also, if every page is (at best - which I doubt in your case) 50Kb - and the AJAX traffic each call is 500bytes - decide if that ajax call saved an entire page refresh (from your site, a page is probably 120Kb, with ads for customer pages can be 200kb..)
So, initial download even at worst (or best) would be 50Kb, each call 500bytes, so you can see the % of overhead is little, and if this call SAVED a refresh then you have saved 49.5p which is good for half a pint on fridays between 12 and 2 at the little willow on hidge street.
good day.
#hostfile 0.0.0.0 primidi.com 0.0.0.0 www.primidi.com 0.0.0.0 radio.weblogs.com
Web 2.0 is made of ... 600 million unwanted opinions in realtime ... emergent blook juice ...Magic pixie dust (a.k.a . Tim O'Reilly's dandruff)
o int_naught_answers/
Paul Moore
Web 2.0 is made of
Ian Nisbet
Web 2.0 is made entirely of pretentious self serving morons.
Max Irwin
Web 2.0 is made of
Jeramey Crawford
- and a load of other things, see http://www.theregister.co.uk/2005/11/11/web_two_p
And like 'podcasting' has a lot of twats fighting over who thought of the grand scheme (while ordinary people were making mp3's and letting people download them without the need for twatish words or syndicated xml), people will fight over who was the one who needs all the attention over web 2.0
I like the one about pretentious self-serving morons and 600 million unwanted opinions.
Web0.002 is like the web, only with a lower signal:noise ratio.
Does anyone find the fucktarded way ingaydget puts every fucking keyword to a link back to its search engine in every story a bit uber-google-gay?
I personally don't like engaydget poisoning my results with their own brand of love-my-ads juice. I ad block that site so harshly, I feel sorry for them.
#hostfile 0.0.0.0 primidi.com 0.0.0.0 www.primidi.com 0.0.0.0 radio.weblogs.com
It's those same idiots who spend all their time talking about how the web has failed to deliver its promises, and in reality are just trying to figure a way they can get all the money by patenting stuff folk have been doing for years.
Because they are so dumb it all looks non-obvious to them; 1 click ordering is so dumb nobody bothered doing it, but hey- the customer (dolt) likes it, so as Amazon were the first senseless idiots to actually do it they get to patent it!
Sam
blog.sam.liddicott.com
This isn't directly related to your question, but it's something that most people experimenting with "AJAX" seem to be overlooking. It's too easy to fall into the trap of using XMLHttpRequest to do everything just because you can, but by doing that you are restricting yourself to a small set of browsers that actually support this stuff. This doesn't include many of these phone/PDA browsers that are becoming more common.
Also worth noting is that changing the DOM can cause confusion to users of aural browsers or screen readers. In some cases this doesn't cause a major problem; for example, if you have a form page where choosing your country then changes the content of the "select region" box that follows, the user will probably be progressing through the form in a logical order anyway and so the change, just as in a visual browser, won't be noticable. However, having a comment form which submits the comment using XMLHttpRequest and then plonks the some stuff into the DOM will probably not translate too well to non-visual rendering, as the user would have to backtrack to "see" the change.
Of course, depending on the application this may not matter. Google Maps doesn't need to worry about non-visual browsers because maps are inherently visual. (though that doesn't actually use AJAX anyway!) Google Maps would be useful on a PDA, however. I'm not saying "avoid AJAX at all costs!" but please do bear in mind these issues when deciding where best to employ it. Most of the time it really isn't necessary.
The poster stated his claim based on experience.
This experience holds valid for most web applications (AJAX or not) as anyone who has worked on any large web applications can attest. Creative use of caching has shown time and again the most effective way to reduce server load. (for some reason spitting out a byte array is faster than calling a database and building a document with the results)
I'm curious how "you can use benchmarks to *identify* which parts of the app are the slowest". This could be done by *profiling*, not benchmarking. As for benchmarks, they are effective at measuring the effectiveness of a change. They don't tell you where to make it.
I think spending 1/3 of your dev time on benchmarks is more than a bit overkill. These should take very little time and should be based on expected use cases. Developing load tests for very large applications rarely takes more than a week or two.
As for optimization, the general rule is
run a standard use case or load (JMeter?)
profile
optimize the code with the heaviest use (identified by the profiler)
repeat
----- If communism is a system where the government owns business, what do you call a system where business owns govern
There are a lot of good points posted in here. Caching on the client on the server are two big things for a good application that is using the XHR. A good database design is also key if you do not want to use "like" which slows down the search. In Ajax In Action as discussed on Slashdot here. In chapter 10, the project talks about how to limit post backs with an auto suggest by using the clientside efficiently. The basic idea examines the results returned. If it is under a certain number, it uses JavaScript regular expressions to trim down the dataset instead of hitting the server. Plus there is a limit on number of results returned so it speeds up response time.
One thing I can not get through people's minds enough when I do my talks is Ajax is not going to be a "client-based app" on the web. The main reason is going to be network traffic getting in the way of your request. Imagine a dial up user in India with your server sitting in the United States. The request is going to have to travel to the other side of the world and back with the slow speeds of dial-up. Testing on your localhost is going to look great until you get on an outdated shared server hosting multiple applications with a full network load. Yes we are talking small requests pinging the server, but 1000 users with a 10 letter word could mean death if you designed the system badly!
I love XHR, cough Ajax, but you need to look at what you are dealing with. The design of an XHR app can kill you if you do not think it out fully.
My 2 cents,
Eric Pascarello
Coauthor of: Ajax In Action
I am not quite sure what the question is, but I am fairly confident that AJAX is not the answer. IMO AJAX is a freak of nat..., computer science.
AJAX reliance on ECMA-script seems like a shaky foundation at best. I imagine debugging ECMA-script can be quite clunky and even if tool support might solve this problem at some point, there is no guarantie that browsers will interpret ECMA-script the same way, it seems like an embrace and extend waiting to happen.
I will not venture too far into the dynamically vs. statically typed language discussion other than stating that personally I prefer strongly typed languages.
I get the impression AJAX is a quick-and-dirty solution to a problem that requires something more advanced.
It seems like AJAX is an attempt to overcome the shortcomings of thin clients using the technology that had the widest market penetration, without considdering whether the technology was the appropriate tool for the job.
I am afraid that we will have to live with AJAX for a long time. A tradgedy similar to VHS victory over betamax, where an inferior technology beat a superior one.
I wonder if something like a next generation X-Server browser plugin or a thick client Java framework might not have been better suited for the job. It can't help but feel like AJAX is somehow trying to force a round peg through a square hole.
If you are worried about load, take the time to think about what really requires a round trip to the server, and what can more easily be done by populating some data on the webpage and then use that directly with javascript. My organization recently paid a certain overblown web design consultanting firm which botched a certain popular humor news site's website a lot of money, and they wanted us to use AJAX to autocomplete all of our forms - without making the simple connection that we only have about 100-200 elements in each autosuggest, and we could just put that data on the page and scrape it out with javascript, thus avoiding the extra server load and the lag time in the web browser. Make sure you see through the hype of AJAX, and don't use it when you don't need to.
If you use AJAX for everything, yes, your load will go through the roof. However, the good news is you probably don't need full blown AJAX to get most of the functionality you want. And, as other folks have pointed out, caching can make things a lot less painful where you really need it.
Good luck...
MediaMatters.org just launched a redesign in which they use some hairy javascript code that parses the DOM's visible and invisible fields to populate their autosuggest: http://mediamatters.org/issues_topics/media_person alities
For example, if you're pulling site news or software changelog entries from a table, and those entries don't get updated very often (like less than twice a day), then it certainly makes sense to cache that data. As, even if the news/changelog that appears to the user is slightly out of date, the cache would update itself within a short period of time and any new info would then appear. Nobody loses anything just because new news doesn't immediately show up on a site. But your server does save a lot.
Steve Magruder, Metro Foodist
If you're suggesting that programmers just use Ajax "because they feel like it" without examining the 1) value and 2) repercussions, then this programmer will have to absolutely disagree with that approach. On the other hand, web programmers should certainly explore possibilities with Ajax and not reject it out of hand.
Steve Magruder, Metro Foodist
Even those of us who strive generally to achieve standards compliance will often put site features ahead of compliance, leaving compliance as a leftover task. Compliance is not a simple thing. Anyone who has been a web programmer for any length of time knows the difficulty of balancing features versus standards compliance, not to mention the ease of making little mistakes that fail the standards testing tools.
Steve Magruder, Metro Foodist
Hi Dave - tried to find a way to contact you outside of this, but have been unsuccessful to date - apologies all around if I'm using the wrong communication channel...
Your post confirmed some hunch I've had for a while - I've been trying to figure out a way to measure CPU, I/O, RAM and other resources on our web servers to get reasonable application benchmarking data - I've been told it's next to impossible, and I've had the hardest time finding any info on this type of benchmarking on the net (beyond gutfeelings, and vmstat and top - which only provide performance snapshots). Yet, I'm a strong believer that "you can't manage something you don't measure" (quoted from somewhere...), and you seem to have found concrete ways to do just that. Any chance you could share some of your references or resources on this server/app benchmarking stuff? I'd be much grateful for any insight.
Cheers,
Dax (dharvey@comminit.com)
http://www.devx.com/asp/Article/296170 4/29/393.aspx
http://www.port80software.com/200ok/archive/2005/
My opinion I would think folks haven't done enough apps to know what is what and the only people saying much are going to be the Web2.0 folks themselves (unlikely to own up to it quite yet) or the few folks like these sitting at an interesting vantage point to see lots of application and site efforts in conjunction with acceleration, caching, compression, etc. knowledge. Anybody else seen similar postings, thoughts, papers? It does concern me a bit given how I have seen polling abused in JS.
-M.J.F.
I've been implementing a forum that never reloads.
Quite interesting, but I'd have my reservations with an all-AJAX forum. IMHO forums shouldn't break the REST behaviour of the traditional web model in very many places. I want the navigation buttons to work, and I want to be able to bookmark the URL and feel confident that when I visit that URL it will contain the posting that was there before. Yes, there are ways around this in AJAX but to fix what it inherently breaks takes more effort than I think should be required (I think those are the most important things to address--if you insist on AJAX make sure it takes care of the browser history/navigation and bookmark support).
As for minimising traffic, I've found I get the biggest "return on investment" (time and money-wise) by properly using CSS. Being an established standard, CSS suffers from fewer cross-browser issues, it degrades more gracefully and doesn't break important web behaviours. The CPU time to generate a well-formed HTML document that uses CSS for all its layout is not worth worrying about, and network traffic is reduced significantly because the CSS is cached, so the number of TCP/IP packets used to send the bare HTML document is not that much different from all the XmlHTTPRequest chatter.
That being said, your project looks very intriguing for other reasons. If your forum is more of a chatroom than a slash-style website then it could be very cool indeed--like MSN or Yahoo chatrooms with threads all updating in realtime. Perhaps you have features like that in mind. However it ends up, best of luck to you!
http://www.mortbay.com/MB/log/gregw/?permalink=Sca lingConnections.html
Basically, the server that is used to handle X number of customers making a request every 2-3 minutes, will get a multiple of that because the requests are coming in much more frequent.
You will need to tune the server for much higher throughput value (more listeners/threads/workers) to deal with AJAX.