Building/Testing of a High Traffic Infrastructure?
New Breeze asks: "I'm currently working on my first web 'application', and have discovered that I know less than nothing about setting up the infrastructure to manage a high traffic system.
Where does one go to learn about setting up the infrastructure required to host something like Slashdot? Or do you just say, 'Not my area!' and help them find a consultant?"
"My experience is pretty much limited to:
I haven't a clue. The last place I worked with on something like this hired a high dollar consultant who spend a huge pile of their money setting up a load balanced, oracle parallel server redundant everything system.
How do you test it? I've worked where they actually had a room with hundreds of systems on racks that they would configured to run test transactions against different servers and software builds for stress testing, but that's not in my budget..."
1. Install the web server on one box, the database on the same box if it's a small installation or a separate box if performance seems like it will need it. Add more memory and processors based on SWAG criteria. (Scientific Wild Ass Guess)I had a potential customer ask what I would recommend if they wanted to self host, they have around 300 remote locations and would have multiple users from each location hitting the application at the same time, so saying a couple of beefy servers probably isn't the right answer.
2. Contract with a hosting company.
I haven't a clue. The last place I worked with on something like this hired a high dollar consultant who spend a huge pile of their money setting up a load balanced, oracle parallel server redundant everything system.
How do you test it? I've worked where they actually had a room with hundreds of systems on racks that they would configured to run test transactions against different servers and software builds for stress testing, but that's not in my budget..."
Seriously...they know all about serving up content on high traffic sites. Not only is it high traffic, but it's rather big files that they're delivering. When we're testing the networks that we set up, both wired and wireless, we often visit pr0n sites for our benchmarks.
"He uses statistics as a drunken man uses lampposts...for support rather than illumination." - Andrew Lang
First step, do the math.
What was once a "high volume" app may be nothing for modern equipment. You're talking about on the order of 1K concurrent users (300 sites * several users per site).
If "use" means manually typing data into forms, viewing mostly static pages, etc. this isn't really a very "high volume" application, and a single decent server should handle it.
If, on the other hand, "use" means constantly running complex queries against a billion item data set, you're doomed.
So where do you fall in this spectrum?
Coming up next...where's the bottleneck?
-- MarkusQ
I'm currently working cooking in a restaurant, and have discovered that I know less than nothing about performing stomach surgery. Where does one go to learn about the techniques and tools necessary for curing stomach cancer? Or do you just say, 'Not my area!' and help them find an oncologist?
Seriously.. you have a lot to learn, and a lot of what you need to know just comes from experience which you can't get from a book.
First: learn how everything works. When you click a link in your "application" (why the quotes?), what happens? For instance, does it run a Controller object? If you're using a language like Ruby or Perl, is it "pre-compiled" or does it have to interpret a script on each hit? Does the controller then go to the database and populate variables, then insert them into a template, then render the template? Is the template cached? How are your database settings? Enough memory for joins? Are all your queries using the appropriate indexes? Are you familiar with your database's performance-measuring variables and tools? Are you pulling more data than you need to in each query?
Once you have an understanding of what's happening, then you can start measuring. Where are the bottlenecks? This is a very important thing to keep in mind in programming or system architecture: DON'T OPTIMIZE UNLESS YOU NEED TO! Keep your system and code as simple as possible. For instance don't cache things in your program (making it more complicated and harder to maintain) unless you have a BENCHMARK IN HAND showing a performance bottleneck.
You might not need to move your database to another machine. What you need to do depends on your app.
Yes, you will need to do a lot of testing to identify your "first round" of bottlenecks. You need to build a lot of diagnostics into your app to help you identify how long different steps take.
Always deploy your app in stages, one site at a time, until you start identifying some problems. Then fix those problems before continuing deployment. Never "flip a switch" and reveal any change all at once.
Good luck!
1: Gradual growth. Find bottleneck, remove it. Repeat. Make sure to start with a growable database and web site technology, but that shouldn't be too tough. Also, stay ahead of the game, always with overcapacity, both to cover for outages and for sudden growth spurts.
2: Instant growth from 0 to thousands+: Hire someone who knows what they are doing. In the first scenario, you have the time to learn what is actually going on, which is an advantage. In this one, you don't, and the customer base is to big (i.e., $$$) to screw with.
That basically covers it. Specific advice will vary widely based on databases and web technology deployed, so just about any other specific advice you get here is as likely to be wrong for you as right.
What constitutes 'high traffic' for you?
I've been developing a high traffic site (well, maybe medium traffic) at about 1.5 million transactions per month. We have customers using the site all over North America, plus a few in Europe and Asia, and the whole thing is hosted internally off of our 10MB link.
We have each 'tier' clustered as a pair of servers - 1Ghz/256M is more than sufficient for our 2 Apache servers. 3Ghz/1GB is our Tomcat tier, and I'm not sure what the DB runs on, but they're the beefiest servers of all the tiers.
Within the app architecture, try to ensure that you can scale to more servers. We have the ability to add more servers to any of the above tiers without any changes, plus any long-running processes (complicated reports and such) get dispatched to a fourth layer of servers we call 'backend' (by RMI). These 'backend' servers can be low-end (300mhz/256M are fine), because they run non-time-critical tasks and generally might email their results or whatever.
In this way, we've avoided the EJB complications while also having full redundancy at every level. There was some custom framework involved, but it's been working well. Our application was complex enough to warrant an advanced framework (similar to Struts, except we wrote ours before Struts came out), yet EJB seemed too heavy for what we wanted to accomplish. Of course it didn't hurt that the only thing we paid licenses for was the DB.
Importantly, though, this was the right solution *for us*. It's serving us well, and already scaling well beyond the number of customers we originally anticipated would be using it. While this meets our needs fairly well, it may or may not be the right type of solution for what you're looking for, particularly because I don't know what your application is supposed to do.
You can accomplish anything you set your mind to. The impossible just takes a little longer.
First off, I'd say you're doing this bass-ackwards. You really should have already answered all these and many other questions before ever laying fingers to keyboard.
It depends on lots of things. Who's going to manage the self-hosted host? If they have an IT dept. maybe they can provide the hardware sizing. In any case you will first need to establish the usage patterns and then go forward from there.
http://tinyurl.com/3t236
- Over 750 requests/second on 29 - servers average >20 requests/second each (Yes I know some are not http servers) . Compare that to some commercial solutions.
- commodity hardware
- squid for cacheing/load balancing, feeding Apache
- multi-tiered archtieture
- dual Opteron for the master mysql database