Building/Testing of a High Traffic Infrastructure?
New Breeze asks: "I'm currently working on my first web 'application', and have discovered that I know less than nothing about setting up the infrastructure to manage a high traffic system.
Where does one go to learn about setting up the infrastructure required to host something like Slashdot? Or do you just say, 'Not my area!' and help them find a consultant?"
"My experience is pretty much limited to:
I haven't a clue. The last place I worked with on something like this hired a high dollar consultant who spend a huge pile of their money setting up a load balanced, oracle parallel server redundant everything system.
How do you test it? I've worked where they actually had a room with hundreds of systems on racks that they would configured to run test transactions against different servers and software builds for stress testing, but that's not in my budget..."
1. Install the web server on one box, the database on the same box if it's a small installation or a separate box if performance seems like it will need it. Add more memory and processors based on SWAG criteria. (Scientific Wild Ass Guess)I had a potential customer ask what I would recommend if they wanted to self host, they have around 300 remote locations and would have multiple users from each location hitting the application at the same time, so saying a couple of beefy servers probably isn't the right answer.
2. Contract with a hosting company.
I haven't a clue. The last place I worked with on something like this hired a high dollar consultant who spend a huge pile of their money setting up a load balanced, oracle parallel server redundant everything system.
How do you test it? I've worked where they actually had a room with hundreds of systems on racks that they would configured to run test transactions against different servers and software builds for stress testing, but that's not in my budget..."
this is a very interesting topic. I just just started my new job where i was coming from an internship previously. There we had a web server, database server, a devbox and a log processing box for webtrends analysis. But now at my new job im being introduced to high level PIX boxes, F5 load balancers, redudant web servers, transaction servers, etc. One thing i just learned the other day is that they use the F5 to handle SSL encryption/decryption instead of relying on the webservers. I never knew that was possible. But eventually i want be able to do all that my boss does right now. Anything less is less than perfection...MUAHAHA.
I run a site which peaks above 5,000 page views/second. That part is static, and runs thttpd. No problems at all.
The other part is dynamic. It runs on Apache (load balanced, no problem) with a PostgreSQL server. If you don't need it's features, "just say no"!
It is the single part in our system that causes most problems. When your tables grow semi-large (less than 800k rows) and you do a few joins, it chooses strange - and slooooow - ways to execute your queries. Combine that with a few journalists who wants to insert and update articles, and you have a sysadms worst nightmare.
Another really good tool for stress testing web apps is Microsofts Web Application Stress Tool. It allows you to configure testing for a set of different virtual users, and also supports https, stores cookies if you want, etc. An all round good featured tool. One of the best features for testing a load ballanced app is it's ability to seamlessly distribute the testing load across multiple client machines, thus really providing a realistic load.
I've worked on a very high traffic system. At one point we were pushing 100MBPS in traffic. I had about 15 servers, 1 database server, and a load balancer. The traffic was mostly static html pages, with a bit of php/mysql for about 1/10th of the traffic.
We had a master database server that was distributed to all the webservers. When reading from the database, each webserver would read it's own local copy. mysql replication kept the data on the local webservers fresh.
Updates to the database were easy as only a small number of users were doing any updates. All updates were able to go through one server and wrote directly to the master database.
The load balancer was managed by the hosting company. It simply made sure that all the webservers shared the traffic load. Any webserver that died for whatever reason would automatically stop getting traffic sent to it.
Need a website host? Try out http://WebQualityHost.net
Everyone has to start somewhere right?
What's your background. There's lots of different ways to solve every problem. I think it's much more of an assessment of what kind of problems you're good at solving. If you think you can conceptualize what your system needs to do, and evaluate different components objectively do it.
Coming from someone who's implemented some massive testing infrastrucutres and custom tools, worked on computational biology frameworks, as well as well as currently working on fault tolerant scalable SIP based telephony systems and protocol development it's really just like any other massive project. Go incrementally and solve one problem at a time. If you're good with databases and know where they excel do it, otherwise use data structures. If you are strong with PERL and apache base it on linux(perhaps with MySQL), versus otherwise go to a bookstore, pick up books on a couple easy components and stick with what you're good at. I personally also recommend actually getting maintinence on open source products you're not incredibly familiar with as a little help goes a long way.
So anyway, again, above all, go with what you're good at. If you give some more details perhaps people can make some more concrete recomendations.
Absolutely. Most companies hand the whole thing off to a hosting companies that specialize in porn hosting. These places are rooms upon rooms of racks, on raised floors, 6 times redundant connections, dual power backup systems (generators), and all the fiber you could ever want. They're the best. Take a look Candid Hosting. They had a few hurricanes go over them, and they didn't bat en eye. Incredible uptime.
I don't respond to AC's.
Most porn companie are clueless,...They get a MAJOR SCREWING from hosting companies that charge big $$ to figure out how to handle the load.
Real porn companies don't host, they colocate. And real porn companies - real porn companies - are well advanced beyond your Slashdots and your CNN.coms. They don't push an agenda, they push what serves millions of page views without 500s or login problems or 'nothing to see here, move along' warnings. Porn is always bleeding edge on the technology front. And porn made the internet what it is today.
In my experience (having played at being the highly paid consultant who comes in to fix stuff once you've messed it up) I'd always point the finger at the linkage between components ("components" being items in your architecture, including the people you're using to help you). In a three tier environment (a sensible approach, almost regardless of your technology), the database is often a problem. DBAs jump on that pretty quickly, so what's left? Networks are normally easily sorted, but you may still find your application idles when you expect it to be returning pages faster. Here the linkage plays a part. It's the linkage between the parts (not necessarily the connectivity though) that'll be the issue. Failing that, make code changes to your application. I haven't seen an application yet that didn't benefit from lots of code tweaking to make it more efficient, use the DB better, generate better SQL, less SQL or what ever. Either way, the OpenSTA route (or LoadRunner if you can afford it) is the only way to do testing. Setting up the tool is a job in itself, and very worth doing carefully (after all, making a virtual user overly aggressive makes it harder to meet targets, but too weak and your system doesn't do what you say it will). As for all the posts about redundancy, load balancing etc - all good information, and something you will need if you need something like 100% uptime. That said, I know of a bank that ran a system supporting hundreds of concurrent users with a single line of three sun boxes (+ mainframe) - they got their uptime targets, and at a fraction of the cost of their rivals who have two of everything, and then duplicate that in two buildings (but can't run it for toffee).
Our sites obviously have to serve millions of people, so they have to be pretty robust. I can't tell you every detail because we're all pretty specialized and don't get to see everything ourselves, but from working with our database guys and network guys, I do have a pretty good 10,000 foot picture of how things work. Here's a general sense of what you'll have to do to really be robust:
1. Your database gets its own server, as powerful as you can afford. If you're a really big site, you're using Oracle, and really, a database cluster rather than a single server. IMPORTANT: Only the DBA can touch the production databases. Developers MUST submit requests to the DBA for any changes. Nobody should be touching a production database from their desktop, other than maybe being able to run queries to check data, and they use a separate, limited login for that. Changes are done by the DBA ONLY.
2. You put a firewall between the database server and your middleware server. The firewall is a dedicated device, and you're careful about the ports you leave open. Only the middleware server and DBA workstations on your intranet can touch the database.
3. Your middleware server(s) are as powerful as you can afford (this will be a theme here) and ONLY run middleware. This means, business rule processing. Everything that touches a database in any way MUST come through middleware -- no direct connections, ever. IMPORTANT: developers don't directly install middleware; network staff only.
4. A firewall (again, dedicated device) between the middleware server and the web server. Only the web server (and network staff workstations on your intranet) are allowed to touch the middleware server.
5. A set of web servers for your websites, as powerful as you can afford (hate to keep repeating this, but if you skimp you'll end up screwing yourself down the road). IMPORTANT: Developers should NEVER have access to production web servers; they should give their stuff to the networking staff when it's ready. Also, if you're doing FTP and such, put it on a separate server.
6. A firewall outside your web server, which only permits port 80 traffic and is twice as paranoid as your other firewalls. Log everything "funny".
In general, you'll have to hire some people: someone really good at security, to configure all your firewalls, someone good at setting up load-balancing to set up all three layers, someone to help you set up a good development environment...
One thing lots of people overlook: You'll want a "sandbox", i.e. a dedicated set of test database, middleware server, and web server that your developers can play with when working on their sites. You'll also want to set up a UAT (User Acceptance Test) environment similar to your sandbox, so projects can be moved to UAT for testing before being rolled out to production. You can't do UAT on a sandbox; sandboxes are constantly changing. You need a stable environment for UAT.
Anyway... Hope that helps, it's just advice, you know? Not all of it directly addresses high-volume sites, some of it is about site stability and security, but I think it all ties in together. If your site is being changed by developers, it won't be stable... And if you don't have a paranoid firewall setup, it won't be secure. A lot of webmasters would consider this layout to be (putting it politely) seriously paranoid, but hell, just because you're paranoid doesn't mean they're not out to get you. And, anyway, like I said, high volume does imply these other considerations...
Good luck!
Farewell! It's been a fine buncha years!
Your sharedance software is interesting. Don't know if you are aware of memcached though, (http://www.danga.com/memcached/, by Livejournal guys) and if so did it lack something that prompted you to write your own?
VIVA1023.com | Political Fashion.