Load Balancing Heavy Websites on Current Tech?
squared99 asks: "I have just delved into some research on a set up for very high traffic websites. I'm particularly interested in how many webservers would be needed at minimum and the type of technology powering them. Slashdot seemed like a good sample site to check out, so I went to Slashdot's technology FAQ to get a starting point. This setup seems to be from 2000, is most likely a bit out of date, and I'm assuming the same number of webservers would not be needed with current server technology. What would experts in the Slashdot community recommend as a required setup to handle Slashdot-like volumes, if they had to do it today using more current hardware? How many webservers could it be reduced to, while maintaining enough redundancy to keep serving pages, even under the heaviest of loads?"
Anything fairly new should be alright, I think the big problem is your pipe size. I mean if you have 30,000 new connections and only 300 kbs, its not goign to transfer data very well.
-------
Support Indy Music. Buy
On the website hosted. Many dynamic pages with content coming from a database? Or just loads of static pages?
Akamai
http://meta.wikimedia.org/wiki/Wikimedia_servers This is how they do it.
... and you've just missed your greatest opportunity for this by not providing a link to your website! ;-)
Paul B.
Many sites are moving towards utility based hosting or virtualized setups. The problem with high capacity sites is that you often end up having to purchase enough servers to deal with peak time, but don't need the servers during off hours. Utility based hosting services charge you for what you _use_ and allow you to scale as needed. Savvis (http://www.savvis.net/ I know offers a utility hosting platform based on Inkra, 3Par and blade servers. IBM has a similar setup.
Akamai for static content and take a look at livejournal's setup for dynamic content (master-master replication based on mysql).
Other people are much more qualified than I to answer the number of servers questions though.
groklaw, wired and slashdot. The holy trinity of work based time wasting.
RTF Server Load Balancing by Tony Bourke. After reading that book you will at least know what you need to look for. Also, you can outsource your load balancing if that is optimal for your needs using something like the Akamai's servers (Microsoft.com uses Akamai, Netcraft confirms).
Take Pound, a few web server machines, a database server and a NFS server (no Coda, AFS or GFS needed in most cases) and you should be set. This is a setup that I installed for a high traffic website and it is very stable.
From those, you will get an idea of the type and scope of technolgy the slash teams use to maintain one of the world's most popular sites.
Granted, your team is not as stilled as the crack techs at /. central, but the specs on that page will get you pointed in the right direction.
Yeah, right.
It depends on what you wanna do with a load balanced webserver. If it is servering heavily dynamic content (like PHP), you need good processing power in your webservers. All the webservers should have two gigabit network ports. One should be connected to the switch that is also connected to the load balancer and the other port should be connected to the backend switch, where you can find the database and NFS server. Put some more RAM into the backend servers. Tweak the NFS server settings (rsize, wsize). Try different MaxClient settings in your apache configuration, but don't overdo it, because the limitation is not the CPU, but the I/O.
Most important: Use mmcache, if your site is based on PHP! If I'd turn off the mmcache, our site would be unusable. The performance increase is awsome.
I've worked on both the Windows and Linux development sides of a shop that receives about a million hits a day on each side. On both sides, the bottlenecks where always the database.
Both sides used pretty much the same setup for webservers... 4 load balanced webservers with hyperthreading at around 3ghz (at the moment... always around 75% of the fastest processors out there to save money). These are sitting in datacenters with multiple 10gbps connections, and each has a hotswapable copy of the entire system running at another data center.
I've found SQL Server can take quite a bit more abuse than replicated mysql, but mysql is extremely fast. However, we have 4 admins for the mysql servers. I myself admin the SQL Servers in addition to my programming and tech support roles. Biggest downside to SQL Server is price.
My suggestion: tread very lightly on the database and you'll be able to handle more load than you'd expect. Cron static pages off the database when possible, or have static pages generated automatically when you update your database. Also look into caching mechanisms for frequently used data.
And as in an programming project, profile and tweek and measure and patch.
We use Foundry ServerIrons. We have two of them set up in an active/standby configuration. We've got approximately 35 servers (5 services) load balanced between the SI's, and average traffic over 50Mb/s just to those services. The SI's are very robust, and I'm quite pleased we got them.
I run quizilla.com, a pseudo-entertainment site that does 60-70 million pages a month, at least 2/3rds being dynamic database backed.
The site faq has the grity details, but basically everything is running on 8 web servers with a cluster of 4 database servers. Mod_perl is used for the most highly trafficed pages, though some less used pages are still static CGIs.
For the way I have it set up, this farm has reached it's limit with the web servers getting pegged pretty constantly during peak hours, and the database servers aren't far behind (mostly due to lack of ram).
The site makes heavy use of Memcached as well as a homebrew ghetto load balancing system based on apache mod_rewrite and some ansilary code.
If I had my druthers, I'd keep the number of machines but have the web heads be 2.8-3ghz Xeons or Opterons with 1.5 GB ram each and the database servers could be dual 1.8ghz xeons with at least 3GB ram each. Idea memcache would be at least 2GB, but more is always better. From my guess, a setup like that would run my site at 100mil quick pages a month, instead of like now where pages often take 5 seconds or more.
One big things that you don't really notice until you try to make things on this scale is that optimization is king -- optimize the hell out of your code. A stray regex might not look expensive, but when it's happening twenty times a second on every machine it quickly adds. up.
Code is almost always the weakest link in a big cluster in that seldom are things sufficiently planned -- I've had huge growing pains since I never planned on scalling past one machine so when i had to move to 2,3,4 and up to 8 is has been a real hassle making things work "right" in a massive cluster. Plan for clustering from the get-go if you even have the slightest inkling it will do high traffic volumes.
Hilary Rosen's speech was about her love of money and her desire to roll around naked in a pile of money.
It is impossible to answer your question unless you define "heavy" traffic.
Some people might consider a hundred thousand pageviews per day to be heavy. Others might consider a million pageviews per day to be heavy.
From experience a hundred thousand for a reasonable application can be handled on one server. A million would probably require 2 to 4.
At my work we use Ultramonkey with LVS-kiss and Mon.
Our hardware infrastructure includes 2 load-balancers running in a failover system with 3 web servers in the backend (1.8ghz, 512ram, 40gig hdd, 100mbps network) systems. That hosts over 60 million page views a month, it also supports real-time failover. For monitoring there are tools out there that use MRTG/RRD for cluster statistics.
Check out Mon and Mon.cgi
Check out http://www.netscaler.com/>. The companies behind the top 10 websites on the internet have, maybe you should too.
Disclaimer, I work for Netscaler, but the customers we have gained should help in your decision.
It's somewhat dated but the FUD busting response to the Mindcraft fiasco has all the formulas on how to figure out what you hardware you need for your pipe. You only need to plug in current processor specs to see what you need. I could only find it in the archives: http://web.archive.org/web/20040409223206/http://c s.alfred.edu/~lansdoct/mstest.html
As previously mentioned, Pound is a wicked, lean load balancer/HA arbitrator that runs well on Linux.
-psy
Outsource to geocities /ducks
Hmm... would your experience be MySQL-based? :P
Carnage Blender serves about 1M pages a day off a single (dual 2.4GHz xeon; 4 GB ram) machine. Those are database-backed pages, with a lot more updates than most read-only-ish sites.
Powered by postgresql. And a lot of tuning.
Some examples here. The examples are heavy on Corporate speak, but you were asking about a large Web/Content architecture, right?
This page shows the server specifications for some of the busiest message boards on the internet, along with what software they use. You'll see the configurations are quite disparate, from the 90+ servers serving the anime fans at Gaia Online on 100% open source software, to the sometimes single server hosting some of the other top 50 forums.
Hey, it's one method.
I was managing a site that received multimillion hits per day 4 apache mod_perl servers, mysql backend (it was getting hammered 1500 qps, I think) because we tracked lots of user stats for the spreadsheets for the higher-ups), 2 TB NFS NAS RAID thing server for customer image files and was pumping out 8+ GB per day each server.
The LVS "director" server was constantly at a load level of 0.0X.
You could run a large site with relatively little hardware. LVS works better than great and the hardware loadbalancer people are praying that more people don't find out about it.
About the only thing "normal" people wouldn't have access to is the bandwidth - everything else you could get used off ebay or craigslist.
Hey man, I love Carnage Blender!
I had to stop playing because it was too addictive, but you've got a cool thing there.
....a Beowulf cluster? :p
But before I switched, I got demos from all three players and put them in a head-to-head contest. I would suggest doing the same. In a lab setting, we couldn't hit the devices hard enough to pick a clear winner based on performance. When I looked at administration and features, the F5 pulled to the front.
The GUI is clear and concise for those who like GUI's. The OS in the old version (4.X) was BSD, while the new version (9.X) is Linux based. You get full root access to the box, so you can write scripts in the shell language of your choice. The last time I worked with the CSS it had a proprietary scripting language; it may be different now.
The power for me as a network guy is the scripting. I can rebuild our entire site in about 30 seconds thanks to the scripts. I've also been able to move some of the administration to admins that normally wouldn't be qualified to work with a load balancer because I can give them scripts and restrict their access so that they can only run the scripts I specify.
SSL proxies are a cool feature as well. It's not documented, but there is an easy process to convert IIS certificates to apache-style certificates for use on the F5. It helps a lot when you are adding/removing servers on a regular basis.
Our site takes enough hits to keep 4 servers loaded all day long. The load average on the F5 never hits 1.0. This is with about 40 Mbps of traffic running through the F5.
Still, with a plan, you only get the best you can imagine. I'd always hoped for something better than that. -CP
One mistake that I see lots of people make is use a PC-based load-balancer. A hardware device (Foundry ServerIron, Nortel Alteon, Cisco CSM, etc) is well worth the money (especially if you get it on ebay).
- 26 July 2004 article
- 27 Nov 2004 article
There are a few on that site about database server performance, too...Our web site serves about 3 millions pages a day on two gigabit links. The balancing rules are quite complex since the content is splitted on multiple servers and SANs. We use software load balancers: Zeus ZXTM on Gentoo Linux. The nice thing about software load balancers is that you can easily replace the hardware if it fails. Having a spare PC is way cheaper than a spare load balancer. We are very pleased with ZXTM so far. Very reliable, fast, and very flexible. It uses a PHP-like scripting language to process requests and you can really handle any specific backend architecture with that.
{{.sig}}
The answer to this question depends entirely on how heavy each request you serve is. If you are just throwing together some PHP code that pulls information from a few databases, possibly updating some others, on every hit, it could require quite a large number of machines to handle the load. If you are clever, effectively making the results static pages, it may take very few systems.
A good starting place is to just measure it by testing how long it takes to serve a page like what you are expecting to be publishing, and come up with an average of how long it takes to serve that request. Multiply this by the number of expected visitors and you get the number of machines you get a rough idea of the number of machines you need to handle the load. Very rough, but it's a starting place.
For example, the last time my site got slashdotted, users were hitting a page that is generated from a database. Through clever design, these pages get cached quite heavily, so only the first view requires 200-ish ms to generate. After that, unless the database is updated, the pages require more around 10ms to serve. Serving a static page through Apache requires around 8ms.
During the heaviest hit period, there was only around 4% CPU load on the server. Without the caching, this probably would have been more like 80%.
Nobody can tell you what the answer to this question will be for your situation. I can tell you that everyone plans for their new site to be as popular as slashdot, but would remind you that trying to come out of the chute able to handle the load of slashdot is probably a waste of time and money. Sure, if you have a few hundred thousand dollars to spend on hardware you can happily build up an infrastructure that will handle huge loads. However, if it takes a year or two for those loads to come, at that time you can buy the same computing horsepower for probably half what it costs today. In the mean time, why don't you spend the time you would have spent architectuing this massive database cluster and set of apache workers, instead providing content and marketing to your site?
It's easy to spend time on the geeky network and computing parts of a design, as a geek, but the marketing and content side is the one that's most likely to make it a slashdot.
Sean