Supporting Tens Of Thousands Of Users With Apache?
"I can only think of a couple of ways of doing this. One is to have an enormous single fileserver and have a cluster of apache web servers that NFS mount the home directories to serve up web pages. Then the users FTP into the main file server to store their web pages. To me, this seems wildly inefficient, and you have no real redundancy if the main fileserver crashes unless you are using a SAN (which is very expensive), or you have a hot backup that is rsync'ed or something. And I'm thinking that rsyncing up to 2TB of data would be an exercise in futility.
My other thought was to have several back end file servers with a fixed number of users on each server, and then send all HTTP requests through an LDAP server first, which would then do a redirect to the machine that user's web page resided on. The big problem then is how to make sure users are FTP'ing into the machine that their account is on? They may also use FrontPage extensions with Apache, and this could complicate things even worse.
I know there has got to be a better architecture for this. How do enormous sites like Yahoo and Excite tackle this problem? They have hundreds of thousands of users! Better yet, how could they tackle it with Open Source tools? Would, for example, a Turbo Linux cluster help this problem any, or would I still have to replicate the data across every node in the cluster (meaning I'd need up to 2TB of storage for each cluster node!) Then what happens if they decide they want to add another 10,000 users? I can't find pointers to information, or ideas on how to do this *anywhere*. Can you fellow Slashdotters give me any advice?"
First of all, I love Linux and I'm not a huge FreeBSD fan. However, FreeBSD with Apache is probably a better choice for such a large site - it's known to hold up with these types of heavy loads.
:-) Then security isn't a concern, since the most they can fuck up is their own stuff.
Also, recommend to them to not allow CGI scripting - that would be a NIGHTMARE to support with 30,000 users. Not only would there be a huge amount of security holes, imagine the amount of server power that would take.
Of course, if you have a large amount of money to spend, get an S/390 and give each user a virtual machine running Linux
--