Slashdot Mirror


Building/Testing of a High Traffic Infrastructure?

New Breeze asks: "I'm currently working on my first web 'application', and have discovered that I know less than nothing about setting up the infrastructure to manage a high traffic system. Where does one go to learn about setting up the infrastructure required to host something like Slashdot? Or do you just say, 'Not my area!' and help them find a consultant?" "My experience is pretty much limited to:
1. Install the web server on one box, the database on the same box if it's a small installation or a separate box if performance seems like it will need it. Add more memory and processors based on SWAG criteria. (Scientific Wild Ass Guess)

2. Contract with a hosting company.
I had a potential customer ask what I would recommend if they wanted to self host, they have around 300 remote locations and would have multiple users from each location hitting the application at the same time, so saying a couple of beefy servers probably isn't the right answer.

I haven't a clue. The last place I worked with on something like this hired a high dollar consultant who spend a huge pile of their money setting up a load balanced, oracle parallel server redundant everything system.

How do you test it? I've worked where they actually had a room with hundreds of systems on racks that they would configured to run test transactions against different servers and software builds for stress testing, but that's not in my budget..."

231 comments

  1. A Beowolf Cluster of Course by l810c · · Score: 4, Funny

    That one was easy, ...Next

    1. Re:A Beowolf Cluster of Course by Anonymous Coward · · Score: 0

      yeah right, do you know anything about beowulf (not beowolf)? i think not.. please read http://beowulf.org/overview/faq.html before saying such things..
      beowulf clusters works great but with software written for them! you can't just run apache, tomcat, postgresql or oracle on such cluster and expect they would run faster. they propably won't. maybe mosix is an answer but frankly i don't think any cluster system is adequate for this job.
      if you want high availability and high performance at the same time you won't run away from load balancers and redudant setup. sorry, no bonus.

    2. Re:A Beowolf Cluster of Course by Anonymous Coward · · Score: 0

      Hey jackass, it was a joke. Get over it.

  2. Ask a Pr0n serving company by chia_monkey · · Score: 5, Insightful

    Seriously...they know all about serving up content on high traffic sites. Not only is it high traffic, but it's rather big files that they're delivering. When we're testing the networks that we set up, both wired and wireless, we often visit pr0n sites for our benchmarks.

    --

    "He uses statistics as a drunken man uses lampposts...for support rather than illumination." - Andrew Lang
    1. Re:Ask a Pr0n serving company by AndroidCat · · Score: 4, Funny

      Okay, let's put it to a real test. Post some good pr0n links and we'll try to slashdot them!

      --
      One line blog. I hear that they're called Twitters now.
    2. Re:Ask a Pr0n serving company by Anonymous Coward · · Score: 0

      Damn thats the BEST excuse EVARR! But seriously honney, its work!

    3. Re:Ask a Pr0n serving company by Neil+Blender · · Score: 1

      Seriously...they know all about serving up content on high traffic sites. Not only is it high traffic, but it's rather big files that they're delivering.

      True dat. And the answer will be proprietary load balancing and firewall, FreeBSD, mysql, mod_perl, apache and squid.

    4. Re:Ask a Pr0n serving company by myukew · · Score: 1

      for your benchmarks only of course ;)

    5. Re:Ask a Pr0n serving company by Neil+Blender · · Score: 5, Informative

      Okay, let's put it to a real test. Post some good pr0n links and we'll try to slashdot them!

      Having worked for porn sites, I will tell you this: They, more than anybody, will rise to the challenge. Porn = Traffic = Ad Click Throughs = Money. If a porn site sees a sudden rise in traffic, they will drop more servers into their delicately load balanced system without blinking an eye. Porns sites aren't slow. There is a reason why.

    6. Re:Ask a Pr0n serving company by DogDude · · Score: 4, Interesting

      Absolutely. Most companies hand the whole thing off to a hosting companies that specialize in porn hosting. These places are rooms upon rooms of racks, on raised floors, 6 times redundant connections, dual power backup systems (generators), and all the fiber you could ever want. They're the best. Take a look Candid Hosting. They had a few hurricanes go over them, and they didn't bat en eye. Incredible uptime.

      --
      I don't respond to AC's.
    7. Re:Ask a Pr0n serving company by emmetropia · · Score: 2, Informative

      I've done the porn gig for a while, and I can tell you a "secret" that won't work all that well for a web application. Anything on a porn site that can be served up statically, usually is. If it's a "page of the week" it usually is generated once a week, so that you don't need to pull from the database on every single hit. At least, that's they way it works/worked for us.

      If you do need to hit the database with almost every page load, there's a few simple tricks. These are what I use with PHP, which is the only language we use on our sites. If there's common information that will be called upon a lot, store it in a session variable, as it saves database transactions (not much, but every bit counts). Take a look at how you're database transactions are setup. If you have a table with 30 columns, and you only need 8, then select those 8 instead of *. If there's a lot of traffic on a particular page, try persistent connections. Look at an object oriented way of writing all of your db transactions, makes big applications a little smaller and easier to work with, and if you find a way to fix something you're doing it, just fix the class instead of every script in the application. There's plenty of other little "tricks" and what i've said doesn't really qualify as tricks, but there's plenty you can do before you start talking multiple load balanced servers.

    8. Re:Ask a Pr0n serving company by DickBreath · · Score: 1

      Having worked for porn sites, I will tell you this: They, more than anybody, will rise to the challenge.

      Then please post some links.
      I'm sure Slashdot users will also rise to the challenge.
      (please only post links to sites that can be successfully navigated using one hand.)

      --

      I'll see your senator, and I'll raise you two judges.
    9. Re:Ask a Pr0n serving company by LiquidCoooled · · Score: 1

      I have to laugh at the link you gave.

      On their front page, one of their options is:

      Available by shelves, racks, or private caged suites of various sizes

      Thats for the stranger fetishes!

      --
      liqbase :: faster than paper
    10. Re:Ask a Pr0n serving company by Anonymous Coward · · Score: 1, Insightful

      Objects can create more load in memory, on a particularly heavy loaded site I've seen sessions get a fairly high usage in memory and generally objects used seem to use more memory than standard variables. Yes, the code may be smaller, better written, but that doesn't mean more efficient with the idea of conserving resources. For database transactions its normally because it helps modifying the connection details more easily. Of course i'm sure I will get stoned by OOP zealots since it is the new(old) thing.

      I agree with the static nature, commonly when content is edited for a mass audience, rather than pulling it from a database, the content can be saved into a static page, if not writing a caching module, or else use one that already exists can reduce overheads significantly. Images viewed as a thumbnail, can be written out once viewed, so the resize script does not need to stdout. Additionally, just make sure thumbnails are generated upon upload.

      Generally to conserve resources, elegant generic code with objects, session variables and sql queries with multiple joins aren't the way to go. Like the parent says, only use what you need. However if you need more than one table you can always consider merging the tables into one. Okay, yes this goes against normalisation, but normalisation is used to reduce redundant data, not speed up a query or reduce overhead.

      Large savings can be made by ensuring things aren't recursive, by eliminating those loops and hardcoding stuff that might not be changed for a long while. This all sounds like bad programming, but it does reduce the overheads.
      Simple concepts reducing memory usage. Cut down on session vars, reduce sql queries to a minimum. Set connection limits. Ensure you have www log rotation and information. If the server is full, then make sure you're warned about it.

      I guess what I'm trying to point out is, elegant code isn't always fast and most developers aim for a streamlined, well programmed application. Sometimes however, unelegant programming can result in optimisations that you might not notice except on a heavily loaded server.

    11. Re:Ask a Pr0n serving company by Barryke · · Score: 2, Insightful

      Candid Hosting is expensive, at least in comparision to dutch hosting (Amsterdam Internet Exchange)

      --
      Hivemind harvest in progress..
    12. Re:Ask a Pr0n serving company by sevinkey · · Score: 1

      I work for a sister company of one of those, and yeah, we can definitely handle the bandwidth. We have a 1gbps pipe from Level3 and that's filled up before. We get DDOS attacks every day, but they do the "open a connection and forget it" type of attacks since bandwidth isn't a problem.

      My database system gets about 300,000 hits a day, and that's just a small cluster of windows servers now days...

      Mainly the way we do it is to try our best to make a resilent redundant system, and then monitor for bottlenecks and make work arounds with code (i.e. write to log files and have a cron interact with the database every 5 minutes, etc). These bottlenecks will bite you a lot harder than the bandwidth.

    13. Re:Ask a Pr0n serving company by Guitarsenal · · Score: 1, Funny

      "Incredible Uptime" would be an excellent name for a porn site!

    14. Re:Ask a Pr0n serving company by chawly · · Score: 1

      A drunken man uses lamp-posts for support - and to urinate against.

      --
      How many beans make five, anyhow ? ... Charles Walmsley
    15. Re:Ask a Pr0n serving company by jon3k · · Score: 1

      Okay, yes this goes against normalisation, but normalisation is used to reduce redundant data, not speed up a query or reduce overhead.

      The actual term is denormalization, and it can definitely produce some dramatic speed improvements.

      In all seriousness, you really don't see any large porn sites being delivered dynamically. Just static pages generated on some regular interval.

    16. Re:Ask a Pr0n serving company by jon3k · · Score: 1

      Actually selecting * vs. selecting the actual rows can be much more efficient. If you have a high density table, then just pulling the entire rows and throwing them away after you're done can be a lot more efficient in terms of disk access, although you'll take a larger memory hit. But as well all know, memory is fast, disks are slow.

  3. Post a URL by MarkusQ · · Score: 5, Funny

    Post a URL here and we'll help.

    -- MarkusQ

    P.S. Clever use of the text describing the link may help you control how much trafic you get, from low ("M. Moore Nude!") to high ("SCO caught robbing courthouse").

    1. Re:Post a URL by TRIEventHorizon · · Score: 0

      Daryl McBride Assasinated!!!

      It is a happy day for us all!

      --
      "And so the Trekkies were executed in the mannor most befitting virgins - thrown into volcanoes" - Futurama
    2. Re:Post a URL by TRIEventHorizon · · Score: 0

      parent is suppost to be a joke relevent to grandparent's comment on link description, so my comment is not actually valid in reality

      --
      "And so the Trekkies were executed in the mannor most befitting virgins - thrown into volcanoes" - Futurama
    3. Re:Post a URL by SexyJesus · · Score: 1

      Are you saying that the slashdot crowd wouldn't want to see Mandy Moore nude?

    4. Re:Post a URL by jo42 · · Score: 1

      Here you go:

      - Pure TnA -

    5. Re:Post a URL by nacturation · · Score: 1

      Sure, but who is Daryl McBride? We all know who Darl is, so is that his brother Daryl, or his other brother Daryl?

      --
      Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
    6. Re:Post a URL by sundy58 · · Score: 1

      Amen. I voted Badnarik

  4. I wondereded the same thing by aldousd666 · · Score: 0, Redundant

    My sites have crashed under a 1200 user load, and sometimes act a little fruity due to databse concurrency issues even if they don't crash... I'd like to know the answer to your question too

    --
    Speak for yourself.
  5. Well .. by Anonymous Coward · · Score: 0

    Well, this sounds like it's pretty big. So load balancing is probably part of the answer. You can pretty much load-balance anything from firewalls to database-servers. That includes web servers, yes.
    The most "difficult" (read: expensive) part is setting up storage that is both large, reliable and fast enough to support everything.

  6. Simple Flow chart for learning by drachenfyre · · Score: 5, Funny

    1. Submit Link to slashdot with your webserver hosting a lot of large video files supporting the link.
    2. Have link approved (Note - duplicate any story just posted is probably the best way to get approval and lots of people crying dupe)
    3. Learn what caused the webserver to melt and how long it took to melt.
    4. Fix the problem that caused step #3
    5. Repeat 1-4 until server doesn't melt.
    6. Congrats! You've learned how to host a high demand web server.

    1. Re:Simple Flow chart for learning by ReelOddeeo · · Score: 1

      You forgot...
      7. Profit

      --

      Those who would give up liberty in exchange for security and DRM should switch to Microsoft Palladium!
    2. Re:Simple Flow chart for learning by drachenfyre · · Score: 1

      its only Profit if the webserver is for a Pr0n site.

  7. Test using Slashdot itself! by xmas2003 · · Score: 2, Insightful
    In answer to your question about testing, have your web site /.'ed and see how it handles the Slashdot Effect which is a pretty good stress test! ;-)

    P.S. When I first tried to read this story, I got "Nothing for you to see here. Please move along" ... somewhat ironic I'd say ...

    --
    Hulk SMASH Celiac Disease
    1. Re:Test using Slashdot itself! by aldousd666 · · Score: 1

      there are a million webmasters with ad-sponsored sites who'd love to get slashdotted. Many are called, but few are chosen

      --
      Speak for yourself.
    2. Re:Test using Slashdot itself! by Anonymous Coward · · Score: 0

      Of course, the actual adservers tend to fail their Save vs. Death when slashdot does visit.

    3. Re:Test using Slashdot itself! by Jeff+DeMaagd · · Score: 1

      Many are called, but few are chosen

      Many people ask how many of the chosen are paying customers, like Roland Piquepaille.

  8. Consult by XMichael · · Score: 1

    This particular area, given it's critical nature to one's business is something that an Unexperieced IT Engineer(s) should leave to test and decide on themselves. There are many folks out there who can give you suggestions or even an essesment in writting for a nominal fee.

    Seriously, what kind of Slashdot reader are you, if you don't already know someone who could help you test / recommend load.

    Go to one of your regions LUGs or them silly Slashdot meetup.com's (-;

  9. Re:weekend gmail invites by Japong · · Score: 3, Informative

    These are very obvious links to a shock site, ignore them and mod parent down. Seriously, AC, don't you get tired of this?

  10. Re:If you are asking this... by Anonymous Coward · · Score: 0

    Then you need to hire a consultant.

    I won't ask it then. Thanks for the warning, that was a close call.

  11. hmm by Anonymous Coward · · Score: 2, Informative

    it really depends on what you need.

    In my experience though hardware (especially memory) and bandwidth come before a superoptimized software front-end & database.

    A good introduction I can recommend is called "Developing IP-Based Services: Solutions for Service Providers and Vendors" - I forget who wrote it. But definatly worth reading on the subject.

    1. Re:hmm by SillyNickName4me · · Score: 1

      > In my experience though hardware (especially memory) and bandwidth come before a superoptimized software front-end & database.

      Both are important, but I don't think they come before a well optimized front-end and database.

      It is really simple. The less time your application needs before it can give an answer the lower your concurrency problems will be.

      This will result in needing both less bandwidth (marginally) and smaller hardware.

      A 'trick' I often employ is using caching in the front-end server. This means that every dynamic page is sent with cache control info, and will be cached at the frontend if appropriate. That way I don't haev to make 'statistics of the week' or whatever pages static, they will just be cached for the appropriate amount of time. It also means having a bit more memory in the frontend server, but it saves a lot of load on the backend.

      You need as much bandwidth and memory as your application demands, and then a bit. But what your application demands is largely depending on how well it is optimized.

  12. PLEASE by asadodetira · · Score: 5, Funny

    Please don't tell him. We don't need another slashdot. Servers worlwide surrender

    OK.It's easy. There are three steps involved
    1.Build a low performance infrastructure.
    2.Put a RT sticker and chromed exhaust pipes
    3.Done

    1. Re:PLEASE by alphapartic1e · · Score: 1

      OK.It's easy. There are three steps involved
      1.Build a low performance infrastructure.
      2.Put a RT sticker and chromed exhaust pipes
      3.Done

      You're missing the key element: a "Type-R" sticker. "VTEC" sticker, a plus.
    2. Re:PLEASE by rizzo420 · · Score: 1

      and it has to have a loud exhaust system and a big ass woofer so all you hear is the plastic shit on the car vibrating. tinted windows and cd hanging from rear view mirror optional.

      --
      please me, have no regrets.
  13. If your sites are acting fruity... by Anonymous Coward · · Score: 0, Funny

    You have a serious problem. I suggest you start accept their newfound sexuality and let them be who they are... they say its genetic, and anyway its unlikely your efforts to make them more masculine will only backfire.

  14. Do the math by MarkusQ · · Score: 5, Insightful

    First step, do the math.

    What was once a "high volume" app may be nothing for modern equipment. You're talking about on the order of 1K concurrent users (300 sites * several users per site).

    If "use" means manually typing data into forms, viewing mostly static pages, etc. this isn't really a very "high volume" application, and a single decent server should handle it.

    If, on the other hand, "use" means constantly running complex queries against a billion item data set, you're doomed.

    So where do you fall in this spectrum?

    Coming up next...where's the bottleneck?

    -- MarkusQ

    1. Re:Do the math by Anonymous Coward · · Score: 0

      So where do you fall in this spectrum?

      It's pretty obvious that the original poster is concerned about what to do when he falls just beyond what you can accomplish with one beefy machine running Apache and another beefy machine running Postgres.

      There are a hell of a lot of people running exactly that combo, and we're all wondering the same thing. So far, that combo works pretty darn well, but what are the options when the time comes to double the capacity?

    2. Re:Do the math by ArbitraryConstant · · Score: 1

      "If, on the other hand, "use" means constantly running complex queries against a billion item data set, you're doomed."

      Unless you're Google.

      --
      I rarely criticize things I don't care about.
    3. Re:Do the math by New+Breeze · · Score: 4, Informative

      300 sites, between 12 and 200 concurrent users at a site.

      It's a CRM system, i.e. some basic data entry, some portions are transaction processing. i.e. the workflow portion for the base part of the app is very simply:

      Search for customer by various criteria.
      No customer found, add one.
      Retrieve customer information.
      Add current order information being stored for this customer.
      Process loyalty/discount programs to see if customer qualifies for an award.
      Return award to order entry system for processing.

      There's a lot more to it, but that's the meat of it. It's fairly data intensive, there is a great deal of information stored for customers for use in data mining the collected information. It's primarly web service based, but there is a fairly extensive management and reporting tool that is all HTML based.

      My guess is going to be that the bottleneck is going to the the database, but we've done extensive testing with a million customer sample database running multiple instances of test applications from 10 other boxes, but that doesn't exactly prove much as it's too predictable.

    4. Re:Do the math by mrjohnson · · Score: 3, Informative

      Aight, I'll take a stab...

      You'll need a hardware SSL loadbalancer, with redundancy:

      http://www.coyotepoint.com/e450.htm

      (Two of those).

      You'll need at lease two web servers with CPU and RAM. The requirements on these boxes really depend on the app. I'd make them dualies at least, with fast Xeon processors (Good bang for the buck). A couple gigs of ram each. You can add servers to the load balancer later if you need to. Disk doesn't really matter, but I'd use a SCSI mirrored root volume for reliability.

      The database needs to be redundant, and since you think it'll be the bottleneck, an Oracle RAC setup would seem to fit your needs. I really don't like Oracle from a developer stand-point, but two big servers with Oracle running on raw disks for performance is a tough combo to beat. Expensive and long on the setup and install, though.

      With all of that, you'd have a large initial investment, but something that would grow with your needs. You could add new apps to this setup later on, as well. However, if you needed to run this on the cheap, the Linux Virtual Server project does a lot of this with simple Linux boxes.

      If this is too expensive, the first thing to take out is the hardware SSL. I included it because I want them, not because I have 'em. :-) Check out pound:

      http://www.apsis.ch/pound/

      A couple Linux boxes with failover setup and you've saved a good 40-70 grand. Requires some expertise.

    5. Re:Do the math by MarkusQ · · Score: 1

      So, tens of thousands of concurent users on a single web application (9,000 to 36,000). The first thing I'd do is check those numbers. That sounds high (e.g., larger than anything Citibank, or Intel is running). If the numbers are acurate, you may want to reconsider making it a web app.

      But the real question isn't how many users, it's how much load they'll impose. Assuming that the number of users is correct, it sounds like a POS situation (but 120 registers at one site?), so maybe one full transaction cycle a minute from each user worst case, or 500 per second.

      Do you have any idea how many of these will be pure read (customer found, in the example) and how many will be read-fail-write (new customer)? Remember to de-rate writes, since they will have to enter/verify the customer data.

      Depending on the deatils, it may make sense to replicate the data at each site (for fast queries), and only propagate updates.

      -- MarkusQ

    6. Re:Do the math by Anonymous Coward · · Score: 0

      I once was in charge of building an automatic way of calculating the neccessary hardware for a complex web application consisting of several different servers. I though about it for a while and came up with the following methodology:

      1. Perform a few "typical" sessions, logging all of the requests to each server
      2. Pick the data above and analize it statistically, mainly to know the percentages of each different kind of request (to each server, if you have different types of them). You can also measure the sizes of the "messages" for some bandwidth calculation.
      3. Do a test program to know how many requests a "unit machine" (let's say, a PIII 1GHz) can handle of each kind

      Using the percentages from 2. and the req/s from 3., you should have an estimate about the hardware/network resources that you need, for 1 session. Then multiply for the number of sessions you need; you can also choose the response time, by scaling the req/s.
      You may pick some "traditional" processor/machine benchmarks from the net and scale the results to get the resources for a machine different from the "unit machine".
      Let's say a typical session has only 4 requests, of 2 types, I and II, making up 25% and 75% of the total requests, and takes in average 5 seconds. If a PIII 1GHz can handle 300 req/s of type I and 200 req/s of type II, in 1 second your machine could be able to handle up to (300*25% + 200*75%) / 2 * 5s ~ 562 sessions with average delay of 1s. If you think your users can wait up to 2 seconds, then may handle 2*562 requests.
      These are very optimistic values of course, but at least it gives you a reference point. You can scale down the output by a conservative constant.

      Good luck!

  15. Apache Benchmark is your friend by sseremeth · · Score: 5, Informative

    If you want to throw some serious load at your equipment, get a few other systems saturating your network with Apache Benchmark (ab) requests. It gives lots of useful data, like response times, etc. . And you're best off toppling the application and trying to find the cause that it failed and working on that as someone already suggested. The rinse and repeat.

    Looks like Apache has updated their tools since the last time I had to do this...

    http://httpd.apache.org/test/

    1. Re:Apache Benchmark is your friend by oneishy · · Score: 2, Interesting

      Another really good tool for stress testing web apps is Microsofts Web Application Stress Tool. It allows you to configure testing for a set of different virtual users, and also supports https, stores cookies if you want, etc. An all round good featured tool. One of the best features for testing a load ballanced app is it's ability to seamlessly distribute the testing load across multiple client machines, thus really providing a realistic load.

    2. Re:Apache Benchmark is your friend by xanthan · · Score: 1

      Apache benchmark is cool and all, but it isn't nearly sophisticated enough to generate complex traffic and emulate real user behavior with the right headers, etc. You can apply some mad-scripting skills around it and customize it a touch, but that's a slippery slope to doing a lot of work. If all you need is load generation for a single URL, Apache Benchmark (ab -- it compiles with a standard distro of apache) works as does http_load (www.acme.com), webbench (www.webbench.com), and WAST (Microsoft Web Application Stress Tool). There are many others as well.

      If you need something more sophisticated, be prepared to shell out money. This means something like LoadRunner (Mercury software) or the like. There are other vendors that do this too, so shop around.

      If you only need simple load generation and you're doing this from windows, I prefer WAST if you need to tweak the data from some scripts easily since it dumps all of its output into an Access database. If you're doing it from Unix, either ab or http_load will do. http_load also does SSL, I don't recall if ab does.

    3. Re:Apache Benchmark is your friend by gregfortune · · Score: 1

      Jmeter is pretty nice as well.

    4. Re:Apache Benchmark is your friend by ryochiji · · Score: 1

      One thing to keep in mind when you're test using programs like ApacheBench is that laboratory tests don't necessarily simulate real-world scenarios well.

      For example, a server hooked up to a "client" on your lan will be able to support a hell of a lot more requests than in the real world. This is because, even if your application responds quickly, your web server process has to stay up to send the output back to the client. In a lab network, this usually takes hardly anytime, while an actual modem user hitting your site might take seconds. One solution for this particular problem is to setup a lightweight proxy server, so that users connecting via slow connections will only hog whatever resources the proxy uses, instead of tying up an entire Apache process.

      Anyway, basically what I'm trying to say is, results that AB give you are helpful, but don't put too much faith on it.

    5. Re:Apache Benchmark is your friend by dubl-u · · Score: 1

      Yes, it's a great tool for relatively static sites. For dynamic ones, I'll generally whip together something with HttpUnit or LWP so that I can simulate a user going through a dynamic, multi-step process. They do require more horsepower to generate high load, but accurate user behavior simulation gives you a lot more confidence that your app won't fall over when the hordes come.

  16. Look at the other high load websites by GrAfFiT · · Score: 4, Informative

    Check out what those guys do at Wikipedia. Don't forget to look at their useful links at the bottom.
    Or maybe it's overkill.

    1. Re:Look at the other high load websites by StillAnonymous · · Score: 1

      That's a pretty sweet setup! It also goes to show all those naysayers that Fedora Core can and IS used in production environments with obvious success.

  17. Re:Slashdot erodes grammer skills! by Anonymous Coward · · Score: 0

    If your sites are acting fruity...I suggest you accept their newfound sexuality and let them be who they are... they say its genetic, and its likely your efforts to make them more masculine will only backfire.

  18. Dear Slashdot... by Anonymous Coward · · Score: 5, Insightful

    I'm currently working cooking in a restaurant, and have discovered that I know less than nothing about performing stomach surgery. Where does one go to learn about the techniques and tools necessary for curing stomach cancer? Or do you just say, 'Not my area!' and help them find an oncologist?

    Seriously.. you have a lot to learn, and a lot of what you need to know just comes from experience which you can't get from a book.

    First: learn how everything works. When you click a link in your "application" (why the quotes?), what happens? For instance, does it run a Controller object? If you're using a language like Ruby or Perl, is it "pre-compiled" or does it have to interpret a script on each hit? Does the controller then go to the database and populate variables, then insert them into a template, then render the template? Is the template cached? How are your database settings? Enough memory for joins? Are all your queries using the appropriate indexes? Are you familiar with your database's performance-measuring variables and tools? Are you pulling more data than you need to in each query?

    Once you have an understanding of what's happening, then you can start measuring. Where are the bottlenecks? This is a very important thing to keep in mind in programming or system architecture: DON'T OPTIMIZE UNLESS YOU NEED TO! Keep your system and code as simple as possible. For instance don't cache things in your program (making it more complicated and harder to maintain) unless you have a BENCHMARK IN HAND showing a performance bottleneck.

    You might not need to move your database to another machine. What you need to do depends on your app.

    Yes, you will need to do a lot of testing to identify your "first round" of bottlenecks. You need to build a lot of diagnostics into your app to help you identify how long different steps take.

    Always deploy your app in stages, one site at a time, until you start identifying some problems. Then fix those problems before continuing deployment. Never "flip a switch" and reveal any change all at once.

    Good luck!

    1. Re:Dear Slashdot... by Anonymous Coward · · Score: 0

      "DON'T OPTIMIZE UNLESS YOU NEED TO"

      and if you just happen to have time, and it turns out being a simple thing, is there really a reason not to do it? (ie: replacing double quotes with single quotes around a static string in PHP)

    2. Re:Dear Slashdot... by New+Breeze · · Score: 2, Informative

      I think there was a misunderstanding. I know how the application works in great detail. I know that it can be scaled up across multiple machines. It will scale up.

      What I don't know is how to judge what hardware to reccomend to someone wanting to self host.

      I'm pretty damn sure that only comes from 1) Testing, and I'm not buying $10's of thousands in hardware to test with and 2) Experience, which I don't have.

      The last place I worked at had a rack of nice quad zeon processor systems in their lab that they had to take back from a customer when they didn't withstand the onslaught. I'm leaning towards letting a consultant roll the dice, because I 'd lose my shirt if that happened.

    3. Re:Dear Slashdot... by Dogers · · Score: 1

      Surely if you speak to the hardware vendors or resellers they could provide you with test hardware, to help encourage the sale?

      --
      I am a viral sig. Please copy me and help me spread. Thank you.
    4. Re:Dear Slashdot... by pjay_dml · · Score: 1

      This should at least be worth a try.

      It's a very competitive market, if one vendor says no, just go to the next.

      Good advice I believe.

    5. Re:Dear Slashdot... by Anonymous Coward · · Score: 0

      How can you know how it works in great detail if it's your first web application?

      Anyway if you don't know how much hardware to use (via testing for instance) and you can't work with the client "after the sale" to build out their hardware, there's not really much you can do! The consultant might not help either, if he's not familiar with the application.

      Figure out a way to test the app, you might not need $10K of hardware.

    6. Re:Dear Slashdot... by Jamesday · · Score: 1

      Sorry, you do need that testing and experience. Nobody here can answer your question without that. Your application is too different from all others for anyone to make really sensible specific recommendations. However, here's a generic try:

      1. Database server with room for lots of disks and RAM, starting out with a pair of drives in RAID 1 and a gigabyte or two of RAM. Add disks and/or RAM as required.
      2. Software written to handle lots of web servers/ page builders. Start out with one and add more as needed.

      Beyond that, you're stuck unless you have some test data. It's not currently good enough to give anyone a meaningful quote but it might be good enough for a customer who understands your situation to work with you on it, if you're willing to replace money with the time to work with the customer.

      A sensible consultant should turn your business down. You're setting them up for a failure unless you're willing to do the required testing or they have the equipment and budget to do it on their own.

    7. Re:Dear Slashdot... by dubl-u · · Score: 1

      I'm pretty damn sure that only comes from 1) Testing, and I'm not buying $10's of thousands in hardware to test with and 2) Experience, which I don't have.

      Don't buy. Borrow. Most high-end hardware vendors will have some way for you to test your code on their gear, especially if that will make the difference for a big sale.

      I'm leaning towards letting a consultant roll the dice, because I 'd lose my shirt if that happened.

      That's a very reasonable option.

  19. How to do it with little/no budget by multipartmixed · · Score: 5, Informative

    First off, if this is a "must succeed with no problems" project, all bets are off -- hire an experienced consultant so you have someone to blame. Also, this technique only works when you have the type of site which will *build up* to expected load -- not get turned on instantly.

    This is tough to generalize without knowing specifics, but here goes:

    1. Make sure your application can work correctly when load balanced across multiple boxes
    2. Keep webserving and DB work on different machines
    3. Make sure your application can work with another database without much work (this gives you the option to hire, say, an Oracle DBA and buy an Oracle license if MySQL can't keep up.. does it even support row-locking yet?)
    4. Have extra hardware handy, in the rack. Do NOT turn it on yet.
    5. Observe the application running; determine bottlenecks, tune
    6. If you can't tune it to perform adequately, NOW is the time to break out the extra hardware while re-evaluating the implementation.

    If you throw all your hardware at the problem at once, you get very little warning when the shit starts to hit the fan, and no response scenario. Do NOT make that mistake. Load, test, tune, repair, repeat.

    --

    Do daemons dream of electric sleep()?
    1. Re:How to do it with little/no budget by SoSueMe · · Score: 1
    2. Re:How to do it with little/no budget by coofercat · · Score: 2, Interesting

      In my experience (having played at being the highly paid consultant who comes in to fix stuff once you've messed it up) I'd always point the finger at the linkage between components ("components" being items in your architecture, including the people you're using to help you). In a three tier environment (a sensible approach, almost regardless of your technology), the database is often a problem. DBAs jump on that pretty quickly, so what's left? Networks are normally easily sorted, but you may still find your application idles when you expect it to be returning pages faster. Here the linkage plays a part. It's the linkage between the parts (not necessarily the connectivity though) that'll be the issue. Failing that, make code changes to your application. I haven't seen an application yet that didn't benefit from lots of code tweaking to make it more efficient, use the DB better, generate better SQL, less SQL or what ever. Either way, the OpenSTA route (or LoadRunner if you can afford it) is the only way to do testing. Setting up the tool is a job in itself, and very worth doing carefully (after all, making a virtual user overly aggressive makes it harder to meet targets, but too weak and your system doesn't do what you say it will). As for all the posts about redundancy, load balancing etc - all good information, and something you will need if you need something like 100% uptime. That said, I know of a bank that ran a system supporting hundreds of concurrent users with a single line of three sun boxes (+ mainframe) - they got their uptime targets, and at a fraction of the cost of their rivals who have two of everything, and then duplicate that in two buildings (but can't run it for toffee).

    3. Re:How to do it with little/no budget by smitty45 · · Score: 1

      "MySQL can't keep up.. does it even support row-locking yet"

      It does, and has, for quite some time. Innodb.

    4. Re:How to do it with little/no budget by denpo · · Score: 1
      "must succeed with no problems" project, all bets are off -- hire an experienced consultant so you have someone to blame
      Perfect stupidity, sure it's better to blame someone when your project is dead, ran out of money and still "bugly try to work". Blindly relying on "expert" is a bad thing, expecially when the expert realise you don't and don't want to understand what he's dealing with.
      Would make a perfect new computer truism
      --
      //TODO: put sig here
    5. Re:How to do it with little/no budget by multipartmixed · · Score: 1

      > > [If this is a] "must succeed with no problems" project, all bets are
      > off -- hire an experienced consultant so you have someone to blame

      > Perfect stupidity, sure it's better to blame someone

      I disagree. The reason is subtle, yet perfectly clear [to me]. I firmly believe there is no such thing as a project which goes off without a hitch. Hence, when management insist that such a project occurs (despite objections from realists!), the very first thing which should be at the top of your list to do is to hire a scapegoat. If you can hire one who knows WTF he's doing and can really contribute, well, all the better.

      --

      Do daemons dream of electric sleep()?
  20. Interesting.... by killerasp · · Score: 2, Interesting

    this is a very interesting topic. I just just started my new job where i was coming from an internship previously. There we had a web server, database server, a devbox and a log processing box for webtrends analysis. But now at my new job im being introduced to high level PIX boxes, F5 load balancers, redudant web servers, transaction servers, etc. One thing i just learned the other day is that they use the F5 to handle SSL encryption/decryption instead of relying on the webservers. I never knew that was possible. But eventually i want be able to do all that my boss does right now. Anything less is less than perfection...MUAHAHA.

  21. Two scenarios: by Jerf · · Score: 4, Insightful

    1: Gradual growth. Find bottleneck, remove it. Repeat. Make sure to start with a growable database and web site technology, but that shouldn't be too tough. Also, stay ahead of the game, always with overcapacity, both to cover for outages and for sudden growth spurts.

    2: Instant growth from 0 to thousands+: Hire someone who knows what they are doing. In the first scenario, you have the time to learn what is actually going on, which is an advantage. In this one, you don't, and the customer base is to big (i.e., $$$) to screw with.

    That basically covers it. Specific advice will vary widely based on databases and web technology deployed, so just about any other specific advice you get here is as likely to be wrong for you as right.

  22. Use a Content Delivery Network by rf600r · · Score: 1

    CDN from the likes of Savvis (think: Digital Island) or Akamai (Buyer Beware here) all but completely alleviate flash-crowd pain. Ask for a free trial or a trail period with no commitment as see what I mean. In fact, you're an idiot for not at least looking into this.

  23. no. 1 cause of downtime by morten+poulsen · · Score: 2, Interesting

    I run a site which peaks above 5,000 page views/second. That part is static, and runs thttpd. No problems at all.

    The other part is dynamic. It runs on Apache (load balanced, no problem) with a PostgreSQL server. If you don't need it's features, "just say no"!

    It is the single part in our system that causes most problems. When your tables grow semi-large (less than 800k rows) and you do a few joins, it chooses strange - and slooooow - ways to execute your queries. Combine that with a few journalists who wants to insert and update articles, and you have a sysadms worst nightmare.

    1. Re:no. 1 cause of downtime by Anonymous Coward · · Score: 1, Interesting

      It is the single part in our system that causes most problems. When your tables grow semi-large (less than 800k rows) and you do a few joins, it chooses strange - and slooooow - ways to execute your queries. Combine that with a few journalists who wants to insert and update articles, and you have a sysadms worst nightmare.

      Interesting... That describes my environment (minus the load balancing, at the moment), but the problems don't exist here. I have several tables with well over 10 million records in them, joined to each other at times with no real impact on the speed of the application.

      I've got to wonder which version of Postgres you're running and how your queries are written.

      (Bah, had to post anon because I blew my last mod point on the discussion.)

    2. Re:no. 1 cause of downtime by mrjb · · Score: 1

      I built a site that was relatively heavy on calculations, dynamically showing stats on collected data. Fortunately I won't need to expect millions of users (because I'm not posting the link here :P )

      Those stats are calculated periodically by a cron job now, so that table joins are no longer needed for serving the data itself. That alone turned the site a lot lighter.

      Of course the cron job still needed to do table joins, but it too was refactored and runs in less than 80 seconds now instead of the former 40+ minutes-- possible due to the fact that the final functional specifications were now known, whilst they weren't at time of the initial build.

      Of course much more can still be done to finetune the site- replacing graphics by text equivalents, pre-generating static versions of often-requested pages, and so on. When at some point it is needed, clustering is an option, but for now the server is idle most of the time.

      --
      Visit http://ringbreak.dnd.utwente.nl/~mrjb/growingbettersoftware to download your free copy of the book
  24. Well... by katzman_NJ · · Score: 1

    I think that the most important thing is to first have a site and worry as it grows... :)

    --
    http://www.terratoday.com - Environmental news, discussions & more!
  25. LAMP + mod_ssl + mod_perl by Anonymous Coward · · Score: 0

    By the way, how do you people install and configure a server with Apache + Mysql + Php + mod_perl + mod_perl for a server with a lot of bandwithg?

    And what do you do to maitain that server secure? I mean, how do you protect the server so that users can't get out of their homedir using php/perl scripts? Does a chroot solve the problem?

  26. Depends by JediTrainer · · Score: 3, Insightful

    What constitutes 'high traffic' for you?

    I've been developing a high traffic site (well, maybe medium traffic) at about 1.5 million transactions per month. We have customers using the site all over North America, plus a few in Europe and Asia, and the whole thing is hosted internally off of our 10MB link.

    We have each 'tier' clustered as a pair of servers - 1Ghz/256M is more than sufficient for our 2 Apache servers. 3Ghz/1GB is our Tomcat tier, and I'm not sure what the DB runs on, but they're the beefiest servers of all the tiers.

    Within the app architecture, try to ensure that you can scale to more servers. We have the ability to add more servers to any of the above tiers without any changes, plus any long-running processes (complicated reports and such) get dispatched to a fourth layer of servers we call 'backend' (by RMI). These 'backend' servers can be low-end (300mhz/256M are fine), because they run non-time-critical tasks and generally might email their results or whatever.

    In this way, we've avoided the EJB complications while also having full redundancy at every level. There was some custom framework involved, but it's been working well. Our application was complex enough to warrant an advanced framework (similar to Struts, except we wrote ours before Struts came out), yet EJB seemed too heavy for what we wanted to accomplish. Of course it didn't hurt that the only thing we paid licenses for was the DB.

    Importantly, though, this was the right solution *for us*. It's serving us well, and already scaling well beyond the number of customers we originally anticipated would be using it. While this meets our needs fairly well, it may or may not be the right type of solution for what you're looking for, particularly because I don't know what your application is supposed to do.

    --

    You can accomplish anything you set your mind to. The impossible just takes a little longer.
    1. Re:Depends by xanthan · · Score: 1

      This is a great point. One note to add -- when looking at your monthly logs, don't get too amazed at yourself for having 1 million hits. Take a moment and break it down into per-second numbers. You'll be surprised... 1 million hits/mo assuming a normal distribution is only about 1 hit every 2-3 seconds. Not exactly a big load unless each hit generates a huge complex page that sucks down your bandwidth. Figuring a normal 8-5 day M-F, that goes up to 2 hits/sec.

      Suddenly that 1 million hits/mo isn't so daunting.

    2. Re:Depends by JediTrainer · · Score: 1

      You'll be surprised... 1 million hits/mo assuming a normal distribution is only about 1 hit every 2-3 seconds. Not exactly a big load unless each hit generates a huge complex page that sucks down your bandwidth. Figuring a normal 8-5 day M-F, that goes up to 2 hits/sec.

      That's right - I know it's not terribly bad. But during the workday, there's some serious peaks and valleys when it comes to our traffic. The numbers you quote assume an even distribution of hits over the 9-5 (actually we normally see traffic 8-8, given the 9-5 workday in 5 time zones).

      I've noted that our peak traffic is usually in the hour starting at 11am EST. It is during this time that we are processing up to 10 transactions/sec, and this is exactly when I've concentrated on monitoring how our servers deal with this traffic. I've been pleased that the servers handle our peak load beautifully.

      --

      You can accomplish anything you set your mind to. The impossible just takes a little longer.
    3. Re:Depends by xanthan · · Score: 1

      Agreed. Peaks and valleys are part of the deal. Pre-IM, I once monitored mail traffic across the course of a day. I noted spikes at around 8:30-9:30a (morning mail glut), 11:45-12:15 (what are you doing for lunch?), and of course, 4:45-5:00 amongst students (hey, what's up for tonight?). The same concept holds true for web apps.

      Going back to what you originally said though, spikes and valleys are all relative. A spike for google.com is very different than a spike for my little web server...

  27. next on ask /. by Anonymous Coward · · Score: 0

    dear /.:

    please do my job for me! thanks!

    AC

  28. A few basic things... by Em+Ellel · · Score: 5, Informative

    These are some very basic thoughts on the subject. They may not be 100% right for you, but will get you thinking in the right way:

    Rule 1 - Three tier archictecture is popular for a reason - it works. Offload user interface (web) to dedicated boxes, make application itself run on separate boxes and make database separate

    Rule 2 - When possible, scale horizontaly not vertically. Make sure your application is as stateless as possible and is capable of you just dropping in an extra server when needed without a lot of reconfiguring. Make sure you can survive a loss of a server without loss of data. Lots of cheap servers will most always work out better (and cheaper) than one big ass box.

    Rule 3 - Make as much of your application as static as possible. Even pseudostatic data (something that updates every minute or so) should be made static and have a process re-generating it every minute or so. Not wasting your CPU time to render a menu or something on every hit will add up fast under heavy stress.

    Rule 4 - Strip your HTML. For example, some crappier web languages (think ColdFusion) have a tendancy of inserting spaces for every line of code etc. A large application running CF (dont ask) would insert enough spaces to make a simple page hundreds of kb in size. Just turning on "the write to output only on demand" option will drop size of the page to next to nothing. So know what it is that you are producing on output and make sure it is lean. Turning on server side compression solves this better, however adds to CPU requirements. On trully stateless web servers this just mean you need more web servers. So MAKE YOUR WEB SERVERS STATELESS.

    Rule 5 - Know how many users your upstream connection can handle (in simplest terms - average size of HTML communication * number of users) and make sure you do not exceed it. Limit your connectivity at load ballancer. Having some users not be able to access your site is better than having ALL users not be able to access your site. Make sure you get plenty of bandwidth to spare. If you are setting up a multi-site presence, make sure your intersite communication is a - not going over same line as incoming and b - has sufficient bandwith and latency to serve the traffic.

    Rule 6 - Professional load testing tools cost big bucks. But if you are carefull you can fake it with some open source software. Google it. When testing remember to take into consideration the limitation of your tester system and bandwidth.

    -Em

    --
    RelevantElephants: A Somatic WebComic...
    1. Re:A few basic things... by Anonymous Coward · · Score: 0

      Good advice

    2. Re:A few basic things... by user32.ExitWindowsEx · · Score: 1

      In regards to rule 6, no, they don't...the guy just needs to post a follow-up here and slashdot will do the rest

      --
      "Evil will always triumph because good is dumb." -- Dark Helmet
    3. Re:A few basic things... by CerebusUS · · Score: 1

      The parent is probably the best answer I've read so far, but I thought I could add a few things.

      If your app relies on heavy database usage, you either need to invest the time to make yourself a decent SQL coder / administrator or invest the money to hire one for awhile. Having a farm of webservers capable of handling a million hits an hour doesn't do any good if the application locks the table each of those hits is trying read. This is an oversimplification, but a good SQL admin will be able to watch your databse as you loadtest and tell you where the hotspots are going to be.

      Regarding rule 6 - Load testing is not something you do after your app is done and you want to find out how many people can acccess it at the same time. That's a common misconception that my boss certainly shares. Load testing is an iterative process. It should be performed throughout the life of the coding project. Each time you do it, you're going to hopefully find a new bottleneck, which you must either decide to live with or re-engineer. For products, look at OpenSTA (free, but you need a really large number of machines to generate the test traffic) or Red Gate Software's ANTS (not free, but much cheaper than the alternatives, such as Mercury)

  29. Mischevious piggybacking by Silas · · Score: 1

    Hire a consultant to do it right from the start.

    If you can't, this is sort of a mischevious way of doing it, but one that can work well in a pinch. Get your basic requirements down in writing (bandwidth, OS and app software, server requirements, disk space, backup scheme, etc.) and then contact one of the high-end services like Rackspace to ask for a proposal for their services based on those requirements. In the resulting conversations, you'll learn a lot about what kind of infrastructure is "standard" (on the high end anyway), what kinds of costs could be involved, and you'll get what is practically a checklist for what you should consider setting up on your own. Whether or not you use the high-end service or go off and set up your own homegrown setup, you'll find yourself a lot more educated. (And then, of course, you should find ways to shuffle some paying business to the poor sales person whose time you wasted.)

    Silas

    1. Re:Mischevious piggybacking by Anonymous Coward · · Score: 0

      Ask Slashdot -> Ask rackpace...
      Why pay?

  30. Re:weekend gmail invites by Sique · · Score: 1

    You should'nt have clicked on the link, just copied the URL into your browser. Sheesh. Even the simplest gags still work around her.

    --
    .sig: Sique *sigh*
  31. RE: Using F5's to encrypt data by Em+Ellel · · Score: 2, Informative

    It may or may not be a great idea depending on your situation. For one - the cost of SSL card for F5 is so high, it may be easier to just get extra servers. For another, I work with some banking applications and having data sent cleartext, even on an inside network directly connected to load balancers is NOT a valid option.

    However if local security can be ignored and you have the money to spend, F5's offer a nice offload of encryption processing. But then again, so do hardware cards for individual servers.

    -Em

    --
    RelevantElephants: A Somatic WebComic...
  32. maybe.. by Anonymous Coward · · Score: 0

    Hire a consultant this time, work with them closely if possible, learn how to do it for next time.

  33. high traffic system by psin+psycle · · Score: 2, Interesting

    I've worked on a very high traffic system. At one point we were pushing 100MBPS in traffic. I had about 15 servers, 1 database server, and a load balancer. The traffic was mostly static html pages, with a bit of php/mysql for about 1/10th of the traffic.

    We had a master database server that was distributed to all the webservers. When reading from the database, each webserver would read it's own local copy. mysql replication kept the data on the local webservers fresh.

    Updates to the database were easy as only a small number of users were doing any updates. All updates were able to go through one server and wrote directly to the master database.

    The load balancer was managed by the hosting company. It simply made sure that all the webservers shared the traffic load. Any webserver that died for whatever reason would automatically stop getting traffic sent to it.

    --
    Need a website host? Try out http://WebQualityHost.net
    1. Re:high traffic system by smitty45 · · Score: 1

      This setup certainly can go some distance, but at some point, replication can become too much for each read slave.

  34. Re:ahhh ask slashdot... by StillAnonymous · · Score: 2, Informative

    It's also a good opportunity for people to learn from other's experiences. Christ, man, I don't see why people have to hoard their knowledge. What kind of example does that set?

  35. Re:The obvious answer is: by gal1264 · · Score: 2, Interesting

    Everyone has to start somewhere right?

    What's your background. There's lots of different ways to solve every problem. I think it's much more of an assessment of what kind of problems you're good at solving. If you think you can conceptualize what your system needs to do, and evaluate different components objectively do it.

    Coming from someone who's implemented some massive testing infrastrucutres and custom tools, worked on computational biology frameworks, as well as well as currently working on fault tolerant scalable SIP based telephony systems and protocol development it's really just like any other massive project. Go incrementally and solve one problem at a time. If you're good with databases and know where they excel do it, otherwise use data structures. If you are strong with PERL and apache base it on linux(perhaps with MySQL), versus otherwise go to a bookstore, pick up books on a couple easy components and stick with what you're good at. I personally also recommend actually getting maintinence on open source products you're not incredibly familiar with as a little help goes a long way.

    So anyway, again, above all, go with what you're good at. If you give some more details perhaps people can make some more concrete recomendations.

  36. Test and define your usage by DeBaas · · Score: 2, Informative

    There is a reason why this is a specialty. There isn't a clear answer.

    The answer depends on many factors such as:
    - how heavy are the pages (many pictures?)
    - what's the platform (Lamp/J2EE/etc....)
    - how is the usage?, if someone gives you a figure for concurrent users, ask yourself what they mean by that. Some apps have users contstantly submitting, others once in a few minutes
    - how are they connected? Reverse proxy can really help for slow connections!
    - if you have performance problems, investigate where the pain really is. Is it the (R)DBMS, or the app server, memory IO.
    - etc. etc.

    Most of all: test! Get something like grinder, or opensta and put some serious load and stress on the setup. See where it hurts.
    Make sure that if you have a problem, you actually fix the right problem. It is ok to add hardware, but you have to know what hardware to get.
    Also many problems can be handled by configuration, such as preventing the system to come to crashing halt by limiting the amount of connections to the amount you can handle.

    Look overhere Perl strategy doc It has some good advice that will help you also in non perl environments.

    --
    ---
  37. (File throughput) != (database connectivity) by turnstyle · · Score: 3, Informative
    Popular porn site likely do lots of bandwidth, but that doesn't necessarily mean lots of database hits.

    Accommodating "high traffic" that is mostly bandwidth intensive is quite a different problem than accommodating traffic that is database intensive.

    --
    Here's what I do: Bitty Browser & Andromeda
    1. Re:(File throughput) != (database connectivity) by LiquidCoooled · · Score: 2

      Don't most porn sites now run database backends to manage the linking and hit rates?
      Those that *ahem* list a "todays best" area at the top of the page, followed by the daily links are certainly DB driven.

      Theres management required for referrer chains and user management.
      Thats without even getting to the individual gallery/image/movie pages that are decorated with links and adverts depending on where they come from.
      I've even seen sites now with a blogger style entrance and backend tpg style archives.

      They are most certainly fully blown (pun intended?) Content Management Systems , and not just high bandwidth static servers.

      If all you ever see is static porn pages, your looking in the wrong place....

      --
      liqbase :: faster than paper
    2. Re:(File throughput) != (database connectivity) by Anonymous Coward · · Score: 1, Funny
      Popular porn site likely do lots of bandwidth, but that doesn't necessarily mean lots of database hits.
      Agreed. Many of the ones I visit use straight HTML pages rather than database driven pages, which helps lower server impact.
    3. Re:(File throughput) != (database connectivity) by Anonymous Coward · · Score: 0

      another person fooled by mod_rewrite!

  38. Profiling/benchmarking by Anonymous Coward · · Score: 1, Informative

    There's no shortcut or substitute for good profiling and benchmarking of your application. If you're doing anything mission critical, SWAG flat out isn't going to cut it. You need to PROFILE to figure out what resources your app uses so you can (a) tune and (b) allocate appropriately. For instance, if your app is making a lot of database queries you can look at ways to cut those down (such as caching responses where possible). And you know you'll need fairly beefy database servers (or conversely, that you can get away hosting the database on the same box handling the web front end).

    BENCHMARKING allows you to size the hardware apropriately. This needs to be done scientifically. Set up whatever architechture - benchmark - if it doesn't meet the expected load, plus reasonable headroom for future growth, plus reasonable slop for load spikes, you can use your profiling results to help spot bottlenecks. Consversely, if you're getting "too good" performance (e.g. some servers are staying idle) you'll know where you can safely cut. The key here is to handle it scientifically. Measure, vary only one variable, then measure again. Rinse. Repeat.

    Even if your company goes with a consultant, you need to be deeply involved in the process. Web application performance is too deeply tied to the application (duh) to allow independant evaluation of the infrastructure requirements. You need top to bottom approach, from application architechture, to implementation, to deployment, to get it right. All steps of the process are going to impact performance.

  39. This is the type of question by Anonymous Coward · · Score: 0

    that inevitably brings out all the morons who will tell you to spend more money because that's all they know how to do.
    Why not just try going with less and seeing what happens. I have run several PHP-Nuke sites off of a P166 and Knoppix, yeah that's right, from the CD! no freakin' hard drive at all. And this is off a home DSL line with 128K upstream and a virtual domain.
    Now I confess these sites don't get any significant traffic most of the time, but there are times when I get a few dozen hits an hour and I've never had problems and they've literally been up for years. There is a delay when it hits the CD sometimes, but it's nothing compared to how bad most commercial sites stall while their freaking ad servers choke.
    I would think even a moderate desktop PC and a slightly faster DSL line could handle at least hundreds of simultaneous users on a halfway functional LAMP setup.

    1. Re:This is the type of question by Anonymous Coward · · Score: 0

      Give us a link or shut up!

    2. Re:This is the type of question by Anonymous Coward · · Score: 0

      A few dozen hits an hour on your home DSL line? Is this supposed to be funny? If not then you have absolutely no idea what "high traffic" means.

    3. Re:This is the type of question by deesine · · Score: 2, Insightful


      You've got to kidding, right?!

      This guy's asking how he might setup a race car for the NASCAR circuit. And you're telling him; forget about $big block engines, forget about $super injected fuel & exhaust flow, forget about $blue-printing the motor...you can get the same performance from your Escort, just press harder on the gas pedal!

      Thanks for the laugh! LOL

      -d

      --
      damaged by dogma
  40. Cisco's docs by devitto · · Score: 1

    Cisco's "How to build a datacentre" should give sdome insights.

    e.g. Multi-peer BGP'd address space, feeding something like a Cisco 6509, with PIX, IDS, CSS and maybe the SSL modules. A 16-port GigE could then be used for upstream and downstream links, maybe just straight into your ~10 frontend servers, ideally caching reverse-proxies, with connections to another 6509 with GigE, which connects to your content web servers. Obvious databases etc. should also hand of this stuff.

    On the side of all this would be a terminal server (Cyclades are good) for "oh shit" access, and preferaably a management network, again, using a totally different switch, and a dedicated line to your Office (preferably 2, one going east, one going west).

    Oh and dont' forget power - UPS for the little 30 minute glitches, and generators for really bad times. Good aircon/dust filters and also get some FM200 to make sure that the place doesn't burn to the ground.

    After you've got all this, you should be away, but just like software, MEASURE and UNDERSTAND where the bottlenecks are (leased lines, network, firewall, CPU, memory, BUS, DISK, Web server, database etc.etc.) and know what you can do get 50% more out of your current solution.

    Enjoy.
    Dom De Vitto

  41. do some math by Sai+Babu · · Score: 1


    Individual building blocks and interconnects are easy to evaluate and once you've done them all you'll have a good idea of the sort of performance to expect. It takes an understanding of how all the pieces work, individually and togetner. It's more work for your brain than...

    Brute force. Build it, exercise it, see where it breaks, swap out a block, rinse, repeat.

    If you just want things that work, understanding them is the best approach. If you need to convince people with little knowledge and lots of prejudice, the brute force approach is best. Involving them in this manner is more conducive to check signing and referral work to other clueless clients and is, I suspect, the reason we see such idiocy as brute force testing when a little math would reveal.

  42. Re:The obvious answer is: by twigles · · Score: 2, Informative

    Jesus what an asshole this parent poster is. Someone asks for advice and this arrogant guy calls them incompetent for not being born with the knowledge. Someone please mod him troll; this is exactly why non-techies think we're all arrogant.

  43. Can you qualify some of this stuff? by UVABlows · · Score: 2, Insightful

    I don't really understand a lot of the stuff you said (I am not a sysadmin). For example:

    What does it mean to not scale "vertically"? When I read that, the only thing that comes to mind is to put the boxes next to each other, not on top of each other. From context I gather that horizontally means extra machines, but what does vertically mean?

    For "dropping in an extra server when needed without a lot of reconfiguring", what do you mean by "a lot of reconfiguring"? Obviously you need to get the machine, install the os, set up networking, install the web server, setup the web application, point it at the database, etc. How does the application being "stateless" help? I guess, what are some examples of state that an application can have that will make configuring an additional web server difficult?

    Concerning the pseudo static data regeneration, what if the thing that was being updated was only accessed once every half-hour on average? I am assuming then that generating the page on demand would be better?

    I don't really know what you mean by "MAKE YOUR WEB SERVERS STATELESS". I mean, they have to know if a request just came in, where the data is, what time it is etc, and that stuff gives it state. I am assuming you mean something else by stateless but I cannot figure it out.

    Thanks for the help!

    --

    <high-level position here>
    <name of stupid small company here>

    1. Re:Can you qualify some of this stuff? by Em+Ellel · · Score: 3, Informative

      What does it mean to not scale "vertically"? When I read that, the only thing that comes to mind is to put the boxes next to each other, not on top of each other. From context I gather that horizontally means extra machines, but what does vertically mean?

      Horizontal scaling - adding more machine
      Vertical scaling - adding more CPU/Memory/etc to existing machines.

      For example, a horizontally scaled application may have 20 1u 1cpu servers, a vertically scaled one has a Sun E15k heating up the room.

      For "dropping in an extra server when needed without a lot of reconfiguring", what do you mean by "a lot of reconfiguring"? Obviously you need to get the machine, install the os, set up networking, install the web server, setup the web application, point it at the database, etc. How does the application being "stateless" help? I guess, what are some examples of state that an application can have that will make configuring an additional web server difficult?

      Reconfiguring the application not the servers. A stateless web server does not store any user state. Meaning that if a user hits web server A for one request, and web server B for another, the user will not know the difference. Also meaning that if you add another server, you do not need to worry about conflicts, sharing data, etc. Stateless servers can be taken offline or brought online without any fuss. They become a commodity appliance and if you need more, you just get more. In realistic terms this means that if you need state for the application (login, etc) you either store the state on the client's machine in a cookie (BAD, all sorts of abuse is possible) or better store an temporary ID in a cookie (or in URL) and store state in App server or (better) DB. A lot of web servers and app servers offer clustering to solve the state issue. While this may or may not work, most of the time it is a marketing hype that rarely lives up to expectations and add extra load. It also violates KISS principle (Keep It Simple Stupid) and will give you more headache than it is worth.

      Concerning the pseudo static data regeneration, what if the thing that was being updated was only accessed once every half-hour on average? I am assuming then that generating the page on demand would be better?

      Use your brain. The idea is to lower CPU requirement and potential risk from overloading, not just to use a cool trick. Do whatever works best.

      I don't really know what you mean by "MAKE YOUR WEB SERVERS STATELESS". I mean, they have to know if a request just came in, where the data is, what time it is etc, and that stuff gives it state. I am assuming you mean something else by stateless but I cannot figure it out.

      State implies retained state across MULTIPLE connections/hits. Most application require state, however state does not need to be kept on the web servers and sometimes not even on app servers.

      HTH

      -Em

      --
      RelevantElephants: A Somatic WebComic...
    2. Re:Can you qualify some of this stuff? by SuiteSisterMary · · Score: 1
      For "dropping in an extra server when needed without a lot of reconfiguring", what do you mean by "a lot of reconfiguring"? Obviously you need to get the machine, install the os, set up networking, install the web server, setup the web application, point it at the database, etc. How does the application being "stateless" help? I guess, what are some examples of state that an application can have that will make configuring an additional web server difficult?

      When you add a new server, do you need only a) make the server functional, join it to your cluster, and watch it take up slack, or b) rewrite half of your application to reflect the changes?

      Concerning the pseudo static data regeneration, what if the thing that was being updated was only accessed once every half-hour on average? I am assuming then that generating the page on demand would be better?

      Lets say you get a weather condition update from a sensor on your roof every five minutes. Does it make more sense to a) write a dynamic website which queries the equipment via SNMP every time somebody hits the page, or b) write a scheduled job that, once every five minutes, queries the sensor, then writes out a completely static HTML page which people hit?

      If the sensor automatically populates a database every five minutes, should you hit the database each and every request, or hit the database once each five minutes?

      I don't really know what you mean by "MAKE YOUR WEB SERVERS STATELESS". I mean, they have to know if a request just came in, where the data is, what time it is etc, and that stuff gives it state. I am assuming you mean something else by stateless but I cannot figure it out.

      A connection is either stateful, or stateless. In other words, does a state get maintained between transactions?

      With the web, it's stateless. If you want to have the web server remember you, you have to store something somewhere, be it a cookie on the client, a bunch of data in a form or in the URL, or whatever. But, if the state is being dealt with by the server, say, you're putting stuff into a shopping cart, so the webserver is keeping your shopping cart in memory, what happens if your next request gets directed to a different webserver? Ooops.

      --
      Vintage computer games and RPG books available. Email me if you're interested.
    3. Re:Can you qualify some of this stuff? by Hulfs · · Score: 1

      I don't really know what you mean by "MAKE YOUR WEB SERVERS STATELESS". I mean, they have to know if a request just came in, where the data is, what time it is etc, and that stuff gives it state. I am assuming you mean something else by stateless but I cannot figure it out.

      Basically, a stateful web application is able to store information about the user/application across multiple seperate page loads. Usually, this is refered to as a web session (at least in the java webapp world). The is generally done by using a uniquely identified cookie, by appending a session id parameter to the URL (a GET parameter -- you'll sometimes see something like SESSIONID=somerandomnumeric in the site URL after logging in or something), or embedding hidden fields into forms to identify the session. By tagging all requests that come from a browser with an id you can then on your server side semi-reliably build up and store information about the user's progress through your webapp.

      Avoiding stateful web transactions benefits you in a distributed environment because the web sessions generally have to be stored in memory on the current server handling a request or persisted to a data store each time information is updated to the session. Option one isn't desired in a distributed environment because you generally can't guarantee that the next request returns to the same machine the session was initiated on if you have a load balanced architecture and option two can require a fair share of processing overhead if your app depends on having to read and write the user's session often.

    4. Re:Can you qualify some of this stuff? by CerebusUS · · Score: 1

      Another note on stateless design. Be sure all your app coders understand the goal there, and understand the security risk in doing it wrong. At one of my former jobs they were working hard to make their code (an e-commerce app) stateless and ended up putting all the shopping cart details into cookies. Including the price of the items in the basket.

      A few weeks later they discovered things were being bought for a dime instead of $30. They fixed it then, but that shouldn't have made it off the design board and into code, but it was a small shop and they didn't follow any sort of standard coding procedure.

    5. Re:Can you qualify some of this stuff? by CmdrGravy · · Score: 1

      I am not qualified at all for any of this but I'll have a go at answering your questions.

      "What does it mean to not scale vertically"

      I think he means that if you need more capacity you can just should just be able to add another web server say, or another database server and not have to add one of each each time you need more capacity. Not 100% sure of that myself though !

      "dropping in an extra server when needed without a lot of reconfiguring"

      He means that you should just be able to image another server and stick it in there without making any changes to any of the existing servers to accomodate the new one. Your infrastructure and applications shouldn't need to really know or care much about anything in any of the tiers beyond the fact they are communicating with another tier.

      "Concerning the pseudo static data regeneration, what if the thing that was being updated was only accessed once every half-hour on average? I am assuming then that generating the page on demand would be better?"

      No, definitely not. If something is only changing every half an hour then it much better to regenerate new static content once each half an hour than have every request you get regenerate the content it's self on the fly.

      Actually I think I misunderstood your point slightly - things only accessed every half an hour wouldn't exactly be causing any kind of high load but even there it's better to use static content. You are going to have to regenerate the content anyway so you may as well just do it once at a time and place of your choosing that do it an unknown amount of times.

      "MAKE YOUR WEB SERVERS STATELESS"

      He means your webservers should just be sending off the pages to people and that's it. Once the page is sent they should not be expecting anything back from the user. This is because in a load balanced system there's no guarantee any requests from the same person are going to end up at the same web server - maybe between requests it has blown up in which case you wouldn't want to rely on anything which has just been destroyed in the fire to carry on dealing with a particular user. In practice if you require any information about where a particular user is in your ordering system or whatever make sure his client has any necessary order numbers etc and sends them back in his requests - do not keep track of them on the webserver.

    6. Re:Can you qualify some of this stuff? by UVABlows · · Score: 1

      I see now that a distinction is made between "web server" and "app server". A web server serves static content and an app server runs the servlets or php scripts or whatever generates the dynamic content?

      I gather that machines in whichever tier (web, app, or db server) is used by the state-tracking mechanism cannot be turned off while the site is running because the users whose state was stored in the machine being removed would be viewed as new visitors to the site. Is this the reason to make the web servers stateless? Wouldn't you still be able to add new servers to this tier though?

      --

      <high-level position here>
      <name of stupid small company here>

    7. Re:Can you qualify some of this stuff? by Em+Ellel · · Score: 1

      Look up "three-tier architecture".

      A quick rundown:

      Web servers acts as presentation layer, putting together HTML pages. They contain no business logic they only know how to take a request, as app server for information and render this information. In addition to web servers, presentation can be IVR (phone interface), WAP serverfor mobile phone, or any other user interface. Because app server knows nothing on how to present user data, it could care less what is the presentation layer as long as presentation layer knows how to form a request adn read the responce.

      App servers know how to process data - aka business logic. They take a simple request and return data only. They have no idea how to render data in user viewable way. Ideally app servers store no data in memory - cache sometimes, but not store.

      DB servers. App servers store all data on DB servers. These now nothing of buseness logic or presentation, just know how to store data and ow to quickly retrieve it.

      For real info read up on "3-tier architecture"

      It sounds like you need to get a LOT more education before doing something like this. Not to discourage you but you are not going to find much more than basic pointers on Slashdot. You do need to do your own research and read up on theory and play with this yourself. The info is out there. Google is your friend.

      HTH

      -Em

      --
      RelevantElephants: A Somatic WebComic...
    8. Re:Can you qualify some of this stuff? by UVABlows · · Score: 1

      Was their app still stateless after they fixed the problem or did the solution involve the app keeping state?

      --

      <high-level position here>
      <name of stupid small company here>

    9. Re:Can you qualify some of this stuff? by UVABlows · · Score: 1

      So if a web application needs to keep state (eg someone being logged in), is there a better option than the two you presented or do you just have to pick one even though it will have a downside?

      --

      <high-level position here>
      <name of stupid small company here>

    10. Re:Can you qualify some of this stuff? by CerebusUS · · Score: 1

      They solved the problem by keeping the basket on the SQL server and passing a reference to it in a session cookie. Initially this created a bottleneck at the SQL server level, but our SQL DBA's were able to fix that (Don't ask me how, though :-)

  44. Suggestions by Facekhan · · Score: 1

    If this is just for internal users and telecommuters then you really need to get an idea of how many people will actually be using the app and then put it on a server and simulate the effects of more and more users until it starts to tax the system. THen you can calculate how many users each server can support at 40-60% load and get that many servers behind a loadbalancing device. If its only few servers you can use a router to run the loadbalancing or get a dedicated load balancing device to do it.

    I have had a great experience with Rackspace for managed servers and ServerBeach for unmanaged.
    They will hook you up. I have never had a more knowledgeable group of people on the other end of the phone trying to sell me something and later supporting it. Security question, they conference with a security guy, network question, they conference with a ccie.

  45. Oh yeah... by DogDude · · Score: 1

    Oh yeah, and a lot of these places even have biometric scanners to get into the 24/7 monitoring room and the server rooms. They have standard hardware setups that they generally use pre-ghosted installs of FreeBSD or Windows 2000. Of course, everything is RAID-5 and backed up religiously. The best in the business.

    --
    I don't respond to AC's.
    1. Re:Oh yeah... by Anonymous Coward · · Score: 0
      Oh yeah, and a lot of these places even have biometric scanners to get into the 24/7 monitoring room and the server rooms.


      What you walk up and stick your dick in a hole and if it's you, then the door will open with a pleasant moan?
    2. Re:Oh yeah... by DogDude · · Score: 1

      What you walk up and stick your dick in a hole and if it's you, then the door will open with a pleasant moan?

      There's no fucking around with this much money. It's just business.

      --
      I don't respond to AC's.
  46. Read a lot, ask a lot of questions by ToasterTester · · Score: 2, Informative

    This is one of those areas that is there is no set answer. There are lots of articles on the topic, but usually on systems larger than you plan to do. Go to user groups, but many in user groups are doing smaller site, but some might be doing what you are.

    Main thing is define what you call a lot of traffic. A lot to one person isn't a lot to another.

    Then nail down your budget that will be your most defining factor.

    Then when designing use a design that is easy to scale. That way if you are off you can scale with little pain.

    Personally I would put money into the database server, they can be real pain to scale. The web side design as a farm even if only two web servers to start with. Decide how you plan to load balance. A couple web boxes DNS round robin will do, but bigger you have to look to real load balancing options. Also what is your SLA that will determine how big your farm needs to be or if to keep hot or cold spare boxes around. IF a farm how are you going to keep content in sync? Then power, cooling, Security, and on and on. Its a lot of work, but when done and everyone is happy you can't wait for a even bigger project.

  47. Performance planning and scalability by punker · · Score: 2, Informative

    I work for a website that does alot of traffic (it's a specialized industy, and no it's not pr0n). The site pushes about 10Mbit/s from 9-5 during the week through 6 webservers. There are a couple things you need to look at as far as making a site like that work.
    The first thing you should do is look at your system and determine what your resource drains are. Do you have a database? Is it read-write or read-only? What are your replication and growth options for that app? That affects your scalability at that point and similarly applies to applications like EJB or other app servers. Do you use sessions? Do you have some sort of session aggregator available so sessions could be accessibly from multiple webservers? There are lots of things like this you need to find. I for example, setup seperate webservers (tux & apache) to handle static and dynamic content so that my DB connections would not be held by processes not using them.
    The next thing you need to do, is know how your system is used. You should be able to statistically break this out from your logs by looking at a small set of users during testing. I found that 60% of my hits were to one page, and I knew I had to really optimize that (someone mentioned apache bench, which can work very well for testing single pages). Also, you need to know how parts of your site use your resources. If you have a single DB server and multiple webservers, you don't want anything slowing your DB because that cascades back to your webservers. We have pretty strict performance testing guidelines whenever a part of the site is updated, and I recommend doing your performance testing as you go.
    The final thing you need to do, is have a growth plan. Do you know how to setup load balancing for your webservers? Can your DB/app servers be replicated, or do you just need to buy faster hardware as you grow? Do you know your capacity thresholds from your performance testing? If your system is going to grow, you're going to need to be able to answer these questions.
    If you make sure you've got your scalability issues known, and you don't lock your self into something that can't grow, you should be ok. Beyond that test for speed under load, and track how your performance changes over time. That will help you know when you need to grow your hardware. HTH.

  48. Do a lot of testing by loco123 · · Score: 1

    I run a dynamic (auction) website with 240Mb/s peak traffic. However, I got there by 5 years of removing bottlenecks. Still running on Apache/PHP and MySQL (~60 servers).

    To start from zero, I recommend:
    1. Do a lot of testing. Try Microsoft Stress Test - a free tool to record macros on IE and replay them on several machines simultaneously, simulating 100s of clients.

    2. Redundancy. Use LVS and heartbeat for load balancing and failover. Use database replication as well.

  49. Experience is key by xrayspx · · Score: 4, Informative

    Knowledge comes from Experience, and experience comes from Doing.

    Mistakes will be made, They key is in mitigating the effects of those mistakes. Redundancy and Manageability are your two biggest buzzwords here. A good load test and utilization projections are definitely key, but no matter what you think your userbase will be, if it's a public application, you'll almost certainly be wrong. Try to prepare for the most traffic possible.

    Redundancy on every level, including switching infrastructure is a very good plan. Any decent server sold can use multiple bonded NICs for reduncancy, if possible design your network such that if a switch fails, your network will fail over to another switch, etc.

    I would suggest going to many local datacenters and interviewing each with probing questions relating to your situation. You will find that they are all relatively equal in terms of Standard DC items:
    Diversity of route (physical entrance of cabling into the building) and redundant carriers.

    Cooling

    Power and backup gens

    The things they differ on will be the readiness of their NOC team (do you have to fill out a web-form or call a call-center in East St. Louis to get a problem fixed in San Jose, or can you just "call the NOC and somene goes to your cage"), the monitoring/alerting they provide their customers for issues on the datacenter network. Infrastructure-wise, most DC's can provide you with Ping/Power/Pipe, but the service and SLAs are where they get points.

    Do a LOT of reading. Depending on your platform, you have many choices. Linux vendors and Microsoft both have good platforms WRT building redundant networks, provided you do your homework.

    Which brings you to manageability. Make sure that you have a deployment framework you can live with right from the start. Deploying code by hand is alright when you have 2 sites in IIS x 3 or 4 machines, but it gets hairy when you have 15 sites x 20 webservers. Make sure you can deploy web content, mid-tier apps, etc, with the "click of a button". This helps to ease the possibility of repetitive mistakes being made. Depending on the app, you may have to roll-your-own, but it's worth it.

    Scalability. Make sure you pick a DC that can grow with you. If you plan to start out with 4 1u rackmount webservers and maybe a 7u DB, plus some storage array, make sure there is "room to move" in the DC without needing to cross-connect all over their facility with a cage here and a couple cabinets on the other end. Scalability testing by your engineers would be a great plan also. During load testing, if you're planning on using 2 mid-tier servers to process "Project X" from the web-users, set up 6 or 8 and load them up with bogus traffic. See how long it takes to kill your DB server.

    Monitoring/analysis. Make sure you have a monitoring system into which you can hook custom monitors and alerts. Of your installation, those parts with the lowest levels of monitoring will be the ones most prone to breakage. Good packages here are NetCool and HP Openview. Expensive though. It's something you can probably write in-house until you need to spend the big bucks for an enterprise package.

    Look to do a lot of reading, but break it into chunks. There is (I hope) no book called "Building and Maintaining High Traffic Enterprise Networks, for dummies, vol2". Every network will be different. But if you componentize your search, you will yeild great results. If you look to build your own monitoring or code deployment system, read up on WMI, read Cisco related newsgroups for network layer redundancy, etc.

    Consultant is NOT a dirty word. Make sure you hire one for the right reasons. You do not want someone to come in and "make it so". You want someone with more experience than you have to work WITH you to design a network that you understand, can maintain, and which will scale. There's an art to it, hire Chris van Allsburg, not Picasso, Dali or certainly not Poll

    1. Re:Experience is key by xrayspx · · Score: 2, Funny

      Fuck me. I'm ripping the "K" and "Y" keys from my laptop right now. I can't believe that post. I sound lie some bonehead owledge engineer.

      eep it real.

    2. Re:Experience is key by soloport · · Score: 1

      Uh... I rather enjoyed your post. No need to bash one's self over a good, knowledgeable post. Really.

    3. Re:Experience is key by h4rm0ny · · Score: 1


      Uh... I rather enjoyed your post. No need to bash one's self over a good, knowledgeable post. Really.

      eah, me too. Anone that nowledgeable is worth coping. The clearl now stuff.

      --

      Aide-toi, le Ciel t'aidera - Jeanne D'Arc.
  50. Follow the course! by internet-redstar · · Score: 1
    In Belgium (Europe) there's the Linux 'DEEP SPACE' course, of course :)

    See it at deepspace.linuxbe.com

    1. Re:Follow the course! by mec_cool · · Score: 0

      what a rip off !

  51. Money and time by hoofie · · Score: 1
    If you want to build a high-traffic web site application, you'll almost certainly need loads of money and time.

    At the company I work for, we have re-built our entire web site system and internal systems over the last couple of years. We've gone from single processor compaq server with webapp and DB on one to a load-balanced multi-application server [all dual processor] with primary and backup oracle databases. Why - because our traffic [both paying and just visiting] was expanding dramatically all the time. At least now we have loads of headroom in the system to allow a decent level of growth and we can just drop in additional servers if required.

    The only was to design this stuff is partially by planning and mainly testing. Ensure your application is lean. Ensure it will scale from one server to ten without any problems for users. Load-test the hell out of it. We used a bank of PC's running Grinder in the end and after a lot of effort, we found the major bottleneck. It required the ADDITIONAL investment of two VERY expensive XML/XLS->XML applications boxes to get round it.

    To get back to my first comment, if you are going to do it properly, it is going to cost a lot of money [and save the usual open-source-is-free comments - if you are going to need some serious database capability for example, you are going to pay some serious money]. So if you're budget is insufficient, walk away from it.

  52. How beefy? by grahamsz · · Score: 1

    Throwing hardware at the problem is usually pretty cost effective - given that consultants are expensive.

    I've seen a single (fairly dated) 12 cpu sparc box serve up about 600 simultanous connections for a cgi driven application without faltering.

    Get a system that you can ramp up and keep adding processors and ram to and you should be able to handle to load that you are talking about with two boxes (one for FE and one for BE)

    1. Re:How beefy? by aminorex · · Score: 1

      Hell, I see a single dual p3 linux box serve up 800
      simultaneous connections for a php app every day.

      --
      -I like my women like I like my tea: green-
  53. My one tip by Anonymous Coward · · Score: 0

    Here's my one tip to save some $:

    If you have a bunch of redundant equipment, that equipment does not need a whole bunch of built-in redundancy.

  54. ipvs, LAMP by dougnaka · · Score: 1
    I'd highly recommend a LAMP setup with ip virtual server My experience says apache/php/mysql(or postgresql) is a good way to scale.
    Buy 2 good load balancers with redundant power supplies, SCSI disks with hardware RAID. Depending on how much database your app needs that's where your hardest to avoid point of failure will be, look into what slashdot does for high performance, I forget the name of the software but it's a distributed caching type system, linux journal had an article about it and it looked very interesting.

    Also, find out how much load you need to support, high traffic means a lot of different things different people. Use siege to slam your setup once you think its good. Make charts and graphs of the data, you won't really understand the data until you try to process it into something that you can explain to someone else.

    When it comes down to it your biggest bottleneck will often be your pipe. 2 fast ipvs load balancers, 10 web servers, and 2 big database servers could easily handle more than the 1 ethernet connection your isp provides if you're hosting moderate database sites.

    Also, it's very your database performance is going to be the killer once you go into production, design your schema very well, and test it extensively. monitor all your queries which ones cost the most, and optimize them. Have a test AND a stage environment similar if not identical to your production one, and USE IT!

    Make sure you know how to use all your tools, there's nothing like trying to search through man pages while your site is down. Make sure you have redundancy in personel, whether all on staff, or consultants. Make schedules and let people know who is responsible.

    Oh, and monitor it like crazy, from at least 2 differnet sites that can page you 24x7, and don't ignore your pager at 3am just cuz you're asleep.

    Also, make backups like crazy, the largest percentage of your disk and storage in general should be used by backups! Test restores of your backups on your test environment.

    One final thing then I'll go, don't be afraid of buying things on eBay. Redundancy is worth more than speed when you're in a 24x7 environment. I really like to buy 3 year old servers and fibre disk arrays there for 1/10th the cost new. These were $30k servers 3-4 years ago, now going for $200-800, they have 3 redundant power supplies, hardware RAID controllers, multiple PCI buses, quad processors, 1-4GB ram, and run great. Also the SAN market on eBay is very saturated with sellers, and you can ignore anything that's close to retail price. I've seen 10x36GB disk fibre arrays with full dual redundant power supplies and controllers for $199 buy it now, and not broken crappy ones. I've got a 10x18GB one, software RAID 0 under Linux I get 95MB/sec sustained (20GB files) reads and writes. (In Windows 2000 server and Windows XP I get about 35MB/sec doing the similar software RAID 0, this is one of many reasons you should ALREADY know not to try to use windows in a production environement on the web)

    I'm envious, I love setting stuff like that up! It's my favorite thing to do in IT!!
    Whew, have a good day.

    --
    My Linux Command of the Day site : LCOD
    1. Re:ipvs, LAMP by tf23 · · Score: 1

      look into what slashdot does for high performance, I forget the name of the software but it's a distributed caching type system, linux journal had an article about it and it looked very interesting.

      I think you are referring to memcached.

  55. ...as always, it depends... by Bob+Bitchen · · Score: 3, Insightful

    First off, I'd say you're doing this bass-ackwards. You really should have already answered all these and many other questions before ever laying fingers to keyboard.

    It depends on lots of things. Who's going to manage the self-hosted host? If they have an IT dept. maybe they can provide the hardware sizing. In any case you will first need to establish the usage patterns and then go forward from there.

    --
    http://tinyurl.com/3t236
  56. More scientific results by TheLibero · · Score: 1

    ... can be obtained by using specialized testing tools like Smartbits (just one example out of thousands of them. Basically, these devices/tools generate manged traffic. So, you can direct one of these devices to your servers/networks and start measuring points of saturation. Just spend few minutes on google searching for testing tools and you will end up with a list of 100s of them to be used in different areas like wired/wireless, security, VoiP, etc .... --- Evil thrives, when good men do nothing!

    --
    "Evil thrives when good men do nothing"
  57. ApacheCon.com - learn from the experts by dirkx · · Score: 2, Informative
    Just hop on a plane to LasVegas - We're having the ApacheCon (http://www.apachecon.com) this week - with at least half a dozen tail on that topic (in the httpd, java, perl and php fields). Though the more hands on oriented tutorials will already start today - :-)

    A good alternative is the book by OReilly - Web Performance Tuning (http://www.website-owner.com/books/servers/webtun ing.asp).

    Dw.

  58. Load balancer + content differentiation by chrysalis · · Score: 5, Informative

    I have some experience with administration of web sites with very high traffic. My previous experience was with p0rn sites (lots of sites, lots of concurrent accesses). My current job is at Skyrock / Skyblog, that serves about 25 million pages every day.

    In both jobs, the infrastructure was extremely similar.

    The entry point is one (or more) load balancer.
    A load balancer will not only blindly allow you to have multiple backends. It will also accept client connections, buffer the request, get the data from already established (keepalive) sessions, buffer it, and transmit it though large chunks to the client. This, alone, really helps to reduce the number of Apache processes that are taking resources (especially memory) for nothing.

    The load balancer can also do other things, like protecting the servers against some attacks, plotting the current workload of every backend, compress HTML pages, etc.

    At my previous job, we were using Foundry Serverirons. Now, we are using Zeus ZXTM http://www.zeus.co.uk/ with great success. Although it's very expensive software, it's way cheaper than Foundries, way more configurable, way more user-friendly and we are very pleased with it so far. A single PC handle 300 Mb/s (Linux 2.6 is needed for epoll).

    The load balancer can also be configured to send the requests to this or that server according to the request.

    Thus, servers are dedicated to specific tasks.

    We have a bunch of static servers for static HTML, CSS, images, etc. They run minimal Apache servers, designed for speed, with NPTL and the worker MPM. Non-forking servers like thttpd or lighttpd is also an option. The static servers are mainly old P3 machines, with only 512 Mb RAM.

    Then, we have servers for PHP. The Apache they are running is huge (our web sites need a lot of modules), the hosts are dual 3 Ghz Xeon with 2 Gb RAM and there are some other specific tweaks.

    Content differentiation is important. It's a waste to spawn huge Apache process to serve static stuff, just because the same host should also be able to serve PHP. Also, tuning (esp. NFS) is very different for static and dynamic content. And as a specialized server often serves the same files, caching is more efficient.

    We run Gentoo Linux on all web servers, plus one DragonFlyBSD (mostly for testing).

    The same content differentiation is made for SQL server. One SQL server serves one sort of thing, so that caching is efficient. Also don't forget that on x86, Linux and MySQL can hardly use more than 2 Gb of RAM. So with big tables, this is really annoying. We are switching SQL servers to Transtec Opteron-based servers for that.

    On high traffic infrastructures, the I/O is often the bottleneck especially if you serve a lot of different content.

    For our blog service, we had to buy a Storagetek disk array with 56 disks (fiber channel, 15k) in RAID 10. As NFS would introduce too much delay, we directly plugged two web servers to the controller of the disk array. These web servers are the NFS servers for the PHP servers, but they also directly serve the static content.

    The access time of hard disk is really annoying. For shared data, but also for databases. We found that RAID 5 was way too slow (even with the high-end Storagetek/LSI controller) since we have about 1 write for 5 reads. So we had to switch everything to RAID 10. It really performs better, but it's obviously more expensive.

    Another bottleneck was the share of PHP sessions between all load-balanced PHP server. We first used a MySQL/InnoDB-based solution, but it poorly scaled. That's why I had to write specific software : Sharedance http://sharedance.pureftpd.org/

    In a high-traffic infrastructure, my hint would be to use many modest, but specialized servers over one huge mega-fast server that does everything. This is way more scalable. And easier to manage, even from a financial point of view. You can b

    --
    {{.sig}}
    1. Re:Load balancer + content differentiation by smitty45 · · Score: 1

      Did you try load balancing Innodb slaves ?

      btw, mysql on opterons is quite excellent, but don't even think of using 2.6.xx kernels on AMD64. just fyi, it was pretty awful when we tried it.

    2. Re:Load balancer + content differentiation by Jamesday · · Score: 1

      How long ago did you try it? There definitely are issues with some builds. We're fine on our master and a SCSI slave but one of the SATA slaves has regular relay log damage (easy to fix but a pain). Too early for those who don't have a pressing need for it, I think, even though it can work.

    3. Re:Load balancer + content differentiation by smitty45 · · Score: 1

      we've been trying it since March:

      here's the issue: (O_DIRECT)

      http://lkml.org/lkml/2004/10/22/19

    4. Re:Load balancer + content differentiation by Jamesday · · Score: 1

      Thanks.

  59. Lessons since '99... by xanthan · · Score: 4, Informative

    You don't mention if you're on the applications side of the world or the network, so I'll cover a little about both.

    1. If you're on the app side, make friends with the network side and vice versa. To understand web site management and acceleration, you will need to know about both parts. Making peace with the other team is crucial to a successful site.

    2. If you are on the app side, start thinking about concurrency from the start. You're going to have not 2-3 users at the same time, but more like hundreds if not thousands. This means that you can't do things like lock up tables and the like in the database. If at all possible write your application so that users don't need to come back to the same server to track their session information. Make sure each request is tracked quickly and easily. Also, differentiate your static content from the dynamic content -- you'll eventually want to cache the static content and life will be easier with static objects being served out of a known location. And please... please, please, please... make sure your app generates clean HTTP headers. Set your cache controls correctly, don't duplicate headers, don't be a smart-ass with your headers. Just use clean headers. ASSUME that there will be proxies between you and the client. ASSUME that you will not be able to control all of them.

    3. Don't forget about megaproxies. Depending on the nature of your site, you're going to have a ton of your users coming from a small handful of addresses. (e.g., AOL) While some megaproxies have fixed the issue of a single user coming out of multiple proxy servers, all have not. This means anything that you use for client IP persistence is broken.

    4. Client IP addresses... don't assume you have them. Don't assume they represent a unique user. They don't. Many load balancers/web accelerators also need to act as proxy and will replace the client IP address anyway. (Don't stress about logging -- any reasonable one will insert the client IP address in a HTTP header that you can extract like X-Forwarded-For:)

    5. Peak load on your web servers. Apache can go fast, scale, blah blah blah... my ass. It's not the web server or operating system that is going to determine your peak performance. It is your application itself. Be prepared to fess up to the reality that your application peak performance is not going to be hundreds or thousands of requests per second unless you go insane with the optimization. (e.g., write your application into the web server and embed the whole thing into the kernel, etc.) Assume you're more likely going to get a few dozen requests/sec per app server. Keep that in mind as you plan server purchases and scaling.

    6. HTTP request does not equal TCP connection. Don't assume that. With HTTP multiplexing like the stuff that Netscaler does (web accelerator), you're going to see most of your requests coming out of a small handful of TCP connections. Make sure your application supports that. Even if you don't use a web accelerator, browsers will do that do. Don't cheat and force the connection closed on every HTTP request, your web server will crap.

    7. This is related to 6, but don't forget that web connections are very short lived compared to what the original designers of TCP were thinking about. As a result, you're going to run into cases where you run out of epheral ports (netstat -an will show a ton of ports in TIME_WAIT) even though your machine is idle. This is why HTTP Multiplexing is important -- you don't want a lot of connection churn. Yes, you can tweak your OS settings so that TIME_WAIT expires quickly, but that isn't going to help your overall performance. (TCP connection setup/teardown is a huge burden on a HTTP request that may only span a few packets...)

    8. Look into HTTP acceleration technology from the get go. I've used several different brands and I've found Netscaler's to be the best. They are crazy fast and capable boxes that have a ton of features (like the HTTP multiplexing, SSL acceleration, HTTP compression, web

    1. Re:Lessons since '99... by smitty45 · · Score: 1

      All excellent points, and I love Netscalers as well. I will add to..."As a result, you're going to run into cases where you run out of epheral ports" ...but remember, this is only a port limit per *IP*. Adding more subnet IPs to the other side of the connection (like on a Netscaler) can help HUGELY to break down that limit.

      and to "6. HTTP request does not equal TCP connection. Don't assume that."

      I will add: HTTP multiplexing (Netscaler or not) will just plainly not work without Keep-Alive connections built into all of the tiers.

      I will add a number 9 and 10.

      9. Hardware load balance all of your lower tier MySQL slave read traffic, and make writes go to the master. This will bring you a very long way.

    2. Re:Lessons since '99... by smitty45 · · Score: 1

      oh wait:

      10. Do #9 until replication becomes too much. Then, federate your databases and stop load balancing them. Build some smarts into your frontends so they can direct traffic to the right db, which are all masters at this point, slaves are only for backup.

    3. Re:Lessons since '99... by eric2hill · · Score: 1

      7. This is related to 6, but don't forget that web connections are very short lived compared to what the original designers of TCP were thinking about. As a result, you're going to run into cases where you run out of epheral ports (netstat -an will show a ton of ports in TIME_WAIT) even though your machine is idle. This is why HTTP Multiplexing is important -- you don't want a lot of connection churn. Yes, you can tweak your OS settings so that TIME_WAIT expires quickly, but that isn't going to help your overall performance. (TCP connection setup/teardown is a huge burden on a HTTP request that may only span a few packets...)

      Set up one server with 16 IP addresses and have Apache listen on all of them. Your inbound connection capabilities go up 16 fold if your server can handle the load. This gets around ports still in time_wait since you have 16x the number of IP's accepting inbound connections.

      --
      LOAD "SIG",8,1
      LOADING...
      READY.
      RUN
  60. look at siege, httperf, and autobench instead by smitty45 · · Score: 2, Informative

    better than ab is siege, which can deal with HTTP/1.1 Keep-Alives, and give more regression-style stats. it's at joedog.org.

    better than siege would be something like httperf, and autobench, which will give you some indication whether or not your client generating the requests is still healthy. autobench also allows you to run multiple instances of httperf on different machines, and then aggregate the numbers after the test.

    remember folks, there are only 65535 (minus 1024) ports that any machine can be using with one IP...that has to be considered as well, including at the load balancer layer.

  61. I call BS by Anonymous Coward · · Score: 0

    So at peak traffic you are getting as many hits as Yahoo. Name the domain or admit you are either lying or can't do math.

    1. Re:I call BS by smitty45 · · Score: 1

      if you think Yahoo only does 5k requests per second, then you'd be mistaken.

  62. to get background info...ask vendors by museumpeace · · Score: 1

    As others have suggested, you may already be in over your head. But even to pick a consultant, you need to have a rough idea of the options and their cost/benefit trade-offs. The large vendors: IBM, Sun, Microsoft etc and some second-tier vendors such as Netscape and BEA have overviews of the application and architecture of their products on their respective web sites...that will cost you a day of reading and give you a headache from reading conflicting claims of superiority BUT, you will know the jargon and the current technology. Reading a book or two would't hurt but they tend not to be completely up to date. Also, look up SOA...the buzzword du jour in buiding web-delivered business services. If you have not googled already, you really ought. My first hit was a comparison of the performance of a dozen web servers with clear graphics and concise info on suggested benchmarking techniques.
    In addition to hardware [do I need RAID? etc], and OS and web server infrastructure issues, don't forget you implentation language choice...what pool of programming skills will be available to write the code? for instance, here is how Perl stacks up but you have many choices these days.
    And above all never forget "SH*T HAPPENS": how and how often and what are you backing up in case of crashes, fires etc.

    --
    SLASHDOT: news for people who can't concentrate on work or have no life at all and got tired of yelling back at the TV.
  63. Adult hosts by base_chakra · · Score: 1

    I'm not so sure I agree. Over the past three years I've done a considerable amount of work in the adult web industry (yes, actual work of the non-fun variety). The most professional hosts that allow adult sites aren't exclusively "adult hosts". Actually, many adult hosts are terribly unreliable and/or unprofessional, and few of them offer managed hosting, which is what this person needs.

    That said, there are a number of extremely reliable, professional managed hosting companies that attract a lot of adult paysite owners. Rackspace Managed Hosting in Texas is one of them; Netgroup Data Center in Denmark is another. (They don't come cheap.)

    But this person seems also to need to know how to implement the software side as well, and I can't say for sure whether even a managed hosting company is going to be able to pick up all the slack. Maybe it's time to call IBM...

  64. Agreed, load balancers are key by Anonymous Coward · · Score: 0

    Another value-add to load balancers is they let you easily swap servers in and out for maintainence. No need to modify DNS etc. I presume for most high volume services, this hardware is now standard.

  65. Different locations... by Anonymous Coward · · Score: 0

    ...for the users will give you some problems with latency.
    Depending on the user origin this can lead to a terrible experience for the users.

  66. Re:ahhh ask slashdot... by spacepimp · · Score: 1

    If you hate ask slashdot, then don't read, it. we have all had to learn something in our lives and at some point it comes from community, asking questions, reading a book, taking a training course, or testing with the help of others. your lack of a sense of community is, highly selfish, and i hope when you have a question, that you need information form someone that they refuse to help you. especially a doctor, when they try to wrench the gerbil out of your ass.

  67. Scaling a high traffic site by DFossmeister · · Score: 2, Insightful

    First, although 300 locations with a few users each may sound like a high-volume site, it is not. I don't want to burst any bubbles, but it simple is not high-traffic in today's world. I work with large e-tailing sites that get 200,000 unique visitors per hour.

    The first step is to determine the type of load you will receive. Is it call-center type traffic, where they will have dedicated staff accessing the application, or will it be more like Internet traffic that comes in waves when it feels like it? If your application fits the call-center model, then you need to know the maximum number of operator-types that will be online at any given time. If it is more like an Internet site, such as Slashdot, then you need to either project the number of sessions per hour, also called the arrival rate, or examine the web logs to find out.

    Concurrent users and arrival rates are not the same--one is the output of the other. In arrival rate mode, the number of concurrent users vary depending on the number of visitors arriving that minute, and the speed of the site. If the site slows down, which is will at a higher rate of visitors, then the sessions will take longer. If the sessions take longer, then visitors continue to come to the site and the number of concurrent users rise. Internet visitors do not know how many users are on the site and certainly won't obey any threshold that you determine.

    The second step is to test over the Internet, and from as may remote locations as possible. You said that there were to be 300 remote offices. Are these all in the US, or are any of them International? Testing on a local LAN does not tell you much of anything, because there is no latency and everything runs at the speed of your switch. Very few people have 100 megabit connections to the Internet, so it is not realistic to test that way. Real users have a mix of line speeds, and come from a variety of locations. It is best to test from 5 or more geographically disperse locations, using a distribution of the line speeds that your end users will be using. If each of these 300 site has a T1, and each site has an average of 3 users, then each user should run at 512Kbps, not 1.54Mbps.

    Lastly, perform realistic transactions on the site, don't just simply hit the home page. Real users on the site will probably start at the home page and traverse the site, doing various things. You should have an idea of what these actions will be, or you can examine the web logs to determine the top 10 paths through the site. Then write scripts for each path and run them proportionately. You also need to build in think or dwell times into each page. Real users don't go from page to page as fast as possible! They take time to fill out forms. A good load test takes into account how familiar a person is with the site and what the person's patience with the site will be. A person using an SSL connection purchasing something has more patience that someone browsing a catalog. By the same token, an operator-type person does not have any choice about whether they can use the site or not, however their productivity will be directly proportional to the speed of the site.

    There are very few open source or free tools that do these things for you. Your options are to 1) wing it as best you can using the SWAG method you described, or 2) seek help. There are various Do-It-Yourself outsourced solutions, such as Test Perspective or some other total outsourced solution. The DIY method will probably get you the best value, but you are subject to your own work, and don't have anyone to blame if things go wrong.

    --
    No Not Again! Its whats for dinner.
  68. most porn companies are clueless by Anonymous Coward · · Score: 0

    Most porn companie are clueless, and can barely handle listing the files in a directory.

    They get a MAJOR SCREWING from hosting companies that charge big $$ to figure out how to handle the load.

    1. Re:most porn companies are clueless by Neil+Blender · · Score: 2, Interesting

      Most porn companie are clueless,...They get a MAJOR SCREWING from hosting companies that charge big $$ to figure out how to handle the load.

      Real porn companies don't host, they colocate. And real porn companies - real porn companies - are well advanced beyond your Slashdots and your CNN.coms. They don't push an agenda, they push what serves millions of page views without 500s or login problems or 'nothing to see here, move along' warnings. Porn is always bleeding edge on the technology front. And porn made the internet what it is today.

    2. Re:most porn companies are clueless by Anonymous Coward · · Score: 0

      Most porn companies are more concerned with cooking up more ways to shave affiliates, cross-sell customers, and slimy dialer/spyware scams to be worried about the day-to-day simpleton work of keeping webservers running.

      There's a good reason the the credit card companies are cracking down hard on the porno dudes.

    3. Re:most porn companies are clueless by Neil+Blender · · Score: 1

      Most porn companies are more concerned with cooking up more ways to....

      Aye, and, at least in the past (it's been 5 years since I worked in this
      industry), they push tech to the limits to separate you from your money.

      There's a good reason the the credit card companies are cracking down hard on
      the porno dudes.


      This has nothing to do with tech. Porn is about money, porn purveyors want your money, porn users don't want to pay for it. Never have I witnessed a more wretched hive of scum and villany.

    4. Re:most porn companies are clueless by jobugeek · · Score: 1

      No credit companies are getting tired of Joe Asshole getting his rocks off at porn sites then calling the CC company when the bill comes claiming he didn't go there.

      --
      I'm not drunk, I just have a speech impediment. And a stomach virus. And an inner ear infection.
    5. Re:most porn companies are clueless by AndroidCat · · Score: 2, Funny

      I think their current economic model consists of getting rich off of each others' click-throughs and showing thumbnails of each others' thumbnails.

      --
      One line blog. I hear that they're called Twitters now.
    6. Re:most porn companies are clueless by AK+Marc · · Score: 3, Funny

      They get a MAJOR SCREWING from hosting companies that charge big $$ to figure out how to handle the load.

      Well, if you are in the porn industry, you should expect a major screwing. However, I would expect the porn industry to know how to handle a load.

    7. Re:most porn companies are clueless by lastmachine · · Score: 0, Funny

      How come when Joe gets his rocks off, it's Bill who comes?

    8. Re:most porn companies are clueless by winwar · · Score: 1

      "I think their current economic model consists of getting rich off of each others' click-throughs and showing thumbnails of each others' thumbnails."

      Hmmm, this seems oddly like the dot com craze and stock prices. I wonder if the same people will get screwed....

  69. biometric is all just for show by Anonymous Coward · · Score: 0

    Places like equinix put all that crap up for show. There is nothing all that great about hand scans. The weak link is the utter morons that work security in those places.
    I can tell you that more than once I've handed some african immigrant security guard dude my ID at big name datacenter and he's given me back a nametag for another company, and access to their cage!

  70. Lots of factors to consider, primarily budget. by bigtangringo · · Score: 1
    This is a very broad question. The company I work for has about 50K customers on the primary web cluster at a given time. There is some serious money invested in this though as it's our main money maker.

    Here's an overview of our stuff
    • Network core: Dual (failover) Cisco 6509 routers - Network guys tell me those two are about 120K each
    • Web servers: 28 Dell 1750's for corporate sites, operating in a half-in-half-out fasion. The 1750's are about 15k each. We use Cisco CSMs to manage the loadbalancing for these things. I don't know how much the CSMs cost but if I remember right it was over 10k
    • SQL Server(s): Eight 8-way servers at about 50K each, the operate in a primary-master failover mode. Two production, two standby, two read-only (for reporting so execs don't take down production :), and lab. These SQL servers are running MsSQL which is about 14k/proc
    • Data storage: SQL servers store their data on Netapp Filers, two production, two standby, one lab. I know we've invested WELL over 1M on these.


    Well, all that together costs about $2,336,000.
    "What's in your wallet"
    --
    Yes, I am a smart ass; it's better than the alternative.
    1. Re:Lots of factors to consider, primarily budget. by kylegordon · · Score: 1

      The 1750's are about 15k each.
      Remind me not to use your supplier... From dell.com a 1750 starts at $949...

    2. Re:Lots of factors to consider, primarily budget. by Anonymous Coward · · Score: 0

      too bad, really too bad.

      when they go out of business from spending
      too much money, I wont want to purchase their
      used gear on ebay.

      cisco should have been juniper.
      dell should have been supermicro.
      etc..

      heh. long live big spenders, so i can get
      their stuff cheap after chapter 11.

    3. Re:Lots of factors to consider, primarily budget. by bigtangringo · · Score: 1

      starts
      You must not have ever looked at enterprise class hardware. I know if I ever start a business I'll be sure to get the bargain basement equipment for the websites that pull in several thousand dollars a minute.

      --
      Yes, I am a smart ass; it's better than the alternative.
    4. Re:Lots of factors to consider, primarily budget. by Anonymous Coward · · Score: 0

      6509s are switches, not routers.

  71. Benchmarking your app by bigtangringo · · Score: 1

    Check out Load Runner, apache bench is probably far too simplistic if you're doing a serious web application. http://www.wilsonmar.com/1loadrun.htm

    --
    Yes, I am a smart ass; it's better than the alternative.
  72. Why doesnt Slashdot get slashdotted by Anonymous Coward · · Score: 0

    I wonder what is the configuration for slashdot servers that they manage to stay alive while any URLs mentioned on this site go down in seconds?

  73. Partition study by Tablizer · · Score: 1

    First you need to think about how you might partition things. Slashdot can be partitioned by stories because each story forum is generally indepedent of each other. Thus, if traffic builds up, then split the traffic to multiple servers by dividing up which story goes to which server. If you have little e-stores, then you can obviously partition by store because each little store is probably independent from the others.

    Another thing to consider is to put the web server on one machine and the database server on another.

    The hard part is databases in which there is a lot of interaction and no clear way to divide. When you do a general key-word search on ebay, for example, you have to search across multiple "kinds" of things. In such a case it might be time to get a big-ass Oracle or DB2 system along with needed DB partitioning experts.

    But even an ebay search may be partitioned by having the search server be seperate from the auction detail server(s) themselves. It just might result in a delay between item posts and the time the search server gets the info to search on. The frequency of updates between the search servers and detail servers may have to be adjusted for traffic because during peak times frequent updates may not be possible.

    Which brings up another related topic: degrading gracefully. Plan what happens when traffic gets too high. You may want to prioritize services or features such that some lower-priority features are shut down before higher priority ones if things get steamy. For example, if ebay got flooded, then they may want to disallow new items for a while. If you don't plan this, then ALL services may go down.

  74. Re:ahhh ask slashdot... by Anonymous Coward · · Score: 0

    who said i hate slashdot? it's just a funny quote i remembered.

  75. Tips by NerveGas · · Score: 1


    First, separate the web serving from the database server, put them on different machines.

    Second, web serving is easily (and massively) scalable. Buy a file server with a good RAID array (and backups!), then a bunch of front-end web servers. Start with round-robbing DNS for load-balancing. If you want, move to some LVS-based load balancers for failover, etc..

    Third, database clustering is not an easy thing to do - if your database server doesn't offer good, scalable clustering, then you just have to buy a single, beefy machine.

    steve

    --
    Oh, you're not stuck, you're just unable to let go of the onion rings.
  76. coffee by Anonymous Coward · · Score: 0

    drink lots of coffee

  77. Check out Anandtech by Krashed · · Score: 1

    Anandtech has some great articles about how they started and what they used over their few years of existance. You really shouldn't start to big though. Maybe start with a decent database backend and a couple of web servers. http://www.anandtech.com/it/

  78. Learn from LiveJournal.com by smartgoldfish · · Score: 1

    Hi OP,
    you may want to read this from the creator of LiveJournal.com: http://www.danga.com/words/2004_oscon/oscon2004.pd f Good Luck!

  79. Its not that hard.. by BawbBitchen · · Score: 1

    Simple.

    Load balancers (Foundry) - NIC1 Webservers NIC2 - NFS (NetApp) for pages.

    Back to the SAME webservers as above:

    Webservers NIC3 - Load balancers (Foundry - a differnet set) - NIC1 DB Servers NIC2 - NFS (NetApp) for data.

    For the Webservers just get a lot of cheap 1U boxen and fill it full of RAM so the pages and NFS are cached as much as can be. Run the same image on each box (netbooting is even better - no harddrives to fall). To much traffic on the frontend, just add another box. If you netboot everything it make backups so simple. One backup (Tape) box and a few netboot boxen tossed it and you are good to go.

    I did this years back with hugh clusters of Sun Netra T1 boxen (1999 era) and you could not slashdot it. Load gets to high unpack and rack a few new boxen and netboot 'em. It is quite simple, easy to manage and very very scalable. The biggest part of this is getting good DB prgs to write the DB part of the setup. This is somewhat the same way that Hotmail ran before the MS take over.

    If NFS is to slow for some reason, you can do the samething with fiberchannel SANs.

    Remember: Simple is better.

  80. Building scalable infrastructure by maokh · · Score: 1
    Yea...ive built a few... ;)

    You may really want to consider hiring a Network Engineer consultant. In particular, someone well versed in Cisco products. My load balacing product of choice is the F5 BigIP, but a Cisco CSM would work too. You can hire a "CCIE", but these guys are always way overpaid and perform just as well as a non-certified seasoned engineer.

    I am not sure how much traffic you actually are looking to handle in terms of megabits or gigabits per second, but lets just assume you need something less than 100 megs. I will also assume there is no need for geographic redundancy.

    Secure a good colocation facility that you will have physical access to. I like colos ran by large internet carriers which also provide cheap 100/1000 meg ethernet connectivity to the internet. This is typically cheaper to terminate than bringing SONET out to the suburbs.

    Hire or consult a network engineer. Unless you really want to becoming a makeshift expert overnight in switching and routing, do not tread here by yourself.

    Think about the future. Your network will need to scale, so make sure the swtiching and routing can grow as you grow.

    Think about redundancy. Don't just drop in a single router and switch because "its cheaper". Don't just bring in one circuit because "its cheaper". But don't go too crazy on redundancy -- compare the financial impact to an outage vs. cost of equipment. If you lose a million dollars for each outage minute during peak hours, that lousy extra $50,000 for an additional 7206 doesn't look so bad after all.

    Avoid homebrew equipment where possible. Yea, you could save thousands by not using an F5 BigIP load balancer or making your own VPN box, but when things go wrong, who's going to support it? Is it really worth it? Only purchase hardware that you have extensively tested in your lab environment. Make sure you buy that support contract. Yea, they likely make a profit off of each one, but when things go down, its their ass, not yours!

    More technically speaking, I would throw a pair of routers on the edge. Create at least two major network segments, external and internal. Use internal non-routable addressing for your internal segment. Put a BigIP between the two segments for load balancing. You can build a "VIP" which contains many different hosts that are dynamically load balanced based on specifications. Boxes can be taken out of load balancing if they do not respond properly to specified requests, or go down. Cisco's CSM can do the same. Stay away from Local Director.

    You probably dont need a firewall in your network, unless you are restricting backchannel access. Use ACL's on the router and your load balancing layer between public and privately addressed segments will act as a natural barrier.

    If you are serving web traffic to a lot of slower clients, and want to decrease download time, check out some of the web compression products. It turned some of our 18 second page load times (over dialup) into 4 second load times...

    I am not a DBA, so I cannot advise you on how you can scale above 1 DB machine. Im sure there is some sort of clustering facility.

    Good luck, and don't do it by yourself.

  81. Hire a consultant by nurb432 · · Score: 1

    Real networking isnt something you pick up overnight, might as well spend the money upfront and do it right.

    --
    ---- Booth was a patriot ----
  82. I thought I'd add my 2cents by ResQuad · · Score: 1

    I work at a company thats a hosted CRM. My first an most important sugestion is: MAKE SURE THE APPLICATION IS EFFECIENT. If you app does alot of unessisary crap in the database or what not, you are screwed no matter what you use.

    Past that, The sugestions for a hardware load balancers are right on par. You have (for example) 2 machines hosting the same application, behind a hardware LB. You have a nice SQL machine, you are set. Add machines as needed (the SQL part gets tricky though).

  83. Diiva = Dynamic JPEGs! by Anonymous Coward · · Score: 0
    http://www.diiva.com

    Most technically advanced pr0n site I've seen.

    1. Re:Diiva = Dynamic JPEGs! by Anonymous Coward · · Score: 0

      Usenet news? what an advanced idea!

    2. Re:Diiva = Dynamic JPEGs! by Anonymous Coward · · Score: 0

      Does GoogleGroups have an RSS feed now?

  84. Large Scale Infrastructures by Floody · · Score: 2, Informative

    1. Foundry ServerIrons at the front-end layer.

    2. Front-end proxying/caching. Not just static content either, take dynamic content that need not be updated often and put it on the front-end in a fashion that does not require over-weight httpds (i.e. no mod_perl). Use session affinity tricks on the front end (such as mod_rewrite with cookies). squid for caching as necessary.

    3. Back-end heavy servers should have a maximum amount of memory, and obviously lower maxclients.

    4. NetApp storage on the back-end, scaled as needed.

    5. http://www.backhand.org/mod_log_spread/

    6. Well designed network topology and aggressive switch partitioning: hint, use vlans and minimize trunking.

  85. Start with slow machines by chaoskitty · · Score: 1

    Testing with slower machines, sometimes purposely putting slow components into the mix (10-base-T between the machines, for instance), will give you an easier way to find the bottlenecks.

    I have a colocated 400 MHz PowerPC 604ev which I use for testing which can push somewhere between 40 and 50 Mbps. It's much easier to get it to its ceiling than my other colo'd server, which is a 1.3 GHz G4 with tons of SCSI disks that can completely saturate the 100 Mbps upstream link. And when the 604ev is too fast, I also have an m68060 Amiga.

    Using a slower machine also makes testing optimisations much easier to measure, too. Having tons of L2 / L3 cache means that synthetic benchmarks sometimes won't even come close to real world performace. Same with having tons of memory and / or an intelligent disk controller, either of which can make measuring disk hits unrepresentative of reality.

    Slower machines have their uses.

  86. you might check out these dudes by Coolmoe · · Score: 1

    Seriously dedicated service is cheap and plenty of bandwidth I have had my site hosted with these guys for a long time and the price is right. Heres the URL www.powerstorm.net

    Not meaning to spam but they might be able to help out for a small budget.

    --
    Got hosting
  87. Consider Akamai by sef · · Score: 1

    Disclaimer: I'm an employee of Akamai, so I'm not unbiased at all.

    Have you considered using a CDN like Akamai? They're in the business of distributing content (dynamic, static, big, small, whatever) for you so you don't have to worry about much of the complexity already mentioned on the thread: load balancers, bw provisioning, hw provisioning...

    Note that all the comments on this thread about writing a good web app still matter a lot. Akamai (or anything else) won't help you if you, for example, write it to be heavily database bound, and has bad locking semantics.

    -- Sef
    (myname at akamai.com)

  88. Consultant would be the best bet by wassy121 · · Score: 1

    Okay, this depends quite a bit on how big a "large" application is going to get. If you are talking more than 2-3 TB of traffic per month, you will want to get a consultant or a team. Plain and simple. However, if you are looking at setting up a medium-sized application server, you may want to look into a Managed Hosting environment. I define medium as 3-10 servers, doing between 500 gigs and 2 ters of bandwidth a month. This may be a relatively popular website like theonion.com or FHM (fhmus.com).

    Anything smaller than that may simply require a single systems administrator. Someone with a couple years experience will easily know how to handle a 2-4 machine setup, possibly with a load balancer.

    As with others, my current job may skew my opinion, so please serve with salt.

    --
    --If I said something interesting it probably wasn't correct
  89. an answer... by drasfr · · Score: 3, Informative

    yes, i Know you asked how can find how to setup a high traffic architecture. I think you came at the right place on Slashdot.

    Although I have never seen really many documentation online, I have setup many architecures in the past, and still able to handle very high volume traffic :10s of millions of pages views a day, most of them dynamic.

    It really all depends on ONE factor: money.

    I will give 2 choices, I have implemented both:
    Appropriate budget:
    Frontends/Load Balancing: We had a pair of of Big/IPs with SSL accelerator, configured for redundancy, that rocks.
    behind them, we had a clustered NetApp F840, with gigabit interfaces, on a gigabit networks.
    Frontends: We were running Apache, with all the binaries, config, webpages, perl scrips located on the shared filesystem. Each machine was a dual CPU, 2GB memory, 2x36GB scsi drive, we had 26 of them, double the capacity really needed so if a machine or two were to go down during the night, no need to worry and it would wait for next day, business hours, great for peaks as well.

    As a database backup we had an Oracle Cluster on a SUN 6650, 14CPUs, 14GB of Memory, connected on an EMC storage. One machine was configured as the master, the other as a standby with the possibility to take down the primary and mount it's filesystem directly from the SAN. Pretty much all the config was on the SAN, on different volumes, and could be mounted on either machine. Each volumes had a copy and an hourly update in case of failure of the primary volume.

    Now for a more realistic scenario with low budget:
    - Load Balancer: Get 2 Linux machines, I'd suggest machines with 2GB Memory, 2x36GB Disk, 2x3Ghz CPUs, with Linux Virtual Server. (http://www.linuxvirtualserver.org/)
    - Build 2 Linux machines that you would use as NFS Server (If you are short in budget also could use them as Oracle or Mysql Server), configure them with 2 external scsi arrays that can be mounted on either machine. If you are really short in budget, don't use external array, but big enough internal drives, and for example rsync to replicate the data between the 2 of them. (I would personnally use LVM, establish a snapshot copy on the master and do a rsync of this snapshot. If you have a database on it, put it in quiet (hot-backup) mode while you do the snapshot
    ).
    - FrontEnds: Get a couples of machines with 2 CPUs, 2GB memory for example, 2x36Gb drives. Configure them to mount the filesystem from the NFS servers.

    - Database, it is budget, use Mysql (or Oracle this would work), configure one machine as Master, the other as read-only. Have all your machines interrogating either machines for read-only requests, and going to the master only for write requests.

    If you need more power: configure more frontends, configure more read-only slave database server. Now if you are write intensive, more than reads, on the database, then it becomes a bit more complicated.

    if you want to know more, contact me off-list.

  90. My $0.02 by TheLastUser · · Score: 1

    In short:
    1. Keep it simple.
    2. Set up monitoring.
    3. Use a staging server.
    4. Backup data. Backup hardware. Backup staff.
    5. Never believe the traffic "estimates"

    Location:
    Use a decent colo facility. Make sure that the techies seem competent. Confirm that they have multiple network peerings, good bandwidth. Run some traceroutes from locations around the country, if possible, to get a handle on the lag. Ask them about their redundant power, their 24/7 NOC, their strategy for managing DDOS. Try and find a colo that isn't about to go out of business, an empty colo is bad, but so is a full one.

    Hardware:
    Realize that the main contributing factor to your "down time" will be your code, not for instance, a network switch. Remember this when people tell you to set up some sort of complex HA switch configuration, etc. Think more in terms of "hot standby" than "no single point of failure".

    Fully redundant hardware is expensive to buy, expensive to configure, and expensive to maintain. If you need the fault tolerance then, you should have the budget to do it right. If not, then don't blow all the cash on a switch cluster.

    Ask yourself, "What if?", What if the firewall dies? What if the Load Balancer dies? What if the Database dies? Make sure that you can recover within an acceptable amount of time. If that's a week, then maybe you just need a reliable hardware supplier, if its 1 day, you better have the part at the office, if its 1 hour, the part better be racked in the cage and configured.

    Software:
    Use what you know. I have seen very large sites created using lots of technologies. They can all do the job, you just have to play by their rules. I would recommend using something that a healthy number of other people are using so that you will get some imporvements as time goes on, PHP, Perl, Java, and .NET seem to be popular choices. Don't get hung up on benchmarks claiming that server A is 13% faster than server B. The main thing is that the technology is easy to use and reliable.

    Os:
    Linux of course! :-) All I can say is, make sure that its stable. Don't choose an os that exhibits ANY stability issues. I have heard horror stories from people that used NT 4. Anything that needs a "reboot cycle" should be a big red flag. You should only have to reboot when you upgrade the kernel, etc.

    Security:
    I am no guru here, but the main point is to think in layers, firewall externally and between layers. Don't go too nuts, the site has to be usable, but add as much as you can put up with, and have time for.

    Web clusters, load balancers and all that:
    Session is the key to a transactional web site. Sessions are usually maintained via browser cookies.

    A content load balancer will stick a user to a webserver based on the cookie. So you probably want one of those. IP based sticky doesn't work all that well because some providers, AOL, send requests out using multiple IP addresses. An extra wrinkle here is SSL, unless your lb can peer inside the SSL data it won't be able to get at the cookie. So if you are using BOTH http and https on the same session you will need an lb that can peer into ssl data.

    Some lbs can also help out with abuse and are crossover security devices. All are routers, and most have access control lists, syn cookie, and other security features. Still they are generally not designed to be the front line defense, but constiture another layer.

    Database:
    Don't go the Oracle RAC route unless you are going to buy more than 4 cpu's. A 4 way server is chaper and FAR easier to set up. Maybe get a 2 way xeon, with n+1 power and with an external raid array. Then get a shitty single cpu machine with a big internal scsi disk to use as a backup, in case the main db dies. The backup db can be used as a development db in the mean time.

    Sometimes its easier to split up your user group into a number of clusters than to scale one cluster to service all of the users. If the users do

  91. Re:The obvious answer is: by Anonymous Coward · · Score: 0

    Mate,

    You've lost the plot... All the guy was after was to know where to begin. So he could read up and start somewhere. All you did was say "go to a bookstore, pick up books..." Maybe he was after some titles, some links, not just a useless answer about going to the bookstore.

  92. Where do you go? by scosol · · Score: 1

    You come to me.

    --
    I browse at +5 Flamebait- moderation for all or moderation for none.
  93. Trial by fire, baby! by pyite69 · · Score: 1

    The old cliche:

    Good judgment comes from experience. Experience comes from bad judgment.

    Assuming that your application can run on one machine to start with, you will just need to make sure you have enough bandwidth and enough machine power so you don't have to worry about those details right away. Be ready to hack things to make your application scale better, that is probably the most difficult thing to prepare for. If this is a one person job, you probably want to have a hosting company that will do the hand holding necessary to keep things secure.

    Don't screw around with cheap hardware - make sure you have multiple CPU's, RAID 10, dual power supplies, and a backup machine with a copy of everything.

  94. Re: Using F5's to encrypt data by corbettw · · Score: 1

    For another, I work with some banking applications and having data sent cleartext, even on an inside network directly connected to load balancers is NOT a valid option.

    My last sysadmin job (I'm in bizdev now) was at a brokerage firm. While the solution wasn't implemented before the higher ups pillaged the company, I and the network engineer came up with a way around this issue: use the F5 SSL accelerator to encrypt/decrypt the SSL stream, then use SSH port forwarding to make sure the cleartext data was encrypted between the machines. We never got it into production, but it worked great in the lab.

    One of the nice side effects of F5 boxes being built around FreeBSD (even if the kernel has evolved so much over the years that it's a completely different beast now).

    --
    God invented whiskey so the Irish would not rule the world.
  95. Here's what we do in state government: by crazyphilman · · Score: 2, Interesting

    Our sites obviously have to serve millions of people, so they have to be pretty robust. I can't tell you every detail because we're all pretty specialized and don't get to see everything ourselves, but from working with our database guys and network guys, I do have a pretty good 10,000 foot picture of how things work. Here's a general sense of what you'll have to do to really be robust:

    1. Your database gets its own server, as powerful as you can afford. If you're a really big site, you're using Oracle, and really, a database cluster rather than a single server. IMPORTANT: Only the DBA can touch the production databases. Developers MUST submit requests to the DBA for any changes. Nobody should be touching a production database from their desktop, other than maybe being able to run queries to check data, and they use a separate, limited login for that. Changes are done by the DBA ONLY.

    2. You put a firewall between the database server and your middleware server. The firewall is a dedicated device, and you're careful about the ports you leave open. Only the middleware server and DBA workstations on your intranet can touch the database.

    3. Your middleware server(s) are as powerful as you can afford (this will be a theme here) and ONLY run middleware. This means, business rule processing. Everything that touches a database in any way MUST come through middleware -- no direct connections, ever. IMPORTANT: developers don't directly install middleware; network staff only.

    4. A firewall (again, dedicated device) between the middleware server and the web server. Only the web server (and network staff workstations on your intranet) are allowed to touch the middleware server.

    5. A set of web servers for your websites, as powerful as you can afford (hate to keep repeating this, but if you skimp you'll end up screwing yourself down the road). IMPORTANT: Developers should NEVER have access to production web servers; they should give their stuff to the networking staff when it's ready. Also, if you're doing FTP and such, put it on a separate server.

    6. A firewall outside your web server, which only permits port 80 traffic and is twice as paranoid as your other firewalls. Log everything "funny".

    In general, you'll have to hire some people: someone really good at security, to configure all your firewalls, someone good at setting up load-balancing to set up all three layers, someone to help you set up a good development environment...

    One thing lots of people overlook: You'll want a "sandbox", i.e. a dedicated set of test database, middleware server, and web server that your developers can play with when working on their sites. You'll also want to set up a UAT (User Acceptance Test) environment similar to your sandbox, so projects can be moved to UAT for testing before being rolled out to production. You can't do UAT on a sandbox; sandboxes are constantly changing. You need a stable environment for UAT.

    Anyway... Hope that helps, it's just advice, you know? Not all of it directly addresses high-volume sites, some of it is about site stability and security, but I think it all ties in together. If your site is being changed by developers, it won't be stable... And if you don't have a paranoid firewall setup, it won't be secure. A lot of webmasters would consider this layout to be (putting it politely) seriously paranoid, but hell, just because you're paranoid doesn't mean they're not out to get you. And, anyway, like I said, high volume does imply these other considerations...

    Good luck!

    --
    Farewell! It's been a fine buncha years!
    1. Re:Here's what we do in state government: by crazyphilman · · Score: 1

      I know, bad form replying to myself, but I've been reading some of the specifics in the other posts in this thread and in comparision, my general description of the layout of a robust site seems kind of, well... Generic.

      So, let me apologize for the lack of detail; I'm just a programmer, I don't know that much about individual hardware elements, model numbers, price points, things like that. I was just trying to describe what the overall layout ought to look like, for a robust, secure setup. It's what we use, FWIW.

      Having said that, I'm learning quite a bit by reading some of the other responses. Very interesting!

      --
      Farewell! It's been a fine buncha years!
  96. See how Wikipedia does it on a shoestring by gtoomey · · Score: 3, Insightful
    Look how Wikipedia organises its cluster on a shoestring budget.

    - Over 750 requests/second on 29 - servers average >20 requests/second each (Yes I know some are not http servers) . Compare that to some commercial solutions.
    - commodity hardware
    - squid for cacheing/load balancing, feeding Apache
    - multi-tiered archtieture
    - dual Opteron for the master mysql database

    1. Re:See how Wikipedia does it on a shoestring by Jamesday · · Score: 1

      Add Memcached for session state and storing some parsed pages. It significantly offloads the Apaches and database.

      For capacity: 24 machines did 1100 requests per second and were responding slower than we liked. That's close to the non-surge capacity. For surge to a few pages, the Squids would handle most of it and could go a lot higher.

      The Squids handle 70%+ of the requests, so the Apaches were dealing with about 15 requests per second. Not a lot because some of them are quite expensive in CPU power - PHP doing parsing, not an optimal choice. We do it that way because MediaWiki needs to run on a shared hosting safe mode PHP setup. For ourselves and those with greater control, we're introducing a PHP plugin which offloads much of the CPU-intensive work from PHP.

      While the hardware is commodity rackmount boxes, most of us are interested in using the home PC type of box for the raw CPU work done by the Apaches. We don't really notice the failure of a few Apache boxes (unles it happens to do something like hit most of the memcached machines). Not cost-effective until we've filled the second rack and switch to a room instead of going to a third rack.

      We use DNS round robin and many virtual IPs per box for load balancing the Squids. That's a pain when one fails and we have to maually switch IPs around, so we're thinking of switching to a pair of LVS boxes in font of them in a failover configuration. This should both help perceived site reliability when a box fails and be a bit more even in load balancing.

      The Apache load balancing does the job but it's not as even as we'd like (sensitive to both network topology and Apache speed), so we're also contemplating LVS as a load balancing layer between them and the Squids.

      What we have now has worked well enough to get us to about the top 200-250 sites range. It'd go further but we're conscious that people now depend on us being there all the time, so we're spending increasing amounts of attention and sometimes money on reliability things.

      The limits. Squids: cache miss penalty, storing the missed page to their disk cache. We're investigating more fancy disk setups than single disk SATA to try to increase the capacity per squid - may end up with lots of old and tiny disks per Squid. Or not - will depend on benchmark results. Apaches: pure CPU power and PHP. Database: full text search is by far the biggest load, in part because we're currently using a query which is less efficient than normal MySQL full text search - that'll change soon.

  97. MySQL and 8GB of RAM on X86 by Jamesday · · Score: 2, Informative
    "don't forget that on x86, Linux and MySQL can hardly use more than 2 Gb of RAM" ...unless you want to, as we do at wikipedia.org:
    top - 06:17:34 up 27 days, 18:27, 3 users, load average: 6.46, 6.64, 6.32
    Tasks: 63 total, 2 running, 61 sleeping, 0 stopped, 0 zombie
    Cpu(s): 8.3% us, 2.5% sy, 0.0% ni, 29.5% id, 58.0% wa, 0.4% hi, 1.3% si
    Mem: 8135268k total, 8098184k used, 37084k free, 59776k buffers
    Swap: 2040244k total, 1082360k used, 957884k free, 1364128k cached

    PID USER PR NI nFLT VIRT SWAP RES SHR S %CPU %MEM TIME #C COMMAND
    23983 mysql 15 0 559 6621m 353m 6.1g 18m S 0.2 78.9 11:44 0 mysqld
    That's the master server of wikipedia.org, a few minutes before I posted this reply. Dual Opteron, 8GB, 6x15K SCSI in RAID 10. FC2. 5GB+ for InnoDB. Been that way for many months now, without any major troubles. Not that there haven't been some, but it does the job. A pair of 4GB slaves are FC2 as well. 2.6/FC2 has issues in some builds on some systems, so it's not entirely smooth going. Just to complete the picture:
    MySQL on localhost (4.*) up 0+18:41:33 [06:25:18]
    Queries: 151.4M qps: 2359 Slow: 57.5k Se/In/Up/De(%): 60/00/00/00
    qps now: 1075 Slow qps: 0.2 Threads: 92 ( 12/ 179) 74/00/00/00
    Cache Hits: 26.7M Hits/s: 416.6 Hits now: 183.0 Ratio: 29.3% Ratio now: 23.0%
    Key Efficiency: 98.9% Bps in/out: 29.4k/51.3k Now in/out: 109.2k/486.9k
    You might have intended to exclude Opteron systems from X86 but it's worth clarifying. Lots of excellent points in your post. Like forget RAID 5 for the database servers.
    1. Re:MySQL and 8GB of RAM on X86 by Anonymous Coward · · Score: 0

      I think he was meaning the 32bit X86 based systems, as the Opteron are 64 bit.

  98. Don't guess by dubl-u · · Score: 2, Insightful

    My guess is going to be that the bottleneck is going to the the database, but we've done extensive testing with a million customer sample database running multiple instances of test applications from 10 other boxes, but that doesn't exactly prove much as it's too predictable.

    Don't guess. You have too much riding on it to guess. Build proper test infrastructure. Not only will it pay off now, but it'll be hugely valuable in the future as you change and expand the application.

    If you're not sure what real users will do, get some to try it out and record their activity. Then spend a little time building a load model, where you describe types of users, their activity patterns, and their expectations. (E.g.: "At the end of the month the 800 salespeople will rush to meet their quota, and during a peak hour they'll each do..." Generally I end up with nice series of spreadsheets, so I can adjust registered users and see peak hits per second come out the other end.) Then simulate the projected load and see where the real bottlenecks are.

    You should be really wary about optimizing without data. As Knuth says, "Premature optimization is the root of all evil." I know a number of people who build very high volume stuff, and I don't know any of them who haven't been frequently surprised at exactly where the bottleneck turned out to be.

    Also, start small and work up. There's no need to build a huge load testing suite all in one go; often you'll learn enough from the first simple tests to point developers and sysadmins in the right direction.

    1. Re:Don't guess by jon3k · · Score: 1

      What software are you feeding this into? A shell script with a bunch of wget's, or a real commercial piece of software?

      Seriously, I'm curious. I'd love to find some decent stress testing software for web apps.

    2. Re:Don't guess by dubl-u · · Score: 1

      What software are you feeding this into? A shell script with a bunch of wget's, or a real commercial piece of software?

      I'm not sure I'd put commercial testing tools in the category of real software; the ones I've seen seem to be relatively lame software dressed up to appeal to corporate managers who want to believe they can buy their way out of a problem.

      For my stress testing, I generally put something together out of Perl's LWP (the UserAgent module) or a custom wrapper around HttpUnit. By the time we get to load testing, this is generally pretty easy, as we use the same tools for automated acceptance testing of the app. We just rip out some interesting chunks of the acceptance tests, optimize the hell out of them, and run them on a bunch of machines.

      I may also through something like Apache's ab into the mix to simulate the stateless stuff. For example, a Slashdotting consists mainly of people who show up and just look at the first page, and ab is just fine for that if you also have more sophisticated agents that represent the people who actually click on things and try out the site.

  99. Re: Using F5's to encrypt data by Em+Ellel · · Score: 1

    My last sysadmin job (I'm in bizdev now) was at a brokerage firm. While the solution wasn't implemented before the higher ups pillaged the company, I and the network engineer came up with a way around this issue: use the F5 SSL accelerator to encrypt/decrypt the SSL stream, then use SSH port forwarding to make sure the cleartext data was encrypted between the machines. We never got it into production, but it worked great in the lab

    Interesting. The only thing is - is SSH encryption any less computationaly intensive than SSL?

    -Em

    --
    RelevantElephants: A Somatic WebComic...
  100. Memcached? by AmVidia+HQ · · Score: 2, Interesting

    Your sharedance software is interesting. Don't know if you are aware of memcached though, (http://www.danga.com/memcached/, by Livejournal guys) and if so did it lack something that prompted you to write your own?

    --
    VIVA1023.com | Political Fashion.
    1. Re:Memcached? by chrysalis · · Score: 1

      The main issue with memcached is that everything needs to fit in the process memory.

      This is not an option when sessions are very large (our scripts use them a lot for caching).

      This is also not an option if you need redundancy, or at least if you don't want to lose all sessions if a server is rebooted.

      But Sharedance can be used exactly like memcached, with everything in ram. Just assign a tmpfs volume as the storage area.

      --
      {{.sig}}
  101. Use Math? by LWATCDR · · Score: 1

    Okay you said 300 sites each with multiable users hitting the application constantly. How much cpu time does each hit take? How much bandwidth? Frankly unless the application is really cpu and memory intensive your limiting factor will tend to be bandwidth not server speed or memory. Without knowing what the application is it is hard to tell you what to look at. I would start with the database. Will it handle the peak number of transactions? What about the bandwidth between the database server and the application server? It takes a big server to saturate a gigabit link so that will not tend to be an issue.
    Figure out the peak transaction rate and the bandwidth/memory/drive space/db transactions and then multiply that by 1.2 to 1.5 and that what you will neet do scale your app. Frankly I guess that a normal server with an okay database server would support your application. If it is a realatively simple form/database/report style application 900 to a thousand users does not seem to be that big of a load. What I have seen kill more apps when they try to scale than any other factor is poor database design. If your querys and indexs are good you should have no problem. If they are not fix them.

    --
    See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
  102. Re:weekend gmail invites by Anonymous Coward · · Score: 0

    Hmm, url goes to goolge with a "nyud.info" embeded in the URL, whois says that belongs to:
    Registrant Organization:Gay Nigger Association of America
    Created On:08-Sep-2004
    Name is a "Dong Bird" but Google says the phone number its registerd belongs to someone else. (Of course that doesn't mean too much, my friends in their apartment has one guys name on the phone, but there are 4 guys living there (college, 4 bedrooms)). Phone number is right for the town and city listed though, but neither Google or Yahoo can find the address listed in the whois info, and the address on the phone number isn't the one in the whois. RandMcNalley.com offers that there are addresses for 100-999 S. Coit Road, and a 1200-2998 N. Coit Road.
    So it seems to be falsified Whois info, aren't their rules/laws against that(yet)?

    Page (opened in lynx, I can guess what it is going to be) is just a frame showing "http://www.metahusky.net/~snoof/bk.jpg" The page itself looks like some girls website (pictures of the family's trip to the zoo with piture of the kid in the photos, its all clean there, nothing obviously says that this picture is supposed to be there).
    So if the image is a shocker, it might be up there without this person's knowledge.