Slashdot Mirror


High-Performance Web Server How-To

ssassen writes "Aspiring to build a high-performance web server? Hardware Analysis has an article posted that details how to build a high-performance web server from the ground up. They tackle the tough design choices and what hardware to pick and end up with a web server designed to serve daily changing content with lots of images, movies, active forums and millions of page views every month."

281 comments

  1. High-performance web server by quigonn · · Score: 5, Informative

    I'd suggest everybody with the need of a high-performance web server to try out
    fnord. It's extremely small, and pretty fast (without any special performance hacks!), see here.

    --
    A monkey is doing the real work for me.
    1. Re:High-performance web server by Electrum · · Score: 4, Informative

      Yep. fnord is probably the fastest small web server available. There are basically two ways to engineer a fast web server: make it as small as possible to incur the least overhead or make it complicated and use every possible trick to make it fast.

      If you need features that a small web server like fnord can't provide and speed is a must, then Zeus is probably the best choice. Zeus beats the pants off every other UNIX web server. It's "tricks" include non blocking I/O, linear scalability with regard to number of CPU's, platform specific system calls and mechanisms (acceptx(), poll(), sendpath, /dev/poll, etc.), sendfile() and sendfile() cache, memory and mmap() file cache, DNS cache, stat() cache, multiple accept() per I/O event notification, tuning the socket buffers, disabling nagle, tuning the listen queue, SSL disk cache, log file cache, etc.

      Which design is better? Depends on your needs. It is quite interesting that the only way to beat a really small web server is to make one really big that includes everything but the kitchen sink.

    2. Re:High-performance web server by twoslice · · Score: 1

      I have no need for a web server if it can't run PHP. Now, if you can get a version that supports PHP into this puppy and crank up the PHP performance to at least twice Zend, then the world is your oyster man...

      --

      From excellent karma to terible karma with a single +5 funny post...
    3. Re:High-performance web server by trybywrench · · Score: 1

      will Zeus run on linux? How do you get non-blocking I/O out of a blocking file system? Or are you talking about non blocking socket I/O?

      --
      I came to the datacenter drunk with a fake ID, don't you want to be just like me?
    4. Re:High-performance web server by Fefe · · Score: 2, Informative

      fnord supports CGI and PHP can be run in CGI mode.
      Actually, at least two people are using fnord to host a PHP site.

      Don't expect stellar performance, though. PHP is by no means a small interpreter. I guess it would be possible to be fast and PHP compatible with some sort of byte code cache. If there is enough demand, someone will implement it.

    5. Re:High-performance web server by khuber · · Score: 1
      I presume they have a worker thread block on I/O if there isn't async support.

      -Kevin

    6. Re:High-performance web server by Syn+Ack · · Score: 1


      Yes, Zeus runs on Linux and....

      HP-UX 10.20, 11.0, 11i, IA-64
      Solaris SPARC 2.6, 7, 8, 9
      Solaris x86 2.6, 7, 8
      Linux x86 glibc2.x
      Linux Alpha glibc2.x
      Linux PowerPC glibc2.x
      IBM AIX 4.3, 5.0
      SGI IRIX 6.4, 6.5
      Compaq Tru64 4.0e,f,g,5.0
      FreeBSD 3.4, 4.2
      OpenBSD 2.8, 2.9
      SCO Unixware 7.1.0
      Mac OS X 10.x
      BSDi 3.0, 3.1, 4.0, 4.1, 4.2

      So basically every UNIX or Unix like operating system on the planet.

      enjoy.

      Syn Ack.

    7. Re:High-performance web server by Electrum · · Score: 2

      will Zeus run on linux?

      Yes. It is a UNIX web server. It does not run on Windows.

      How do you get non-blocking I/O out of a blocking file system? Or are you talking about non blocking socket I/O?

      You don't. Not having non blocking I/O available for the filesystem is one of the most annoying things about UNIX. Though, there are ways around it. Either use a separate thread or process to do file I/O, or use mmap() with mincore().

    8. Re:High-performance web server by DancingSword · · Score: 1

      What about thttpd ?

      Has anyone compared it with fnord?

      And Yes, I know it hasn't been updated in awhile ( May, actually, from the timestamps in the tarball, referring to the beta version ), but maybe they don't feel need to update it for their purposes anymore. . .

      --
      Messages to/for me ( in me journal )
  2. 10'000 RPM by Nicolas+MONNET · · Score: 3, Insightful

    The guys use 10'000 RPM drive for "reliabilit" and "performance" ... 10k drives are LESS reliable, since they move faster. Moreover, they're not even necessarily that faster.

    1. Re:10'000 RPM by autocracy · · Score: 4, Funny
      In comparison to what? Yes, they're faster than the 7,200 you probably have - but they only run at 2/3 the speed of most really high end drives (15,000 RPM). Really it's not too bad a trade-off.

      Also, please note that the laws of physics say that it can read more data if the head is able to keep up - and I'm sure it is.

      --
      SIG: HUP
    2. Re:10'000 RPM by khuber · · Score: 3, Informative
      10k drives are LESS reliable, since they move faster.

      Okay, well ,you can use ancient MFM drives since they move much slower and would be more reliable by your logic.

      Personally, I'd take 10k SCSI drives over 7.2k IDE drives for a server, no question.

      -Kevin

    3. Re:10'000 RPM by Krapangor · · Score: 5, Funny
      10k drives are LESS reliable, since they move faster

      This implies that you shouldn't store servers in high altitudes, because they move faster up there due to earth rotation.
      Hmmm, I think we know now why these Mars missions tend to fail so often.

      --
      Owner of a Mensa membership card.
    4. Re:10'000 RPM by Nicolas+MONNET · · Score: 1

      I'm comparing two drives comparably built; the slower the more reliable. Obviously not comparing current high end drives with 10 years ago's technology.

    5. Re:10'000 RPM by khuber · · Score: 1
      I know. I was just being a butthead :).

      -Kevin

    6. Re:10'000 RPM by Anonymous Coward · · Score: 0

      RTFA
      they end up using 7200rpm IDE drives...

    7. Re:10'000 RPM by jstepka · · Score: 1

      I wouldn't know because I've had Katz blocked for over two years now!

      --
      Justen Stepka
    8. Re:10'000 RPM by Syre · · Score: 5, Insightful

      It's pretty clear that whomever wrote that article has never run a really high-volume web site.

      I've designed and implemented sites that actually handle millions of dynamic pageviews per day, and they look rather different from what these guys are proposing.

      A typical configuration includes some or all of:

      - Firewalls (at least two redundant)
      - Load balancers (again, at least two redundant)
      - Front-end caches (usually several) -- these cache entire pages or parts of pages (such as images) which are re-used within some period of time (the cache timeout period, which can vary by object)
      - Webservers (again, several) - these generate the dynamic pages using whatever page generation you're using -- JSP, PHP, etc.
      - Back-end caches (two or more)-- these are used to cache the results of database queries so you don't have to hit the database for every request.
      - Read-only database servers (two or more) -- this depends on the application, and would be used in lieu of the back end caches in certain applications. If you're serving lots of dynamic pages which mainly re-use the same content, having multiple, cheap read-only database servers which are updated periodically from a master can give much higher efficiency at lower cost.
      - One clustered back-end database server with RAID storage. Typically this would be a big Sun box running clustering/failover software -- all the database updates (as opposed to reads) go through this box.

      And then:

      - The entire setup duplicated in several geographic locations.

      If you build -one- server and expect it to do everything, it's not going to be high-performance.

    9. Re:10'000 RPM by Anonymous Coward · · Score: 0

      So, I can pee for better distance.

    10. Re:10'000 RPM by GC · · Score: 2

      How about running the web server from a RAM Disk? That's an age old trick to make speed improvements!

    11. Re:10'000 RPM by ssassen · · Score: 1
      Syre,

      As you probably understood from reading the article, we're not a large corporation but just a small startup company with young and enthusiastic people fresh out of university and aspiring to be all that we can be. We therefore are taking this opportunity to learn from this experience as the amount and diversity of traffic we've gotten over the past few days is both beyond anything we simulated. It is an opportunity for us to learn from any mistakes we've made and track down the bottlenecks.

      We've been going over the Slashdot comments and your response was one we certainly would like to follow up on. So if you're interested we'd love to hear more and welcome any advice you're able to give us. We'll be adding one or more pages to the article with, for example, software tricks for Apache and Linux and other tips and tricks that will help us, and others, to fine-tune their web server.

      Looking forward to your reply.

      Kind regards,

      Sander Sassen

      Email: ssassen@hardwareanalysis.com
      Visit us at: http://www.hardwareanalysis.com

  3. But any web server is high-performance by Ed+Avis · · Score: 5, Insightful

    Computer hardware is so fast relative to the amount of traffic coming to almost any site that any web server is a high-performance web server, if you are just serving static pages. A website made of static pages would surely fit into a gigabyte or so of disk cache, so disk speed is largely irrelevant, and so is processor speed. All the machine needs to do is stuff data down the network pipe as fast as possible, and any box you buy can do that adequately. Maybe if you have really heavy traffic you'd need to use Tux or some other accelerated server optimized for static files.

    With dynamically generated web content it's different of course. But there you will normally be fetching from a database to generate the web pages. In which case you should consult articles on speeding up database access.

    In other words: an article on 'building a fast database server' or 'building a machine to run disk-intensive search scripts' I can understand. But there is really nothing special about web servers.

    --
    -- Ed Avis ed@membled.com
    1. Re:But any web server is high-performance by khuber · · Score: 5, Insightful
      With dynamically generated web content it's different of course. But there you will normally be fetching from a database to generate the web pages. In which case you should consult articles on speeding up database access.

      I'm just a programmer, but don't big sites put caching in front of the database? I always try to cache database results if I can. Honestly, I think relational databases are overused, they become bottlenecks too often.

      -Kevin

    2. Re:But any web server is high-performance by NineNine · · Score: 5, Insightful

      Good databases are designed for performance. If databases are your bottleneck, then you don't know what you're doign with the database. Too many people throw up a database, and use it like it's some kind of flat file. There's a lot that can be done with databases that the average hack has no idea about.

    3. Re:But any web server is high-performance by NineNine · · Score: 4, Insightful

      You're absolutely right. Wish I had some mod points left...

      Hardware only comes into play in a web app when you're doing very heavy database work. Serving flat pages takes virtually no computing effort. It's all bandwidth. Hell, even scripting languages like ASP, CF, and PHP are light enough that just about any machine will work great. The database though... that's another story.

    4. Re:But any web server is high-performance by khuber · · Score: 2, Interesting
      Our databases are tuned. Some apps would just need to transfer too much data per request for a SQL call to be feasible.

      -Kevin

    5. Re:But any web server is high-performance by jimfrost · · Score: 5, Interesting
      As you say, databases are usually the bottleneck in a high-volume site. Contrary to what Oracle et al want you to believe, they still don't scale and in many cases it's not feasible to use a database cluster.

      Big sites, really big sites, put caching in the application. The biggest thing to cache is session data, easy if you're running a single box but harder if you need to cluster (and you certainly do need to cluster if you're talking about a high-volume site; nobody makes single machines powerful enough for that). Clustering means session affinity and that means more complicated software. (Aside: Is there any open source software that manages session affinity yet? )

      Frankly speaking, Intel-based hardware would not be my first choice for building a high-volume site (although "millions of page views per month" is really only a moderate volume site; sites I have worked on do millions per /day/). It would probably be my third or fourth choice. The hardware reliability isn't really the problem, it can be good enough, the issue is single box scalability.

      To run a really large site you end up needing hundreds or even thousands of Intel boxes where a handful of midrange Suns would do the trick, or even just a couple of high-end Suns or IBM mainframes. Going the many-small-boxes route your largest cost ends up being maintenance. Your people spend all their time just fixing and upgrading boxes. Upgrading or patching in particular is a pain in the neck because you have to do it over such a broad base. It's what makes Windows very impractical as host for such a system; less so for something like Linux because of tools like rdist, but even so you have to do big, painful upgrades with some regularity.

      What you need to do is find a point where the box count is low enough that it can be managed by a few people and yet the individual boxes are cheap enough that you don't go broke.

      These days the best machines for that kind of application are midrange Suns. It will probably be a couple of years before Intel-based boxes are big and fast enough to realistically take that away ... not because there isn't the hardware to do it (though such hardware is, as yet, unusual) but because the available operating systems don't scale well enough yet.

      --
      jim frost
      jimf@frostbytes.com
    6. Re:But any web server is high-performance by NineNine · · Score: 3, Informative

      Our databases are tuned. Some apps would just need to transfer too much data per request for a SQL call to be feasible.

      I had this problem for a while... Sloppy coding on my part was querying 65K+ records per page. Server would start to crawl with a few hundred simultaneous users. Since I fixed it, 1000+ simultaneous users is no problem at all.

    7. Re:But any web server is high-performance by khuber · · Score: 2, Informative
      Very good info Jim.

      Yeah, my experience is at a relatively large site. We use mostly large and midrange Suns, EMC arrays and so on. There's a lot of interest in the many small server architecture though that is still being investigated.

      -Kevin

    8. Re:But any web server is high-performance by jimfrost · · Score: 5, Informative
      I've seen both kinds and take it from me, many small servers is more of a headache than the hardware cost savings is worth. Your network architecture gets complicated, you end up having to hire lots of people just to keep the machines running and with up-to-date software, and database connection pooling becomes a lot less efficient.

      You save money in the long run by buying fewer, more powerful machines.

      --
      jim frost
      jimf@frostbytes.com
    9. Re:But any web server is high-performance by khuber · · Score: 2, Interesting
      The interest is primarily hardware cost (the big Suns cost over $1m, and EMC arrays are likewise). Another issue is that when you have a few big machines and you do a deployment or maintenance, it's a struggle for the other boxes to pick up the slack. If you had more small servers, you could upgrade one at a time without impacting capacity as much.

      What do you think about handling capacity? Do you see sites with a lot of spare capacity? We'd have trouble meeting demand if we lost a server during prime hours (and it happens).

      -Kevin

    10. Re:But any web server is high-performance by jimfrost · · Score: 5, Insightful
      Yea, big Suns are too expensive and you do need to keep the server count high enough that a failure or system taken down for maintenance isn't a really big impact on the site. I mentioned in a different posting that my cut on this is that the midrange Suns, 4xxx and 5xxx class, provide good bang-for-the-buck for high-volume sites.

      Beware of false economy when looking at hardware. While it's true that smaller boxes are cheaper, they still require about the same manpower per box to keep them running. You rapidly get to the point where manpower costs dwarf equipment cost. People are expensive!

      Capacity is an issue. We try to plan for enough excess at peak that the loss of a single server won't kill you, and hope you never suffer a multiple loss. Unfortunately most often customers underequip even for ordinary peak loads, to say nothing of what you see when your URL sees a real high load.[1] They just don't like to spend the money. I can see their point, the machines we're talking about are not cheap; it's a matter of deciding what's more important to you, uptime and performance or cost savings. Frankly most customers go with cost savings initially and over time (especially as they learn what their peak loads are and gain experience with the reliability characteristics of their servers) build up their clusters.

      [1] People here talk about the slashdot effect, but trust me when I tell you that that's nothing like the effect you get when your URL appears on TV during "Friends".

      --
      jim frost
      jimf@frostbytes.com
    11. Re:But any web server is high-performance by jimfrost · · Score: 5, Interesting
      If you're just serving static pages you're right. If you're doing dynamic content then you're wrong.

      But 2.5 million hits a day is still just a moderate volume site to me. One of the sites I worked on sees in excess of a hundred million hits per day these days; it was up over ten million hits per day back in 1998.

      I don't happen to know what Slashdot does for volume, but Slashdot is a very simplistic site when it comes to content production. Each page render doesn't take much horsepower and sheer replication can be used effectively. Things get more complicated when you're doing something like trying to figure out what stuff a user is likely to buy given their past buying history and/or what they're looking at right now.

      If you really think a 4-way Intel box is equivalent to a 12-way Sun, well, it's clear you don't know what you're talking about. You're wrong even if all you're talking about is CPU, and of course I/O bandwidth is what makes or breaks you -- and there's no comparison in that respect.

      --
      jim frost
      jimf@frostbytes.com
    12. Re:But any web server is high-performance by Matey-O · · Score: 5, Insightful

      I think the big problem here is the tendency to DBify EVERYTHING POSSIBLE.

      Like the State field in an online form.

      Every single hit requires a tag to the databases. Why?

      Because, heck if we ever get another state, it'll be easy to update! Ummm, that's a LOT of cycles used for something that hasn't happened in, what, 50 years or so. (Hawaii, 1959)

      --
      "Draco dormiens nunquam titillandus."
    13. Re:But any web server is high-performance by Matey-O · · Score: 2

      "The hardware reliability isn't really the problem, it can be good enough, the issue is single box scalability."

      I dunno, our current major project is running on an ES7000 (8 processors, fully redundant, running Windows Datacenter) It seems pretty beastly to me.

      At the point here where X Unix implementation is x% faster than Y Microsoft implementation, the issue is decided by other factors. As long as either is fast enough to handle the load, n-th degree performance doesn't matter.

      In out case, the company that won the contract specified the hardware, it was part of a total cost contract (you get one amount of money to make this work, work within those boundaries.)

      _Presumably_ that company is happy enough with Windows performance on a 'big iron' box.

      --
      "Draco dormiens nunquam titillandus."
    14. Re:But any web server is high-performance by Anonymous Coward · · Score: 0

      My company is building a high volume ecommerce site using IBM HTTP Server (Apache based) and Websphere on Suse Linux for S390 on a Zseries mainframe. We are replacing a current site that is using Microsoft Site Server (IIS 4) which has many boxes and has become increasingly difficult to maintain. :)

    15. Re:But any web server is high-performance by jimfrost · · Score: 2
      I think we're going to see more and more of this kind of server. The Zseries mainframes running Linux are really interesting because you're not so dependent on scalable SMP capabilities and yet you get the same kind of manageability as if you were working with a big SMP box. Nice.

      I haven't personally done any deployments on such a system, but I like the idea.

      --
      jim frost
      jimf@frostbytes.com
    16. Re:But any web server is high-performance by Anonymous Coward · · Score: 1, Insightful

      This is exactly the stuff you CACHE! But there are VERY GOOD REASONS for putting state/country data in the database.

    17. Re:But any web server is high-performance by TrueKonrads · · Score: 1

      May i add 'bullshit'?

      linuxvirtualserver is a great example on howto use small machines to serve big volume. /--WWW1
      SAN==NFSBOX |--WWW2 LOADBALANCER
      |--WWW3
      \---etc...

      Nothing complex to maintain there. Make the www* diskless machines that boot from a bootserver. Load all required apps into ram (32 MB to spen isn't much). Update apps in bootserver. Reboot to upgrade. THat simple, one man maintanable !

      --
      Lone Gunmen crew.
    18. Re:But any web server is high-performance by Hast · · Score: 3, Informative

      How about reading the FAQ before you start giving out "facts"? Slashdot is running on:
      * 5 load balanced Web servers dedicated to pages
      * 3 load balanced Web servers dedicated to images
      * 1 SQL server
      * 1 NFS Server
      Either the "little 4 way intel" you mention has a serious case of shizofrenia or your just full of it. (Guess which theory I'm going for.)

      Besides the poster mentioned that those sites /are/ bigger than Slashdot. E.g. the mention that "Getting your URL posted during Friends" is nothing like getting it posted on Slashdot.

      I know I shouldn't feed the trolls, but someone might actually belive this tripe.

    19. Re:But any web server is high-performance by hoover · · Score: 0

      Zope (http://www.zope.org/) has good support for selective caching of ZSQL method result sets (an abstraction layer on top of your DB engine) which works great.

      Cheers,

      uwe

      --
      Ever wondered whats wrong with the world? http://www.ishmael.org/
    20. Re:But any web server is high-performance by Anonymous Coward · · Score: 0

      Ugggh I feel for you.

      BTW Windows Datacenter must be the biggest oxymoron ever. The last place I would consider for my companies most valuable data is a Microsoft server. But to each his own I guess. I'm just not a big fan of Russian Roulette though.

    21. Re:But any web server is high-performance by Aldurn · · Score: 3, Informative

      Aside: Is there any open source software that manages session affinity yet?

      Yes. Linux Virtual Server is an incredible project. You put your web servers behind it and (in the case of simple NAT balancing) you set the gateway of those computers to be the address of your LVS server. You then tell LVS to direct all IPs of a certain netmask to one server (i.e. if you set for 255.255.255.0, 192.168.1.5 and 192.168.1.133 will connect to the same server).

      The only problem I had with it was that it does not detect downtime. However, I wrote a quick script that used the checkhttp program from Nagios to pull a site out of the loop when it went down (these were Windows 2000 servers: it happened quite frequently, and our MCSE didn't know why :)

      There are higher performance ways to set up clustering using LVS, but since I was lazy, that's what I did.
      --
      char sig[120] = "\0"
    22. Re:But any web server is high-performance by Ed+Avis · · Score: 2

      One database query per page is not too bad. You can make that scalable and it's certainly a lot less effort than trying to track large amounts of data _outside_ the DB.

      You have a problem when a single page view takes hundreds of database queries (as happened with a certain web toolkit I used to develop on).

      --
      -- Ed Avis ed@membled.com
    23. Re:But any web server is high-performance by jimfrost · · Score: 3, Interesting
      I have more than a few problems with that idea, but amongst them is:

      • Diskless systems start to collapse the central servers even by forty or fifty clients. By the time you're talking the thousand or more Intel systems necessary for a big site you're looking at having to have a tiered system just to do software deployments, forget about data serving.

      • Diskless systems don't work well if you have more data than you can realistically afford to store in memory. You start to see practical limits (like hardware limitations) in the low gigabyte range, when most larger websites have static content to deliver in the hundreds of gigabyte range.

      • Applications are notoriously hungry because they have to do a lot of caching to offload the database since databases generally don't scale well. It's pretty common to see our application servers running with 2+ gig heaps, and we'll run one application server per CPU on a system, and you're probably running three or more 6 or 8 CPU systems just for the application server part. Try to make that diskless and you're now talking about machine configurations with something like 30G of RAM ... very expensive and impractical.

      We're talking about a totally different scale, really.

      --
      jim frost
      jimf@frostbytes.com
    24. Re:But any web server is high-performance by otisg · · Score: 1

      That's an interesting about Intel hardware.
      Google, which is a top 5 web site by traffic, uses Intel hardware. It's got a few thousand Linux boxes running on Intel hardware, so they had to go for something affordable (Linux for the OS, Intel for hw).

      --
      Simpy
    25. Re:But any web server is high-performance by PhotoGuy · · Score: 4, Interesting
      A key question someone needs to ask themselves when storing data in a relational database, is "is this data really relational"?

      In a surprising amount of cases, it really isn't. For example, storing user preferences for visiting a given web page; there is never a case where you need to relate the different users to each other. The power aggregation abilities of relational databases are irrelevant, so why incur the overhead (performance-wise, cost-wise, etc.)

      Even when aggregating such information is useful, I've often found off-line duplication of the information to databases (which you can then query the hell out of, without affecting the production system) a better way to go.

      If a flat file will do the job, use that instead of a database.

      --
      Love many, trust a few, do harm to none.
    26. Re:But any web server is high-performance by Anonymous Coward · · Score: 0

      Can you explain more so I don't make the same mistake?

    27. Re:But any web server is high-performance by Anonymous Coward · · Score: 0

      I wouldnt add CF in there as light weight.
      And the others ASP and PHP can gain alittle poundage depending on what your attempting to do with them. But both are able to churn out some hefty numbers.

    28. Re:But any web server is high-performance by jimfrost · · Score: 2
      It's smarter to manage affinity by session, not by IP, since a variety of sources have rotating IPs (most notably AOL, but some business firewalls do it too).

      Anyway, thanks for the tip. I haven't seen the LVS stuff at all yet.

      --
      jim frost
      jimf@frostbytes.com
    29. Re:But any web server is high-performance by jimfrost · · Score: 2
      Yes, Google is one such site, although their runtime is simplistic enough that it's not a really good example of a typical large-volume site. Amazon would be better, or eBay; I know eBay uses larger machines, don't know about Amazon. The only really high volume site I know off the top of my head that uses Intel-based hardware and individual personalization is hotmail and again they're dealing with thousands of servers.

      If you're building a site like that then you've got to make your decision as to whether you'd rather use thousands of Intel servers or a few tens of larger servers. If it were my decision I'd go for the smaller number of larger servers simply because they require a lot fewer IT people to keep running, and every IT person you don't have to hire is another new machine or two you could buy every year. It adds up.

      --
      jim frost
      jimf@frostbytes.com
    30. Re:But any web server is high-performance by johnlcallaway · · Score: 2

      I agree with the assessment about using non-Intel hardware, but disagree with the big v/s little argument, specifically the manpower requirement. Our website uses several automated tools to distribute updates to our webservers and app servers, which are Netras. The Netras all share the exact same Sun image, which is very, very small. All unneeded packages (X, language packs, etc) were removed. Unison is used to keep the web pages and JSP pages syncronized.

      We have had 1 failure (SCSI drive) since implemented 1 year ago. It took us 20 minutes to have the box back up and running (Jumpstart). Granted, we only have 20 now. But based on the amount of time we actually spend working on the machines, one of us could handle 5 to 10 times this amount.

      Now, 100 Netras cost about 600,000. You can't touch any other Sun equipment at that price and get 100 CPUs. A Sunfire15K w/72 CPUs is over $3M, without maintenance. I could afford a couple more admins at those prices.....

      --
      I rarely read replies, it's my opinion and if you thought about your opinion a little more, I'm OK with that.
    31. Re:But any web server is high-performance by vlag · · Score: 1

      I agree on many fronts, but consider having a look at the IBM X440 server. I have seen several over these installations and I can honestly say this is the first Intel box that really scales and competes with the Big Iron. And you can't argue with the price. Check it out:
      http://www.pc.ibm.com/us/eserver/xseries/x440.html

      --
      Do you want to remove linux?
    32. Re:But any web server is high-performance by strobert · · Score: 2

      I have a question how much manpower (say in terms of number of sysadmins) do you generally use for say a group of 10 mid range sun servers say E4500's?

      Reason I am asking is some experience we had here where an admin dealing with the intel/linux side of things was able to handle about 40 boxes each with plenty of room to sprae, whereas on the sparc/solaris side an admin was dealing with two boxes and wasn't really even able to keep up.

    33. Re:But any web server is high-performance by jimfrost · · Score: 2
      One person can do the maintenance of the main servers with only part-time effort, although generally such operations are well staffed for other reasons. The online servers are only the tip of the iceberg in such an operation -- you also have the database(s) with its associated guru, staging system(s), some number of developers, artists, etc. each with one or more systems, and of course the network infrastructure for such a system is very substantial.

      Keep in mind that with that kind of horsepower you're talking about a pretty darn large site -- like way into the tens of millions of dynamic page views per day. One customer I worked with was handling more than ten million dynamic page views per day on just three systems running at less than half utilization each. (There were three or four smaller boxes doing static content up-front, and a larger database box behind however.)

      The ancillary systems tend to far outnumber the main systems. Generally, at least in the places I've seen, IT handles the lot of them.

      --
      jim frost
      jimf@frostbytes.com
    34. Re:But any web server is high-performance by strobert · · Score: 2

      oh, I understand that. we actually have a half dozen effective mirrors of the production environment for development/testing/etc.

      I was just kind of curious on what manpower ratios you genereal use for all of these servers (both main and pre-production/dev/test). I.e. for say 10 servers (say 2 main, the other 8 in use to get the product to the 2) how many sysadmins would you generally see in use.

    35. Re:But any web server is high-performance by jimfrost · · Score: 2
      If it's that few then you could easily get by with only one admin, keeping in mind that he'll have to sleep and go on vacation on occasion. That's a pretty small site though.

      I think the hundred-million dynamic pageviews site had three admins, but they switched hats with other jobs. One did double duty as the group manager, and the other two were part-time programmers. Multiple admins also meant that there was the possibility of time off :-).

      They had a lot of outside help, though, since Exodus was hosting their machines for them and there was another IT department that did desktop management for the rest of the organization.

      --
      jim frost
      jimf@frostbytes.com
  4. gee, i wonder.. by Anonymous Coward · · Score: 5, Funny

    .. if their webservers are as reliable as the ones in the article..
    i guess there's only one way to find out..

    slashdotters! advance! :P

    1. Re:gee, i wonder.. by lesburn1 · · Score: 0

      Gee, /.'ed already

    2. Re:gee, i wonder.. by egreB · · Score: 1

      Hey, karma is for burning, right? They've changed it. The counter couldn't keep up, I guess.. There are a few registered and quite a few anonymous users currently online. Current bandwidth usage: 541.73 kbit/s

  5. That "howto" sucks by Nicolas+MONNET · · Score: 5, Interesting

    There is no useful information in that infomercial. They seem to have judged "reliability" through vendor brochures and in a couple days; reliability is when your uptime is > 1 year.

    This article should be called "M. Joe-Average-Overclocker Builds A Web Server".

    This quote is funny:

    That brings us to the next important component in a web server, the CPU(s). For our new server we were determined to go with an SMP solution, simply because a single CPU would quickly be overloaded when the database is queried by multiple clients simultaneously.

    It's well known that single CPU computers can't handle simultaneous queries, eh!

    1. Re:That "howto" sucks by Meleneth · · Score: 1

      I didn't read the article so this may be wrong but...

      WTF are they doing running the database on the webserver?

      --
      remote access CLI with tools is the only friend you'll ever need.
    2. Re:That "howto" sucks by khuber · · Score: 5, Insightful
      Well, not to mention that high traffic sites usually have a bunch of webservers and then a load balancer in front of them. This article obviously isn't for big league web serving.

      -Kevin

    3. Re:That "howto" sucks by jimfrost · · Score: 5, Informative
      High traffic sites, the ones that are really dynamic anyway, do more than that.

      They start with a load balancer at the front end, or possibly several layers of load balancer. If they run a distributed operation they'll use smart DNS systems or routers to direct requests to the most local server cluster. The server cluster will be fronted by a request scattering system.

      Behind the request scattering system you'll find a cluster of machines whose job it is to serve static content (often the bulk of data served by a site) and route dynamic requests to another cluster of servers, enforcing session affinity for the dynamic requests.

      Behind the static content servers are the application servers. They do the heavy lifting, building dynamic pages as appropriate for individual users and caching everything they can to offload the database.

      Behind the application servers is the database or database cluster. The latter is really not that useful if you have a highly dynamic site as there are problems with data synchronization in database clusters (no matter what the database vendors tell you). But that's ok, single databases can handle a lot of volume if built correctly and caching is done appropriately at the application level.

      And there you have it, the structure of a really large site.

      --
      jim frost
      jimf@frostbytes.com
    4. Re:That "howto" sucks by khuber · · Score: 1
      Yes, I'm sort of playing a little dumb here ;). We go load balancer to front end web servers to app servers, then through a request router to back end servers. I work on the back end apps. We don't use clustering, but rather home grown replication.

      -Kevin

    5. Re:That "howto" sucks by ssassen · · Score: 1
      Hi Kevin,

      As you probably understood from reading the article, we're not a large corporation but just a small startup company with young and enthusiastic people fresh out of university and aspiring to be all that we can be. We therefore are taking this opportunity to learn from this experience as the amount and diversity of traffic we've gotten over the past few days is both beyond anything we simulated. It is an opportunity for us to learn from any mistakes we've made and track down the bottlenecks.

      We've been going over the Slashdot comments and your response was one we certainly would like to follow up on. So if you're interested we'd love to hear more and welcome any advice you're able to give us. We'll be adding one or more pages to the article with, for example, software tricks for Apache and Linux and other tips and tricks that will help us, and others, to fine-tune their web server.

      Looking forward to your reply.

      Kind regards,

      Sander Sassen

      Email: ssassen@hardwareanalysis.com
      Visit us at: http://www.hardwareanalysis.com

  6. "Three times the power?" by mumblestheclown · · Score: 5, Insightful
    From the article:

    If we were to use, for example, Microsoft Windows 2000 Pro, our server would need to be at least three times more powerful to be able to offer the same level of performance.

    "three times?" Can somebody point me to some evidence for this sort of rather bald assertion?

    1. Re:"Three times the power?" by khuber · · Score: 5, Interesting
      That was total FUD. The two operating systems have comparable performance on the same hardware.

      -Kevin

    2. Re:"Three times the power?" by NineNine · · Score: 5, Informative

      "Microsoft Windows 2000 Pro"

      I got a good laugh out of this... W2K Pro is the desktop version, not the server version. Wow. Great article. Really well informed author.

    3. Re:"Three times the power?" by SpeedMan · · Score: 1

      I thought "bald" assertions with regards to MS Windows were only made by Steve Balmer. s/bald/bold

      --
      Regards, SpeedMan
    4. Re:"Three times the power?" by (H)elix1 · · Score: 5, Informative

      That was total FUD. The two operating systems have comparable performance on the same hardware.

      Win2k pro limits you to 10 concurrent TCP/IP connections, Win2K Server has no (artificial) limit but won't cluster, Advanced Server can cluster but I don't know a thing about it..

      Linux has no (artificial) limit... not sure about clustering options there either.

      Found out about the TCP/IP limit when I added SP2 and trashed my evening counter-strike server - this makes a HUGE difference.

    5. Re:"Three times the power?" by sheldon · · Score: 2

      I hope when you're talking about clustering, you don't mean Beowulf?

      I find most Linux advocates don't understand the first thing about clustering, and keep misusing the term. Generally speaking for a web server you need limited clustering, that is you just want to do load balancing. But you also want to monitor the servers such that if one fails you take it out of the loop.

    6. Re:"Three times the power?" by Aldurn · · Score: 2, Informative

      At a website I used to work at, they decided they needed to use Windows 2000 Advanced Server for web clustering. That is, quite possibly, the worst decision they ever made (aside from going with Windows 2000; trust me on this one.)

      Win2k AS Load Balancing (aka WLBS: Windows Load Balancing Service) works by detecting other computers on the network with the same service, and they decide who will handle what request. They both have a primary IP, which is unique, in addition to a "virtual" address, which is the same on all of them. They also have a fake MAC address which is identical on both (makes for interesting ping responses.)

      An interesting thing we noticed about WLBS is that, unless a computer is off the network, it will still be in the cluster. I.e. if IIS fails on one machine, as long as you can ping it, it will still get traffic.

      When we moved from WLBS to LVS, we noticed a 50% drop in average CPU usage. This is probably due to the fact that now the clustering horsepower was moved off the web servers, but still, a free product versus a rather expensive one. And we've had better uptime now than ever before.

      --
      char sig[120] = "\0"
    7. Re:"Three times the power?" by Magila · · Score: 4, Informative

      Win2k pro limits you to 10 concurrent TCP/IP connections.

      Whao! bullshit meter rising! While Win2K does have a limit on TCP/IP connections, it is in the thousands. A limit of 10 would be totaly ridiculous, it would cripple the OS for MANY people. Also, most of the traffic for a CS server is UDP so the TCP/IP connection limit isn't going to affect that much at all.

    8. Re:"Three times the power?" by Anonymous Coward · · Score: 1

      IIS, on the other hand, does only support 10 concurrent connections.

    9. Re:"Three times the power?" by elemental23 · · Score: 5, Informative
      The maximum number of other computers that are permitted to simultaneously connect over the network to Windows NT Workstation 3.5, 3.51, 4.0, and Windows 2000 Professional is ten. This limit includes all transports and resource sharing protocols combined. This limit is the number of simultaneous sessions from other computers the system is permitted to host.

      From Microsoft Knowledge Base Article Q122920.
      (Warning: The page layout is broken in Mozilla)

      It's an artificial limitation. The idea is that if you need more simultaneous connections you should buy Win2k Server. In other words, MS wants you to spend more money.

      --
      I like my women like my coffee... pale and bitter.
  7. A little disapointing really by grahamsz · · Score: 5, Insightful

    The article seemed way too focused on hardware.

    Anyone who's ever worked on a big server in this cash-strapped world will know that squeezing every last ounce of capacity out of apache and your web applications needs to be done.

    1. Re:A little disapointing really by januschr · · Score: 2, Insightful

      The article seemed way too focused on hardware.

      Well the name of the website is "Hardware Analysis"... ,-)

      --
      This is my sig. Read it and weep.
    2. Re:A little disapointing really by Zeinfeld · · Score: 2
      The article seemed way too focused on hardware

      Yeah, maybe if the site had not been slashdotted...

      Does not appear that the site considers the most effective way to make a Web server fly, replace the hard drives with RAM. Ditch the obsolete SQL engine and use in memory storage rebuilt from a transaction log.

      Of course the problem with that config is that an outage tends to be a problem so just duplicate the hardware at a remote disaster recovery site.

      Sound expensive? Well yes, but not half as expensive as some of the systems people put together to run SQL databases...

      --
      Looking for an Information Security student project suggestion?
      Try http://dotcrimeManifesto.com/
  8. my $0.02 by spoonist · · Score: 5, Informative

    * I prefer SCSI over IDE

    * RedHat is a pain to strip down to a bare minimum web server, I prefer OpenBSD. Sleek and elegant like the early days of Linux distros.

    * I've used Dell PowerEdge 2650 rackmount servers and they're VERY well made and easy to use. Redundant power supplies, SCSI removable drives, good physical security (lots of locks).

    1. Re:my $0.02 by lgftsa · · Score: 1

      Redundant power supplies, SCSI removable drives, good physical security (lots of locks).

      If you need locks on your servers to provide physical security, then you have more urgent problems than wringing the last drop of performance from your webserver.

    2. Re:my $0.02 by khuber · · Score: 3, Funny
      Back alley colocation. It's the only way to afford it these days.

      -Kevin

    3. Re:my $0.02 by Anonymous Coward · · Score: 0

      If you don't like Redhat, use Slackware. Slackware RULES!

    4. Re:my $0.02 by Door-opening+Fascist · · Score: 3, Informative
      RedHat is a pain to strip down to a bare minimum web server, I prefer OpenBSD [openbsd.org]. Sleek and elegant like the early days of Linux distros.

      OpenBSD doesn't have support for multiple processors, which are a necessity for database servers and dynamic web servers. I'd say FreeBSD is the way to go.

    5. Re:my $0.02 by yomahz · · Score: 3, Interesting

      RedHat is a pain to strip down to a bare minimum web server, I prefer OpenBSD [openbsd.org]. Sleek and elegant like the early days of Linux distros.

      Huh?

      for i in `rpm -qa|grep ^mod_`;do rpm -e $i;done

      rpm -e apache
      cd ~/src/apache.xxx ./configure --prefix=/usr/local/apache \
      --enable-rule=SHARED_CORE \
      --enable-module=so
      make
      make install

      with mod_so (DSO - Dynamic Shared Object) support, module installation is trivial.

      --
      "A mind is a terrible thing to taste."
    6. Re:my $0.02 by spoonist · · Score: 2

      What I meant by "strip down to a bare minimum web server" was more along the lines of:

      * I don't want freakin' xinetd running

      * I don't want freakin' gpm running

      * I don't want freakin' portmap running

      etc, etc.

      I've got more important things to do with my time than turn off every process known to man that comes installed. OpenBSD already comes with mostly everything turned off.

    7. Re:my $0.02 by SuiteSisterMary · · Score: 3, Informative

      If your server isn't designed with 'security' in mind, including the ability to padlock the chassis, and at least send an SNMP trap when the chassis is opened, then you need to learn that as far as 'computer and data security' is concerned, protecting from external network attacks is actually quite low on the totem pole.

      Or, "If Joe Random Idiot can walk in and rip out the hard drive, who cares how 3117 your firewall and other network protections are."

      --
      Vintage computer games and RPG books available. Email me if you're interested.
    8. Re:my $0.02 by yomahz · · Score: 2

      I've got more important things to do with my time than turn off every process known to man that comes installed. OpenBSD already comes with mostly everything turned off.

      Hmmm... the installation makes it pretty easy to remove these services. All it takes is a couple of clicks of a mouse. Even if it's post install, all you have to do is remove the files from the /etc/init.d and /etc/rc?.d dirs.

      You're right tho', the default probably shouldn't come with everything. You'd think they'd learn a lesson from MS and all the services that they turn on for you by default.

      --
      "A mind is a terrible thing to taste."
    9. Re:my $0.02 by Anonymous Coward · · Score: 0

      for i in `chkconfig --list`;
      do chkconfig --levels 0123456 $i off;
      done

      Turning everything off doesn't look so "freakin'" hard to me!

    10. Re:my $0.02 by Anonymous Coward · · Score: 0

      If Joe Random Idiot has even the remote possibilty to come anywhere near your servers to rip out the hard drive, you have to learn that as far as 'computer and data security' is concerned, locking your doors and securing your building is way up on the totem pole. You just _don't_ protect you servers physically by sending SNMP traps!

    11. Re:my $0.02 by lgftsa · · Score: 1

      Ummm, yeah. Through the swipe-card door in reception, through the building without being challenged for lack of a ID card, through IT without being challenged, through the restricted (five people) swipe-card door to the machine room.

      BTW, if a nasty has such free access to your machine room, they are just as likely to take the entire 3U/4U server as the drives.

      Larger than that, and you move into Security Through Massivity territory, though. *grin*

    12. Re:my $0.02 by Anonymous Coward · · Score: 0

      Linux and FreeBSD do better at caching data from the filesystem: both can use all inactive RAM to do this. OpenBSD, like NetBSD, still has a fixed-size buffer-cache which is statically assigned at boot time and has a default value which is fairly small. NetBSD 1.6 should improve this by incorporating Unified BUffer Cache (UBC) and OpenBSD should quickly follow suit. UBC is not very smart yet however, and can cause excessive paging. In other words, Net/OpenBSD's performance will suck on large data sets for any server having to do file I/O.

    13. Re:my $0.02 by SuiteSisterMary · · Score: 2

      I've seen it happen. Put on a nice business suit, claim to be a consultant, and the SEP field magically kicks into play.

      Like those IBM commercials showing the inside of the network as a round table, and the two thieves come in. "Umm...we're vendors."

      Or, to put it your way, why have the challenge if you've the swipe door? Why have the server room locked if the front door is locked? And so on. Just because you've a firewall doesn't mean you don't tell your database server to only accept requests from the webserver and the admin console. Just because the front door's locked, and the server room door's locked, doesn't mean you shouldn't lock the racks, and the machines.

      You might choose to trust Juan Third Party Repairman to repair the right machine, let alone not fuck something up, accidentally or maliciously, but I don't. For example.

      I guess what I'm trying to say in my own rambling way is, there's no percentage in taking chances.

      --
      Vintage computer games and RPG books available. Email me if you're interested.
    14. Re:my $0.02 by ssassen · · Score: 1
      Hi,

      As you probably understood from reading the article, we're not a large corporation but just a small startup company with young and enthusiastic people fresh out of university and aspiring to be all that we can be. We therefore are taking this opportunity to learn from this experience as the amount and diversity of traffic we've gotten over the past few days is both beyond anything we simulated. It is an opportunity for us to learn from any mistakes we've made and track down the bottlenecks.

      We've been going over the Slashdot comments and your response was one we certainly would like to follow up on. So if you're interested we'd love to hear more and welcome any advice you're able to give us. We'll be adding one or more pages to the article with, for example, software tricks for Apache and Linux and other tips and tricks that will help us, and others, to fine-tune their web server.

      Looking forward to your reply.

      Kind regards,

      Sander Sassen

      Email: ssassen@hardwareanalysis.com
      Visit us at: http://www.hardwareanalysis.com

  9. So fast and soo goo... by PineGreen · · Score: 0, Redundant

    Yes, their performance server is so good and fast that it has been slashdotted withing minutes of posting the artice...

    1. Re:So fast and soo goo... by khuber · · Score: 3, Funny
      It's still running. It's just extremely slow. Or maybe it's so fast it's zipping through space-time and it only seems slow from our reference frame.

      -Kevin

    2. Re:So fast and soo goo... by irc.goatse.cx+troll · · Score: 3, Informative

      Server has nothing to do with it.
      10,000 slashdotters * 500k pages = 5gigs in about an hour.
      these figures are both estimates, but you can see that network congestion is obviously more of a bottleneck than their performance server.

      --
      Pain lasts, kid. Its how you know you're alive. Sometimes I think this growing up thing is just pain management-TheMaxx
    3. Re:So fast and soo goo... by blueroo · · Score: 1

      yes, because as we all known "Error 500" means "Network Congestion".

    4. Re:So fast and soo goo... by Anonymous Coward · · Score: 0

      If you notice at the top of their page "Current bandwidth usage: 208.28 kbit/s" This has been well under the 10 Mbps required to pump out about 5 Gigs an hour. I run adult sites and have three RaQ4 servers pushing an average of 20-30 Mbps each. Hardware is the last of your worries when running a high traffic site, as long as you don't bog the server down with database requests and you tune software for maximum performance, utilizing multi-Mbps is very easy. For static content you may also want to look into thttpd. We use it for some of our image serving and it works great.

  10. Re:Troll? Informative is more like it. by OpCode42 · · Score: 2, Interesting

    Every time that you click on a link and get bumped back to the front page here on Slashdot, it's a failure of mysql. So much for high-performance.

    Why hasn't Slashdot changed to postgresql?


    I thought this was a good question, if slightly off-topic.

  11. Strange choice of processors by Ed+Avis · · Score: 5, Insightful

    I know that in the server market you often go for tried-and-tested rather than latest-and-greatest, and that the Pentium III still sees some use in new servers. But 1.26GHz with PC133 SDRAM? Surely they'd have got better performance from a single 2.8GHz Northwood with Rambus or DDR memory, and it would have required less cooling and fewer moving parts. Even a single Athlon 2200+ might compare favourably in many applications.

    SMP isn't a good thing in itself, as the article seemed to imply: it's what you use when there isn't a single processor available that's fast enough. One processor at full speed is almost always better than two at half the speed.

    --
    -- Ed Avis ed@membled.com
    1. Re:Strange choice of processors by khuber · · Score: 1
      Boy I don't know if I'd say that. I really like quad+ boxes. From my view as a developer they seem to work pretty well. Most web serving stuff isn't CPU bound, it's I/O bound. Having a couple processors seems to smooth things out.

      -Kevin

    2. Re:Strange choice of processors by sql*kitten · · Score: 1

      SMP isn't a good thing in itself, as the article seemed to imply: it's what you use when there isn't a single processor available that's fast enough. One processor at full speed is almost always better than two at half the speed.

      It depends. If you are bogged down in a context thrash, dual slower processors will recover more easily than a single fast one. Generally, if you have many processes in the run queue, multiple CPUs perform better than a single one.

    3. Re:Strange choice of processors by minion · · Score: 1

      SMP isn't a good thing in itself, as the article seemed to imply: it's what you use when there isn't a single processor available that's fast enough. One processor at full speed is almost always better than two at half the speed.

      Not necessarily. You're fogetting that each time a new thread or process needs CPU time, the L1/L2 has to be flushed and new data is retrieved from main memory. Thats a huge overhead, as main memory is slow compared to L1/L2 cache. When using SMP, a thread or process can happily be on either CPU, so now you went from 1 cache flush for each new process, to a better chance of less flushes because you have twice the available processing. Almost all modern OSs use threads and processes (except some RT systems), so even if the CPU is not 100% maxed on both SMPs, you're data retrieval from main memory won't hurt you so much.

      --

      -- If we don't stand up for our rights, now, there will be no right to stand up for them later.
    4. Re:Strange choice of processors by Ed+Avis · · Score: 1

      Yeah, you're right, in some cases having two half-speed CPUs is better than one full-speed because of caching. To guarantee equally good performance with all apps, you'd need to make your single processor have not only twice the clock speed, but also twice the size of cache, and make sure your OS's context switches didn't happen often enough for frequent pipeline flushes to affect performance.

      Hang on a bit, you say the L1 and L2 caches have to be flushed on each context switch? That seems crazy. The caches deal with physical addresses not logical ones, right?

      --
      -- Ed Avis ed@membled.com
    5. Re:Strange choice of processors by Anonymous Coward · · Score: 0

      Most modern cpus have caches that deal with physical addresses. Those that have cahes that see virtual addresses (before translation by the MMU) do have to cache flushing on context switched. In other words, most cpus don't have to do this, but there are probably a few that do.

  12. 20 Minutes After Posting and it's Already /. by LuxuryYacht · · Score: 1, Redundant

    If their servers are so good, why is their site down after only 20 of being /.ed?

    --
    Quidquid latine dictum sit altum viditur
  13. How to make a fool of yourself by noxavior · · Score: 5, Funny

    Step one: Submit story on high performance web servers.
    Step two: ???
    Step three: Die of massive slashdotting, loss of reputation and business


    Still, if someone has a link to a cache...

    --
    Karma:This parrot is dead! (and so is the joke.)
  14. And as the last step... by MavEtJu · · Score: 5, Funny

    ... Don't forget to post an article on /. so you can actually measure high-volume bulk traffic.

    [~] edwin@topaz>time telnet www.hardwareanalysis.com 80
    Trying 217.115.198.3...
    Connected to powered.by.nxs.nl.
    Escape character is '^]'.
    GET /content/article/1549/ HTTP/1.0
    Host: www.hardwareanalysis.com

    [...]
    Connection closed by foreign host.

    real 1m21.354s
    user 0m0.000s
    sys 0m0.050s

    Do as we say, don't do as we do.

    --
    bash$ :(){ :|:&};:
  15. High powered webserver? by Moonshadow · · Score: 5, Funny
    In an hour or so, I'm predicting it will be a high-powered heap of smoking rubble. It's almost like this is a challenge to us.

    Maybe it's their idea of a stress test. It's kinda like testing a car's crash durability by parking it in front of an advancing tank.

  16. Defintion of irony by nervlord1 · · Score: 3, Funny

    An article about creating high performacne webservers being slashdotted

    --
    Microsoft IIS is to webserving as KFC is to healthy eating
    1. Re:Defintion of irony by NineNine · · Score: 1

      Well, by using the same brilliant skills of analysis you do, this article is running on Apache, and the webserver is dead. That must mean that Apache is the Taco Bell of the webserver world, right?

      Kid, go back to playing with your Nintendo. You're in over your head here.

    2. Re:Defintion of irony by Electrum · · Score: 2

      Well, by using the same brilliant skills of analysis you do, this article is running on Apache, and the webserver is dead. That must mean that Apache is the Taco Bell of the webserver world, right?

      That would be about right. It's cheap, lots of people use it, but it's certainly not the best.

    3. Re:Defintion of irony by Anonymous Coward · · Score: 0

      There is a great restaurant in my town with some excellent tacos and the best food all around I have ever tasted, but it's quite costly. I would say they are IIS.

  17. server load by MegaFur · · Score: 5, Funny

    Many other people will likely post a comment like mine, if they haven't already. But hey, karma was made to burn!

    According to my computer clock and the timestamp on the article posting, it's only been about 33 minutes (since the article was posted). Even so, it took me over a minute to finally receive the "Hardware Analysis" main page. The top of that page has:

    Please register or login. There are 2 registered and 995 anonymous users currently online. Current bandwidth usage: 214.98 kbit/s

    Draw your own conclusions.

    --
    Furry cows moo and decompress.
    1. Re:server load by Anonymous Coward · · Score: 0

      Please register or login. There are 4 registered and 1590 anonymous users currently online. Current bandwidth usage: 1170.45 kbit/s Oct 19 07:39 EDT

      -- Nice of them to provide stats of their impending implosion!

    2. Re:server load by Anonymous Coward · · Score: 0
      Shall we try a running tally then?
      Please register or login. There are 3 registered and 1758 anonymous users currently online. Current bandwidth usage: 1900.51 kbit/s Oct 19 07:45 EDT
    3. Re:server load by Anonymous Coward · · Score: 0

      think we can drive it up to 10mbps usage at least? (hey, we just hit 3.3, so why not? :P)
      let's begin by linking to their interview with CmdrTaco.
      I'm also sure they'd appreciate it if we posted a comment in their forums. ;)

    4. Re:server load by Anonymous Coward · · Score: 0

      Please register or login. There are 3 registered and 1624 anonymous users currently online. Current bandwidth usage: 3652.79 kbit/s

    5. Re:server load by Queuetue · · Score: 2
      Please register or login. There are 4 registered and 1428 anonymous users currently online. Current bandwidth usage: 1183.73 kbit/s
      Took about 3 minutes, next page would not load.
    6. Re:server load by Anonymous Coward · · Score: 5, Funny

      Please flush my dns entry, or better yet unplug me. There are 0 registered and millions of the slashdot horde currently refreshing their browser and laughing at my stats. Current bandwidth usage: 100 Mbit/s.

    7. Re:server load by fusiongyro · · Score: 5, Interesting

      Well, they're about slashdotted now. They lost my last request, and it says they have almost 2000 anonymous users. I sometimes think the reason I like reading Slashdot isn't because of the great links and articles, but instead because I like being a part of the goddamned Slashdot effect. :)

      Which brings me to the point. Ya know, about the only site that can handle the Slashdot effect is Slashdot. So maybe Taco should write an article like this (or maybe he has?). The Slashdot guys know what they're doing, we should pay attention. Although I find it interesting that when slashdot does "go down," the only way I know is because for some reason it's telling me I have to log in (which is a lot nicer than Squid telling me the server's gone).

      --
      Daniel

    8. Re:server load by Anonymous Coward · · Score: 0

      Please register or login. There are 1 registered and 1874 anonymous users currently online. Current bandwidth usage: 2852.90 kbit/s

      warm up those pipes :)

    9. Re:server load by SnAzBaZ · · Score: 1

      Please register or login. There are 1 registered and 1975 anonymous users currently online. Current bandwidth usage: 4156.09 kbit/s Oct 19 08:47 EDT

    10. Re:server load by stevey · · Score: 3, Interesting

      I seem to remember that there was an article just after the WTC attacks last year, which discussed how Slashdot had handled the massive surge in traffic after other online sites went down.

      From memory it involved switching to static pages, and dropping gifs, etc.

      Unfortunately the search engine on Slashdot really sucks - so I couldn't find the piece in question.

    11. Re:server load by blibbleblobble · · Score: 2

      There are a few registered and quite a few anonymous users currently online. Current bandwidth usage: 6.80 kbit/s Oct 19 12:02 EDT

      Guess they stopped counting. We're supposed to be impressed that their dynamic page with 7 embedded tables and 160 images loads in less than three minutes?

      If only they hadn't copied the review format from Toms Hardware. Take a 1000-word article, add 2000 words of padding, and split between 9 pages including an index.

    12. Re:server load by 1110110001 · · Score: 3, Informative

      Maybe the article Handling the Loads, describing how Slashdot kept their Servers up at 9/11, is a bit of the thing you're looking for. b4n

    13. Re:server load by Anonymous Coward · · Score: 0

      Give me a break. I've worked on real sites with 10s of millions of page views a day, and the "slashdot" affect didn't even cause a hiccup. Sites like CNN and Yahoo had trouble on 9/11 because they were doing hundreds of millions of page views, double or even triple their usual load.

      Of course, this article is moronic... take a look at any of the recent web server benchmarks and you see numbers in the 2k-5k/s range... which is more page views per machine than 95% of the sites out there will ever do. Of course, that's static pages, which aren't overly interesting. Anything else means interaction, dynamic content, terabytes of data, etc... and the real bottleneck is always the network, since in the real world you're serving most of your clients down horrible connections through 56k and slower modems, or countries 25 hops away with congestion to match, badly built tcp/ip stacks, broken connections, long timeouts, and your apache server with 150 connections is lucky if only 20 of them are "dead" with these problems... hence the reason for the async i/o based proxies you put in front of your main webservers... maybe.

      Using Sun 4xxx and 5xxxx machines as webservers? You must be insane. You don't need 12 processors to shuttle network data around. Of course, we also saw higher failure rates on our sun hardware over our shitty intel boxes (I know, 3 intel failures in 3 years out of 120 boxes, that was just horrible)

  18. Not-so high performance by Anonymous Coward · · Score: 0

    ping www.hardwareanalysis.com

    Pinging www.hardwareanalysis.com [217.115.198.3] with 32 bytes of data:

    Request timed out.
    Reply from 217.115.198.3: bytes=32 time=765ms TTL=240
    Reply from 217.115.198.3: bytes=32 time=1038ms TTL=240
    Reply from 217.115.198.3: bytes=32 time=2036ms TTL=240

    Ping statistics for 217.115.198.3:
    Packets: Sent = 4, Received = 3, Lost = 1 (25% loss),
    Approximate round trip times in milli-seconds:
    Minimum = 765ms, Maximum = 2036ms, Average = 1279ms

    1. Re:Not-so high performance by Inda · · Score: 1

      Not trying be be funny matey but I think your connection is shagged. Not that pinging the server shows much anyway...

      ping www.hardwareanalysis.com

      Pinging www.hardwareanalysis.com [217.115.198.3] with 32 bytes of data:

      Reply from 217.115.198.3: bytes=32 time=39ms TTL=241
      Reply from 217.115.198.3: bytes=32 time=37ms TTL=241
      Reply from 217.115.198.3: bytes=32 time=56ms TTL=241
      Reply from 217.115.198.3: bytes=32 time=39ms TTL=241

      Ping statistics for 217.115.198.3:
      Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
      Approximate round trip times in milli-seconds:
      Minimum = 37ms, Maximum = 56ms, Average = 42ms

      --
      This post contains benzene, nitrosamines, formaldehyde and hydrogen cyanide.
    2. Re:Not-so high performance by chrysalis · · Score: 4, Insightful

      The article is about *WEB* high performance.

      I don't see your point. "ping" has never been designed to benchmark web servers AFAIK.

      My servers don't answer to "ping". Does it mean that the web server is down? Noppe... it's up a running...

      "ping" is not an all-in-one magic tool. By using "ping" you can test a "ping" server. Nothing else.

      --
      {{.sig}}
    3. Re:Not-so high performance by Anonymous Coward · · Score: 0


      riiiight, a ping server egh, nice to see your servers (assuming your profile is correct) don't respond to a ping egh ?

      Pinging pureftpd.org [216.136.171.204] with 32 bytes of data:

      Reply from 216.136.171.204: bytes=32 time=158ms TTL=233
      Reply from 216.136.171.204: bytes=32 time=158ms TTL=233
      Reply from 216.136.171.204: bytes=32 time=186ms TTL=233
      Reply from 216.136.171.204: bytes=32 time=157ms TTL=233

      Ping statistics for 216.136.171.204:
      Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
      Approximate round trip times in milli-seconds:
      Minimum = 157ms, Maximum = 186ms, Average = 164ms

      glad you got a grip on tcp

    4. Re:Not-so high performance by Fluffy+the+Cat · · Score: 1

      Since when did ping have anything to do with tcp?

    5. Re:Not-so high performance by Fluffy+the+Cat · · Score: 2, Informative

      Servers will generally carry on pinging even if they're heavily overloaded. Lag or missing packets is generally either a congested or bad link.

    6. Re:Not-so high performance by Anonymous Coward · · Score: 0

      Disabling ICMP REPLY is a bad idea.

      TCP does use it. Or at least some stack implementations do, to solve for the path mtu. That's the largest transmittable unit before somebody along the way fragments the packet. And IP fragmentation sucks (lose one fragment, all need to be resent).

    7. Re:Not-so high performance by chrysalis · · Score: 2

      ICMP REPLY doesn't exist. Maybe you mean ICMP ECHO REPLY which has nothing to do with MTU discovery.

      --
      {{.sig}}
    8. Re:Not-so high performance by chrysalis · · Score: 2

      You are pinging Sourceforge.

      --
      {{.sig}}
    9. Re:Not-so high performance by Anonymous Coward · · Score: 0

      "By using "ping" you can test a "ping" server. Nothing else."

      well, you can test connection speed, which is a reasonably useful factor in most webservers.

      As for the processor-time dedicated to apache/php/sql per request, that's something to get from the webserver itself, possibly testing the site design on a machine with logging.

    10. Re:Not-so high performance by Anonymous Coward · · Score: 0

      Good point. Some people just don't want to admit they're wrong. The grandparent should've just left the issue alone when Fluffy the Cat rightly contradicted him. Instead, he made a bigger fool of himself by spewing out more falsehoods.

    11. Re:Not-so high performance by Saint+Aardvark · · Score: 2
      Hehe...for some reason the idea of building a ping server strikes me as funny.

      We tested it in the workshop by hooking it up to a 3Com X500 Terabit Switch, and using over 500 RedHat servers to ping -f. This baby handled it well -- the time we'd spent optimizing the Oracle backend really paid off.

      Yeah. Or maybe I should just have more coffee...

  19. Almost by Anonymous Coward · · Score: 2, Insightful

    > One processor at full speed is almost always better than two at half the speed.

    You can safely drop that 'almost'.

    1. Re:Almost by Anonymous Coward · · Score: 0

      No he can't. Two processors bound with individual NICs is much better than one fast processor bound with two NICs.

    2. Re:Almost by bolthole · · Score: 2
      You can safely drop that 'almost'.

      wrong. For example, when you have a situation where you have lame hardware/drivers that do a lot of busywaits. With a single-cpu system, your system will be completely idle under that situation, no matter what speed cpu you have. Whereas with a dual cpu system, you will be able to get other work done.

      Assuming you have a decent OS, of course.

  20. It took 3 minutes to load by roly · · Score: 0

    Soo slow!

    --
    "With Microsoft, you get Windows. With Linux, you get the full house" - unknown
  21. Alternative HowTo by h0tblack · · Score: 4, Informative

    1. goto here
    2. click buy
    3. upon delivery open box and plugin
    4. turn on Apache with the click of a button
    5. happily serve up lots of content :)

    6. (optional) wait for attacks from ppl at suggesting using apple hardware...

    1. Re:Alternative HowTo by khuber · · Score: 1
      They're evaluating an Xserve at work compared to IBM blades. I'm anxious to see what their results are. I guess the admin software is nice.

      -Kevin

    2. Re:Alternative HowTo by Anonymous Coward · · Score: 0

      Typical mac user.

      What about first checking the apache config first before starting?

      You could even use MS Word to examine it .. least you have to use Terminal.app

    3. Re:Alternative HowTo by h0tblack · · Score: 2

      Definitely sounds like an interesting evaluation exercise.
      I'm of the opinion that it was a great move by Apple to move into this lower end server market. There's a lot of organisations that need some sort of server system for their network, but don't have the resources or the expertise to use some of the more traditional *nix based systems. That isn't to say that these are solely aimed at the "Idiots Guide to running a Server" market. There may be some nice user-friendly management and monitoring tools, but there's a lot under the hood to play with too. In the future there's also some interesting possibilities with clustering and the upcoming PPC970's from IBM. After all, this is really the first 'proper' server offering from Apple, future generations of the Xserve are definitely something to keep an eye on IMHO.

    4. Re:Alternative HowTo by GoRK · · Score: 2

      You forgot at least one step. Pick one to add but not both:

      4.5. Just because we're using a mac webserver, doesn't mean we're free from the responsibility of properly tuning our configuration. Anyone can buy a box of any type that's preconfigured to run apache when you first plug it in. Anyway, we tune the heck out of our Apache so that it will stand up to the load we're expecting.

      or

      7. Wonder what is going wrong when we realize we have no grasp of how our computer or applications actually work.

      ~GoRK

    5. Re:Alternative HowTo by mcowger · · Score: 2, Informative

      You missed a few steps:

      3a) Pull off god awful packaging
      3b) Install in rack with mickey mouse install setup thatrequires removing the cover from the machine, exposing all the internal electronics while your at it
      3c) Making sure the system sags in the middle while installed in the rack.


      and

      4a) Wipe OS because you have to before you can set up RAID
      4b) Setup RAID, have the disk set utility fail multiple times with cryptic errors, only to find that Apple's own docs say this is 'normal behavior'
      4c) When disks fail are are removed, must reboot server to signle user mode to reconstruct failed data. May or may not work...apple says 'normal behavior'


      and

      5a) Hope that your machine doesn't exhause it TCP connection pool which it will if you make too many SSH connections to it.


      Sorry, Im ust so pedantic today.

      Really, though, the XServes are a cheap attempt at a server that just doesn't work. Its a mickey mouse hack from the beginning. And yes, I have set them up personally. Only 2, because I wont reccommend the purchase of anymore after THAT experiment.

    6. Re:Alternative HowTo by Anonymous Coward · · Score: 0

      and of course not to forget

      8. Profit!

  22. Slashdotted again by Anonymous Coward · · Score: 1, Informative

    These guys got taken down a few weeks back:

    Hard Drives Evaluated for Noise, Heat and Performance

    I'm sure spreading out their content over nine pages is definitely helping their server load.

  23. Why Apache? by chrysalis · · Score: 5, Informative

    I don't understand.

    Their article is about building a high performance web server, and they tell people to use Apache.

    Apache is featureful, but it has never been designed to be fast.

    Zeus is designed for high performance.

    The article supposes that money is not a problem. So go for Zeus. The Apache recommendation is totally out of context.

    --
    {{.sig}}
    1. Re:Why Apache? by khuber · · Score: 2, Redundant
      Any web server can be good enough as long as you spread the load over enough boxes. Apache is much more flexible than Zeus.

      -Kevin

    2. Re:Why Apache? by jimfrost · · Score: 2
      Apache is more flexible, but in traditional versions (1.x) you have a problem in that a new program instance is used for each request. That makes things like maintaining persistent connections to the application servers really hard.

      Using something like iPlanet each server instance opens a number of connections to each application server in your cluster; you get a nice connection pool that way. With the Apache design (again this is 1.x) you can't use a pool so TCP setup/teardown costs between the web server and the application servers start to be an issue.

      Not that people don't do it, but it's a lot less efficient.

      I can't speak for Zeus, and as I understand it the most recent version of Apache allows threaded deployments that can take advantage of connection pooling, but most high volume sites use IIS or iPlanet as their front end web server.

      --
      jim frost
      jimf@frostbytes.com
    3. Re:Why Apache? by khuber · · Score: 1
      BigIP->IIS->IPlanet->MQ Series->Back end.
      Databases are DB2 on Sun 6500s.

      -Kevin

    4. Re:Why Apache? by khuber · · Score: 1
      Actually that's not entirely true.

      Sometimes there are two levels of app servers between IIS and MQ Series, and it's not always IPlanet.

      -Kevin

    5. Re:Why Apache? by jimfrost · · Score: 2
      You're talking about one particular application I imagine. MQ Series is actually pretty rare in large scale deployments, DB2 is like my third choice in databases, and I'd prefer not to use HTTP servers as the actual application server.

      YMMV.

      --
      jim frost
      jimf@frostbytes.com
    6. Re:Why Apache? by khuber · · Score: 1
      Well you could consider it one application. It's not really. It's essentially a complete generalized back end content system that consists of many servers. MQ provides asynchronous interprocess communication on the back end as well as content-based routing and load management. I don't have an informed opinion on DB2 vs. Oracle vs. whatever other than I have complained about DB2's agressive lock escalation. I have no say in DBMS software.

      -Kevin

    7. Re:Why Apache? by Electrum · · Score: 2

      Any web server can be good enough as long as you spread the load over enough boxes. Apache is much more flexible than Zeus.

      Sure, but if you need 2+ Apache boxes to handle the load of one Zeus box, wouldn't it make more sense to buy Zeus in the first place?

      I would like you to qualify your statement about Apache being more flexible. Zeus is a lot easier to configure than Apache. In what aspects is Apache more flexible?

      When it comes to mass virtual hosting, Zeus beats the pants off Apache. Zeus' configuration is fully scriptable out of the box. Apache's is not. Zeus can do wildcard subservers. Apache cannot. Zeus does not require restarting to make configuration changes or add sites. Apache does. Sites can only be added in Apache if using the very limited mass vhost module.

    8. Re:Why Apache? by Anonymous Coward · · Score: 0
      Quoting:

      Although the old Hardware Analysis web server is retired and no longer actively serving any of Hardware Analysis' content we're keeping it on standby if things indeed get really busy, during heavy traffic spikes. We will use it for serving static content utilizing Linux in-kernel web server called Tux.

    9. Re:Why Apache? by khuber · · Score: 1
      I was pretty much just talking out of my ass, so feel free to ignore me. My employer doesn't use either and I'm out of date on my Zeus knowledge. We don't need virtual hosting.

      Apache has a lot of mods, documentation, and community support. That's what I was thinking of in terms of flexibility. Isn't Zeus also closed source? I imagine the gap has narrowed as Zeus has developed. Also, Jim Frost already mentioned problems using Apache with app servers.

      -Kevin

    10. Re:Why Apache? by Anonymous Coward · · Score: 0

      And you're talking out of your ass again?

    11. Re:Why Apache? by Syn+Ack · · Score: 1


      I agree with another post that says, why depoy 2 Apache boxes for every 1 Zeus box? Zeus beats the snot out of Apache every day of the week. Not to mention the nightmare that it is to manage a large Apache cluster and make sure configuration files are kept in Sync etc. Zeus is managed from a central management server and replicates it's configs out to the web servers in the cluster. You have 30 web server do you? No problem.

      As for Apache being more flexible, aparently you've never used Zeus to any extent.

      Oh and just FYI, Zeus can also scales almost linearly on multi processor servers. It also runs PHP faster than Apache does, so I don't see where, aside from price, that Apache even comes close?

      Syn Ack.

    12. Re:Why Apache? by crucini · · Score: 2
      Apache is more flexible, but in traditional versions (1.x) you have a problem in that a new program instance is used for each request. That makes things like maintaining persistent connections to the application servers really hard.

      Actually, Apache 1.x forks a number of children upon launch, the quantity specified by the StartServers parameter (default 5). It then forks and kills children as necessary to accomodate the load, keeping the number of spare (idle) processes between MinSpareServers and MaxSpareServers. So it always has a pool of spare servers to handle the next connection - it does not fork upon accepting a connection.

      Therefore database handles can be held by the process and used through multiple request/response cycles. Mod_perl users accomplish this transparently with the Apache::DBI module, which overrides the connect method of DBI, causing it to first draw from a pool of cached handles.

      Of course this technique can easily be applied to TCP connections to application servers, or any other reusable resource that takes time to acquire.
    13. Re:Why Apache? by Bedouin+X · · Score: 2

      No it wouldn't. Another webserver would cost less than a copy of Zeus.

      --
      Dissolve... Resolve... Evolve...
    14. Re:Why Apache? by jimfrost · · Score: 2
      This is true if you're using Apache as the application server, but most large web applications use the HTTP server as a front end, serving only static content. They refer requests to a back-end application for dynamic page generation, and often that application is running on a cluster of machines.

      If you're using session affinity to bind a session to a particular application server, which is pretty much a necessity for high-volume applications, then it's to your benefit if each HTTP server can hold a connection open to every application server.

      You can't do that on a 1.x Apache server because you'd end up having one connection for every app server and every Apache instance, and that can easily run into the tens of thousands of connections.

      With Apache, therefore, you usually build a new TCP connection with each request, which is not very efficient.

      --
      jim frost
      jimf@frostbytes.com
  24. Re:Troll? Informative is more like it. by roly · · Score: 0

    Why hasn't Slashdot changed to postgresql?

    Or better yet, SlashSQL?

    --
    "With Microsoft, you get Windows. With Linux, you get the full house" - unknown
  25. And it just keeps climbing... by nutbar · · Score: 0, Redundant
    New, at Hardware Analysis!
    Watch our high performance webserver get slashdotted, in real time!
    How long until it melts? Let's see if those aftermarket heatsinks really paid off.

    There are 3 registered and 1643 anonymous users currently online. Current bandwidth usage: 1215.81 kbit/s

  26. Koestler by Anonymous Coward · · Score: 0

    There are moments in our lives, especially when one is under some strain (due to the imminent locality of others predominantly), when we might utter a nonsensical word. Eg. I found myself saying the word "unrelentless" just this evening. What I meant to say was "unrelenting" or "relentless". The ghost in the machine is a busy guy nowadays with 6 billion of us.

  27. Building a Better Webserver in the 21st Century by Anonymous Coward · · Score: 0

    Ace's Hardware has (IHMO) better article about this subject: http://www.aceshardware.com/read.jsp?id=45000240

    1. Re:Building a Better Webserver in the 21st Century by khuber · · Score: 3, Informative
      I hate to do this, but actually MS has put out some good stuff that's relevant to larger sites.

      http://www.microsoft.com/backstage/whitepaper.htm

      -Kevin

    2. Re:Building a Better Webserver in the 21st Century by thona · · Score: 1

      There is nothing bad at giving MS credit where credit is due. And fact is that the MS website is one of the most heavily trafficked sites on the planet. It runs great - so they know how to do this. THAT said, most technologies they use are meaningless for smaller sites. They are just SO big :-)

  28. Server running at near 100% load by ssassen · · Score: 5, Informative
    From the SecureCRT console, connected through SSH1, as the backend is giving me timeouts. I can tell you that we're near 100% server load and are still serving out those pages to at least 1500 clients. I'm sure some of you get timeouts or can't even reach the server at all, for that I apologize, but we just have one of these, not a whole rack full of them.

    Have a good weekend,

    Sander Sassen

    Email: ssassen@hardwareanalysis.com
    Visit us at: http://www.hardwareanalysis.com

    1. Re:Server running at near 100% load by khuber · · Score: 1
      The server is responsive now. I wonder what they did.

      -Kevin

    2. Re:Server running at near 100% load by NineNine · · Score: 1

      Ha! 1500 clients swamps the box? Sounds like a shitty webserver to me. My simple little P 1.2 Ghz running W2K gets to about 10% CPU usage with that load. Here's a clue kid: it ain't the hardware, it's the software. And I'm not talking about W2K vs. Linux or Apache vs. IIS. There's some seriously shitty code running that webserver.

    3. Re:Server running at near 100% load by ssassen · · Score: 1
      You obviously don't know what you're talking about, else you wouldn't give that kind of a response. There's over 2000 clients accessing pages with a high hit-count, a flash animation and other dynamic content, that's not an easy task I can assure you.

      Kind regards,

      Sander Sassen

      Email: ssassen@hardwareanalysis.com
      Visit us at: http://www.hardwareanalysis.com

    4. Re:Server running at near 100% load by NineNine · · Score: 1

      Oh right... I don't know what I'm talking about. All I gotta say is that my server handles that same load fine, all driven from a database, and you're the one writing the article about "how to build a high performance webserver" on a webserver that's dead. No need for a dick-measuring contest here.

    5. Re:Server running at near 100% load by waterwheel · · Score: 1

      Not to be confrontational, but isn't that the point? It's straining under 2000 clients.If you're touting high end knowledge, you should be doing better than that. The first thing you need to do is get rid of the flash. That alone should allow you to serve orders of magnitute higher numbers than what you are doing now. And that's what some here are suggesting - there's far more effective ways to serve high volumes than tackling hardware.

    6. Re:Server running at near 100% load by Anonymous Coward · · Score: 0

      I'll bet....Why won't you post the URL then??

      Lets see just how much traffic your box can take for a /. effect.....

      Or are ya a chicken?? :))

    7. Re:Server running at near 100% load by Anonymous Coward · · Score: 1, Funny

      Uhmm, he does post a url. See NineNine.com for details. I think you could learn a bit from the owner of a porn site when it comes to server performance.

    8. Re:Server running at near 100% load by Anonymous Coward · · Score: 3, Insightful

      I'm sorry, but if your server cannot handle 2000 connections then NineNine is right, you have a crappy backend. How is the fact that you have Flash animation relevant? Isn't a 200k flash animation the same as a 200k jpeg from the server's point of view? If your server cannot handle 2000 connections, what business do you have writing an article about "high performance" webservers? It would be a different story if you entitled it "high performance webserver for less than $1000," but you didn't.

      Personally I think the new trend on Slashdot of "hey, I saw this article about ____, it's really insightful and just great!" being submitted by the author of that article is sort of shitty. If anybody knows about building a high traffic webserver, it would be Slashdot, so you'd think they'd be a little pickier about what they post regarding high performance servers.

    9. Re:Server running at near 100% load by Anonymous Coward · · Score: 0

      "Isn't a 200k flash animation the same as a 200k jpeg from the server's point of view?"

      yes, of course. The author is in denial. And that's not an easy life to live, I assure you.

    10. Re:Server running at near 100% load by happystink · · Score: 2

      Yeah, but 1500 clients WHAT? this minute, this second?

      --

      sig:
      See the "..for smart people" banners Wired runs here? Look elsewhere guys.

    11. Re:Server running at near 100% load by NineNine · · Score: 1

      Well, HTTP isn't stateful, so technically it's pretty tough to get "true" concurrent users, but I was talking about a span of about 30 seconds.

    12. Re:Server running at near 100% load by blueroo · · Score: 1

      Only 1500? What are you running, a 90mhz 486? If a measly 1500 simultanious connections is burning your server, you have a lot to learn about designing high performance servers and webapps.

    13. Re:Server running at near 100% load by Aquitaine · · Score: 1

      I think the most interesting part of this article was how they chose their ISP. A bunch of young people that are committed to quality service and offer a lot of bandwidth (where "a lot" is obviously quite relative, now).

      And then they admit that, okay, actually they knows these guys personally, and they're really not that bad!

      No offense to either HA or their ISP, but, er, let's hear it for unbiased journalism. :)

    14. Re:Server running at near 100% load by Anonymous Coward · · Score: 0

      Give me a break....it is a very SLOWLY loading asp driven site. That site took over 30+ seconds to load over a broadband connection and it is only a TGP site! Mostly text....

      I am not impressed....

  29. Re:First rule to create a fast server... by Anonymous Coward · · Score: 0

    I know there are faster webservers then apache. but you can't beat the price/preformance ratio...

  30. slashdotted... by Anonymous Coward · · Score: 0

    Warning: Too many connections in /web/admin.hardwareanalysis.com/include/db.php on line 9
    Unable to connect to database. Too many connections

    what does the hardware mean anyway...if the software is not configurated right?

  31. it's not quit dead by squarefish · · Score: 1

    Query error: Commands out of sync; You can't run this command now

    --
    Creationists are a lot like zombies. Slow, but powerful and numerous. And they all want to eat our brains.
  32. Quick howto by Klerck · · Score: 1, Insightful

    Here's a quicker howto.

    Get the fastest AthlonXP out there.
    Get a motherboard with onboard SCSI.
    Get 15,000RPM SCSI 160MB/s drives
    Get a NIC
    Install linux
    Install apache
    Install mysql, php, perl, etc.

    And there you have it. Is it really necessary to write a long article when all you're basically saying is "get the fastest hardware out there and slap it into one machine"? Come on folks.

  33. Slow... very slow... by dark-br · · Score: 1

    I guess the ppl running the webserver with the article should have used the info on in cuz i just can't access due to high load :)

  34. how to build a high performance/reliable webserver by jacquesm · · Score: 4, Informative

    1) use multiple machines / round robin DNS
    2) use decent speed hardware but stay away from
    'top of the line' stuff (fastest processor,
    fastest drives) because they usually are not
    more reliable
    3) replicate your databases to all machines so
    db access is always LOCAL
    4) use a front end cache to make sure you use
    as little database interaction as you can
    get away with (say flush the cache once per
    minute)
    5) use decent switching hardware and routers, no
    point in having a beast of a server hooked up
    to a hub now is there...

    that's it ! reasonable price and lots of performance

  35. OK so where do I start? by SuperCal · · Score: 2

    I was really excited to see this article, because oddly enough I am seriously considering setting up my own webserver. In fact am thinking of running slashcode. So far everyone has been saying that the article generally sucks. So the question remains where should I start? I was thinking of buying a few of my company's used PCs and building a cluster... that scares me a bit, as I'm not a computer genius, but I can get a great deal on these computers (between 5 and 10 500mhz wintel computers)

    OK, I know that was rambling so to recap simply, is it better to go with a expenive single MP solution like the article, or with a cheaper cluster of slow/cheap computers

    --
    Business News and Resources: www.usasource.net
    1. Re:OK so where do I start? by ssassen · · Score: 3, Informative
      People are negative because the server has been unreachable for some, but they tend to conveniently forget that we did not design for 2000+ simultaneous clients, just a couple of hunderd really. Just thought I'd let you know, as we only have one of these whereas most websites (like Anand and Tom) have a rack full of them. Still we're handling the load pretty well and are serving out the pages to about 1500 clients.

      Have a good weekend,

      Sander Sassen

      Email: ssassen@hardwareanalysis.com
      Visit us at: http://www.hardwareanalysis.com

    2. Re:OK so where do I start? by m0i · · Score: 1

      Just being curious there, where are the CPU cycles currently burned? In other words, what are the top lines of 'top'. I suspect your DB engine to be there. And your setup seems to be a nice testbed for ZendAccelerator :-)

      --
      have you been defaced today?
    3. Re:OK so where do I start? by Anonymous Coward · · Score: 1, Insightful

      The thing is that "a couple of hunderd" clients isn't actually High Performance Web Serving. Maybe it is to your target overclocker-fan-boy audience, but to Slash-folk that's nothing...

      The lack of system setup detail isn't good. Too many variables there. Apache2 may have been a better choice for this too...

      BTW, you're prossibly disk io (requests not bandwidth) limited by your IDE RAID. Make sure atime is turned off - no point recording it for no good reason. Do what ever youcan to minimise disk io, because your IDE RAID is done in software (and if you use Promise drivers, stiff bikkies when you need to upgrade your kernel...)

      A high "load" isn't much good info-wise either... what does "sar" have to say? Where is the "load" being generated???

    4. Re:OK so where do I start? by ssassen · · Score: 1
      Overclocker-fan-boy audience? Ouch! That's not nice, we're not targetting them, plenty of sites that do, just not us.

      The upgrade to Apache 2 is in the works, but that just wouldn't help us right now, would it? We'll also update the kernel and install Promise' latest drivers. Appriciate the constructive critisism but a little more detail please, afterall we're all here to learn right?

      Kind regards,

      Sander Sassen

      Email: ssassen@hardwareanalysis.com
      Visit us at: http://www.hardwareanalysis.com

    5. Re:OK so where do I start? by drouse · · Score: 2, Insightful

      I wouldn't worry too much.

      Probably 90% of all non-profit websites could be run off a single 500 MHz computer and most could be run from a sub 100 MHz CPU -- especially if you didn't go crazy with dynamic content.

      A big bottleneck can be your connection to the Internet. The company I work for once was "slashdotted" (not by slashdot) for *days*. What happened was our Frame Relay connection ran at 100%, while our web server -- a 300 MHz machine (running Mac OS 8.1 at the time) had plenty of capacity left over.

      --
      -- I browse at +5 with stripped sigs ... Ha! Ha!
    6. Re:OK so where do I start? by Utopia · · Score: 1

      You are concentraing heavily on hardware performance. The hardware matters little if you write and configure your software properly.

      My website runs on Win2K Adv server with SQL Server on the same machine. Its a 4 proc 700 Mhz PC with a SCSI drive. When it was linked on slashdot. It never for once went down or stopped serving pages.

    7. Re:OK so where do I start? by cymen · · Score: 2

      Well where do you plan on putting all these boxes? Are you going to serve your pages over a DSL connection? Or colocate? If you are planning on colocating, you'll be investigating smaller sized servers, like 1U or 2U size, unless you have money to blow. To be honest, you should just setup one server and get some page hits. Then think about how you'll survive the hordes of people that may come in the future. Unless you're serving porn. I would imagine the loads are always fairly high on porn servers. Someone here can surely offer suggestions if porn is involved.

    8. Re:OK so where do I start? by SuperCal · · Score: 2

      Actually, I have been investigating forms of higher speed connections. My plan is to actually set up the hardware and get it running on a simple DSL connetion so I can work on content. After everything is up and running and when I start getting enough traffic that DSL becomes the bottleneck then I'll upgrade. At the moment I know the system I want is overkill, but I would rather do it right now so I can put off a hardware upgrade in the near future. For the moment my server is going to sit in my dinning room, but a friend has offered me space in his office (A big unused closet) when I need to move (the buisness ultra broadband providers here are to expencive, its much cheaper in the City).

      --
      Business News and Resources: www.usasource.net
  36. Apache 1.3x? by djupedal · · Score: 2

    What kind of 'high performance' web server uses back-leveled software? Apache 2.x may not be totally API compliant, but it certainly provides more than 1.3x in terms of performance.

    I am glad they used an IDE RAID, however. The SCSI myth can now go on the shelf.

    1. Re:Apache 1.3x? by Pizza · · Score: 2, Informative

      Actually, their disk tests are fundamentally flawed. RAID0 is only good for boosting raw sustained throughput; it has pretty much no effect on access time. If you want a boost in access time, go for RAID1, as you can load-balance reads across two drives.

      Furthermore, RAID0+1 is also not really worth it, as it still only gives you the ability to fail one drive, and instead of two logical spindle you only have one to do all of the work. But I suppose of your software is inflexible enough to only be able to operate on one partition, so be it.

      I'd like to see some numbers for their boxes loaded up with RAM and high numbers of random I/O operations, which is where the high rotational speed of modern SCSI drives really shine. And this is the access pattern of a dynamic database-driven web site.

      And as others have said, it's not the hardware that makes the most difference in these circumstances, it's how the software is set up, and how the site/database is coded.

      Hell, I've completely saturated a 100mbps network serving dynamic content via pure Java Servlets, and this was only a dual P3-650. With a RAID5 array of 50G 7200RPM SCSI drives, hardly cutting edge even at the time. Dropping in a RAID1 array of WD120 IDE drives couldn't come anywhere close. But once the working set of data was loaded into RAM, they both performed about the same.

      Their IDE raid setup is certianly considerably cheaper though, and that's a tradeoff that most people can easily make.

      --
      -- I ain't broke, but I'm badly bent.
    2. Re:Apache 1.3x? by GoRK · · Score: 4, Insightful

      Their IDE-RAID is actually software RAID. The SCSI myth can go off the shelf, sure, but don't take the RAID myth down.

      The promise FastTrak and Highpoint and a few others are not actually hardware RAID controllers. They are regular controlers with enough firmware to allow BIOS calls to do drive access via software RAID (located in the firmware of the controller), and OS drivers that implement the company's own software RAID implementation at the driver level, thereby doing things like making only one device appear to the OS. Some of the chips have some performance improvements over a purely software RAID solutions, such as the ability to do data comparisons between two drives in a mirror during reads, but that's about it. If you ever boot them into a new install of windows without preloading their "drivers", guess what? Your "RAID" of 4 drives is just 4 drives. The hardware recovery options they have are also pretty damned worthless when it comes to a comparison with real RAID controllers - be they IDE or SCSI.

      A good solution to the IDE RAID debacle are the controllers by 3Ware (very fine) or the Adaptec AAA series controllers (also pretty fine). These are real hardware controllers with onboard cache, hardware XOR acceleration for RAID 5 and the whole bit.

      Anyway, I'm not really all that taken aback that this webserver is floundering a bit, but seems really responsive when the page request "gets through," so to speak. If it's not running low on physical RAM, it's probably got a lot of processes stuck in D state due to the shit promise controller. A nice RAID controller would probably have everything the disks are thrashing on in a RAM cache at this point.

      ~GoRK

  37. More Advice from the site by HappyPhunBall · · Score: 4, Funny
    Once you have the hardware setup and the software configured, it is time to design your site to perform. The following tips will help you create a site that is just as scalable as ours. Enjoy.
    1. Use lots, and I mean lots of graphics. Cute ones, animated ones, you name it and people expect to see them. Skimping here will hurt your image.
    2. CSS style sheets may be the way of the future, but just for now make sure you include dozens or even hundreds of font tags, color tags, and tables in your site. Trust us. This has the added benefit of increasing your page file size by at least 30%. You do want a robust site right?
    3. Make sure you are serving plenty of third party ads! Their bandwidth matters also, and you know the way to make money on the web is be serving lots of "fun" animated ads. This will not slow down the user experience of your site one bit! Those ad people are slick, they know that you are building a high bandwidth / high performance site and will be expecting the traffic.
    4. A site is not a high performance site until is has withstood the infamous Slashdot effect. You will want to post a link to your site on /. post haste to begin testing.
    That should be enough to get you started. Now you too can build a rocking 200K per page site, and having read our hardware guidelines, you can expect it to perform just as well as ours did. One more free tip: Placing a cool dynamic hit counter or traffic meter on your site in a prominent position will encourage casual visitors to hit the reload button again and again, driving the performance of your site through the roof.
    1. Re:More Advice from the site by Anonymous Coward · · Score: 0

      But, do use http output compression so a 100k html page becomes 5k and takes 0.0123 seconds transfer :)

  38. Redefinition of irony by jazmataz23 · · Score: 1

    Guy who didn't read the article makes an uninformed M$ bash and gets modded to four...
    (they're running linux, and there must have been some other problem because it's usable now)

    It's a shame I'm banned from moderation for my failure to jump on the Linux bandwagon; vast numbers of readers of this site are using IE6.0, and that doesn't come in any linux distro I know of. I'm just honest about my use of software from the beast of Redmond.

    --
    Death to Argument by Slogan!! (This post twice-encrypted with ROT-13. Replies not using same will be ignored)
    1. Re:Redefinition of irony by jazmataz23 · · Score: 1

      went back down apparantly as I was posting this...

      --
      Death to Argument by Slogan!! (This post twice-encrypted with ROT-13. Replies not using same will be ignored)
    2. Re:Redefinition of irony by elemental23 · · Score: 2

      Guy who didn't read the article makes an uninformed M$ bash and gets modded to four...

      The Microsoft line was the poster's sig. Check your Slashdot preferences, there's an option to include a "--" between post content and sig. I don't know why this isn't on by default, it eliminates mistakes like this.

      (I added the "--" to my sig myself because it seems a lot of people don't have this enabled)

      --
      I like my women like my coffee... pale and bitter.
    3. Re:Redefinition of irony by jazmataz23 · · Score: 1
      hey thanks. I'll do that. I kind of figured that was a sig, but wasn't sure.

      I certainly appreciate the polite and helpful correction in response to an honest mistake instead of flaming me as a monkey faced, morris dancing waste of DNA.
      cheers to ya,
      jaz

      --
      Death to Argument by Slogan!! (This post twice-encrypted with ROT-13. Replies not using same will be ignored)
  39. Re:how to build a high performance/reliable webser by SuperCal · · Score: 2

    Thanks, I wish I hadn't posted earily in this article so I could use my mod points. Now, my only question is how fast is decent speed? I'm about to build my own server (actually I'm going to have some help, but I want to at least sound like I know what I'm doing) nothing fancy. I don't expect a huge hit count or anything, so would using older (500-750 mhz)second hand computers, properly upgraded memory and storage, work? Also would you recomend replacing the powersuply. One the guys whoes helping me swears that will save me money in the long run on energy costs, but I don't know if its worth the cost.

    --
    Business News and Resources: www.usasource.net
  40. How not to get slashdotted? by Bahamuto · · Score: 2, Funny

    Does building this high performace web server prevent you from being slashdotted?

  41. I think it's just gone down by Anonymous Coward · · Score: 0

    at about 2:05 gmt

  42. anyone else find it funny... by sudog · · Score: 1

    ...that a simple slashdotting took down this "monster" server?

  43. Their "high performance server" seems to be fixed by NineNine · · Score: 1

    Their "high performance server" seems to be fixed now... I'm getting a 500 error almost instantly! Good work, guys!

  44. Software Design by wolfc · · Score: 1
    I don't think hardware can resolve a serious performance problem. It almost always leads down to a algorithm problem.

    Check out the SEDA architecture with Haboob as web-server. It seems to outperform Apache.

    Haven't got the traffic myself to test it though. :-)

  45. Re:Their "high performance server" seems to be fix by Anonymous Coward · · Score: 0

    Could not connect to server..... 14:18 GMT

  46. Step 1) Hardware; Step 1.5) Software by ammulder · · Score: 1
    Okay, so let's assume, for sake of argument, that it really is high-performance hardware. Congratulations, you've solved 1/2 the problem! But why, when serving a totally static article, do we get results like this:
    Warning: Too many connections in /web/admin.hardwareanalysis.com/include/db.php on line 9 Unable to connect to database. Too many connections
    Next article, I hope to see how they've rewritten their site to only use the database when it's really necessary.
  47. Fast except didnt set DB correctly by WillRobinson · · Score: 1

    This looks like either PostNuke or PHPNuke web site. And while I was visiting it was serving up at a rate of around 500k before it ran out of DB connections. Guess they should have did some research on expanding the DB connections to MySql from PHP. Im sure the slashdotting will give them some insite into that. Im sure they will also come and read all the constructive comments here on /. so give em some good ones.

    1. Re:Fast except didnt set DB correctly by ssassen · · Score: 1
      Hi Will,

      As you probably understood from reading the article, we're not a large corporation but just a small startup company with young and enthusiastic people fresh out of university and aspiring to be all that we can be. We therefore are taking this opportunity to learn from this experience as the amount and diversity of traffic we've gotten over the past few days is both beyond anything we simulated. It is an opportunity for us to learn from any mistakes we've made and track down the bottlenecks.

      We've been going over the Slashdot comments and your response was one we certainly would like to follow up on. So if you're interested we'd love to hear more and welcome any advice you're able to give us. We'll be adding one or more pages to the article with, for example, software tricks for Apache and Linux and other tips and tricks that will help us, and others, to fine-tune their web server.

      Looking forward to your reply.

      Kind regards,

      Sander Sassen

      Email: ssassen@hardwareanalysis.com
      Visit us at: http://www.hardwareanalysis.com

  48. how nice of them by twitter · · Score: 2
    Current bandwidth usage: 214.98 kbit/s

    Draw your own conclusions.

    How nice of them to share that information.

    The obvious conclusion is that my cable modem could take a minor slashdoting if Cox did not crimp the upload and block ports. Information could be free but thanks to the local Bell's efforts to kill DSL things will get worse until someone fixes the last mile problem.

    The bit about IDE being faster than SCSI was a shocker. You would think that some lower RPM SCSIs set to strip would have greater speed and equivalent heating. The good IDE performance is good news.

    --

    Friends don't help friends install M$ junk.

    1. Re:how nice of them by strobert · · Score: 2

      Not sure if you noticed but they tried using the AMI megaraid controllers. They should have tried a Mylex. In spite of what Dell tech support witll tell you (the PERC in the Dell's is a branded MegaRaid) that i960 based boards just have the performance issue, the Mylex DAC960 is i960 based and hums along just fine. I have seen 2-5x write performance increases going between the PERC and the Mylex -- and yes just proved this to management recently.

  49. Re:how to build a high performance/reliable webser by Electrum · · Score: 3, Interesting

    3) replicate your databases to all machines so
    db access is always LOCAL


    This is probably a bad idea. Accessing the database over a socket is going to be much less resource intensive than accessing it locally. With the database locally, the database server uses up CPU time and disk I/O time. Disk I/O on a web server is very important. If the entire database isn't cached in memory, then it is going to be hitting the disk. The memory used up caching the database cannot be used by the OS to cache web content. A separate database server with a lot of RAM will almost always work better than a local one with less RAM.

    This Apache nonsense of cramming everything into the webserver is very bad engineering practice. A web server should serve web content. A web application should generate web content. A database server should serve data. These are all separate processes that should not be combined.

  50. Re:how to build a high performance/reliable webser by jcrowe · · Score: 2, Informative

    The company I work for successfully runs our webserver(php & MySQL) on an old pentium 166. We have several thousand visitors every month & use it for an ftp site for suppliers, a router, firewall, gateway & squid server.

    I think that your 700mhz machine would work fine for just web pages. :)

  51. practice what you preach . by Anonymous Coward · · Score: 0

    http://www.hardwareanalysis.com/ slashdotted at : UTC Sun Oct 20 00:46:49 2002 -0.796378 seconds

    instead of pointing us to these hypocrits why don't the slashdot server admins themselves write some good stuff and put it for us to see . if you people are too busy then request the google geeks .

  52. This is wrong on soooo many levels. by (H)elix1 · · Score: 5, Interesting
    (include standard joke about high performance web serving getting /.)

    I'd post sooner, but it took forever to get to the article.. here are my thoughts...

    First off SCSI.

    IDE drives are fast in a single user/workstation environment. As a file server for thousands of people sharing an array of drives? I'm sure the output was solid for a single user when they benched it... looks like /. is letting them know what multiple users do to IDE. 'Overhead of SCSI controller'... Methinks they do not know how SCSI works. The folks who share this box will suffer.

    Heat issues with SCSI. This is why you put the hardware in a nice climate controlled room that is sound proof. Yes, this stuff runs a bit hot. I swear some vendors are dumping 8K RPM fans with ducting engineered to get heat out of the box and into the air conditioned 8'x19" chassis that holds the other 5-30 machines as well.

    I liked the note about reliability too... it ran, it ran cool, it ran stable for 2 weeks. I've got 7x9G Cheetahs that were placed into a production video editing system and ran HARD for the last 5+ years. Mind you, they ran about $1,200 each new... but the down time cost are measured in minutes... Mission critical, failure is not an option.

    OS

    Lets assume the Windows 2000 Pro was service packed to at least SP2... If that is the case, the TCP/IP stack is neutered. Microsoft wanted to push people to Server and Advanced Server... I noticed the problem when I patched my counter strike server and performance dogged on w2kpro w/sp2 - you can find more info in Microsoft's KB... (The box was used for other things too, so be gentle) Nuking the TCP/IP stack is was the straw that cracked my back to just port another box to Linux and run it there.

    Red Had does make it easy to get a Linux box up and running, but if this thing is going outside the firewall, 7.3 was a lot of work to strip out all the stuff that are bundled with a "server" install. I don't like running any program I did not actually install myself. For personal boxes living at my ISP, I use slackerware (might be moving to gentoo however). Not to say I'm digging through the code or checking MD5 hashes as often as I could, but the box won't even need an xserver, mozilla, tux racer, or anything other than what it needs to deliver content and get new stuff up to the server.

    CPU's (really a chassis problem):

    I've owned AMD's MP and Intel's Xeon dually boards. These things do crank out some heat. Since web serving is usually not processor bound, it does not really matter. Pointing back to the over heating issues with the hard drives, these guys must have a $75 rack mount 19" chassis. Who needs a floppy or CD-ROM in a web server? Where are the fans? Look at the cable mess! For god's sake, at least spend $20 and get rounded cables so you have better airflow.

    1. Re:This is wrong on soooo many levels. by Al-Hala · · Score: 1

      In the case of rounded cables (IDE in this case), the marketers have won over the engineers.

      By rounding the cables, the designed protection against cross talk is lost. This could be a BAD thing.

      IDE cables were designed with alternating ground and signal lines, so that each signal line would see a ground plane next to it.

      When the cables are cut up, or custom wired, the signal wires are now jumbled in with the rest and the whole idea behind the design is lost.

      Now, given the fact most people don't have full bore constant-pinned datarates across the maximum allowed standard length, nothing bad happens, but I leave the rounded cables and the other quirky things to the "mod-squad". I've enough headaches without finding out my "improvement" has actually caused my intermittent valuable time burning problem:)

    2. Re:This is wrong on soooo many levels. by Anonymous Coward · · Score: 0

      Both the TCP/IP stack and the IIS server in W2K Pro are neutered by design for licencing reasons only (and this is not some SP thing - it's been that way since NT 3.5). Microsoft wants you to install the otherwise identical, but more expensive, W2K Server on your "server", end of story.

      You are excused with your counterstrike server, but any dipshit putting up an article about building a webserver should know better.

      As for the drives, it really depends on the use pattern. Google for instance uses cheapo IDE drives because they are basically used for boot only, and all the action is happening in an in-memory DB. Many webservers (like slashdot.org) are just a small number of scripts and don't require much or any disk i/o.

      I agree that for even the slightest disk usage, you really really want SCSI on a server box (especially now that the IDE guys have basically told you the thing will die after 1 year!)

    3. Re:This is wrong on soooo many levels. by seanadams.com · · Score: 3, Interesting

      IDE drives are fast in a single user/workstation environment. As a file server for thousands of people sharing an array of drives? I'm sure the output was solid for a single user when they benched it... looks like /. is letting them know what multiple users do to IDE. 'Overhead of SCSI controller'... Methinks they do not know how SCSI works. The folks who share this box will suffer.

      Methinks it's been a LONG time since you've read up on IDE vs SCSI, and me also thinks you dont have the first clue about how a filesystem works. Yes, there was a time when IDE drives were way slower, mainly because the bus could only have one outstanding request at a time. IDE has since advanced to support tagged command queuing and faster data rates, closing the gap with all but the most horrendously expensive flavors of SCSI. Really, the bottleneck is spindle and seek speed - both IDE and SCSI are plenty fast now.

      The only thing SCSI really has going for it is daisy-chainability and support for lots of drives on one port. HOWEVER there are some really killer things you can do with IDE now. In my web server I'm using the promise RM8000 subsystem: a terabyte of RAID5 storage for about $3500 including the drives IIRC. Try doing that with SCSI drives!

      Anyway.... you suggest that this server is slashdotted because it's disk-bound. Serving the exact same page over and over again. Uh huh. Go read up on any modern file system, then figure out how long it takes to send a 100KB web page to 250,000 people over a DSL line, and then tell me where you think the problem lies.

    4. Re:This is wrong on soooo many levels. by (H)elix1 · · Score: 2

      Many of the 'good ideas' for CPU design, HDD, etc seem to merge together. IDE drives are phenomenally better than they use to be. The last audio workstation used RAID 0/1 IDE drives because it was fast and solid enough. Heck, even the box I built for my wife to do photoshop work was only RAID 0 with a pair of 80G IDE drives.

      IDE has since advanced to support tagged command queuing and faster data rates

      This part of the controller or the RAID card doing the work? Great news if it is. (Then my old KT7A-RAID can be put to better use than it is). I'm all for right tool, right job... but when I hear heavy beating on a web server, I would not use a low end sun box either. Personal or hobbyist grade is one thing... but I'm pounding code for one of the major dot com this weekend (my life sucks) that expects to handle millions of requests. This box is closer to what I would put out there for a game server - counter strike size, not everquest....

      you suggest that this server is slashdotted because it's disk-bound
      Nope - If I was to put money on it, it looks like bad code is the problem here. I suspect someone went nuts with the server side code generation.

      My biggest complaint was they could not deal with the heat. Such an easy problem to fix...

    5. Re:This is wrong on soooo many levels. by khuber · · Score: 1
      The only thing SCSI really has going for it is daisy-chainability and support for lots of drives on one port

      hot swapping, designed to run 24/7 reliably, better vendor support, fibre channel / SAN, Ultra320, forward and backward compatibility, wider range of RAID hardware/controllers, internal and external connections, track record in servers

      -Kevin

    6. Re:This is wrong on soooo many levels. by khuber · · Score: 1
      availability of 10k+ drives

      -Kevin

  53. Are you high? by Anonymous Coward · · Score: 0

    A tulatin P3-S 1.26GHz does not even need a heat sink, just a low RPM fan blowing over it.

    A P4, even worse an Athlon XP/MP, produce much more heat than a P3-S, requiring heat sinks and very loud fans. Want a 1U solution?

    And a faster processor is not going to give you better performance in a web server.

    Wonder why the P3-S did not make it above 1.4GHz? Because it was outperforming P4s 1.7GHz.

  54. WTF? by Anonymous Coward · · Score: 0

    Remember this article:
    http://slashdot.org/article.pl?sid=02/10 /01/163725 3&mode=thread&tid=137

    The owner of Hardware Analysis is Sander Sassen. He apparently has two usernames and is posting articles to his own site. Does anybody see anything wrong with that?

    1. Re:WTF? by matto14 · · Score: 0

      yeah i do I wonder how much /. makes on testing load balancing and overall buttfucking a system. But if it means not seeing a bunch of ads i guess im down.

      --
      SCREW FLANDERS
  55. Funny by Anonymous Coward · · Score: 0

    It's funny that an article about setting up a high performance web server is on a server that can't even handle the slashdot effect.

  56. They just wanted us to test it out for them! by caluml · · Score: 1

    No better way than to get the Slashdot crowd to do a quick bandwidth, hardware and security test!
    All that free NMAPing, clicking, and trying ../../ paths :)

  57. Re:how to build a high performance/reliable webser by Anonymous Coward · · Score: 2, Interesting

    Not so...
    You can cache with technologies like Sleepycat's DBM (db3).

    We have a PHP application that caches lookup tables on each local server. If it cant find the data in the local cache, then it hits our Postgresql database. The local DBM cache gets refreshed every hour.

    Typical comparison
    -------------------
    DB access time for query: .02 secs
    Local cache (db3) time: .00003 secs

    We server load dropped from typical 0.7 to an acceptable 0.2, and the load on the DB server dropped like a rock! This is with over a million requests (no graphics, just GETS to the PHP script) every day.

    We also tuned the heck out of Apache (Keepalive, # of children, life of children etc).

    Some other things we realized after extensive testing:
    1. Apache 2.0 sucks big time! Until modules like PHP and mod_perl are properly optimized, there's not much point in moving there.
    2. AolServer is great for Tcl, but not for PHP or other plugin technologies

    Because of all these changes, we were able to switch from a backhand cluster of 4 machines, back down to a single dual processer machine, with another machine available on hot standby. Beat that!

  58. Slashdotted by csnydermvpsoft · · Score: 1

    They tackle the tough design choices and what hardware to pick and end up with a web server designed to serve daily changing content with lots of images, movies, active forums and millions of page views every month.

    Yeah, but how about millions of page views per day?

    1. Re:Slashdotted by ssassen · · Score: 1
      Uhm, we didn't factor that one in, need a couple more boxes and a dedicated high-bandwidth connection for that as we're suffering from too many clients at the same time and network congestion.

      Sander Sassen

      Email: ssassen@hardwareanalysis.com

      Visit us at: http://www.hardwareanalysis.com

    2. Re:Slashdotted by blueroo · · Score: 1

      Yes, because as we all know "Error 500" and "Too many connections to the database" are caused by Network Congestion.

  59. Not to flame, but the article is bad for newbies by Anonymous Coward · · Score: 2, Insightful

    I'll just mention a couple of items:

    1) For a high performance web server one *needs*
    SCSI. SCSI can handle multiple request at one time and performs some DISK related processing compared to IDE that can only handle request for data single file and uses the CPU for disk related processing a lot more than SCSI does.

    SCSI disk also have higher mean times to failure than SCSI. The folks writting this article may have gotten benchmark results showing their RAID 0+1 array matched the SCSI setup *they* used for comparison, but most of the reasons for choosing SCSI are what I mention above -- not the comparitive benchmark results.

    2) For a high performance webserver, FreeBSD would be a *much* better choice than Redhat Linux. If they wanted to use Linux, Slackware or Debian would have been a better choice than Redhat Linux for a webserver. Ask folks in the trenches, and lots will concur with what I've written on this point due to mainenance, upgrading, and security concerns over time on a production webserver.

    3) Since their audience is US based, It would make sense to co-lo their server in the USA. Both from the standpoint of how many hops packets take from their server to their audience, and from the logistical issues of hardware support -- from replacing drives to calling the data center if there are problems. Choosing a USA data center over one in Amsterdam *should* be a no brainer. Guess that's what happens when anybody can publish to the web. Newbies beware!!

  60. Crackin' me up!! by HuvahCraftah · · Score: 1

    What's truly funny is now that they've tuned the ONE page that's linked in the /. article, the rest of the site is unavailable.

    Just try going to their main page or to an old article. Pretty sad really.

  61. Slashdotted by entrylevel · · Score: 3, Funny

    Ooh! Ooh! I really want you guys to teach me how to build a high performance webserver! What's that? You can't, because your webserver is down? Curses!

    (Obligatory disclaimer for humor-impaired: yes I understand that the slashdot effect is generally caused by lack of bandwidth rather than lack of webserver performance.)

    --
    Karma: Incomprehensible (Mostly affected by posting at +5, reading at -1, and metamoderating everything unfair.)
  62. Re:how to build a high performance/reliable webser by Anonymous Coward · · Score: 0

    "Disk I/O on a web server is very important"

    Maybe if you are running a porn site or something that's very static content-heavy. However, I imagine that many sites (think of slashdot for example) are a relatively small number of scripts that fit neatly into memory cache, with all of the disk i/o happening on the db-level.

  63. maybe slashdot should have written the article? by Anonymous Coward · · Score: 0

    ...because the link can't take the slashdot effect :)

  64. new msg by dagooncrn · · Score: 1

    I like the new header on thier site: Please register or login. There are *a few* registered and *quite a few* anonymous users currently online. Current bandwidth usage: 350.79 kbit/s

    --
    -- mg
  65. Re:how to build a high performance/reliable webser by jacquesm · · Score: 2, Interesting

    we serve up between 5 and 7 million pageviews daily to up to 100,000 individual IP's

    Decent speed to me is one in which the server is no longer the bottleneck, in other words serving up
    dynamic content you should be able to saturate the pipe that you are connected to.

    I have never replaced the power supply because of energy costs, it simply isn't a factor in the
    overal scheme of things (salaries, bandwidth, amortization of equipment)

    500-700 Mhz machines are fine for most medium volume sites, I would only consider a really fast machine to break a bottleneck, and I'd have a second one on standby in case it burns up

  66. Re:Advice from the wise: by Hast · · Score: 2

    Really? There was an earlier discussion on this topic. (Related to 9/11 or some other day with extremely high traffic.)

    From that discussion I got the impression that what happens when you are bumped to the front page is that you have tried to access a story with non-standard setup. (What you get if you are logged in and change your view preferences.) The system is setup so that some servers only serve static content. (Because that's what most users view.)

    During high load situations a dynamic request is sometimes sent to a static serving server. This is when you are bumped to the front page. (Unfortunately I couldn't find anything about this in the FAQ/About, so I can't verify it.)

  67. "millions of page views every month" not High-Perf by Anonymous Coward · · Score: 3, Insightful

    Too bad "millions of page views every month" is simply not even in the realm that would require "High-Performance Web Server"(s). These guys need to come back and write an article once they've served up 5+ million page views per day. Not hits. Page views.

  68. Nice by whereiswaldo · · Score: 1


    As of 9:37AM PST, the site seems to be down (connection refused).

  69. Re:how to build a high performance/reliable webser by jacquesm · · Score: 1

    it's an excellent idea ! you have a lot more reliability like that and you can incrementally increase your database capacity.

    Nothing worse than having your one 'monster' database server go down on you...

    Also there usually are limits as to how big that 'monster' server can get in practice, whereas by breaking it up and replicating you can scale as large as you want, and you also avoid trouble by slowing down that one machine when you do your backups.

    (Simply replicate once more and have your tapedrive in a machine that you can take 'offline' without hurting your app). The
    replication mechanism will take care of bringing it back into synch once you are done making your backup.

    If you don't want to have the db and the www server residing on the same box you can always break that up into pairs of machines, but I really have not yet found a need for that (and I
    have done quite a bit of *really* high volume web serving to back that up)

  70. Re:Troll? Informative is more like it. by Anonymous Coward · · Score: 0

    "Unable to connect to database. Can't connect to MySQL server on '217.115.193.148' (111)" ...

  71. in summary by job0 · · Score: 1
    One thing became immediately apparent when making real world comparisons to out-of-the-box web server solutions from Dell, Compaq, Sun and others and that's the fact that we're much faster and able to offer substantially higher output bandwidth than comparable offerings from these OEMs.

    What real world comparisons? The software section only mentions Win2k pro but this is patently not a real world comparision. It's designed to run on the desktop and it's been limited it to 10 concurrent TCP/IP connections by MS so that anyone looking to set up a proper server will need to get the Server or Advanced versions

    The article goes on to say

    With Linux however you can basically turn it on and walk away, provided you got a system administrator that knows what he's doing and has set up everything correctly.

    maybe their sysadmin didn't/dosen't know what he was doing!!

  72. Re:Advice from the wise: by Anonymous Coward · · Score: 0

    How I understand it to work is that under high load, the database servers melt down and are toast until manually restarted, an event which is apparently a frequent occurance, going by hints let loose by the editors.

    So yes. Under load, your dynamic requests will get sent to the static server. This is because the load has killed mysql... Every two or three weeks there'll be a time where I can't get a dynamic page for a period of more than three hours. (Who knows how long they last, but I'm rarely webbrowsing for any longer than that)

  73. how 2 test a so-called "high-performance" server by eagleyezx · · Score: 2, Funny

    1. load it full of pr()n
    2. post the link on /.
    3. check back in 30seconds

    if it still works, it's high-performance

  74. Re:my $0.02-slackware. by Anonymous Coward · · Score: 0

    Then use slackware. You can't get more "off" than that. Gives you the control to squeeze all the power out of your hardware.

  75. Building a Better Webserver in the 21st Century by Anonymous Coward · · Score: 0

    Well the MS solution does have one thing in common with that article about planes. Throw enough engine into it and even a brick will fly. Throw enough hardware into the problem and even a MS site will fly. Bang for the buck MS loses.

  76. Am I the only one to note problems with this test? by Dolemite_the_Wiz · · Score: 1

    There is a huge bottleneck in this configuration not to mention the limits of the tests (load tests, scalability). This is probably one of the worst configs for a web server I have ever seen.

    --
    Save the World! Use a Quote!
  77. Western Digital Drives?? by zentec · · Score: 3, Interesting


    The mere fact that they recommended 7200 rpm Western Digital drives for their high performance system gives me the impression they haven't a clue.

    I disagree with the assertion that a 10,000 rpm SCSI drive is more prone to failure than a 7,200 IDE drive because it "moves faster". I've had far more failures with cheap IDE drives than with SCSI drives. Not to mention that IDE drives work great with minor loads, but when you start really cranking on them, the bottlenecks of IDE start to haunt the installation.

  78. Re:how to build a high performance/reliable webser by llin · · Score: 1

    1)/5) For the front end, you might be better off with a weighted load balancer (or LVS on the cheap). Also consider a specialized HTTP multiplexer like NetScaler/Redline (these typically give content encoding, SSL acceleration for free).

    3)This is probably a bad idea

  79. How to build a high-performance webserver by Anonymous Coward · · Score: 0

    Business plan:

    1. Build a beowulf cluster of webservers.
    2. Put "First post!!!" in the index.html file.
    3. Announce it on Slashdot.
    4. Get a free bandwidth usage and server reliability test.
    5. Change hostname and I.P. address to stop Slashdot effect.
    6. Upload real content.
    7. ???
    8. Profit.

    No, seriously, you could look in to the possibility of using the webserver built in to the Linux kernel - it is sill an experimental feature, and probably not ready for production use yet, but in a few months it could be.er the text in that file's own buffer.

  80. "software to its limits"? by Anonymous Coward · · Score: 0
    ...at what point we saturated the Apache web server with too many requests handled, the hardware however never budged or was near 100% load, we just pushed the software to its limits.[last paragraph on
    Can someone please explain this?
  81. Umm... news flash by Xformer · · Score: 1

    Someone already has. It's called APC, for Alternative PHP Cache. It's an open source PHP bytecode cache. I don't know if it works with PHP running as a CGI program or not, but the website doesn't say that it doesn't, so...

    --
    All I want is a kind word, a warm bed and unlimited power.
  82. They forgot the important bits by ToasterTester · · Score: 2

    This setup doesn't account for HA or scaleability. With hardware as cheap as it is today there is no excuse for not using multiple servers to avoid downtime, and allow for maintenace without taking the site down. Also what about backup, not even mentioned. Last I don't fully agree with the RAID 0 + 1. For a large database, but on a small setup like this I wouldn't. They article seems to imply the data is more read than write RAID 5 has better read performace.

    So article was missing a lot for a professional setup.

  83. The recommended motherboard has problems by Anonymous Coward · · Score: 0

    That motherboard runs *REALLY* slowly with the Redhat 7.3/ 2.4.18 linux kernel in a 4 gig configuration. My company bought about 8 of these machines, and our vendors don't have a solution. I did a small write up about this.

  84. Re:how to (question) by Anonymous Coward · · Score: 0

    Wondering...

    Would the idea of replicating databases to servers only be viable for web sites that have 99%-100% read-only contents?

    Suppose you have a high volume ordering/inventory system. Wouldn't the replicated database raise the possibility that two orders will collide?

  85. What Really Happened! by ssassen · · Score: 1
    I wanted to give some feedback about the server's performance and address some of the concerns mentioned in this thread. First off, we're competent enough I can assure you. I was actually amazed we kept on running as for example Tom's Hardware is nearly unreachable when they're Slashdotted. They got a rack full of servers and quite possibly a load balancer in front. We just got one single box, so do the math.

    But the problems were all software related and Apache took the bulk of it. The problem with Apache is that the per-connection overhead is too high. It's a couple Megs per connection generally, and if you use keepalives (enabled by default), then each connection process will by tied up for as long as 30 seconds (which is the default I think) after the request has been completed.

    Additionally, since Apache works with a pool of individual processes to handle connections, there is no way to have a global shared resource between all processes. So, in the case of your database connections, you have a 1:1 relationship between db connections and HTTP processes. The result is that you have HTTP processes with open db connections serving images and so forth that don't even need db links. So, you end up using a lot more db connections than you actually need.

    The thing we need to do to be able to handle such loads in the future is change from Apache to something that uses a worker thread model within a single process. Apache 2.0 may be setup to work like this, but I think it uses a hybrid model that still uses processes for dynamic stuff like PHP. Apache 2.0 will definitely help a little, though.

    But anyway, what's also happening is that MySQL is only able to handle so many requests and then you're getting HTTP processes piling up waiting for it. So if we can cut down on the number of requests per page that will make a pretty significant difference when spread across thousands of users.

    So yes, I think the Apache keep-alives 'did us in' for the most part and the pool of child processes you create becomes unmanageable at some point with many 1000s of connections at the same time. The worst part is that optimizations such as this can't be found in the manual, you'd need to have been in the 'trenches' to know about things like this. Fortunately we have a great team and Vitaliy, our CTO, is really on top of things, and actually had a great time this weekend, or as he put it 'this is better than simulation'.

    Overall I'm more than happy with the performance of the server, it was never designed to handle such loads, and yet it kept on running, it never faltered and it certainly did not turn into a smoking heap of rubble as some suggested. We just were a little slow with serving out those pages and must've been unreachable to some with a slower connection.

    If anybody else has some additional comments or insights I'd be happy to discuss this further, or go into greater detail. After all we're all here to learn right?

    Sander Sassen

    Email: ssassen@hardwareanalysis.com
    Visit us at: http://www.hardwareanalysis.com

  86. Read this by WillRobinson · · Score: 1

    Sorry your site got /.ed ;) everbodys wish...
    Anyway, you can find alot at google do a search on mysql server optimization. But here is a good starting point, its a bit later down on the article, about setting the server during operation. A very big note in performance can be realized by the compilation options of mysql. http://atmail.nl/docs/mysqloptimize.html

    So which fork of phpnuke did you use?

    Regards,
    Rod Longhofer

  87. Last Post! by alpg · · Score: 1

    It is practically impossible to teach good programming style to students
    that have had prior exposure to BASIC: as potential programmers they are
    mentally mutilated beyond hope of regeneration.
    -- Edsger W. Dijkstra, SIGPLAN Notices, Volume 17, Number 5

    - this post brought to you by the Automated Last Post Generator...