Slashdot Mirror


Real World Webserver Price vs. Performance Figures?

Borgoth asks: "At my company we just broke 10 million pageviews per day. We use 5 2-processor 1U off-the-shelf Intel boxes running Apache, Linux, mod_perl, and MySQL. This averages out to about 2 million pageviews per day per server (about 20 million hits/server, including images). Most of our pages have some dynamism using mod_include SSIs, and maybe one pageview in five directly results in a db query. We think we should be pretty happy that we're doing so much with so little, but we don't really have any idea how much horsepower other sites are using in their server farms. So, what sort of webfarms do Slashdot readers maintain, and how does their performance compare?"

56 comments

  1. not many comparisons by rumpledstiltskin · · Score: 2, Insightful

    You probably won't find a whole lot of comparable situations, even on slashdot, except maybe slashdot itself.

  2. Well I really dont know.. by override11 · · Score: 4, Funny

    But if you give us the URL of your web site, the kind folks here at /. would be happy to give it a load test for ya. :)

    --
    No I didnt spell check this post...
    1. Re:Well I really dont know.. by Anonymous Coward · · Score: 0

      Um. Not really.

      If they're breaking 10,000,000 page views per day, that's 115 per second every second of the entire day.

      I sincerely doubt the extra load of 10,000 or 20,000 slashdotters is going to dent it.

  3. Well CT? by gmhowell · · Score: 1

    It would be nice if the editors read their own site, then maybe you'd get some good answers.

    If they maintained their own servers.

    If they're not already compiling an answer that isn't a flippant troll like this;)

    --
    Jesus was all right but his disciples were thick and ordinary. -John Lennon
    1. Re:Well CT? by krow · · Score: 5, Informative

      I've been looking at this laely. Most sites seem to be able to do a million pages per webhead.

      The answer for slashdot is more complex because we have three groups.

      Article/comment servers can handle 200K of pages views a piece.
      Index/All can handle 100K.
      Static/XML can take a million per server.

      I have a fix that goes in this week which should up Article/Comment, for index I am looking at a new system for caching the stories that should increase the index servers.

      --
      You can't grep a dead tree.
  4. My friend's site... by rwsorden · · Score: 5, Funny

    My buddy over at Oesterly.com seems to think that a Pentium 100 and 128MB is sufficient.

    1. Re:My friend's site... by Dr.+Photo · · Score: 2, Funny

      Is that an invitation for everyone on Slashdot to simultaneously test that hypothesis? ;-)

    2. Re:My friend's site... by rwsorden · · Score: 3, Funny

      Heh. If he gets too many hits, I think there's a good chance that I will be downgraded from Friend V2.0 to Acquaintance V1.5...

    3. Re:My friend's site... by mike_ggood · · Score: 1

      Go ahead and tested. He won't mind. He loves a good challenge.

    4. Re:My friend's site... by rwsorden · · Score: 2, Funny

      "Well, it's been a few hours and it looks like a Pentium 100 and 128MB of RAM is all anyone would ever need to host a website."

      - rwsorden, May 5, 2003 from Infamous Technology Quotes

    5. Re:My friend's site... by sql*kitten · · Score: 1

      Heh. If he gets too many hits, I think there's a good chance that I will be downgraded from Friend V2.0 to Acquaintance V1.5...

      Serving only static files, on a threaded webserver and a decent OS, even a lowly Pentium can saturate a T1 without breaking a sweat. Hell, I remember saturating 155 Mbit/s ATMs (100x faster than a T1) on 1994-vintage DEC Alphas. I would say that most Slashdottings flatten routers and pipes rather than servers.

    6. Re:My friend's site... by rwsorden · · Score: 2, Funny

      Who do you think you are? This is Slashdot! How dare you analyze the situation logically and reasonably!

      Seriously, though, I'm glad you brought that up. That'll make him feel a whole lot better. Sometimes the "myth of the almighty Slashdotting" can get even the most level-headed webmaster up in arms. Now if only he'd return my phone call...

  5. Hard to say by linuxwrangler · · Score: 4, Insightful

    It's a bit like saying "we just shipped 5000 thingies last month using 3 vehicles". Um, 5000 beanie babies or 5000 tractor engines?
    Was the vehicle a rowboat or a train?"

    Every site is different. I don't really care that the servers are 1U at the expense of telling us things like how large the database is and is it mostly cached reads or read-write activity? How big is the pipe? What is the CPU speed and RAM size? What is the speed and type of disk? How many bytes are transferred?

    Incidentally, a much more important number is peak capacity, ie. what is your 5 minute peak load? Whatever you can reasonably handle for 5-10 minutes you can probably handle constantly but a supposedly high-volume site can melt down when the site gets flashed up on the morning news or Slashdot.

    --

    ~~~~~~~
    "You are not remembered for doing what is expected of you." - Atul Chitnis
  6. Blahaq���� by Anonymous Coward · · Score: 0, Funny

    Imagine a bewolf cluster of these! DOES IT RUN LINUX?! now?! does it?! DOES IT?! *IMPLODES* wtffffffffffffffffffffffffffffffff

  7. You should move all static content by Anonymous Coward · · Score: 2, Interesting

    To a purely static server, like thttpd. Then you can focus the dynamic servers on serving purely dynamic stuff, and optimize accordingly. Also, MySQL 4's query cache is a great thing, so if you're not using it yet, look into it.

    1. Re:You should move all static content by Fweeky · · Score: 1
      To a purely static server, like thttpd [acme.com]. Then you can focus the dynamic servers on serving purely dynamic stuff, and optimize accordingly.

      We use Boa, which is a little faster (apparantly).
      Also, MySQL 4's query cache is a great thing, so if you're not using it yet, look into it.

      That is if it doesn't randomly decide to fall over, destroy your indexes, or corrupt your data.

      *grumble*
  8. You're missing the important stuff by Charlton+Heston · · Score: 4, Informative

    Who cares what everyone else does?

    What is your system load? If it's less than 1, you've got processor power to spare. If it's more than one, you could add more processors IF you think that site response is too slow.

    What is the throughput to your disks? Actually benchmark this with vmstat or something like that. If that shows that your disks are constantly maxed you could get more servers to spread the disk activity around, or you could build a faster disk subsystem if you've got a centralized database. Smart architecting helps too. Don't run the database on the same processors that run scripts and serve pages. Use the database load handling features to improve that specific part of the site. See what pages you can generate statically - I doubt that every single page on a site needs to be from the database.

    --
    Get your stinking paws off me you damn dirty ape
    1. Re:You're missing the important stuff by Khazunga · · Score: 2, Interesting
      What is your system load? If it's less than 1, you've got processor power to spare. If it's more than one, you could add more processors IF you think that site response is too slow.
      This is not true. System load is the average number of blocked processes. They may be blocked waiting for processor time, but they may also be blocked waiting for a lot of other stuff. So, the 100% usage system load depends on what are you doing with the server. You can have a system keep a load average of 20 and yet show unallocated CPU time (My IMAP server has this behaviour). You can have a system with load average of 1.5 and have no spare CPU cycles (if you're number crunching, for example).

      My best advice is: use vmstat. vmstat 10 will give you readings on all the stuff you need to know: mem, disk and cpu usage.

      --
      If at first you don't succeed, skydiving is not for you
    2. Re:You're missing the important stuff by the+eric+conspiracy · · Score: 1

      System load is the average number of blocked processes.

      Depends on what tools you are using. Many (uptime on RedHat for example) exclude processes blocked by I/O.

    3. Re:You're missing the important stuff by Khazunga · · Score: 1
      Load average comes from /proc/loadavg, so it's calculated by the kernel. I'd be suprised to see distributions finetuning the kernel to change stuff like this, but since I haven't laid hand on RH systems in a while, I can't refute your statement.

      I've seen the behaviour I've described across SuSE, Debian and Gentoo systems. Servers with many network connections, and lots of disk I/O show high load averages and unnallocated CPU time.

      --
      If at first you don't succeed, skydiving is not for you
  9. My Anecdotal Evidence by Xunker · · Score: 5, Informative

    Here is my anecdotal evidences for the site I run:

    The total outfit is 8 servers, 6 active: 1 DB Server with one hot backup (dual P-III 750, 1.5GB), 4 web servers (~1.1ghz, 1GB), 1 uniproc dedicated image server (1ghz, 1GB) with a hot backup.

    The 4 web servers toss a combined total of about 1.5 million pageloads a day, of which 1.4 mil are dynamically generated using FastCGI/Perl and that others are shtml and stylesheets. A lot of the data that is queried from the DB server can and is cached on the web heads for better performance so that during peak times the server doesn't have to do much more than 80 queries/sec. The image server using stock Apache 1.3 however, does something like 3m serves a day without much sweat since it's all static content.

    All told that works out to each web server doing something like 325,000 pageviews a day. I don't have a barometer of whether that's good or not, but honestly I worry more about bandwidth than computrons.

    I think you should be pretty happy with what you're doing. I don't know of the current figures, but last september Slashdot was doing 2.4m pageviews a day with ~10 web heads (as gleaned from 'Taco's journal). Understand that's not an apples to apple comparison since I guess you're serving more static content while slashdot (and my site) are by and large dynamic.

    --
    Hilary Rosen's speech was about her love of money and her desire to roll around naked in a pile of money.
  10. Try this guy by FreeLinux · · Score: 4, Informative

    I'd say that you should probably talk to JW Smythe. He posted on an article, not too long ago, on bandwidth and porn. From his post he seems like someone who would be able to help you with your question.

    Frankly, I don't think that even Slashdot gets as many page views per day, as you do.

  11. My company by Anonymous Coward · · Score: 3, Informative

    My company works off a dual 933 with 2 gigs of ram and is currently serving out 1 million pageviews per day. Most of the site is cached with PHP/MYSQL

  12. Sorry, no FAQ for that! by jgardn · · Score: 4, Insightful

    You are approaching the point where the information you'll get from others won't apply to you, because you are pioneering new territory with your company's technology.

    You have a website that has its needs. I can't imagine what kind of application you are using, how much memory it needs, whether it is processor intensive or disk intensive, or both. Depending on how your website works, there are a variety of solutions available. One solution to one problem might actually cause more problems for you if applied inappropriately.

    It might make a lot of sense to consolidate the database onto an advanced server -- with 2 procs, RAID SCSI drives, and a fair amount of memory. It might make a lot of sense to get cheaper boxes with more memory and only one processor to run the web servers. Perhaps you can mount them all off of one giant NFS file server, and have the data that the web servers need held in a cache on the web server. It might make a lot of sense to go talk to IBM and Sun and see what they have to offer as well. It might also make a lot of sense to redesign the way your web application works to reduce the load.

    But no one can tell you the right way to do it, because your situation is unique. No one can even give you a good estimate of cost. Your best bet if you are truly lost is to hire someone to analyze your code, your servers, and your needs, and come up with a plan. Those guys cost a bit of money, and finding a good one is near impossible. You're better off at studying up on what your website really needs and experimenting with possible solutions.

    This is where you start to realize why web people can earn up to 6 digits. We don't just design web sites or program applications. We have to make sure they scale as well.

    --
    The radical sect of Islam would either see you dead or "reverted" to Islam.
  13. Using mod_gzip? by Isao · · Score: 4, Informative

    You are using mod_gzip, aren't you? Depending on content, you may be able to reduce your bandwidth usage by 50%, at the expense of some CPU time.

    1. Re:Using mod_gzip? by TwistedKestrel · · Score: 2, Insightful

      Great advice. Nowhere in his post does he mention bandwidth concerns. So if he were to install mod_gzip, he would reduce the capacity of his servers, not increase it. Mod_gzip isn't the answer to everything.

    2. Re:Using mod_gzip? by Anonymous Coward · · Score: 2, Interesting

      The expense of gzipping is easily recovered by the fact that it takes 2x or 3x less time to send the data out. That is the same precious time that httpd spends to send its output, while eating up memory and CPU.

    3. Re:Using mod_gzip? by nyamada · · Score: 2, Informative

      Not that it matters that much, but Safari on Mac OSX doesn't (yet) support gzip encoding. But that's a bug that's been pointed out by lots of people...

    4. Re:Using mod_gzip? by prefect42 · · Score: 1

      That's where you'd be surprised. You might actually increase your throughput. You can find that the all in time for a connection is lower, even though the CPU usage is higher. The bonus is, since you get to drop the connection sooner, there's a memory bonus, and it's another thread to forget about.

      --

      jh

  14. DB or not DB? by fm6 · · Score: 3, Insightful
    Actually, a lot of Slashdot content is, for all practical purposes, static html. Notice the message that appears when you post?

    You neglected to mention what DBMS you use. Or is it a given nowadays that everybody uses MySQL?

    Which is my cue for my usual anti-MySQL flame. Except that it's old, I'm tired of doing it, you've all heard it. Still, I'd like to see some serious benchmarks comparing MySQL with PostgreSQL, Firebird, and Berkeley DB. With attention to realistic web-style queries, scalability and (except for Berkeley DB, of course) complex queries.

    1. Re:DB or not DB? by Xunker · · Score: 1

      Well, static and "static" are two different thing entirly.. but yes, I neglected to say I'm on Mysql (I had it in there before, but I edited the comment and removed it in one part anticipating I'd put it somewhere else but didn't. I had the Gettysburg address in there too but I thought I'd take it out for brevity sake).

      I can flame MySQL with the best of them, but it's still the one I choose because It's the one that sucks the least for what I want it to do. I have yet to find a DB engine that does not blow hard in some major way, and I expect I never will.

      --
      Hilary Rosen's speech was about her love of money and her desire to roll around naked in a pile of money.
    2. Re:DB or not DB? by fm6 · · Score: 1

      So you've looked at other Open Source DBMSs? I'd be very interested in hearing specifics as to their blowhardedness.

    3. Re:DB or not DB? by wmshub · · Score: 1

      You say you'd like to see serious benchmarks comparing MySQL with PostgresSQL? I wish I could help you, but I realized that Postgres sucked before I got far enough. I needed a DB, I got recommendations for Postgres because it was a "real" database and had a fuller feature set than MySQL. So I set up Postgres. Just populating my database (and yes, I used all optimizations recommended, like turning off indexes and transactions while populating) took 10 minutes, twice in the week of work the database was corrupted inexplicably, plus the misery of vacuuming (took 6 minutes!) meant my database would have regular downtimes. I tried MySQL, suddenly populating the database took only 30 seconds and in the 3 years I've been using it, not a single database corruption.

      And the best part? MySQL had been adding the missing features as fast as I've been needing them. So be quiet with your anti-MySQL comments, it's a great database that just plain works. I can never understand why some people flame it so badly...sure, it's not perfect, but from my view it's a great pieces of software. Something Postgres most definitely is not.

    4. Re:DB or not DB? by fm6 · · Score: 1
      So be quiet with your anti-MySQL comments,.. I can never understand why some people flame it so badly...
      Uhm, I'm bored with the MySQL flame wars (even though I helped start them), so no noise from me. But I do want to point out that the attitude in your first sentence kind of answers the question implicit in your second sentence!
    5. Re:DB or not DB? by damiam · · Score: 1
      Something Postgres most definitely is not.

      How would you know, if you haven't touched it in three years?

      --
      It's hard to be religious when certain people are never incinerated by bolts of lightning.
    6. Re:DB or not DB? by Sxooter · · Score: 1

      Let's see, 3 years ago, you tried Postgresql. It took 10 whole minutes to load your data, so it must just suck at everything else. Never mind that there's been like 4 major revisions put out (think Mysql 3 to 4 to 5 to 6 to 7).

      If Postgresql was corrupted, it was likely running on a hard drive with bad sectors or in a machine with bad memory. The fact that MySQL doesn't tax your machine as hard makes it more likely that postgresql will show these errors.

      I've been running Postgresql for 4 years, and have NEVER had a table get corrupted, NEVER had it lose data, and NEVER had it crash.

      Your experience with it, i.e. virtually none, disqualifies you from stating it sucks.

      MySQL is NOT A SQL database. It's missing so many features it's hard to make a list and not have it be as long as your arm.

      Here are a few: No constraints (however, it will accept them during table creation, then promptly ignore them!) Allows you to put NULL into not null columns, and just inserts either the default or 0 or '' or 0000-00-00. Foreign keys, now that they finally have them, don't support cascading, which makes them about worthless.

      Let's make it clear, if Postgresql was a shitty as you're saying, the .info and .org domains would not be running on it. How many TLDs run on MySQL by the way? 0.

      --

      --- It is not the things we do which we regret the most, but the things which we don't do.
    7. Re:DB or not DB? by Sxooter · · Score: 1

      Oh, and another thing about MySQL is that when it does what would usually be classified as "the wrong thing" it never tells you. Not even a warning.

      Create two tables, one innodb, the other myisam. Run a transaction against both of them. roll it back halfway through.

      MySQL doesn't tell you that you can't run a transaction on a MyISAM table, it doesn't tell you it didn't roll back the MyISAM table, in fact, it happily acts like it got the whole thing right, and rolled it back. Except the changes you made to the myisam table are still there.

      MySQL is a data store with a minimal SQL front end written by people who DO NOT UNDERSTAND the issues of database design in a high load / high transaction environment. To paraphrase the manual, data consistency and what not are the job of the database, they are the job of your code. god help you if you make a mistake because MySQL isn't going to catch any of them.

      --

      --- It is not the things we do which we regret the most, but the things which we don't do.
    8. Re:DB or not DB? by Sxooter · · Score: 1

      How do you benchmark a ferrari and a freight train?

      MySQL is missing so many key features that benchmarking MySQL against Postgresql is useless.

      How fast is MySQL when it has check constraints on data? No one knows, since it doesn't have them.

      How fast is MySQL when you use a complex sub select? No one knows, it can't do more than a few common cases yet.

      How fast is MySQL when you use stored procedures? Triggers? custom data types? Unions?

      None of those things are there. WHEN MySQL gets around to having most of the features a REAL database has, then you can benchmark it, until then, any benchmark is going to unfairly favor MySQL because MySQL is fast because it is missing most of the features it needs to compete, and it never errors out, even when it should.

      It's a data store, not a database. The SQL it supports is so small a subset of SQL92 as to be useless for any but the most simple content management applications.

      When it meets most of the SQL99 or 03 standards, we'll talk. Til then, you're tilting at windmills when you compare MySQL to any other database. Heck, even Foxpro is a better database than MySQL.

      --

      --- It is not the things we do which we regret the most, but the things which we don't do.
    9. Re:DB or not DB? by fm6 · · Score: 1
      How do you benchmark a ferrari and a freight train?
      You compare cargo capacity and speed, of course. This would then help you choose based on the relative importance of these factors for your application. Though I suspect most people would choose some kind of comprimise, such as a semi truck.
      MySQL is missing so many key features that benchmarking MySQL against Postgresql is useless.
      What's "key" to you is not "key" to everybody. Thousands of webmasters claim that MySQL has the features they need. Maybe they're wrong, but you're not going to convince them by throwing out generalizations.
      How fast is MySQL when it has check constraints on data? No one knows, since it doesn't have them.

      How fast is MySQL when you use a complex sub select? No one knows, it can't do more than a few common cases yet.

      How fast is MySQL when you use stored procedures? Triggers? custom data types? Unions?

      Come on guy, you're preaching to the choir. I know MySQL is missing key features, or implements them badly. If I were a MySQL partisan (isn't obvious that I'm not?) I'd respond with two statements: (1) MySQL has enough features for most web applications; (2) Leaving out all those features makes MySQL faster than other engines.

      Now, I happen to suspect that both those statements are false, at least for larger web applications. But I'm not enough of a database expert to make a good argument. And even if I were, throwing CS theories at people doesn't seem to accomplish much. What would accomplish much is a real-world demonstration. Like, suppose you design some simple content mangement schema, and you want to use it to implement a web site that serves 10,000 pages to 100,000 users per day. (Insert additional assumptions here.) How will your choice of back-end affect performance? Will MySQL really be faster than PostgreSQL or Firebird? And if so, would Berkeley DB be even faster? How about complex queries (a particularly nasty MySQL shortcoming)? Answering these questions would accomplish more than any amount of theorizing and flaming.

      None of those things are there. WHEN MySQL gets around to having most of the features a REAL database has, then you can benchmark it,
      Somehow, I get the impression that you don't like MySQL! ;) Correct me if I'm wrong, but if somebody told you you had to use it for your next project, you'd be just a little unhappy, right? Well then, you need to respond to the fact that MySQL is what everybody uses, You have a vested interest in demonstrating its shortcomings. You can't just ignore it and hope it will go away. Not gonna happen.
    10. Re:DB or not DB? by fm6 · · Score: 1
      MySQL is a data store with a minimal SQL front end written by people who DO NOT UNDERSTAND the issues of database design in a high load / high transaction environment.
      True. Unfortunately, most people who design database applications don't understand them either.
    11. Re:DB or not DB? by Sxooter · · Score: 1

      Actually, I really do like MySQL. I just don't like to benchmark it against Postgresql in the role of "datastore for a web site" without looking elsewhere.

      It may be hard to believe, but you can do more with a database than build a web site :-)

      MySQL is a great backend data store for web sites. However, it is NOT ACID compliant, and shouldn't be compared against an ACID compliant database.

      Postgresql uses Write ahead logging, and you can pull the power plug on the machine in the middle of 1000 concurrent transactions and the database will lose NONE of your data.

      That doesn't come free. MySQL has no approximationg of it. If the OS fails to complete a write to a block, MySQL has no write ahead log or what not to to fix the data with. You've got a corrupted record with half the data in it, and that's that.

      MySQL doesn't do constraints, so for every check contraint I put into a Postgresql based application, I have to write one in PHP or Perl or what not for MySQL. But you've got a possibility of race conditions for certain constraints, so MySQL cannot approximate Postgresql in that situation.

      There are MANY other issues brought up by MySQL, but it only gets the A and I from ACID. It lacks proper constraints and just inserts a default value when you insert a NULL into a NOT NULL field! That in no way qualifies for use of the C in ACID. It has no write ahead logging, so it can't qualify for the D, since power failure can result in it having incompletely written records that can't be recovered.

      It's not so much MySQL I hate, I really don't. It's the marketing they do. The MySQL people shout out they have ACID when they don't. Adding innodb tables alone isn't enough to qualify as ACID compliant. Heck, even the Postgresql guys didn't claim the D part until just over a year ago, and that required a HUGE chunk of code for write ahead logging.

      While the Postgresql core developers quietly make one of the nicest databases in the world, the MySQL crew basically lies about the capabilities of their database.

      There are plenty of people using Oracle in a situation that probably should be using MySQL. I can think of at least three examples in my own company where I've told the folks doing the work that they'd be better off with MySQL, because they're basically batch processing in Oracle, not using any constraints, foreign keys, or transactions.

      So no, I don't hate it. But I do wish people would stop spouting misinformed garbage about it's capabilities. It's a great partner for Postgresql. Postgresql handles the critical data you can't afford to lose or get wrong, MySQL handles content management. But the MySQL crew continue to try and rule the world, and in the long run, that's actually bad for open source, because the worst thing we can do is give someone a first experience with an Open Source database that can't do what it promised to do, and then they reckon all open source databases are like MySQL, overselling themselves.

      --

      --- It is not the things we do which we regret the most, but the things which we don't do.
    12. Re:DB or not DB? by fm6 · · Score: 1
      My mistake. I assumed somebody who knew so much about MySQL's shortcomings had to be a MySQL hater.

      On the other hand, you seem to be reading a lot into my use of the word "benchmark". Perhaps you're assuming I'm like all those marketroid drones who use bogus benchmarks to "prove" that a particular product is "superior". In real life, benchmarks only prove that a product does one particular thing in one particular circumstance better. Benchmarks have their legitimate uses, but only if you bear in mind their limitations.

      And I still think benchmarking MySQL against other DBMSs makes sense. You simply have to be careful understanding what the benchmarks mean. For example, you seem to be saying that MySQL will always beat a fancier DBMS when you use it as a simple data repository, where you only retrieve data one record at a time, using a single-field unique index. OK, let's take that as a given. That still leaves me with some unanswered questions:

      • If this is all just a matter of feature overhead, wouldn't Berkeley DB be even faster?

      • I know for a fact that MySQL takes a performance hit when you use multi-field indexes or if the indexed field is a string. (That's why Slashdot comments now have unique ID numbers.) Can this cost enough to justify using a "real" DBMS? And are there real-world web applications where this matters?

      • This kind of argument always seems to end up comparing MySQL with Oracle or PostgreSQL. Now, these are big (I won't say "bloated", though some would) systems optimized for high-end applications. Users of Interbase and its open-source branch Firebird claim that these products eliminate a lot of the performance issues associated with other databases, while retaining all the features of a "real" DBMS. Has anybody made a serious effort to compare the performance of MySQL with these engines?
      My parting comment is offtopic, but I have to respond to this:
      There are plenty of people using Oracle in a situation that probably should be using MySQL. I can think of at least three examples in my own company where I've told the folks doing the work that they'd be better off with MySQL, because they're basically batch processing in Oracle, not using any constraints, foreign keys, or transactions.
      There's more at stake here than performance. Having multiple database engines at hand costs you in terms of maintenance. In two different companies where I worked, the IS people and the web monkeys insisted that people not build applications around MySQL. They weren't just being narrow minded. Once the developer handed off the application, somebody had to look after it for years into the future -- and nothing raises your maintenance costs more than unnecessary multiplicity of technologies.

      Then again, both companies sold, and used, high-end servers, so the performance hit was never an issue anyway! People actually just used MySQL because it was easier to install, or because it was what they knew.

    13. Re:DB or not DB? by Sxooter · · Score: 1

      My thing against benchmarking is that MySQL wins hands down on simple insert / delete against almost every serious database out there. It just flies. Unfortunately, I do a lot of

      select enum from table except (select enum from table2) kinda stuff, and mysql just can't do it.

      The part about maintaining two databases is valid, but only so far. the biggest cost of running Oracle are license fees and maintenance. MySQL doesnt' really need a whole lot of maintenance, since it's more or less self maintaining, and any one of the PERL or PHP developers can run myisamcheck when they need to.

      If you can get rid of half your oracle licenses, then the usage of MySQL makes sense, if it doesn't help you get rid of licensing / support contracts, then I guess Oracle is still the way to go.

      But really, there are TONs of shops out there running Oracle for content management type stuff that are just tossing money into the toilet.

      --

      --- It is not the things we do which we regret the most, but the things which we don't do.
    14. Re:DB or not DB? by fm6 · · Score: 1
      the biggest cost of running Oracle are license fees and maintenance
      Many large companies have site licenses for enterprise software, including whatever DBMS is standard. Maybe they could negotiate smaller fees by saying, "we use MySQL for the small stuff!" but I doubt it!
      I do a lot of "select enum from table except (select enum from table2)" kinda stuff, and mysql just can't do it.
      Except that you can fake it on the client site by doing multiple queries and merging the data. Yeah, that's painfully inefficient. But I suspect that a lot of people use MySQL for this kind of thing because they just don't have the training to design complex queries.
      MySQL doesnt' really need a whole lot of maintenance, since it's more or less self maintaining...
      "More or less" self-maintaining? It'd have to be totally, absolutely self-maintaining to satisfy the IS and production web people I've talked to. Suppose you need to upgrade the OS, or move the app to a new server? This might mean that you have to upgrade the DBMS engine too. If the original application designer is not available (busy with something else, left the company) the maintenance-mode owner needs to have technology s/he is familiar with.

      Anyway, I'm willing to accept that Oracle or PostgreSQL can't run as fast as MySQL for very simple queries. But I still want to see some serious comparison of MySQL with Interbase and Berkeley DB!

  15. .com ebay bargains and party pics by jaydho · · Score: 0, Offtopic

    I just got this puppy w/1GB ECC memory and it is doing a fine job even with the high demand for college party pics you owe it to yuorself to checkout the elevated horizontal body shot and the wet-fun-fountain photos :-)

    Offtopic -2, Lovely Ladies +5

    1. Re:.com ebay bargains and party pics by jaydho · · Score: 1

      Specific details are about 600 visitors a day, 4000 hits on the aforementioned server and about 40,000 on the photo server which is hosted over at Hurricane Electric. The Penguin server is a really nice system, extremely fast with the dual 600Mhz PIIIs and all that memory combined with a snazzy SCSI HD. Compared to the old 233Mhz server I used to use there is a huge difference. Even after a bit of slashdot traffic there isn't any degradation on the new server, and if there was it's because of the limited bandwidth more than processing. Ohh yes, the new server is also hosting about 6 other sites so we'll peg the total hits a day at about 7000, but keep in mind every one of these hits is generating a dynamic (ASP based) page...

    2. Re:.com ebay bargains and party pics by Anonymous Coward · · Score: 0

      Hey buddy, learn the difference between / and \. Your site is broken in mozilla because of it. (no thumbnails).

    3. Re:.com ebay bargains and party pics by Anonymous Coward · · Score: 0

      Yup. Broken in Opera 6 too. All the traffic in the world doesn't matter if your site has got evil inside!

  16. Our stats by Graelin · · Score: 3, Informative

    As many others have pointed out the question really should have been "What setup are you running MY site on and how much traffic are you handling?" This is, in no way, apples to apples.

    We are comfortably serving 2.5M dynamic generated pageviews every month across 3 webheads, 1 software load balancer and two large DB servers. This is all mod_perl work here. Last I looked we were doing about 1.5TB/month in bandwidth from these dynamic pages.

    Webhead data (currently 3, adding 2 more soon):
    2x1.67Ghz Athlon
    3GB Ram / 18GB SCSI Disk (only used for logs, content is read over NFS)

    LB data (we're moving this to a CISCO CSS 11050):
    1x1.4Ghz PIII
    2GB Ram / Disk unimportant, it's never touched.
    Software load balancer: Pound, quite an amazing piece of software.

    DB server (one live, one hot-spare)
    4x1.6Ghz Xeon (PowerEdge 6650)
    4GB Ram / Big ass disks and a 40GB database

    MySQL currently sees about 500-600 queries per second on the DB. We need to implement more server-side caching though, we are seeing an alarming 54% query cache hit rate (4.0.12).

    One thing I'm looking at is less computation on the forward-facing webservers. Instead, using SOAP to build the page components from a separate cluster of application servers. Preliminary testing is promising.

  17. at a previous job by Anonymous Coward · · Score: 2, Informative

    We got about a million pages a day or so... 95% or higher was highly dynamic database driven (plus there was a very active forums section). 5 1U 2CPU webservers (apache/php/coldfusion), 3 database servers (mysql w/ replication), and were at prob 40 to 50 percent capacity... would have been much better if there was more caching and/or a better indexed database

  18. What are you running? by placiBo · · Score: 3, Interesting

    Too many people seem to concentrate on processing power and hardware while neglecting the software side of things.
    Using a web server which pre-forks (example-- Apache 1.3x), is probably the best way to dramatically reduce performance and scalability in most situations. The sheer number of processes under high load makes most schedulers crap themselves in most situations.
    Multithreadedness, an example is Apache 2.x, can greatly improve performance and scalability as can single process, single threaded multiplexing non-blocking IO based web servers such as Thttpd, BOA or Zeus.
    Once one has selected a server which works effeciently for them given their content, fine tuned their OS, then one can move towards actual processing power and system throughput.

    1. Re:What are you running? by 1101z · · Score: 2, Informative

      If it is linux you are wrong in a couple of ways.

      1. Linux maps treads to processes so you get a mass of processes anyway.

      2. If you want to run things that are not tread safe like PHP you have to pre-fork. In fact PHP's web site states not to run PHP, Apache, and UNIX-like OS 2.x on any production web site. Beause most libs for are not thread safe. Which means mod_perl and mod_* are going to have the same problems. It may work it may not that is not what I want to base my job on.

      3. Single process, single threaded web servers are nice for static content but not for most modern hign volumn web sites which have lots of dynamic parts.

      --
      One day people will learn the folly of Winbloze, Linux Rules!
  19. Consider the whole picture.. by elemur · · Score: 3, Insightful

    Think about your network, load balancing, and other sorts of issues.

    For example, I had a site that I ran for a while that was fairly poorly built from an application perspective. However, the client had prepped a flash load (ie: a bursty, concentrated load) for a specific time period.. and I had about a month to prepare. The problem was that we couldn't rewrite the apps part of the site to ease the congestion, nor could we rewrite some apps to be distributed to multiple servers. (They stored state on the server..)

    So, I brought in a Foundry ServerIron, and used the URL switching to map all static files/items to a pair of Ultra 5 workstations. These had a bunch of memory and had iPlanet Enterprise Server configured with very agressive caching parameters. For the dynamic content, I also increased any caching parameters available.

    (This is high level, but you get the idea. Basically, serve as much out of memory as possible.. other tuning issues.. turn off name resolution obviously.. make sure you aren't I/O bound.. or network bound for that matter.)

    The day came around and we served 5 or 6 million hits in two hours or so.. the average load on the servers was around 0.1. In fact, even on the servers with the static content getting lots of hits, there was only really disk activity when access logs were flushed to disk (Every 30 seconds)..

    So, don't just think about servers.. consider all options when trying to balance and handle your load.

  20. CORRECTION by Sxooter · · Score: 1

    I'd like to point out I talked to the author of INNOdb tables, and he assured me that innodb tables use a "double write MVCC" mechanism to assure the D part of ACID. So, the only part MySQL is still missing is check constraints to be ACID.

    --

    --- It is not the things we do which we regret the most, but the things which we don't do.
  21. Cant get it but it works... by Anonymous Coward · · Score: 0

    Lotus/IBM Domino GoWebserver 4.6.2.#
    Equipment Needed to handle 2 million
    page views a day: P2 350 256MB RAM. PII 400 for
    MySQL. OS/2 Warp Server for e-Business (latest
    release is Feb 2003 - release(s) we've used are
    2003, 2001, 1999 and Warp Server Advanced 1996). The SQL server can sometimes come close to breaking a sweat, but (1) we have a dual SMP box waiting for us to work up the initiative to make the switchover, and (2) the web server doesnt come close to breaking a sweat - sometimes I wonder if it knows it's even doing anything. Memory is always a good thing to increase though for dynamic pages. Shame DominoGo is kinda tough to come by - comes with WSeB releases and as an obscure security option (sold with another IBM update package under that package's name), and that performance is only there under Warp (and maybe AIX) because of Domino Go's extensively intensive use of threads (up to 4,000 per CPU).

    Oh - and unless you have extremely fast disks and caching controllers, JFS (included) or HPFS386 (add-on) is the better choice over HPFS due to larger (up to the machine's max available RAM) caches and the fact that they are designed to pipeline the data in a better fashion to the httpd or directly to the network card(s). HPFS (either variant) will save you the chore of needing to de-fragment, which is its big advantage over JFS if you are serving lots of files - IBM calls it "fragmentation resistant" though it's near fragmentation proof... I've got machines with thousands of directories and in many of those, thousands of files, some written as long ago as 7 years, and low single digit fragmentation for the drives.

    Sorry to say, nothing yet I've tried comes even remotely close... IBM really blew it... this is what Warp Server Advanced was designed for... on one CPU it still beats NT on 4 CPUs for this type of serving (or any actually), and WSeB 1999 till present is even faster and better optimized. We've worked with some big clients (on the graphical end of web stuff) who have each opted for various other web solutions (Linux, NT, 2K, XP, Be, etc), one big one had MS's help in setting up and installing their network (big cash outlay, big stake in getting the results wanted). They ended up with 6 dual CPU boxes that still hit 15% or more "Server too busy" errors to try to match our traffic.

    Next sad note is that there are numerous IBM 4-way, 8-way (and even a few 16, 32, and occassionally that rare 64way) boxes out there that are VERY cheap (some 4 ways in the $700 range) - they're used and refurbished (sometimes by IBM themselves) - and Warp Server (Advanced or eBusiness) flies on them... native 64-way per node support. I know it can do clusters, but not sure how many nodes it is designed for... but with up to 64 in a single node, why would anyone want to?

    WSeB (with DominoGo, WebSphere, and the availability from IBM or elsewhere of MySQL, DB/2, Notes and more) is still for sale - as is eComStation and eComStation Pro (SMP version) - though eCS doesnt come with DominoGo AFAIK. Finding WSeB for sale may be a pain though...

    Probably not a viable solution for you, but we and numerous of our customers arent turning back any time soon. Good luck with what you do choose though...