MySQL Problems Under Heavy Loads?
pgatt asks: "I am running a very high load Web site, averaging 500,000 hits per day. Most pages on the site require a MySQL connection for some simple query. I have a separate Web server and MySQL server, each consisting of dual Pentium III 750s with a gig of RAM. Under peak hours, MySQL reports a "Lost Connection To Server During Query" error, even if I try to connect to the database locally from the MySQL server. This usually seems to happen when the number simultaneous threads get to be around 700-800, but there doesn't seem to be an exact number that it hits when it encounters a problem. I have contacted MySQL support but have not had much luck. My system has plenty of resources left, only running at about 0.40 load average with plenty of RAM. I see a lot of people experiencing the same problem as me in the mysql mailing list but they haven't found a solution. I am looking for recommendations about how to either tweak MySQL to run better or other database solutions. I would like to stick with something open source, but if I need to pay for something to get stability then I will. Any suggestions will be greatly appreciated."
Depending on how your DB is structured and how your front-end interfaces with it, you could give Postgres a shot..
First problem (DB structure) is: are any records larger than 8K? A well-designed DB usually won't have (it would use foreign keys) but since MySQL doesn't support foreign keys, and does support arbitrarily large records, your DB may vary.
Postgres doesn't support records larger than 8K (OK, it does if you modify the source, but I'm told there is a performance hit for doing this) - this isn't unusual, as other "professional" grade RDB's have similar limits (MSSQL has a record limit of 4K)
Second problem, does your front end support/use persistant connections?
The major difference in performance between Postgres and MySQL boils down to the persistant/non-persistant connections. Postgres takes a much longer time to start up the connection than MySQL, so if your front-end doesn't support persistant connections, I wouldn't recommend switching. However if it does, a well-designed DB running on Postgres could solve your problem..
The biggest thing is the learning curve - Postgres supports things that MySQL doesn't, such as row-locking and foreign keys (even though these are coming) - my experience with Postgres is that it just slows down under extreme load, instead of stopping altogether - with a properly designed DB and good understanding of how to optimize your queries, it could provide an solution to your database; the problem is that you'll have to learn a little more SQL.
Or, if you were using Roxen, you could just wrap your code in tags, and let roxen handle the rest...
--
Good:
Bad:
--
/. used to run on a Multia doing a few hundred k hits a day. Forget what the box after that was, I think a 450 or something. It's just when Andover bought /., they threw a LOT of hardware at the problem.
Uh. How about you despam it?
Things to try (Score:3, Informative)
by whyDNA? (whydna@fuckspam.hotmail.com) on Thursday November 09, @03:32PM EST (#17)
(User #9312 Info) http://dcaff.com
His username is WhyDNA?. His name at hotmail is whydna. I think the fuckspam. part is easy enough to figure out.
the company i was working for had similar problems... Where php and mysql would get in this wierd state when using persistant connections. It would cause the DB to get all sorts of confused and end up eating it... The solution (please don't ask why this is the case, it seems absolutely illogical) was the switch between the systems.. We changed it to a hub, BAM, problem solved. After spending about 3-4 days digging through newsgroups, IRC, etc. We found a few people that were discussing this problem. It is unknown if it is a php problem or a mysql problem. -andy
check out the section on file descriptors:
http://www.kegel.com/c10k.html
In my experience, connection pooling makes a very large difference on busy MySQL servers. If you're using Java, there are a lot of easy-to-use pools, like PoolMan and Resin's pooling. Don't know if PHP and Perl/DBI have similar features.
I have no idea about the content of your site, but I see a lot of sites where everything is dynamicly generated, for no obvious reason.
Maybe you could save a few thousand hits on your database by making some of the pages static, or just regenerating them with a perl script once an hour or something.
Rather than just dumping the results of the database query to disk, why not dump the complete, generated HTML into a static file? This doesn't even require modifying a lot of code - just write a simple daemon to send an http request ('lynx -dump' or whatever) to the correct page every X seconds/minutes, and save the results to a file. In the code, just divert requests to known static pages (either with a redirect, or just send the file on disk). You can get the web code to ignore the static file if the timestamp is too old.
/. gets like 1.5 million hits a day, and it seems to work pretty damn good all things considered. You could probably run some sort of tests to figure out where you're having bottlenecks. While it's certainly not impossible for MySQL to be giving you trouble, it really doesn't seem likely, unless you use an older version.
Although I personally have no hands-on experience with PostGres, it has often been declared in this here Ask Slashdot forum that PostGres is a somewhat slower but more robust and complete SQL server that doesn't buckle under such load as quickly as MySQL does. I suggest you try it on a test server, migrating from MySQL shouldn't be difficult.
-Billco, Fnarg.com
You may find it help to write a simply caching function for some of the more intensive pages. Think of Slashdot. Let's pretend that 50% of pageviews are the front page, and, of those, 50% are ACs or simply non-logged-in users. So that's 25% of the traffic seeing identical pages. (That is, it's been in no way customized.)
:)
This is a prime example (I think!) of a very cacheable page. You could write a simple caching function -- dump the results of a series of MySQL queries to a file, and check the timestamp on that file to see if you want to re-query the DB or if you want to just re-use the contents of that file. Totally simple.
If you did this to avoid a single query (unless it was a bummer of a join), it would probably be inefficient. But if you used this in place of a dozen queries, I think you'd find that this made better use of your machine. If you want to get really fly you could store that file in a RAM disk.
-Waldo
Try the following:
- Add more RAM (at LEAST 512), preferably 1 Gig
- Upgrade CPUs
- Upgrade Motherboards
- Upgrade NIC
- Read the "Optimizing MySQL" chapter (11??) from the mysql website.
- Try contacting AbriaSoft who works with MySQL to provide support.
- Try caching as much content as possible. Have you concidered squid. Or, you can easily write your own (this is really easy in PHP... email me for info).
-andyWhat user is MySQL running as? Make sure you're not hitting the per-user process limit. Also, make sure you're not near the MAX_CONNECTIONS setting.
My daytime employer handles millions of database driven page views every day with MySQL on Solaris... We have our connection limit set to around 4,500 right now...
-JF
MrJoy.com -- Because coding is FUN!