Slashdot Mirror


MySQL 4 - Is it Stable?

Shaklee3 asks: "I have been running version 3 of MySQL on the company's website for quite a while now. We recently ran into a problem where we needed the new features of version 4 that uses the UNION clause. We are running FreeBSD 4.6-STABLE and Apache 1.3.26. I know they reccomend not using it in a production environment yet, but from what I hear it is already being used on a few major websites. Does anyone have experience with version 4, and is it stable enough to run on a high traffic site?" If you feel MySQL isn't ready for prime-time, where specifically do you feel it needs improvement?

32 of 453 comments (clear)

  1. Using 4 since it came out at the end of 2001 by Anonymous Coward · · Score: 5, Informative

    Running High traffic webboarding system, online stores, other various web apps running off of it and I have not had one problem related to mysql 4 not being stable.

    I think it was good enough for production the day it came out, but they just wanted to keep it in Alpha mode for a while just in case...

  2. Been running it for a while now... by Phil+John · · Score: 4, Informative

    I've been following the developments very closely and have had it running fine on my development box (P3, Win2K Pro) for over a month without a reboot.

    I've also had a test site live to the world running off my broadband connection, so I could test stuff from work etc. and I haven't found any problems yet.

    However I don't know how it would scale to a large site (the test site was 34 tables and only around 50k records in total).

    I think a lot of the "don't use this on a live site" stuff is just to cover their own asses in case something goes wrong.

    --
    I am NaN
  3. Pretty stable for me.... by wenzi · · Score: 3, Informative

    I have been using it for quite some time on a large active database. I have not had any problems with corruption, and it has been solid as a rock so far.

    Why use mysql instead of Postgres ? Speed....

    --
    -- I doubt, therefore I might be.
  4. Re:Which DB is better!! THe world will never know by Phil+John · · Score: 4, Informative

    Hmm...silly geeks throwing in oracle?

    I worked on a couple of very large sites a few years ago (OK Magazine and Television-X the fantasy channels website - fantasy121.com) and the only databases which would scale to what we wanted with no problems and serious speed problems were MSSQL and Oracle...and not liking M$ stuff (security holes...monopolies yada yada yada) we chose oracle and never looked back.

    For huge sites I would heartily reccommend it...and the cross platform java admin/setup tools were a brilliant touch.

    O.k. it costs a bomb...but to a certain extent you gets what you pays for (although for free (as in beer) MySQL gives you pretty good bang for your buck).

    So what if oracle isn't open source...it's still one of the best.

    --
    I am NaN
  5. Re:postgres by sc00ch · · Score: 2, Informative

    Just as easy to setup? Have you tried installing postgresql on a windows box? Currently, the only solution they offer is by using Cygwin and there is no simple installation program. I eventually got it running but i would certainly not say it was as easy as mysql on a win32 platform. A lot of people i'm sure are turned away by this...

    Remember that many developers like to use windows and dont have the money to set up a seperate box for running a database.

  6. Re:All I want for Christmas... by swingkid · · Score: 2, Informative

    So use a real database: Oracle, Sybase, MSSQL, Postgres, DB2 (etc, etc) all have such features.

  7. MySQL supporters need to learn SQL by rsmah · · Score: 4, Informative
    MySQL's popularity can be traced back to the popularity of mSQL and MySQL's compatability with it. It's widespread use, however, is like the spread of Microsoft Windows or Visual Basic. Some people *believe* that MySQL is somehow "easier" or "faster", but that's just not the case.

    If you are a MySQL user and don't understand why people who know databases don't like it, you simply have to consider MySQL's lack of common SQL DBMS features such as:

    • Transactions
    • Subselects
    • Views
    • Triggers
    • Constraints
    • Foreign Keys
    • Etc, etc, etc.
    You may not understand why such features are important, but that doesn't mean that having them available for use when you do finally learn about them isn't important. A DBMS without such common features is like driving a car without seat belts and air bags. You may get where you want to go, but woe to you if you run into any trouble along the way.

    BTW, yes, I know that Mysql supports foreign keys and transactions if you use innondb tables. But the point of using a SQL database is that you shouldn't have to worry about the underlying data representation.

    Cheers,
    Rob

  8. Re:All I want for Christmas... by kylant · · Score: 5, Informative

    MySQL includes the fabulous InnoDB which offers foreign keys, great performance and row-level locking.

  9. new feature? by sql*kitten · · Score: 3, Informative

    new features of version 4 that uses the UNION clause.

    As far as I know, IBM's System/R had UNION in 1974 and Oracle (back when the company was called Relational Software) had it in 1979...

  10. Re:All I want for Christmas... by Karora · · Score: 5, Informative
    Is Sub-selects and foreign keys. These are probably the two biggest features I've constantly found myself needing / wanting.

    Then you probably should be using PostgreSQL, which has had these features for some time. It also has UNION, functional languages, extensible types, full transaction support, SQL standard syntax, unlimited row length...

    Stick with MySQL when performance is critical and you want to do straight SELECTs from simple tables with few joins and very few updates.

    Go with PostgreSQL if you want an RDBMS that supports transactions, that scales when writers and readers both use the DB at the same time, if you want to support complex queries or if robustness, programmability and configurability are important to you.

    --

    ...heellpppp! I've been captured by little green penguins!
  11. Don't use proprietary SQL features by clever · · Score: 2, Informative

    You can likely change to a different database later as long as you don't use proprietary functions. Make sure you write standard SQL.

    And before you say that what you write is SQL... You might not be. Check a few of your statements with an SQL Validator

  12. Re:All I want for Christmas... by zmooc · · Score: 3, Informative

    Shameless plug: we use SapDB for all of our customers. It's GPL, stable and we're really happy with it.

    --
    0x or or snor perron?!
  13. I (heart) MySQL by defile · · Score: 4, Informative

    MySQL's limitations are not a serious problem for me. Most database accesses in my applications are consolidated in a separate layer. It is very simple to duplicate features which may be missing from MySQL.

    The support behind MySQL is outstanding, both from the mailing list and paid MySQL support. MySQL is extremely simple to set up and can handle quite impressive loads.

    To keep score, MySQL + Innobase supports transactions (w/ row level locks and multi-versioning), foreign keys. 4.0 introduces UNION statements and a supposedly cool query cache. Pretty soon should come subqueries, stored procedures, etc.

    That said, I've not used 4.0 in a production environment. What we have right now (3.23.51-max) seems to be doing the job just fine.

  14. Query Caching by Anonymous Coward · · Score: 1, Informative

    In a recent database benchmark, the Alpha version of MySQL beat DB2, SQL Server and ASE. It kept up speed right along with Oracle (a expensive product to say the least). Check It Out

  15. Performance by ghum · · Score: 5, Informative

    Stick with MySQL when performance is critical and you want to do straight SELECTs from simple tables with few joins and very few updates

    "Postgresql is slow" is a very popular myth and urban legend.

    It even was a true myth - with older versions of Postgresql. Since 7.1.3 big improvements in the query-optimizer gave Postgresql big speed improvements - without stripping any of it's features.

    7.2.1 is lightning fast.


    Postgresql Tips for today: Do check out
    • the COMMENT on table.column command
    • the \d+ extended information display
    • the CASE WHEN ... THEN .. ELSE.. ENDIF construct
    1. Re:Performance by gazbo · · Score: 2, Informative
      Postgres commands:

      Tables in database: \dt
      Fields+types in table: \d [tablename]

      Nuff said.

      PS. wtf if the third line indented by one char?

    2. Re:Performance by Fjord · · Score: 3, Informative

      Other people have mentioned \dt , but I wanted to say you should try out DB visualizer. It's like TOAD without the wizards, but works for any JDBC database (even cloudscape). Very cool tool and best of all, free (as in beer, maybe as in speech too)

      --
      -no broken link
  16. umm... because.... by Ender+Ryan · · Score: 2, Informative
    Because lots of times you have software that uses SQL to talk to it's database backend. Or you may need a database that talks to a number of different servers, etc. Or possibly you may need to migrate to a bigger better database in the future, like DB2 or Oracle.

    --
    Sticking feathers up your butt does not make you a chicken - Tyler Durden
  17. Mysql @ /. by Anonymous Coward · · Score: 1, Informative


    I was at the Open Source Convention in San Diego last week.. I can't remember if it was Yahoo Finance, or Slashdot that uses 4.0 on all their slaves, but still has 3.23 as their master...

  18. Re:How/where should one learn SQl then? by FattMattP · · Score: 3, Informative

    Read SQL for Web Nerds. It's done from the perspective of someone using Oracle, but you should be able to get the idea.

    --
    Prevent email address forgery. Publish SPF records for y
  19. I've tried 4.0.2 on OpenBSD 3.1 - not recommended by the_olo · · Score: 2, Informative

    I've tried compiling MySQL version 4.0.2 on OpenBSD.

    First of all, to even get this thing compiled, you'll probably need to apply patches from the ports. See http://www.openbsd.org/cgi-bin/cvsweb/ports/databa ses/mysql/patches/ for OpenBSD ports patches.

    For example, I've needed patch-innobase_include_univ_i to compile any recent MySQL to compile on OpenBSD 3.1.
    That patch is quite funny:

    #define UNIV_INLINE __inline
    #else
    /* config.h contains the right def for 'inline' for the current compiler */
    -#if (__GNUC__ == 2)
    -#define UNIV_INLINE extern inline
    -#else
    -/* extern inline doesn't work with gcc 3.0.2 */
    +/* mysql people don't understand extern inline */
    #define UNIV_INLINE static inline

    When installing MySQL 4.0.2 I've applied all of those patches, and then configured, compiled and made make install.

    The database new MySQL daemon started up and operated fine (with previous version's data files), but mysql 4 client were unable to connect (I've got an "ERROR:" error message. Tells much, doesn't it?).

    So, to summarize, wait some more time, at least until they release a beta.

  20. Re:we're using it in production by Anonymous Coward · · Score: 1, Informative

    So are many others, as are we - using UNION, too. 1.040.000 logged-in users per day, 15.000+ queries per second. Single CPU sytem w/ 1G RAM. Any questions?

  21. My Experience with MySQL at work... by vandan · · Score: 5, Informative

    We (I) upgraded to MySQL-4.0.1 when it came out (after a little testing at home).
    I upgraded to MySQL-4.0.2 about 2 weeks ago.
    We have about 5 databases with 10 tables in each which have between 1,000 and 100,000 records per table. We have about 30 users connected to the MySQL server from an Access 2002 front-end. I have been moving stuff from SQL Server 7 as it bogs down. MySQL-4 seems to handle multiple connections better than SQL Server - update queries that used to timeout (and crash Access) when in SQL Server now run effortlessly.

    By the way, our little MySQL beast is an AMD K6-2 500 with 256MB, and is also running an IMAP server for about 50 mailboxes. Oh - and don't forget VNC ;)

    The ONLY problems I have had have been with the MyISAM table handler with large tables & multiple users. I was getting locks and time-outs, so I upgraded them to InnoDB, and have had no other problems since.

    I have also started using transactions (which InnoDB supports). Seems to work perfectly for me. Admittedly, I'm not doing anything major, but any ... no problems for me.

    Foreign Keys are also supported by InnoDB. Works well. MySQL-4.0.2 just made foreign key constraints survive an alter table command (4.0.1 used to dump the constraint).

    What else can I say? I've been very happy with MySQL-4.0.x. Certainly no crashes or anything unexpected. And the --log-update startup option gives you a nice running backup anyway...

    I would upgrade. I think the 'alpha' versioning is being too modest.

    1. Re:My Experience with MySQL at work... by retrosteve · · Score: 3, Informative

      If you can make it through all the MySQL-haters rantings, (can somebody mod those guys down, it's off-topic!), here are my current conclusions after using mysql 4.0 in an industrial app (27 million records, about 1 million queries daily).

      1. It works good. No crashes and no more problems than the 3.23 versions.
      2. The replication still isn't there yet. If replication matters to you, wait.
      3. I agree it's not "alpha". It is no more or less buggy than the stable 3.23.51 version.

      I will be very happy when replication is working and tested and works on ALL queries and can recover from errors. Until then it's still a day of fooling around every time replication screws up.

  22. Stable? Maybe. Worth it? Not yet. by MattRog · · Score: 3, Informative

    Let me repeat the same things I say every time:

    Why MySQL is Not Suitable for Enterprise or High-Volume Use
    or
    MySQL.com misleads you about it's capabilities

    Replication in MySQL is a joke for 'mission critical' use. As I understand it, the binary log records SQL modification statements which are executed on a master, not the data which was changed. This is involves significant assumptions beforehand, such that the master and slave(s) must be 100% identical. If I perform an UPDATE on the master, the changes are not replicated, but the query. This is what I would call the 'easy way out'. Who knows what happens to the query once it is replicated out - what if it hangs halfway through? I can't roll back and be in a consistent state, I have half-completed changes which makes my database inconsistent and now I'm forced to dump-and-load. Keep track of which rows are modified, to what from what, and ensure that those transactions are replicated to my slaves. Anything less is simply useless for high availability.

    I would also be willing to bet that a significant number of installations that have transitioned to MySQL replication are doing so due to table-lock induced latency. A suitable system with a capable RDBMS could probably handle all of the load given to it and not need 'many slaves' to handle the extra traffic. They would have a single failover for high availability and that's it.

    Filesystem buffered writes. Transactional support is great - it allows me to roll-back aborted transactions. However, due to the inability to control whether or not my tables are write-buffered means that MySQL may *think* it has performed a write even though it is still in the write-cache. I can then turn off the system and voila -- corruption! Part of the fault lies in the OS who tells MySQL it was written even though it is in the cache, but I have a simple solution. Devise a way to selectively turn off buffered writes for certain tables / databases. This way if I know I have a critical table which has a lot of writes I can turn buffering off and be ASSURED that writes will be performed when asked. I suspect a lot of 1040 and other table corruptions are caused by something like this. Yes, performance will take a hit but I think it is a very acceptable trade-off for data corruption. Obviously all system tables should NOT be buffered.

    Inability to use more than one index on a table in a query -- most enterprise RDBMS' can use more than one index on a table for a query. This can easily save a table scan or the use of a single, less-efficient index. Given an example query - 'SELECT bob FROM sometable WHERE somecol = 45 and somecol2
    Clustered indexes. These basically physically sort the table based on particular columns. This allows you to ORDER BY username ASC without using anything special since the rows are already sorted on username (if you have a users table and cluster the username col). This also greatly speeds up BETWEEN clauses. And yes, to people who know a little bit of SQL but don't know as much about clustered indexes -- you can create an index with a bobcol ASC but clustering the actual data is faster and more efficient if you are grabbing data which is not on the index. For example, SELECT * FROM table ORDER BY username ASC will not be as efficient as the same query clustered on the username. If you had a sorted index on username it will probably read the index sequentially and then visit the table. That extra operation = more disk seeks = more time / cpu to execute (and it really adds up as the table size increases). However, if you are doing something like 'select username, password from user order by username' it would be better to create a sorted index on username ASC, password. That way it will read the index only and not visit the table at all.

    On-line backups. In today's internet world your site has to be 24/7. This means you cannot have significant performance problems (or even offline-ing your dB!) when you make a dump -- Sybase, etc. have done this from as far back as I can remember. Postgres can do this with an add-on which is well worth the money. As far as I know MySQL can only do this with InnoDB tables and is a for-pay feature (since it has a MVC log to use in the meantime).

    Backups to something other than CSV files. MS SQL, Sybase, Oracle, they all dump to a compressed binary file. Saves a TON of space and is MUCH FASTER to dump and load. I can dump a 12GB Sybase DB in under 20 minutes. Loading it all (from scratch) and then bringing the DB online is about the same amount of time. MySQL stupidly logs the CREATE TABLE / INSERT statements. What does this mean? That I have to wait for 4 million INSERTs to be performed when loading my table, and FURTHER I have to wait for the INDEXES to be re-created on the new data. Dump the indexes, too! (Remember that full-text indexing is just another index, so if you use that and have to load from a dump be in store for SIGNIFICANT downtime).

    Ability to specify the number of files to dump to. What happens if you have a dump which is larger than 2GB? Some linux distros cannot handle a single file of 2GB or more without recompiling the kernel. Give users a way to, within the dump statement, split the dump over two files. Not only will that help avoid the 2GB limit, but it can speed up dump/loads since I can dump to a bunch of different disks to improve throughput. Sybase has the 'STRIPE ON' clause (originally to dump to two tape drives at once but works fine on filesystem files as well) to split the dump equally over an unlimited number of files. This also impacts the fact that MySQL tables and indexes are stored in filesystem files that are also subject to a 2GB limit.

    Cleaner way to view query plans of statements. EXPLAIN ... is great and all, but the resulting table is a PAIN to read. What I want to know is simple:
    Query is using XYZ, ABC tables. Table XYZ is using index 123 which is sorted so I do not need to create a temp table to sort ASC.
    Since you have all the columns in your select statement in the index I do not have to visit the actual table - I can pull it all from the index. Because of this, I will read the index from start to finish.
    ABC is using index 23dsf which is not sorted so I must create a temp table to sort that. Also, since it is a join, I do not need to perform an index scan but a positioned search (table scan is to a WHERE clause with no index AS index scan is to an index which is not selective enough or needs to read all columns.)
    Simple, easy and pretty much even a NOVICE can see that their query is a good performer or a bad performer.
    Along with more in-depth EXPLAIN, also provide me with a way to see what the optimizer is doing with the query. In MS SQL and Sybase you have 'trace flags' which you can turn on before your query to see EXACTLY what Sybase is doing - why does it think this index is better than this other one, why is it table scanning when you think it should index sort, etc. Give me an easy way to say 'verbose on; explain xxx;'.

    Ability to delve deeply into performance of the system. If there is one job a DBA must know it's how to tell what the heck is going on when something is slow. Currently MySQL gives you meaningless info like 'slow queries'. Great, I see 200,000 of them. What queries are they? What good is it in a large application which may contain 3000 lines of SQL to tell me the raw number of queries which are slow? I want to know the EXACT SQL of the query(s) which are slow and I want to find the one taking up the most CPU time and blocking all the rest. I want to know how MySQL is managing it's data cache so I can see if I need more ram (e.g. it is swapping lots of data to/from the cache) or if I am I/O bound. Don't tell me to look at 'free' or 'top' - half the time it is wrong because you (MySQL) tell it misleading figures. I want *you* to tell me exactly what you are doing since you would know best! If you've ever seen a sp_sysmon output from Sybase ASE you'd know what I'm talking about.

    MySQL's query optimizer is PISS POOR. If I see another changelog entry like this I'm going to scream:
    Optimized queries of type: SELECT DISTINCT * from table_name ORDER by key_part1 LIMIT #
    So does that mean these queries were NOT AT ALL optimized before? It doesn't read 'FURTHER optimized'.
    "ORDER BY ... DESC can now use keys."
    Does that mean it was table scanning each time? Jebus! Hands down the query optimizer is one of the most important things in the database -- knowing how to use the database statistics and knowing when to use a merge-join vs. a hash-join etc. are CRITIAL to database performance.

    Of course, the usuals: integrated row (or in the least page) locking, full support of subqueries, stored procedures, views, triggers, referential integrity, transactions, etc. etc. etc.'

    PostGRES and virtually 100% of 'for pay' RDBMs have this. There simply is no reason to use MySQL for anything sufficiently non-trivial.

    --

    Thanks,
    --
    Matt
  23. Re:All I want for Christmas... by Anonymous Coward · · Score: 2, Informative

    The current problem with InnoDB is that the size must be fixed and does not grow automatically.

    It doesn't?
    "InnoDB tables can now be set to automatically grow in size (autoextend)."
    MySQL - D.1.2 Changes in release 4.0.2 (01 July 2002)

  24. it definitly depends on the usage by pfote · · Score: 2, Informative

    getting back to the topic (4.x stable or not) ..
    short answer is: if not used with replication, yes

    bout the advocacy: shure mysql has its short comings, but it heavily depends on what you do and need

    i have a mysql server that has a average of 3000-6000 queries a second, with peaks going up to bout 60000 q/s, 800-1600 connections, bout 30GB of data (which doesnt say much), lots of tables with bout 1.000.000 rows that are read only, some smaller that are frequently updated (200-300 updates/inserts/deletes per second on some tables)
    server hardware is a SUN E6500 (18cpu 400MHz, 16GB RAM), table type used is myisam except some tables with sensitive data where i need transactions, there innodb is used)

    wouldnt call it a simple at all, but its a web app

    beside others(interbase and so on) i gave postgres several tries (latest version i tested was 7.1) because of having subselects and views .. it scaled better, but the general performance was to low to even try it in production

    oracle might be a option (beside having a bunch of features more than mysql it suports clustering which i would like to have for scalability reasons) .. but its still out of question because of the price tag .. i can pay a bunch of programmers a long time for the same money, and so we tend to throw some brain on the problem instead of money

    with a different application the situation might change completely .. so your mileage may vary, but for what we do here mysql has been proven to be the best solution

    sorry for the bad english

  25. Re:Stable? Maybe. Worth it? Not yet. by peyote · · Score: 3, Informative

    Wow, a shining example of someone firmly planting their foot in their mouth on Slashdot. What a surprise. It's nice that the post was moderated up to 4 as ``Informative''... too bad most of the information was useless or wrong.

    The overarching problem in your comment was your opening statement:

    'Let me repeat the same things I say every time'

    The problem there, which is obvious to anyone remotely understanding of what active development means, is that MySQL is sort of a moving target. Your statements are erroneous is so many ways, but most of them can be boiled down to this: you are arguing against something that no longer exists. MySQL, as you attack it, is no more, and has been replaced by something far better. So let's just take some of your arguments (unlike you, I refuse to speak on those things that I *DON'T* know, so I'll skip a few with which I'm not familiar) and see how they stand up, shall we?

    Argument: Replication in MySQL is a joke for 'mission critical' use.
    Rebuttal: Somehow that fact doesn't impede Yahoo!'s extensive use of it. (See Jeremy Zawodny's presentations at the recently-held OSCON.) If the query hangs on the slave halfway through, it isn't marked on the slave as having completed. When the slave becomes available again, it notes its pointer in the transaction log and catches up automatically. Oh, and replication (at least in 4.x, possibly also in 3.23.x) is transaction-safe.

    Argument: Filesystem buffered writes.
    Rebuttal: As another poster wrote, leave the OS out of this. If you cannot properly configure your OS to not buffer writes, you probably shouldn't be running a 'mission critical' ANYTHING.

    Argument: On-line backups [are not there]
    Rebuttal: So set up a dedicated slave for backups. Turn off the slave while backups are running, it catches up when backups are done and it is brought back up. A *simple* solution to a *simple* problem. If you really feel the need to do a hot backup of your live server, you can check into using InnoDB's tool at http://www.innodb.com/hotbackup.html .

    Argument: Backups to something other than CSV files.
    Rebuttal: You mean like backing up the raw MyISAM files? Of course, that doesn't work with InnoDB databases, so you can use their hot backup tool for that as well, if this is a REAL (rather than IMAGINED) problem.

    Argument: I have to wait for 4 million INSERTs to be performed...
    Rebuttal: RTFM. No you don't.

    Argument: [No] Ability to specify the number of files to dump to
    Rebuttal: Again, is this a real or imagined problem? It's likely that whatever you are trying to do, there's a better way. Unfamiliarity with a particular tool usually results in this type of problem.

    Argument: Cleaner way to view query plans of statements
    Rebuttal: So because YOU don't like the output of EXPLAIN you're saying MySQL isn't ready for production? WTF are you talking about?! As for a more *in-depth* EXPLAIN, I agree there, and I found PostgreSQL's mechanism kind of cool. Of course, in four years of running MySQL in a production environment, I'm not sure I would have used it more than once or twice; MySQL's EXPLAIN has always been sufficient, if you actually know what you're doing.

    Argument: Ability to delve deeply into performance of the system ... Currently MySQL gives you meaningless info like 'slow queries'
    Rebuttal: RTFM. You are obviously unfamiliar with the slow queries log where MySQL gives you EXACTLY the information you are looking for. As for the data cache and whatnot, I don't know if that is actually available or not.

    Argument: MySQL's query optimizer is PISS POOR
    Rebuttal: And your evidence? Oh wait, you didn't actually provide any. You just brought up a tangential issue...

    Argument: If I see another changelog entry...
    Rebuttal: If you are that concerned, you have some good options here. (a) Pay the developers to hold your hand and explain to you what has happened. (b) Use the source and do your own friggin' diff. This is Unix; stop acting so helpless.

    Argument: [a laundry list of disinformation]
    Rebuttal: Dude, have you not even looked at MySQL since 3.21 or something? Row locking is available in InnoDB, as are transactions. Stored procedures and triggers are planned for 5.x IIRC, but so many applications DON'T need them that the MySQL folks simply haven't cared to add them. Ditto for views (which are also slated for 5.x).

  26. Postgres replaced mysql on my prod system by TheNarrator · · Score: 2, Informative
    Two years age I replaced Mysql with postgres on our production ecommerce system. This was one of the early 7.0 releases. It's much faster, and much more stable than mysql and pretty much hassle free. The one thing I found was the query structure checking in mysql is so sloppy that the port to Postgres forced me to clean a lot of the queries up. I found some ambiguities that were caught by popstgres but would introduce subtle errors in the mysql based system. That, and Postgres has transactions.

    I am convinced that the only reason people like Mysql is it has better marketing. The perfect example is those benchmarks they distribute with the software. Those are single user transactionless benchmarks!!! Single user benchmarks are not realistic. Anyway Postgres is far stabler faster product in my opinion.

  27. Re:Stable? Maybe. Worth it? Not yet. by zak_greant · · Score: 3, Informative

    Hey,

    Thanks for the critical feedback - however, you have some information that is not accurate.

    On the positive side, your criticism has provided some interesting ideas! :)

    REPLICATION
    You are correct that the binary log stores the SQL data modification statements that were applied to the master, rather than the actual
    changes that were made to it. You are also correct that if something goes wrong, the DBA is the one who gets to fix it.

    It does not assume that the master and slave are equal - it does assume that the slave contains an linear subset of the data on the
    master.

    We have users working with replication in high-demand situations and it is performing well. Could we improve it? Of course - I was just
    browsing through the slides from the PostgreSQL Replication talk at OSCON and it looks like some very cool things are going on - we should
    watch and learn. :)

    TABLE LOCK LATENCY
    It is true that table-level locking used by MyISAM performs poorly under heavy concurrent read/write.

    If this is an issue switch to the InnoDB storage engine (which uses low-cost, non-escalating row level locking) or the Berkley DB storage
    engine (which uses page-level locking).

    InnoDB uses a single bit to indicate if a particular page stores rows that are locked. If any rows are locked in the page, then a few more bits are needed for the page to indicate what particular rows are locked.

    FILE SYSTEM BUFFERING
    File system buffering can be a thorny issue. The InnoDB storage engine forces a flush to disk upon the commit of every transaction and
    then writes a checkpoint so that it knows when the last flush was made.

    INDEXING
    This is just plain wrong. Of course MySQL can use more than one index in a query!

    CLUSTERED INDEXES
    MyISAM tables can be optimized so that the order of the rows in the table matches the order of the indexes on the table.

    InnoDB tables already use clustered primary key indexes. Secondary indexes refer to the primary key values.

    ON-LINE BACKUPS
    Replicate the database out to another server (even one on the same machine), then stop the slave to take a backup.
    This has the added benefit of being able to ask the slave to take over from the master when you need to maintain the master.

    BACKUP FORMAT
    Use the binary log instead of the mysqldump tool - it uses a compressed binary format that is much more compact.

    Also, if you have to recreate a table from a mysqldump, then disable indexes until you have recreated all the rows - will save a good deal
    of time on bigger tables.

    Dumping the indexes is probably not a bad idea though. :)

    DUMPING TO MULTIPLE FILES
    An option to allow dumping to multiple files would be convenient. :)

    In the past, I just asked mysqldump to dump to stdout and have a perl script handle the segmenting.

    The binary log does get segmented into multiple files automatically. I will ask the developers if it would make sense to make a new dump tool that works with the binary log format so as to get the benefits of that format.

    BETTER TOOLS...
    Duly noted - the output from EXPLAIN is cryptic.

    DELVING DEEPLY...
    Absolutely - we could use more tools to give detailed performance information. The current tools that report the number of active threads,
    the number of questions run, the memory currently used, the max. memory consumed, etc. are not enough. We have some graphical tools like MySQL-Graph (A GPL'd app) to make review of the data easier.

    Also, slow queries is more than a counter. The slow query log stores details on every query that ran over the value of the long_query_time
    setting. Use the mysqldumpslow tool to give a summary of data in the log file. The log can also record queries that did not use an index.

    QUERY OPTIMIZATION
    On what knowledge do you base your assertion that the query optimizer is 'PISS POOR'? Do you understand the code behind it? Have you run
    benchmarks to compare its performance to another optimizer?

    Of course we keep optimizing for specific cases. We want to continue to improve performance whereever possible. We have spent a good deal
    of effort doing broad optimizations - the optimizations that have the greatest benefit for the most queries. Now we are work more on
    tweaking specific cases.

    MISSING FEATURES
    The Berkley DB storage engine uses page level locking.

    The InnoDB storage engine uses row level locking (without resorting to lock escalation :)

    We have excellent transaction support - likely the best of any available database today. InnoDB supports the repeatable read transaction
    isolation level. However, due to how we implemented our multi-versioning support, we don't get phantom reads. This is a higher level of transactional isolation than MS SQL, Sybase, PostgreSQL, Interbase, Ingres, etc. IIRC, only FireBird and Oracle may be the same.

    Sub-selects should be out very soon. We are still working on stored procedures, views, triggers and full support for referential
    integrity. We know that these features are important. However, we are working on doing truly robust implementations - rushing them out
    will not help anyone.

    END NOTES
    Why does no one mention the stuff that MySQL is good at? :)

    We are fast - we have third party confirmation that we are faster than DB2, MS SQL and Sybase. The test even confirmed that we perform about as well as Oracle (a bit slower :) when running under Windows using a JDBC driver.

    We don't need to stop the database to vacuum or do many maintainance tasks.

    We know that we can run in critical environments because we have users like Yahoo! Finance *and* Slashdot :) using us for critical, high-load applications.

    We are a fully-threaded app and can take full advantage of SMP machines.

    We can run natively under a bunch of OSs - including Windows.

    Our ability to use different storage engines gives users great choice in how to manage their data. If someone needs a lightweight format in a non-transactional environment, use MyISAM - it has very little storage overhead and is speedy in situtations where you do not have many concurrent reads/writes.

    If you need really robust transaction support that ensures the integrity of your data, use the InnoDB storage engine. Storage overhead is more than with MyISAM tables, but that is not generally an issue for enterprise level users.

    I can hear people grumbling - well, the transaction support isn't integrated, so it isn't valid... That is complete junk - since when is choice a bug? That is like saying that the Linux is not a modern operating system because its default filesystem does not use journaling. Usually, is it only the proprietary and/or less advanced operating systems do not give you a choice of file system.

    Also, we can easily add in new storage to support specific needs. Look at how quickly InnoDB was integrated - it suddenly took us from having no transaction support to having great transactional support. Without the storage engine concept, we would have had to do a lot more work to get it integrated.

    blah blah blah... :)

  28. Comment removed by account_deleted · · Score: 3, Informative

    Comment removed based on user account deletion

  29. Re:Stable? Maybe. Worth it? Not yet. by peyote · · Score: 5, Informative

    If we're talking about high trafficked sites, I'm pretty sure our sites in my last job would qualify: around 80M-90M page views per month. Granted it isn't ``upwards of 100 million imp/mo.'' but I don't expect to get into a d*cksize war about that. :-) (Also, the great majority of these were PHP pages doing various things against our MySQL database.)

    Right, and the slave goes down. So I have to manually intervene to fix it. That is unacceptable.

    I'm not really sure what you mean here. The only replication problem that would require manual intervention is in a multi-master setting where MySQL gets confused about which query comes first. (And they don't even officially support such a config.) In a typical failover setting, if a slave dies, goes down, whatever, it rebuilds itself when it comes back up, no intervention required.

    As for just quitting and saying 'oops', I think that was a problem with MySQL's replication a number of versions ago, probably one of the first versions it was officially in the source (complete with dire warnings, of course).

    Finally, as for Yahoo!'s total MySQL traffic, we actually outran that by quite a bit at my last job. We were seeing sustained averages (over a number of months) of nearly 180 queries per second, with a maximum of 1000 concurrent. Is that worth writing home about? ;-)

    Correct, knowing what your filesystem does is your problem...

    Maybe I'm misunderstanding, but this REALLY doesn't seem like it's MySQL's job. If you want a database to be O_SYNC'd, `chattr +S the_directory`. (Granted you cannot do that on a per-table level.) Want the whole thing sync? `mount -o sync the_filesystem/`.

    Ok, so I have to now get another boxen set up just to backup my database?

    Yep. Or you can get one box that acts as a slave for all five/ten/whatever of your MySQL pools and back them ALL up from one box, rather than five/ten/whatever.

    ...pausing replication...

    No no no no no. You don't have to pause replication. Really. I promise.

    etc. etc.

    No, there is no etc. etc. You described it just fine in a few simple steps. At least in our application, this eliminates any need whatsoever for hot backups. YMMV, use the right tool if you determine that to be something else. But to say that is doesn't work is simply false. I can demonstrate that it does work, and it works well, even in a production environment with a healthy bit of traffic.

    P.S. If you are still really bent on this hot backup business, check out `perldoc /path/to/mysqlhotcopy`.

    Question: why are you doing lookups on a table that is in the process of being loaded? Do you do lookups on a table while you are restoring it from backup? Aren't you running the risk of getting bad data out in any case? Yes, you do have to wait for the operation. No, that does not mean that you have to do 4 million serialized INSERTs.

    Yes I've read the manual and seen the slow-query-log portion. However the log does not log the currently running SQL...

    Okay, if that's what you want, use SHOW FULL PROCESSLIST from the MySQL monitor. Done.

    As for some of the data cache issues, I still don't purport to know much about that, but ISTR a lot of talk from Monty and the other MySQL team at OSCON about this sort of thing in 4.x and 5.x. We'll see over the coming months...

    As for the optimizer, you're right, I can't see it either. Neither have I ever needed to, since it just doesn't come up. At least in our application, MySQL seems to get it right every time. (No, this isn't a proof. Yes, it's anecdotally helpful to the OP, I think.)

    We don't use MySQL in a production environment, so I'm certainly not going to pay or waste my time coding. (regarding vague Changelog entries)

    Okay, maybe the Changelog could be more verbose. (Personally, I'm happy with the amount of information in the Changelog.) But those who write and manage large applications, if they are truly concerned about this sort of thing, should have the expertise (either in-house or via a MySQL support contract) to squeeze every last ounce of performance out of the database. In the same way, they should be able to track changes in a more specific way. Maybe not you, since you don't run MySQL in production, but those who *do*, *can*.

    InnoDB does provide row locking, but at a cost of performance and memory.

    I'd be interested to see the application that provides row locking for free. :-)

    If you *NEED* those features (note that need does not just depend on the size of an application), use them, either in another package, or when MySQL gets them. I've seen plenty of non-trivial applications that run just fine on MySQL in production and high-trafficked environments. Size and volume do not constitute a feature need.