Posted by
Hemos
on from the lock-yer-rows-gentlemen dept.
egerlach writes "All you DB admins out there might be interested to know that MySQL 4.0 has finally been released! It's only 4.0.0 alpha, but you can download it here. You can also check out a full list of changes."
Re:In other news
by
Kattare
·
· Score: 3, Insightful
Sometimes speed is everything...
Re:is this news?
by
guusbosman
·
· Score: 1, Insightful
Of course it is news! MySQL is cool, and for many many Open Source developers it's THE database to use, and 4.0.0 is supposed to bring quite nice new features.
"too much time on their hands" is a comment which has been made for the whole Open Source movement a lot of times. But it's nice to develop software, it's nice to make Open Source and it's nice to read on Slashdot about new versions of tools we use.
Re:In other news
by
tzanger
·
· Score: 5, Insightful
Sometimes speed is everything...
Which MySQL hasn't got, no matter how you look at it.
Postgres has shown time and time again that it blows the shit out of MySQL for any kind of select statement where your 'where' clause is even just slightly more complicated than 'x = y'. Even for simple selects, Postgres performance scales waaaaay better than MySQL. We're not talking hundreds of clients, either; we're talking dozens. MySQL falls flat on its face under heavy load.
And if your application just requires the simplest of simple selects, a hashed flat file is faster than MySQL because you don't actually have to parse up the SQL statement and return the result in a formatted fashion!
Face it: MySQL is a neat toy but that's about it. All this pandering about what 4.0's got now and how they think it's mission-critical-ready is bullshit, plain and simple. We've got an open-source, free ACID-compliant database already. MySQL has lost, they just don't know it yet.
The real reason for this release
by
joshv
·
· Score: 5, Insightful
From the release notes:
"Removed all Gemini hooks"
See:
http://www.mysql.com/news/article-75.html
And you will understand. There probably are not enough changes to warrant a +1 increment in the release number, but this gives them a new version that does not support NuSphere's attempt to usurp MySQL's copyright and trademark rights.
-josh
How about a competition? (was Re:Proof, please)
by
King+Babar
·
· Score: 4, Insightful
Postgres has shown time and time again that it blows the shit out of MySQL for any kind of select statement where your 'where' clause is even just slightly more complicated than 'x = y'.
I'm using MySQL to perform selects that join five tables, including one with over 800,000 rows and one with over 5 million rows. It can perform these selects in a fraction of a second, more than adequate performance for my application.
I might be missing something, but I don't see
anything contradictory in these two statements.
If your "where" clause is sufficiently easy and you can take advantage of indexing, you would expect any decent system to
perform reasonably well when it is returning a
handful of rows. Five table joins would only
be scary when you can't toss out the vast majority of your 5 million rows at step one.
MySQL has also been very reliable for me.
One thing that has just occurred to me in this,
round 5,408 of the MySQL-versus-PostgreSQL flame
fest is to ask if anybody has tried to replace
MySQL with PostgreSQL or vice versa in any project that both sides would consider "worthy" in some sense. One problem here, of course, is that if you really wanted an ACID solution, you weren't (aren't?) likely to choose MySQL in the first place. But what if you wanted to use PostgreSQL with the Slashcode; would this port be doable in anything like a reasonable amount of time? If done, could it then be shown that the result was
any different (for better or for worse) than what
you get with the current system?
Another idea would be to borrow an idea from the
functional programming community and do a "database programming contest" like the the annual
ICFP Programming Contest. You could try to find a sponsor and give out cash prizes, or you could get *really* serious (like the ICFP people do) and award year-long bragging rights.
I mean, both PostgreSQL and MySQL fans would probably go to great lengths to avoid having to
refer to the other product as, for example, "the relational database system of choice for for discriminating hackers" in public. Not to mention the possibility that both
of them would have to acknowledge the supremacy
of DB2 or Oracle or something.:-)
--
Babar
Re:Any support for Foreign keys yet?
by
teg
·
· Score: 3, Insightful
Does this new release of MySQL support the proper use of foreign keys internally? (found no info in release notes). Or has MySQL a different way of implementing a "One-Many"-relation?
You don't absolutely need foreign keys to implement one-to-many relationsships - you can just do it the standard way and drop the foreign key constraint. Of course, your app needs to make sure that it updates and deletes entries when necesarry (and when did NTH^WNTNU stop using Oracle? That's what we used when I had the database course there, and MySQL is a poor choice for a course in databases If there's one place you'd like to really have foreign keys, subselects, transactions, views etc, a university course on databases is it - and PostgreSQL clearly has better SQL support than MySQL)
Re:This version doesn't compile
by
Peter+H.S.
·
· Score: 3, Insightful
GCC is *not* the culprit. Code that doesn't compile on GCC 3 is broken code, that violates the basics of the C language.
True, and I knew that you knew that too, so I did not write it. But a lot of code still doesn't work with gcc 3.x. So as another pointed out, a bug report to the mysql crew is in order. (if you have the time).
If an automobile vendor sold cars without
windshields, seat belts, rediator, air-conditioner, brakes, and suspension, would you encourage someone who will be relying on his or her car in demanding situations to buy a car from this vendor?
Of course not. Don't be silly.
It wouldn't help that you can buy the missing parts from Jimmy-Bob either if you know the car was never even designed to HAVE windshields, seat belts, radiator etc. in the first place.
Actually, having transactions, foreign key constraints and better locking as extras is more like selling a car without an engine or a gear box and driving it Fred-Flintstone-style...
If I have complete control over every application that accesses the database then I can ensure that garbage data is not put in.
That is exactly the point. In the real world, you cannot ever have this 100% complete control. Even if you have somehow been given supreme executive database authority in your company, one day you will quit or get run over by a bus. Your successor will be handed a requirement to implement this new little tool that the sales force desperately needs that just happens to require access to this "protected" database. If you had done the job right, almost all the new developer would need to know would be completely described in the database itself. With referential integrity relegated to clients, the developer first has to completely understand the database and all of the client GUI code everywhere.
Let's say he's real good and figures out all of the GUI code. Then the next day, the marketing guys come up with a new requirement for a little app that they need that also accesses the same database. Now you have to completely understand 2 other applications before coding this new one. And it just keeps getting worse and worse as time goes on.
There are tradeoffs involved. Performance problems can often (but not always) be handled by throwing more/better/faster hardware at it. The tradeoff I most often see is that it's usually easier and much quicker to slap client code together than it is to do a good solid system design.
Re:In other news
by
shayne321
·
· Score: 2, Insightful
and frankly I do't understand why people would continue to use a product so limited as MySQL
I'm not arguing FOR MySQL here, but I just wanted to point out I think the main reason people use it is accessibility. I started out programming CGI apps with perl using MySQL databases about 5 years ago with no prior database or programming experience. Over those years I've written tons of code and designed hundreds of databases. All of my current knowledge has come from books, MySQL documentation, and real world experience. Coming from this background, I started with MySQL because it was highly recommended as an easy to use database, and at the time was considered to be much faster that Postgres.
MySQL is very easy for the newbie to get up and running. You don't have to understand views, foreign keys, transactions, etc. All you have to understand is "put data in with insert, pull data out with select"... Well, maybe more than that but you get the idea.
That said, I've recently switched one of my major production databases over to postgres. I finally "graduated" to the point where I needed views, row-level locking, and some other things that I couldn't get with MySQL. I'm finally comfortable with postgres but it took some time. Postgres requires a bit more of it's users. For example, queries are case sensitive. If you have a web form that says "enter name of user to lookup" and feed that form to a query, "where user = 'bob'", you have to have extra code to convert 'bob' to 'Bob', 'BOB', or whatever. Either that or train your users better (yeah, right). You could argue that this is the "correct" behavior and I'd agree with you, but still, it's a PITA to rework already working programs to account for this.
All of that said, I still use MySQL for some things.. I have several databases that get updated once daily via a cron job from a production database and are searched hundreds of times per minute by a mod_perl-enabled web site. Why bother with the complexities of having to explicitly set up keys and indexes, worry about case-sensitive matching, vacuum analyzing, etc, for such a simple database? In my mind this only requires more work and introduces more chances for errors.
It all boils down to the right tool for the job. Writing a complete accounting package? Use postgres (or something commercial). Writing a simple interface to query a list of students in a classrom? Use MySQL.
Shayne
-- Today I didn't even have to use my AK; I got to say it was a good day -- Icecube
Still on the wrong track!
by
MattRog
·
· Score: 5, Insightful
If you're listening MySQL -- who really cares about SSL and unicode. Granted I've had them in my RDBMS that I use but still you're missing the big picture.
The people who post funny things like 'MySQL Mad libs' are on to something - Slashdot is full of 'arm chair' DBAs (and programmers and network engineers and etc. etc. etc.) who know just enough to think they know exactly what is wrong and how to fix it. It's kind of like the people who have used a computer for a while and start trying to troubleshoot their friend's computers - 'You may have run out of Megahertz' or 'You probably need more memory to download files faster'. Yes, subqueries, locking issues, etc. are very important. However, these are 'surface' level flaws in MySQL. There are some very, very large problems in the underlying architecture which need to be addressed. Since they're not as glamorous (or as understood) as subqueries and pretty much ignored by 90% of the users they never see the light of day - but they impact almost 100% of the users in one way or another.
Fix things like this:
Filesystem buffered writes. Transactional support is great - it allows me to roll-back aborted transactions. However, due to the inability to control whether or not my tables are write-buffered means that MySQL may *think* it has performed a write even though it is still in the write-cache. I can then turn off the system and voila -- corruption! Part of the fault lies in the OS who tells MySQL it was written even though it is in the cache, but I have a simple solution. Devise a way to selectively turn off buffered writes for certain tables / databases. This way if I know I have a critical table which has a lot of writes I can turn buffering off and be ASSURED that writes will be performed when asked. I suspect a lot of 1040 and other table corruptions are caused by something like this. Yes, performance will take a hit but I think it is a very acceptable trade-off for data corruption. Obviously all system tables should NOT be buffered.
Inability to use more than one index on a table in a query -- most enterprise RDBMS' can use more than one index on a table for a query. This can easily save a table scan or the use of a single, less-efficient index. Given an example query - 'SELECT bob FROM sometable WHERE somecol = 45 and somecol2 = 44' - if I have two indexes on sometable (somecol and somecol2) it can join the two indexes together and
Clustered indexes. These basically physically sort the table based on particular columns. This allows you to ORDER BY username ASC without using anything special since the rows are already sorted on username (if you have a users table and cluster the username col). This also greatly speeds up BETWEEN clauses. And yes, to people who know a little bit of SQL but don't know as much about clustered indexes -- you can create an index with a bobcol ASC but clustering the actual data is faster and more efficient if you are grabbing data which is not on the index. For example, SELECT * FROM table ORDER BY username ASC will not be as efficient as the same query clustered on the username. If you had a sorted index on username it will probably read the index sequentially and then visit the table. That extra operation = more disk seeks = more time / cpu to execute (and it really adds up as the table size increases). However, if you are doing something like 'select username, password from user order by username' it would be better to create a sorted index on username ASC, password. That way it will read the index only and not visit the table at all.
on-line backups. In today's internet world your site has to be 24/7. This means you cannot have significant performance problems (or even offline-ing your dB!) when you make a dump -- Sybase, etc. have done this from as far back as I can remember.
backups to something other than CSV files. MS SQL, Sybase, Oracle, they all dump to a compressed binary file. Saves a TON of space and is MUCH FASTER to dump and load. I can dump a 6GB Sybase DB in under 5 minutes. Loading it all (from scratch) and then bringing the DB online is about the same amount of time.
Ability to specify the number of files to dump to. What happens if you have a dump which is larger than 2GB? Most linux distros cannot handle a single file of 2GB or more. Give users a way to, within the dump statement, split the dump over two files. Sybase has the 'STRIPE ON' clause (originally to dump to two tape drives at once but works fine on filesystem files as well) to split the dump equally over an unlimited number of files.
cleaner way to view query plans of statements. EXPLAIN... is great and all, but the resulting table is a PAIN to read. What I want to know is simple:
Query is using XYZ, ABC tables. Table XYZ is using index 123 which is sorted so I do not need to create a temp table to sort ASC.
Since you have all the columns in your select statement in the index I do not have to visit the actual table - I can pull it all from the index. Because of this, I will read the index from start to finish.
ABC is using index 23dsf which is not sorted so I must create a temp table to sort that. Also, since it is a join, I do not need to perform an index scan but a positioned search (table scan is to a WHERE clause with no index AS index scan is to an index which is not selective enough or needs to read all columns.)
Simple, easy and pretty much even a NOVICE can see that their query is a good performer or a bad performer.
Along with more in-depth EXPLAIN, also provide me with a way to see what the optimizer is doing with the query. In MS SQL and Sybase you have 'trace flags' which you can turn on before your query to see EXACTLY what Sybase is doing - why does it think this index is better than this other one, why is it table scanning when you think it should index sort, etc. Give me an easy way to say 'verbose on; explain xxx;'.
Ability to delve deeply into performance of the system. If there is one job a DBA must know it's how to tell what the heck is going on when something is slow. Currently MySQL gives you meaningless info like 'slow queries'. Great, I see 200,000 of them. What queries are they? What good is it in a large application which may contain 3000 lines of SQL to tell me the raw number of queries which are slow? I want to know the EXACT SQL of the query(s) which are slow and I want to find the one taking up the most CPU time and blocking all the rest. I want to know how MySQL is managing it's data cache so I can see if I need more ram (e.g. it is swapping lots of data to/from the cache) or if I am I/O bound. Don't tell me to look at 'free' or 'top' - half the time it is wrong because you (MySQL) tell it misleading figures. I want *you* to tell me exactly what you are doing since you would know best! If you've ever seen a sp_sysmon output from Sybase ASE you'd know what I'm talking about.
Hire people who are accomplished relational algebra freaks. Pay them a lot of money - it is well worth it!!! If I see another changelog entry like this I'm going to scream:
Optimized queries of type: SELECT DISTINCT * from table_name ORDER by key_part1 LIMIT #
So does that mean these queries were NOT AT ALL optimized before? It doesn't read 'FURTHER optimized'.
"ORDER BY... DESC can now use keys."
Does that mean it was table scanning each time? Jebus! Get these wiz-bangs to overhaul your query optimizer. Hands down this is one of the most important things in the database -- knowing how to use the database statistics and knowing when to use a merge-join vs. a hash-join etc. are CRITIAL to database performance. Trust me when I say that this is a MAJOR priority - especially over SSH or Unicode since it impacts EACH AND EVERY statement which MySQL parses.
Small things which can really raise the respect of MySQL in the eyes of people who matter - the CTOs and CIOs of the world who might be considering something like Postgres or MySQL but opt for MS SQL, Sybase, or Oracle because of the more advanced (but less flashy) features.
Of course, integrated row (or in the least page) locking and full support of subqueries and the like are also VERY HIGH priorities.
I think addressing these issues will vastly improve MySQL -- they're not flames really, just gripes:D
Sometimes speed is everything...
Of course it is news! MySQL is cool, and for many many Open Source developers it's THE database to use, and 4.0.0 is supposed to bring quite nice new features.
"too much time on their hands" is a comment which has been made for the whole Open Source movement a lot of times. But it's nice to develop software, it's nice to make Open Source and it's nice to read on Slashdot about new versions of tools we use.
Sometimes speed is everything...
Which MySQL hasn't got, no matter how you look at it.
Postgres has shown time and time again that it blows the shit out of MySQL for any kind of select statement where your 'where' clause is even just slightly more complicated than 'x = y'. Even for simple selects, Postgres performance scales waaaaay better than MySQL. We're not talking hundreds of clients, either; we're talking dozens. MySQL falls flat on its face under heavy load.
And if your application just requires the simplest of simple selects, a hashed flat file is faster than MySQL because you don't actually have to parse up the SQL statement and return the result in a formatted fashion!
Face it: MySQL is a neat toy but that's about it. All this pandering about what 4.0's got now and how they think it's mission-critical-ready is bullshit, plain and simple. We've got an open-source, free ACID-compliant database already. MySQL has lost, they just don't know it yet.
From the release notes:
"Removed all Gemini hooks"
See:
http://www.mysql.com/news/article-75.html
And you will understand. There probably are not enough changes to warrant a +1 increment in the release number, but this gives them a new version that does not support NuSphere's attempt to usurp MySQL's copyright and trademark rights.
-josh
I might be missing something, but I don't see anything contradictory in these two statements. If your "where" clause is sufficiently easy and you can take advantage of indexing, you would expect any decent system to perform reasonably well when it is returning a handful of rows. Five table joins would only be scary when you can't toss out the vast majority of your 5 million rows at step one.
One thing that has just occurred to me in this, round 5,408 of the MySQL-versus-PostgreSQL flame fest is to ask if anybody has tried to replace MySQL with PostgreSQL or vice versa in any project that both sides would consider "worthy" in some sense. One problem here, of course, is that if you really wanted an ACID solution, you weren't (aren't?) likely to choose MySQL in the first place. But what if you wanted to use PostgreSQL with the Slashcode; would this port be doable in anything like a reasonable amount of time? If done, could it then be shown that the result was any different (for better or for worse) than what you get with the current system?
Another idea would be to borrow an idea from the functional programming community and do a "database programming contest" like the the annual ICFP Programming Contest. You could try to find a sponsor and give out cash prizes, or you could get *really* serious (like the ICFP people do) and award year-long bragging rights.
I mean, both PostgreSQL and MySQL fans would probably go to great lengths to avoid having to refer to the other product as, for example, "the relational database system of choice for for discriminating hackers" in public. Not to mention the possibility that both of them would have to acknowledge the supremacy of DB2 or Oracle or something. :-)
Babar
Does this new release of MySQL support the proper use of foreign keys internally? (found no info in release notes). Or has MySQL a different way of implementing a "One-Many"-relation?
You don't absolutely need foreign keys to implement one-to-many relationsships - you can just do it the standard way and drop the foreign key constraint. Of course, your app needs to make sure that it updates and deletes entries when necesarry (and when did NTH^WNTNU stop using Oracle? That's what we used when I had the database course there, and MySQL is a poor choice for a course in databases If there's one place you'd like to really have foreign keys, subselects, transactions, views etc, a university course on databases is it - and PostgreSQL clearly has better SQL support than MySQL)
GCC is *not* the culprit. Code that doesn't compile on GCC 3 is broken code, that violates the basics of the C language.
True, and I knew that you knew that too, so I did not write it. But a lot of code still doesn't work with gcc 3.x. So as another pointed out, a bug report to the mysql crew is in order. (if you have the time).
windshields, seat belts, rediator, air-conditioner, brakes, and suspension, would you encourage someone who will be relying on his or her car in demanding situations to buy a car from this vendor?
Of course not. Don't be silly.
It wouldn't help that you can buy the missing parts from Jimmy-Bob either if you know the car was never even designed to HAVE windshields, seat belts, radiator etc. in the first place.
Actually, having transactions, foreign key constraints and better locking as extras is more like selling a car without an engine or a gear box and driving it Fred-Flintstone-style...
If I have complete control over every application that accesses the database then I can ensure that garbage data is not put in.
That is exactly the point. In the real world, you cannot ever have this 100% complete control. Even if you have somehow been given supreme executive database authority in your company, one day you will quit or get run over by a bus. Your successor will be handed a requirement to implement this new little tool that the sales force desperately needs that just happens to require access to this "protected" database. If you had done the job right, almost all the new developer would need to know would be completely described in the database itself. With referential integrity relegated to clients, the developer first has to completely understand the database and all of the client GUI code everywhere.
Let's say he's real good and figures out all of the GUI code. Then the next day, the marketing guys come up with a new requirement for a little app that they need that also accesses the same database. Now you have to completely understand 2 other applications before coding this new one. And it just keeps getting worse and worse as time goes on.
There are tradeoffs involved. Performance problems can often (but not always) be handled by throwing more/better/faster hardware at it. The tradeoff I most often see is that it's usually easier and much quicker to slap client code together than it is to do a good solid system design.
I'm not arguing FOR MySQL here, but I just wanted to point out I think the main reason people use it is accessibility. I started out programming CGI apps with perl using MySQL databases about 5 years ago with no prior database or programming experience. Over those years I've written tons of code and designed hundreds of databases. All of my current knowledge has come from books, MySQL documentation, and real world experience. Coming from this background, I started with MySQL because it was highly recommended as an easy to use database, and at the time was considered to be much faster that Postgres.
MySQL is very easy for the newbie to get up and running. You don't have to understand views, foreign keys, transactions, etc. All you have to understand is "put data in with insert, pull data out with select"... Well, maybe more than that but you get the idea.
That said, I've recently switched one of my major production databases over to postgres. I finally "graduated" to the point where I needed views, row-level locking, and some other things that I couldn't get with MySQL. I'm finally comfortable with postgres but it took some time. Postgres requires a bit more of it's users. For example, queries are case sensitive. If you have a web form that says "enter name of user to lookup" and feed that form to a query, "where user = 'bob'", you have to have extra code to convert 'bob' to 'Bob', 'BOB', or whatever. Either that or train your users better (yeah, right). You could argue that this is the "correct" behavior and I'd agree with you, but still, it's a PITA to rework already working programs to account for this.
All of that said, I still use MySQL for some things.. I have several databases that get updated once daily via a cron job from a production database and are searched hundreds of times per minute by a mod_perl-enabled web site. Why bother with the complexities of having to explicitly set up keys and indexes, worry about case-sensitive matching, vacuum analyzing, etc, for such a simple database? In my mind this only requires more work and introduces more chances for errors.
It all boils down to the right tool for the job. Writing a complete accounting package? Use postgres (or something commercial). Writing a simple interface to query a list of students in a classrom? Use MySQL.
Shayne
Today I didn't even have to use my AK; I got to say it was a good day -- Icecube
If you're listening MySQL -- who really cares about SSL and unicode. Granted I've had them in my RDBMS that I use but still you're missing the big picture.
... is great and all, but the resulting table is a PAIN to read. What I want to know is simple:
... DESC can now use keys."
:D
The people who post funny things like 'MySQL Mad libs' are on to something - Slashdot is full of 'arm chair' DBAs (and programmers and network engineers and etc. etc. etc.) who know just enough to think they know exactly what is wrong and how to fix it. It's kind of like the people who have used a computer for a while and start trying to troubleshoot their friend's computers - 'You may have run out of Megahertz' or 'You probably need more memory to download files faster'. Yes, subqueries, locking issues, etc. are very important. However, these are 'surface' level flaws in MySQL. There are some very, very large problems in the underlying architecture which need to be addressed. Since they're not as glamorous (or as understood) as subqueries and pretty much ignored by 90% of the users they never see the light of day - but they impact almost 100% of the users in one way or another.
Fix things like this:
Filesystem buffered writes. Transactional support is great - it allows me to roll-back aborted transactions. However, due to the inability to control whether or not my tables are write-buffered means that MySQL may *think* it has performed a write even though it is still in the write-cache. I can then turn off the system and voila -- corruption! Part of the fault lies in the OS who tells MySQL it was written even though it is in the cache, but I have a simple solution. Devise a way to selectively turn off buffered writes for certain tables / databases. This way if I know I have a critical table which has a lot of writes I can turn buffering off and be ASSURED that writes will be performed when asked. I suspect a lot of 1040 and other table corruptions are caused by something like this. Yes, performance will take a hit but I think it is a very acceptable trade-off for data corruption. Obviously all system tables should NOT be buffered.
Inability to use more than one index on a table in a query -- most enterprise RDBMS' can use more than one index on a table for a query. This can easily save a table scan or the use of a single, less-efficient index. Given an example query - 'SELECT bob FROM sometable WHERE somecol = 45 and somecol2 = 44' - if I have two indexes on sometable (somecol and somecol2) it can join the two indexes together and
Clustered indexes. These basically physically sort the table based on particular columns. This allows you to ORDER BY username ASC without using anything special since the rows are already sorted on username (if you have a users table and cluster the username col). This also greatly speeds up BETWEEN clauses. And yes, to people who know a little bit of SQL but don't know as much about clustered indexes -- you can create an index with a bobcol ASC but clustering the actual data is faster and more efficient if you are grabbing data which is not on the index. For example, SELECT * FROM table ORDER BY username ASC will not be as efficient as the same query clustered on the username. If you had a sorted index on username it will probably read the index sequentially and then visit the table. That extra operation = more disk seeks = more time / cpu to execute (and it really adds up as the table size increases). However, if you are doing something like 'select username, password from user order by username' it would be better to create a sorted index on username ASC, password. That way it will read the index only and not visit the table at all.
on-line backups. In today's internet world your site has to be 24/7. This means you cannot have significant performance problems (or even offline-ing your dB!) when you make a dump -- Sybase, etc. have done this from as far back as I can remember.
backups to something other than CSV files. MS SQL, Sybase, Oracle, they all dump to a compressed binary file. Saves a TON of space and is MUCH FASTER to dump and load. I can dump a 6GB Sybase DB in under 5 minutes. Loading it all (from scratch) and then bringing the DB online is about the same amount of time.
Ability to specify the number of files to dump to. What happens if you have a dump which is larger than 2GB? Most linux distros cannot handle a single file of 2GB or more. Give users a way to, within the dump statement, split the dump over two files. Sybase has the 'STRIPE ON' clause (originally to dump to two tape drives at once but works fine on filesystem files as well) to split the dump equally over an unlimited number of files.
cleaner way to view query plans of statements. EXPLAIN
Query is using XYZ, ABC tables. Table XYZ is using index 123 which is sorted so I do not need to create a temp table to sort ASC.
Since you have all the columns in your select statement in the index I do not have to visit the actual table - I can pull it all from the index. Because of this, I will read the index from start to finish.
ABC is using index 23dsf which is not sorted so I must create a temp table to sort that. Also, since it is a join, I do not need to perform an index scan but a positioned search (table scan is to a WHERE clause with no index AS index scan is to an index which is not selective enough or needs to read all columns.)
Simple, easy and pretty much even a NOVICE can see that their query is a good performer or a bad performer.
Along with more in-depth EXPLAIN, also provide me with a way to see what the optimizer is doing with the query. In MS SQL and Sybase you have 'trace flags' which you can turn on before your query to see EXACTLY what Sybase is doing - why does it think this index is better than this other one, why is it table scanning when you think it should index sort, etc. Give me an easy way to say 'verbose on; explain xxx;'.
Ability to delve deeply into performance of the system. If there is one job a DBA must know it's how to tell what the heck is going on when something is slow. Currently MySQL gives you meaningless info like 'slow queries'. Great, I see 200,000 of them. What queries are they? What good is it in a large application which may contain 3000 lines of SQL to tell me the raw number of queries which are slow? I want to know the EXACT SQL of the query(s) which are slow and I want to find the one taking up the most CPU time and blocking all the rest. I want to know how MySQL is managing it's data cache so I can see if I need more ram (e.g. it is swapping lots of data to/from the cache) or if I am I/O bound. Don't tell me to look at 'free' or 'top' - half the time it is wrong because you (MySQL) tell it misleading figures. I want *you* to tell me exactly what you are doing since you would know best! If you've ever seen a sp_sysmon output from Sybase ASE you'd know what I'm talking about.
Hire people who are accomplished relational algebra freaks. Pay them a lot of money - it is well worth it!!! If I see another changelog entry like this I'm going to scream:
Optimized queries of type: SELECT DISTINCT * from table_name ORDER by key_part1 LIMIT #
So does that mean these queries were NOT AT ALL optimized before? It doesn't read 'FURTHER optimized'.
"ORDER BY
Does that mean it was table scanning each time? Jebus! Get these wiz-bangs to overhaul your query optimizer. Hands down this is one of the most important things in the database -- knowing how to use the database statistics and knowing when to use a merge-join vs. a hash-join etc. are CRITIAL to database performance. Trust me when I say that this is a MAJOR priority - especially over SSH or Unicode since it impacts EACH AND EVERY statement which MySQL parses.
Small things which can really raise the respect of MySQL in the eyes of people who matter - the CTOs and CIOs of the world who might be considering something like Postgres or MySQL but opt for MS SQL, Sybase, or Oracle because of the more advanced (but less flashy) features.
Of course, integrated row (or in the least page) locking and full support of subqueries and the like are also VERY HIGH priorities.
I think addressing these issues will vastly improve MySQL -- they're not flames really, just gripes
Thanks,
--
Matt