MySQL Moves to Prime Time
MagLev writes "MySQL, especially version 5.0, is popping up on the radar screens of database gurus who built their reputations and book sales using other SQL databases. Ken North, who did those ODBC performance benchmarks for Oracle, Sybase, and DB2, wrote a recent article about MySQL 5.0. The article profiles mission critical database software and discusses how well MySQL 5.0 fits the profile. It gives good marks to MySQL, except for Java and XML integration."
I share your dislike of mySQL, but database lock-in is nearly impossible to avoid.
It can be done if you do something simple, like say, a forum, but for anything large, advanced features are incredibly useful, and will make sure you're dependent on that specific database.
For instance, it's very desirable to have everything in a stored procedure. Not only this gives a performance benefit, but it also increases security. Building your database this way you can ensure that nobody will ever be able to insert bad data. But the price of that is that converting to a different DB will be really difficult.
It was a very good decision.
Triggers are queries that are run automatically when the associated table or view is changed. It's generally considered to be a bad practice to overuse them, but they have a few very nice uses.
One is implementing complicated checks. You can make an update query fail if any fields don't meet arbitrary requirements. That's good because you can use that to ensure that everything inserted in the database makes sense.
Another is logging. You can easily make a trigger that inserts the previous content of a row into a different table. Can come really handy for debugging, or when you simply want to easily make sure any changes to some data can be tracked. Also handy if you not only want to deny access to something, but also record somewhere that somebody tried to touch it.
Another use can be gradual redesign. If you're phasing out a column, you can do it in steps: Removing its usage from the latest version of your DB interface, and adding a trigger that ensures it has some value that doesn't confuse older versions. This can be used to provide smoother upgrades.
Locking is a major problem. In a database locks must be placed on data to maintain the consistency. You definitely don't want a database that locks the whole table without a really good reason, because as soon as table locks start happening, your performance goes to hell, as everything else will have to wait.
You're absolutely right. The database should fail to accept bad data. Integrity doesn't mean that the database tries to make an educated guess about what you meant (MySQL does this, and its one of the many reasons why MySQL sucks). It means that the database isn't going to let you insert data if it's not correct. Take the simple case of a foreign key. Your application may be buggy and try to insert a value into a column with a foreign key constraint that doesn't exist in the referenced table. The database should fail your insert, not try to find the closest match to the value you were asking for, or let you insert the value anyway even though it violates the constraint.
The people that defend MySQL's lack of features by saying that you can fix all of the problems in application code obviously do not know what they're doing when it comes to databases. They may truly be smart people, but as soon as they spout such nonsense you can be sure that any DBA that hears it will never take them seriously again.
It's not that it sucks, it's that it just doesn't stack up.
The added features of MySQL 5, if put into the context of the auto industry, would be like a car manufacturer announcing that some of their 2005 models would now come with airbags and anti-lock breaks. Yes, it shows improvement, and yes, it may plug some longstanding criticisms, but in the larger picture it still means that company is years behind everyone else.
MySQL 5 would have been a great advancement to put it in serious technological competition with other databases...if it had been released in 1999 or 2000. The reality is that Postgre is in version 8 with serious Windows support, Oracle is at 10g with gobs of new features 1% of DBAs will use, and Microsoft is in the process of unleashing a major new version of SQL Server onto a world that has done it no wrong. MySQL has only managed to catchup to where the industry was 5 years ago. Everyone else has kept moving.
Real DBA's don't like MySQL for the same reason real web developers don't like IE. They're both behind the times, fail to live up to standards (CSS/ACID), and only got to where they are because of aggressive bundling. IE is "popular" because it's preinstalled and thus used by the average joe who doesn't know any other "internet". MySQL has made sure it is sitting on every free and cheap LAxP host out there, resulting in droves of kiddie web developers whose experience involves a few web tutorial on PHP and MySQL being locked into its heavily proprietary interfaces and dodgy "optimizations".
I've read the article, am familiar with the product, and am also familiar with data warehousing. This is just propaganda.
> MySQL has a demonstrated capacity for managing very large databases. Mytrix,
> Inc. maintains an extensive collection of Internet statistics in a one
> terabyte (1 TB) data warehouse that contains 20 billion rows of data.
MySQL is a horrible product for warehousing: no query parallelism, no partitioning, primitive memory management, primitive optimizer, etc, etc. 20 Billion rows in a single table? What would mysql do with that? Any typical warehousing queries would take hours to come back. More likely either the database owners are just logging data someplace cheap, or they are creating hundreds of much smaller tables.
> It replicates 10-60 gigabytes per day from its master database to a MySQL server farm.
Ah, a 'server farm'. So, the 20 billion rows are probably spread across dozens of mysql servers. This explains it all. Even SQL Server can do better than that.
> The MySQL databases are used to support a shopping application that can accommodate a million fare changes per day.
Yeah, I think db2's most recent benchmark was for something like 3 million transactions a minute.
This is little more than an advertisement for mysql. A little poking around would probably show that he's on the mysql payroll.
Wait a second. If that's a warning, it means that it went ahead and did it and warned you that it did so. If a connector translates that into an exception, how do you roll back what MySQL already did? Besides, since when is it the domain of the connection software to handle referential integrity constraints?
What's the default, and why would I trust them? MySQL has already proven they don't care what you (the developer) want by doing stupid stuff like trying to set defaults when you don't want them, silently ignoring commands (like when you try to create an InnoDB table on a deployment of MySQL without InnoDB), etc. Given that, my assumption would be that the default of MySQL is not the stricter modes, and even if I did choose them they wouldn't be as strict as you would expect.
I'd call it a generous understatement, personally.
Data and referential integrity is not a "feature", but a requirement (if it's not a requirement for you, you don't need a RDBMS). Other features may not be that important, but they're still nice to have for the one or two times you need them. PostgreSQL is free if price is a concern, and there are plenty of other free DBMSs with varying levels of functionality out there.
Really? Postgres -- fully functional, powerful RDBMS that routinely competes with the Big Boys in terms of speed and features, but has a funky maintenance system (vacuum) and doesn't run natively on Windows. MySQL -- toy database propelled to stardom by open source fanaticism, notoriously unstable, its vaunted speed only applies to relatively small, simple datasets (let's hope you're not doing something crazy like a join!), with arrogant developers who tell their users that they don't need certain features ... right up until the point that they implement the feature and then pretend they never bad mouthed it (yes, that's right, the developers did say that you didn't need row-level locking and transactions because you could do everything you need with a table lock and some programmer "smarts"). Sounds like a wash to me.
In comparison to Oracle, Postgres is trivially easy to configure. It also has a very large (but not as vocal as MySQL's) community, and it's not very heavyweight. If anything, MySQL is slowly converging towards Postgres, but it's doing so in a distinctly MySQL sort of way -- deny, deny, deny right up to the point you implement the feature. Because foreign keys will make you slow!