MySQL Moves to Prime Time
MagLev writes "MySQL, especially version 5.0, is popping up on the radar screens of database gurus who built their reputations and book sales using other SQL databases. Ken North, who did those ODBC performance benchmarks for Oracle, Sybase, and DB2, wrote a recent article about MySQL 5.0. The article profiles mission critical database software and discusses how well MySQL 5.0 fits the profile. It gives good marks to MySQL, except for Java and XML integration."
What are the biggest areas you see these problems?
-- Have you ever imagined a world with no hypothetical situations?
I share your dislike of mySQL, but database lock-in is nearly impossible to avoid.
It can be done if you do something simple, like say, a forum, but for anything large, advanced features are incredibly useful, and will make sure you're dependent on that specific database.
For instance, it's very desirable to have everything in a stored procedure. Not only this gives a performance benefit, but it also increases security. Building your database this way you can ensure that nobody will ever be able to insert bad data. But the price of that is that converting to a different DB will be really difficult.
Notably absent from that list is "SQL"...
no, the database should maintain the integrity of it's data reguardless of buggy code. thats the problem with mysql, it doesn't protect your data.
If you mod me down, I will become more powerful than you can imagine....
Placing extensive amounts of business logic in SPs can lead to fractured code bases and schizophrenic systems, vendor lock-in, reduced scalability, extremely difficult debugging, and porting/cross-platform problems.
If performance is so critical that SPs can make or break a system, then you should probably consider changing the architecture by adding application tiers.
Any sect, cult, or religion will legislate its creed into law if it acquires the political power to do so.
It was a very good decision.
Triggers are queries that are run automatically when the associated table or view is changed. It's generally considered to be a bad practice to overuse them, but they have a few very nice uses.
One is implementing complicated checks. You can make an update query fail if any fields don't meet arbitrary requirements. That's good because you can use that to ensure that everything inserted in the database makes sense.
Another is logging. You can easily make a trigger that inserts the previous content of a row into a different table. Can come really handy for debugging, or when you simply want to easily make sure any changes to some data can be tracked. Also handy if you not only want to deny access to something, but also record somewhere that somebody tried to touch it.
Another use can be gradual redesign. If you're phasing out a column, you can do it in steps: Removing its usage from the latest version of your DB interface, and adding a trigger that ensures it has some value that doesn't confuse older versions. This can be used to provide smoother upgrades.
Locking is a major problem. In a database locks must be placed on data to maintain the consistency. You definitely don't want a database that locks the whole table without a really good reason, because as soon as table locks start happening, your performance goes to hell, as everything else will have to wait.
so basically your decision was a religious one and absolutely not an engineering one. nice.
I suspect that what may be pushing Sun to postgresql is the fact that they can, in essence, 'own' their derivative work without licensing it as they would have to do with mysql as opposed to any technological reasons. Pay no mind to this post though, that's just the conspiracy theorist in me talking.
http://www.watacrackaz.com
FULL OUTER JOIN
You're absolutely right. The database should fail to accept bad data. Integrity doesn't mean that the database tries to make an educated guess about what you meant (MySQL does this, and its one of the many reasons why MySQL sucks). It means that the database isn't going to let you insert data if it's not correct. Take the simple case of a foreign key. Your application may be buggy and try to insert a value into a column with a foreign key constraint that doesn't exist in the referenced table. The database should fail your insert, not try to find the closest match to the value you were asking for, or let you insert the value anyway even though it violates the constraint.
The people that defend MySQL's lack of features by saying that you can fix all of the problems in application code obviously do not know what they're doing when it comes to databases. They may truly be smart people, but as soon as they spout such nonsense you can be sure that any DBA that hears it will never take them seriously again.
It's not that it sucks, it's that it just doesn't stack up.
The added features of MySQL 5, if put into the context of the auto industry, would be like a car manufacturer announcing that some of their 2005 models would now come with airbags and anti-lock breaks. Yes, it shows improvement, and yes, it may plug some longstanding criticisms, but in the larger picture it still means that company is years behind everyone else.
MySQL 5 would have been a great advancement to put it in serious technological competition with other databases...if it had been released in 1999 or 2000. The reality is that Postgre is in version 8 with serious Windows support, Oracle is at 10g with gobs of new features 1% of DBAs will use, and Microsoft is in the process of unleashing a major new version of SQL Server onto a world that has done it no wrong. MySQL has only managed to catchup to where the industry was 5 years ago. Everyone else has kept moving.
Real DBA's don't like MySQL for the same reason real web developers don't like IE. They're both behind the times, fail to live up to standards (CSS/ACID), and only got to where they are because of aggressive bundling. IE is "popular" because it's preinstalled and thus used by the average joe who doesn't know any other "internet". MySQL has made sure it is sitting on every free and cheap LAxP host out there, resulting in droves of kiddie web developers whose experience involves a few web tutorial on PHP and MySQL being locked into its heavily proprietary interfaces and dodgy "optimizations".
Oh well, freeloaders get bitter when the free ride ends - or when they think it might end? or maybe they don't like the license for some reason? (I guess I'm not sure what the issue is, but I digress)
I use mysql and love it. Not only are the availability and price unbeatable (it ships with my OS, how much easier could it get?) but the performance is truly outstanding. Predictably though, the mysql-bashers who are stuck in 1998 will arise to tell us how mysql can't do subqueries, or has no transactional integrity, or whatever other fairy tales they might dig up at random
Why can't we just admit that mysql is a useful tool in many situations, and move on?
Of course... because the risks associated with licensing and the law don't have any bearing on building marketable products or services, right?
There is a decent sized market out there where organizations don't need a complicated schema or fancy features. I have seen places where MySQL is heavily preferred due to speed and liteness.
They just want to do your average query on a fairly large db, but do it fast, hella fast. They'd rather put MySQL on a fast proprietory filesystem. Stripe and load balance off some fast storage arrays. And just blast away.
As somebody who maintains the database abstraction layer for a very complex enterprise application platform that runs in Oracle, SQL Server, and DB2 (the latter of which I was the primary porter), I don't think you're right at all on this point. It is possible to go really, really far by using nothing but SQL standard features, and factoring common features with different syntax into a pretty thin database abstraction layer.
First of all, you're mislabeling what you describe as "security" (which properly refers to controlling access to information); the term you have in mind is data integrity (which refers to protecting data from invalid inserts from authorized parties).
Second, constraints != stored procedures. You can add constraints to a schema without having to write a single stored procedure.
The trick to maintaining cross-database compatibility is to avoid requiring behaviors that are only implemented in one database (you may only use such behaviors optionally), and never ever hardcoding database-specific SQL syntax into any part of your application; instead of the latter, you delegate construction of database-specific SQL statements to a software component that knows how to translate your request to the database you're connected to.
I've done a lot of performance testing and taking batch x by submitting it raw vs. putting it in a procedure have only very small differences in performance. The more processing is done, the smaller the difference. I've even seen cases where preparing and executing ad-hoc SQL is slightly faster than using an existing SP.
.Net Remoting, CORBA, J2EE, etc.
The only time SPs are significantly faster than ad-hoc SQL are when the two are different. Every database worth a crap today implements ad-hoc batch caching. That means that SQL blocks from applications execute exactly as if they were pre-compliled SPs.
As for sharing code, that causes as many problems as it fixes. When code is implemented as SPs and called from multiple apps, then a change to an SP can potentially have compatibility problems with any of the applications. Nothing is guaranteed to work. If the business code were in a seperate business layer, then maybe the change wouldn't be instantly effective in all applications, BUT, all application would be guaranteed not to break until you visit them. Test each app and patch. If you haven't figured out how to effectively deploy a middle-tier fix, that's your own fault.
I fight this battle every day at work. We implement WAY too much code in SPs. It takes far longer to build logic into an SP han to put in in compiled code in a good IDE. Almost every developer is mores likely to make an error in SQL than in a procedural language. Also, SQL debugging is never as easy as native debugging.
Architecturally, a good business layer built as Web Services is far superior to a business layer built as SPs. Better passing semantics, better language choices, more portability. If you don't like disconnected processing, pick your favorite three tier technology -- COM+,
In 1996, SPs were invaluable. Today, they are a great tool, but not necessary for either good performance or good security.
I've read the article, am familiar with the product, and am also familiar with data warehousing. This is just propaganda.
> MySQL has a demonstrated capacity for managing very large databases. Mytrix,
> Inc. maintains an extensive collection of Internet statistics in a one
> terabyte (1 TB) data warehouse that contains 20 billion rows of data.
MySQL is a horrible product for warehousing: no query parallelism, no partitioning, primitive memory management, primitive optimizer, etc, etc. 20 Billion rows in a single table? What would mysql do with that? Any typical warehousing queries would take hours to come back. More likely either the database owners are just logging data someplace cheap, or they are creating hundreds of much smaller tables.
> It replicates 10-60 gigabytes per day from its master database to a MySQL server farm.
Ah, a 'server farm'. So, the 20 billion rows are probably spread across dozens of mysql servers. This explains it all. Even SQL Server can do better than that.
> The MySQL databases are used to support a shopping application that can accommodate a million fare changes per day.
Yeah, I think db2's most recent benchmark was for something like 3 million transactions a minute.
This is little more than an advertisement for mysql. A little poking around would probably show that he's on the mysql payroll.
Wait a second. If that's a warning, it means that it went ahead and did it and warned you that it did so. If a connector translates that into an exception, how do you roll back what MySQL already did? Besides, since when is it the domain of the connection software to handle referential integrity constraints?
What's the default, and why would I trust them? MySQL has already proven they don't care what you (the developer) want by doing stupid stuff like trying to set defaults when you don't want them, silently ignoring commands (like when you try to create an InnoDB table on a deployment of MySQL without InnoDB), etc. Given that, my assumption would be that the default of MySQL is not the stricter modes, and even if I did choose them they wouldn't be as strict as you would expect.
I'd call it a generous understatement, personally.
Data and referential integrity is not a "feature", but a requirement (if it's not a requirement for you, you don't need a RDBMS). Other features may not be that important, but they're still nice to have for the one or two times you need them. PostgreSQL is free if price is a concern, and there are plenty of other free DBMSs with varying levels of functionality out there.
Really? Postgres -- fully functional, powerful RDBMS that routinely competes with the Big Boys in terms of speed and features, but has a funky maintenance system (vacuum) and doesn't run natively on Windows. MySQL -- toy database propelled to stardom by open source fanaticism, notoriously unstable, its vaunted speed only applies to relatively small, simple datasets (let's hope you're not doing something crazy like a join!), with arrogant developers who tell their users that they don't need certain features ... right up until the point that they implement the feature and then pretend they never bad mouthed it (yes, that's right, the developers did say that you didn't need row-level locking and transactions because you could do everything you need with a table lock and some programmer "smarts"). Sounds like a wash to me.
In comparison to Oracle, Postgres is trivially easy to configure. It also has a very large (but not as vocal as MySQL's) community, and it's not very heavyweight. If anything, MySQL is slowly converging towards Postgres, but it's doing so in a distinctly MySQL sort of way -- deny, deny, deny right up to the point you implement the feature. Because foreign keys will make you slow!
Odd, I've never considered sqlite to be an alternative to PostgreSQL. At least not where there is a lot of concurrency. (PostgreSQL, like MySQL, is client/server. sqlite is a library.)
Yes I did. SQLite has an even less restrictive license than postgresql. SQLite has no advertising clause. postgresql does.
but i was simply responding to OP's chest thumping about how php "ditched" mysql because the license was so evil. and his smug "but we all know the REAL reason they switched, wink wink nudge nudge" comment.
to OP, PHP moving to SQLite can't possibly be due to technical reasons -- it's due to ideological ones. so he's using it to justify his ideological switch to postgresql.
as you pointed out, his reasoning is bogus -- PHP moved to sqlite for technical reasons. the zero-config of SQLite being a major one. not even 5up4r-1337 postgresql can boast that.
Ahh and there still is the issue, that the MySQL devs refused to add transactions for years ditto for subselects because you do not need em :-)
thing of the past, but that attitude alone was enough not to touch this system.
MySQL is owned by a company, MySQL AB. PostgreSQL is an all-volunteer organization, sort of like Debian is to Linux. Sure, there are a few companies that work with PostgreSQL, such as Command Prompt, but they don't control the direction of PostgreSQL. MySQL AB controls all aspects of the MySQL database. Plus, being a company, they have the money to promote it.
This is one of the reasons why it is so much more popular than PostgreSQL. Another reason is that around the time of PHP taking off, 1998 or 1999, MySQL was emerging, while PostgreSQL was still in version 6.x. PostgreSQL was going through a complete code cleanup and rewrite so it was optimized at all. Therefore it performed much slower than MySQL. It has since closed that gap, while being a more robust database. PostgreSQL and MySQL actually took different development routes. PostgreSQL wanted to add the features to make it a world class database and optimize it and add speed later, while MySQL went for speed first and is now trying to add the features.
MySQL has always reminded me of how Microsoft works. Make it just good enough for the masses and then try to add enough to it to please the experts in the field.
Amen! You wouldn't believe how many times I've had to actually pull a team's head out of the clouds and point out that they really don't need a generic database abstraction layer, because the product is scheduled to go into production with the company's standard database product (which is almost always Oracle). Ironically, my first Java gig required such a layer, as we were writing a product that had to support both SQL Server and Oracle.
If the dev team wants to develop using a different DB than production (since Oracle DBAs tend to be pretty tight-assed and don't like developers creating their own schemas), then I'd be OK with a generic DB access layer, but I've never been on a team that's tried that.
Just junk food for thought...