According to the MySQL 5.0 Manual mysqlhotcopy is a Perl script that does LOCK TABLES, FLUSH TABLES and then cp or scp. So you go read only with your 500GB production database while the darn thing is shoveling it over the wire to the other server? I wouldn't call that "flawless", but then again, my definition of 24x7 seems to be quite different from yours.
I have allways admitted that the design goals of the Slony-I replication system introduced some rather ugly implementation flaws. But I would have never made the compromise to let it interfere that severely with a production database by design. mysqlhotcopy is lame and plain useless in a 24x7 environment.
It doesn't matter if any of those companies actually ship stuff or just use it internally and buy the license out of fear. With MySQL AB being the copyright holder of the majority of the code base and the fork being GPL based, TheirOwnSQL AB cannot sell them any commercial license that allows closed source usage. They can only hand them the whole thing under the GPL. So there is no license and no fear factor whatsoever that would make anyone pay anything to TheirOwnSQL AB. MySQL AB's business model is based on dual licensing... what is TheirOwnSQL AB's business model?
My guess is that far too many people have repeated over and over that "MySQL is open source, MySQL is GPL, if... yadda... fork...". There is no distributed developer community behind MySQL, there is a company with a business model behind it. If MySQL AB goes down the hard way, the people who maintain the code today will have to look for new jobs. They will be allowed to continue working on the GPL version in their spare time, but they will not be able to make a living out of that unless that is entirely based on a support/service model. And nobody can spin off any license based distribution or package business from that GPL community fork.
Yes, it is open source under GPL, that legally allows a fork. I yet fail to see who is going to do that and why.
Care to compare? Which replication solution for MySQL allows building a slave without interruption? Which of them allows rebuilding a failed master and fail-back without significant downtime? Don't I have to shutdown MySQL for the duration of creating the initial copy? Now that doesn't play nice in a 24x7 environment. Is that what you call a legitimate replication solution or is that the tradeoff I have when chosing MySQL?
Ease of replication...MySQL is real easy to set up when it comes to replication, not only is it easy, but it's full featured...
So MySQL replication does support a master to multiple slave setup where you can failover to one slave and the new master inherits all other slaves without the requirement of resynchronization. And when you later repair the original master, you can fail-back without significant downtime, right?
I might have not looked at it for a long time, but last time I looked it only allowed to promote a slave to a single standalone database... that's not a master in my book (it misses any slave). Also MySQL's replication being still statement based, some of the glorious new enterprise features like stored procedures and triggers simply screw up your full featured replication.
I do admit, the Slony-I replication system has a lot of shortcomings, most of which are due to the original design goal of "being able to install on an existing, old Postgres version and use it to upgrade to newer ones". But that mostly affected the implementation, not the initial design of features.
Though I also happen to know a few of the people who are involved in the development of MySQL, and I'm pretty sure they would quit and start their own company if MySQL's management decided to go that route.
And that new company is selling what exactly?
Note that MySQL AB holds the copyright to all of the server code as well as the trademark MySQL. Because of the trademark, let's call the fork TheirOwnSQL. Since TheirOwnSQL will be a fork of the GPL release, TheirOwnSQL AB will not be able to do the dual licensing again. That means that TheirOwnSQL AB will only be able to publish new GPL versions. The very reason why closed source software shops pay MySQL AB license fees today will prevent exactly those paying customers to use TheirOwnSQL. I guess they will continue to pay MySQL AB for a now closed source product.
Without paying customers and a single GPL licensed product, which is polluted with foreign copyright, how long will that new company exist?
If all replication options for Postgres are "niche afterthought hacks", what exactly do you call the third party table handlers that add transactional capabilities to MySQL?
Jan Wieck Postgres core member Author of the Slony-I replication system
Of course I'm not a SQL purist. I think that DBs exist as part of a larger system and play a specific role. I prefer my app-level logic and data where it belongs - in the application driving the db. Call me old school, but I can live in a world without views and stored procs just fine.
Where to implement your business logic (app vs. stored proc) is a preference question, granted. But having transactions and referential integrity constraints is not.
Transactions and constraints are your last line of defense against data inconsistency caused by application bugs. Think of them as sort of Assert-checking. Your application code has the responsibility to check, that the customer placing this new order actually exists. But if you forgot to SELECT FOR UPDATE in only one place, you created a bug that will probably not be exposed by any software testing. It is waiting for a race condition with the cleanup procedure that removes inactive accounts once a month colliding with the customer returning after 2 years.
If you define the constraints that represent the relationships in your data, your database will ensure that the above bug does not create an inconsistent state. One of the two transactions will fail and it might cause the application to do some ugly, ungracefull stuff. But trust me, you really want it to do that in this case.
Call me old school, but for some data I prefer belt & suspenders.
Your opinion on those Gotcha's doesn't matter. How to treat NULL, division by zero, variable overflows and so forth is defined in the SQL standard, and just because you're an Anonymous MySQL fanboy doesn't make it right to violate or ignore the standard. Get it, this is SQL, not K&R C.
MySQL AB needed 10 years to add all those features. Being as popular as MySQL is means (among other things) that a huge horde of self taught users had 10 years to learn how to live without them. Don't expect all of those simple PHP scripters to unlearn and improve over night.
Congratulations MySQL AB. The 5.0 release is a very significant milestone and I am sure, serious database users appreciate the new features.
There are other reasons to release the code under the GPL than to get code contributions. And unless there has been a significant mind shift in the company again, I doubt that getting contributions from a GPL based community is the main reason in this case. See the NuSphere fiasko for what I mean with the last mind shift.
One possible reason I can think of is free ALPHA and BETA testers.
Another reason could be to become the most popular Open Source Database, known by the students today who will be the decision makers of tomorrow.
Whatever it really is, the fact is that all code today is Copyright MySQL AB. Probably they only suggested that and all people who made little code donations agreed that would be a good idea.
InnoDB is as "tightly integrated" into MySQL as the next table handler NDB Cluster. That tight integration must be the reason why the transactions of both won't be ACID together, right?
Yes, someone could fork. There are just 2 little problems.
The fork is strictly GPL from there on, no more dual licensing. No big deal if the new team would come up fast enough with a new LGPL client library. Code from the current one can't be reused so the fork would right now lose prepared statements and such - now that was a smart license move from MySQL AB, wasn't it?
Problem number 2 is way more of an issue. Who in this world right now could a) do any significant work in the server code and b) is not on MySQL AB's payroll? All the really interesting MySQL (tm) users today are desperately waiting for 4.1 or even 5.0. The same features promised for those releases would have to be pushed back for another 2-4 years before any pure open source GPL team could do them (I speak from experience here, I hack the Postgres backend code for about 10 years now and have implemented quite a few features for it). So before the forked... let's call it "OurSQL" because MySQL is (tm) MySQL AB and that fork would probably not be allowed to use that trademark... so before OurSQL could satisfy the current user demands, those users would have to either stick with MySQL (tm) for a few years and pay the license fees, or switch to any other database in the meantime.
I have heard of a lot of successfull MySQL (tm) to PostgreSQL migrations. I have not heard of successfull PostgreSQL to MySQL (tm) migrations. This could be because I am biased, because nobody tells me anyway, because the various PostgreSQL specific gotcha's work as a perfect vendor lock in or because marketing pumps more money into success stories than into failure stories. But I am sure, a few people who stick to MySQL (tm) like chewing gum to a shoe (and in about the same comfortable position) can tell us some of them.
Using a TPC-W style benchmark suite implemented with Apache, PHP4 and either MySQL 4.1.1 or PostgreSQL 7.4.2, I get more or less the same performance. Because of the transactional requirements and the update concurrency, all tables are InnoDB, of course. Based on that I cannot but contradict your claims about MySQL's scalability (and I am a PostgreSQL CORE developer). It keeps well up and is stable even under heavy load. Where the test uses a stored procedure in PostgreSQL, it must use a bunch of PHP code and separate query calls in the MySQL case, but that is exactly what developers do today and since the Apache server is part of the benchmarked system, this is as fair as possible.
That said, Apache+PHP+DB is the environment most people are talking about when they speak about simple to medium complex Web applications. With the scalability and performance being head to head, why would someone voluntarily miss stored procedures, views, triggers and all the other yet to be done for MySQL features? And while the (new in 4.1) subselect support makes it possible to get all of the TPC-W functionality implemented at all, to get it running fast enough in MySQL one has to rewrite some queries in a manner that I would call unmaintainable code. These complex features are not something where you can say "Transactions, checkmark". You have to look at how complete the implementation is and how well the query optimizer can deal with queries that use that feature.
So looking at the two right now, with the performance advantage gone, and the Win32 support knocking at the door, replication available and tons of well settled features in the HISTORY that are still on MySQL's ROADMAP, PostgreSQL is not just the better choice in some cases. It is ahead... except for MySQL's outstanding marketing.
Looking with a narrow view at situations that need 100% guaranteed zero transaction loss on failover, you are right. Many businesses however can live with a little (few seconds) lag and the risk of losing the last couple of transactions, given that there is a mechanism to later analyze the failed server (after recovery) and find out what had been lost, to solve these cases manually or inform users/customers.
The true failover functionality you are talking about will be the goal of my follow-up project Slony-II, which will implement synchronous multi master replication for PostgreSQL. The design phase will start in about 3-5 months.
Instead, the controller sends commands and both the "master" and "slave" respond to those commands. If you must apply that terminology, then the controller is the master and both drives are the slaves.
So you mean I have a black and a white slave working together for the same "controller"? So the real offending word now should be "controller"... they must be removed immediately from LAX!
Postgres, the only other threat on the first point, was nullified with Oracle's acquisition of the only backend to it with atomic commits
What Postgres backend did Oracle acquire?
Jan
According to the MySQL 5.0 Manual mysqlhotcopy is a Perl script that does LOCK TABLES, FLUSH TABLES and then cp or scp. So you go read only with your 500GB production database while the darn thing is shoveling it over the wire to the other server? I wouldn't call that "flawless", but then again, my definition of 24x7 seems to be quite different from yours.
I have allways admitted that the design goals of the Slony-I replication system introduced some rather ugly implementation flaws. But I would have never made the compromise to let it interfere that severely with a production database by design. mysqlhotcopy is lame and plain useless in a 24x7 environment.
Have a nice day
Jan
It doesn't matter if any of those companies actually ship stuff or just use it internally and buy the license out of fear. With MySQL AB being the copyright holder of the majority of the code base and the fork being GPL based, TheirOwnSQL AB cannot sell them any commercial license that allows closed source usage. They can only hand them the whole thing under the GPL. So there is no license and no fear factor whatsoever that would make anyone pay anything to TheirOwnSQL AB. MySQL AB's business model is based on dual licensing ... what is TheirOwnSQL AB's business model?
... yadda ... fork ...". There is no distributed developer community behind MySQL, there is a company with a business model behind it. If MySQL AB goes down the hard way, the people who maintain the code today will have to look for new jobs. They will be allowed to continue working on the GPL version in their spare time, but they will not be able to make a living out of that unless that is entirely based on a support/service model. And nobody can spin off any license based distribution or package business from that GPL community fork.
My guess is that far too many people have repeated over and over that "MySQL is open source, MySQL is GPL, if
Yes, it is open source under GPL, that legally allows a fork. I yet fail to see who is going to do that and why.
Jan
Care to compare? Which replication solution for MySQL allows building a slave without interruption? Which of them allows rebuilding a failed master and fail-back without significant downtime? Don't I have to shutdown MySQL for the duration of creating the initial copy? Now that doesn't play nice in a 24x7 environment. Is that what you call a legitimate replication solution or is that the tradeoff I have when chosing MySQL?
Jan
Ease of replication...MySQL is real easy to set up when it comes to replication, not only is it easy, but it's full featured...
... that's not a master in my book (it misses any slave). Also MySQL's replication being still statement based, some of the glorious new enterprise features like stored procedures and triggers simply screw up your full featured replication.
So MySQL replication does support a master to multiple slave setup where you can failover to one slave and the new master inherits all other slaves without the requirement of resynchronization. And when you later repair the original master, you can fail-back without significant downtime, right?
I might have not looked at it for a long time, but last time I looked it only allowed to promote a slave to a single standalone database
I do admit, the Slony-I replication system has a lot of shortcomings, most of which are due to the original design goal of "being able to install on an existing, old Postgres version and use it to upgrade to newer ones". But that mostly affected the implementation, not the initial design of features.
Jan
Though I also happen to know a few of the people who are involved in the development of MySQL, and I'm pretty sure they would quit and start their own company if MySQL's management decided to go that route.
And that new company is selling what exactly?
Note that MySQL AB holds the copyright to all of the server code as well as the trademark MySQL. Because of the trademark, let's call the fork TheirOwnSQL. Since TheirOwnSQL will be a fork of the GPL release, TheirOwnSQL AB will not be able to do the dual licensing again. That means that TheirOwnSQL AB will only be able to publish new GPL versions. The very reason why closed source software shops pay MySQL AB license fees today will prevent exactly those paying customers to use TheirOwnSQL. I guess they will continue to pay MySQL AB for a now closed source product.
Without paying customers and a single GPL licensed product, which is polluted with foreign copyright, how long will that new company exist?
Jan
My top demotivator for the change is the inherent weird feel of using PostgreSQL. Call me flamebait, but the problem is that it is just not MySQL.
I think you meant "but the problem is that MySQL is so far from any SQL standard that I will not like any other DB".
Jan
If all replication options for Postgres are "niche afterthought hacks", what exactly do you call the third party table handlers that add transactional capabilities to MySQL?
Jan Wieck
Postgres core member
Author of the Slony-I replication system
Of course I'm not a SQL purist. I think that DBs exist as part of a larger system and play a specific role. I prefer my app-level logic and data where it belongs - in the application driving the db. Call me old school, but I can live in a world without views and stored procs just fine.
Where to implement your business logic (app vs. stored proc) is a preference question, granted. But having transactions and referential integrity constraints is not.
Transactions and constraints are your last line of defense against data inconsistency caused by application bugs. Think of them as sort of Assert-checking. Your application code has the responsibility to check, that the customer placing this new order actually exists. But if you forgot to SELECT FOR UPDATE in only one place, you created a bug that will probably not be exposed by any software testing. It is waiting for a race condition with the cleanup procedure that removes inactive accounts once a month colliding with the customer returning after 2 years.
If you define the constraints that represent the relationships in your data, your database will ensure that the above bug does not create an inconsistent state. One of the two transactions will fail and it might cause the application to do some ugly, ungracefull stuff. But trust me, you really want it to do that in this case.
Call me old school, but for some data I prefer belt & suspenders.
Jan Wieck
Your opinion on those Gotcha's doesn't matter. How to treat NULL, division by zero, variable overflows and so forth is defined in the SQL standard, and just because you're an Anonymous MySQL fanboy doesn't make it right to violate or ignore the standard. Get it, this is SQL, not K&R C.
Jan
MySQL AB needed 10 years to add all those features. Being as popular as MySQL is means (among other things) that a huge horde of self taught users had 10 years to learn how to live without them. Don't expect all of those simple PHP scripters to unlearn and improve over night.
Congratulations MySQL AB. The 5.0 release is a very significant milestone and I am sure, serious database users appreciate the new features.
Jan
one can talk to god as long as he wants, when he thinks god does answer is where the problems start ...
There are other reasons to release the code under the GPL than to get code contributions. And unless there has been a significant mind shift in the company again, I doubt that getting contributions from a GPL based community is the main reason in this case. See the NuSphere fiasko for what I mean with the last mind shift.
One possible reason I can think of is free ALPHA and BETA testers.
Another reason could be to become the most popular Open Source Database, known by the students today who will be the decision makers of tomorrow.
Whatever it really is, the fact is that all code today is Copyright MySQL AB. Probably they only suggested that and all people who made little code donations agreed that would be a good idea.
Jan
InnoDB is as "tightly integrated" into MySQL as the next table handler NDB Cluster. That tight integration must be the reason why the transactions of both won't be ACID together, right?
Yes, someone could fork. There are just 2 little problems.
... let's call it "OurSQL" because MySQL is (tm) MySQL AB and that fork would probably not be allowed to use that trademark ... so before OurSQL could satisfy the current user demands, those users would have to either stick with MySQL (tm) for a few years and pay the license fees, or switch to any other database in the meantime.
The fork is strictly GPL from there on, no more dual licensing. No big deal if the new team would come up fast enough with a new LGPL client library. Code from the current one can't be reused so the fork would right now lose prepared statements and such - now that was a smart license move from MySQL AB, wasn't it?
Problem number 2 is way more of an issue. Who in this world right now could a) do any significant work in the server code and b) is not on MySQL AB's payroll? All the really interesting MySQL (tm) users today are desperately waiting for 4.1 or even 5.0. The same features promised for those releases would have to be pushed back for another 2-4 years before any pure open source GPL team could do them (I speak from experience here, I hack the Postgres backend code for about 10 years now and have implemented quite a few features for it). So before the forked
I have heard of a lot of successfull MySQL (tm) to PostgreSQL migrations. I have not heard of successfull PostgreSQL to MySQL (tm) migrations. This could be because I am biased, because nobody tells me anyway, because the various PostgreSQL specific gotcha's work as a perfect vendor lock in or because marketing pumps more money into success stories than into failure stories. But I am sure, a few people who stick to MySQL (tm) like chewing gum to a shoe (and in about the same comfortable position) can tell us some of them.
Jan
Can you point to an asynchronous replication system for ANY database that does address that problem better that Slony-I does?
Sincerely, Jan
Come on guys, the troll has a point. The encoding issues with bytea need to be added to the PostgreSQL-gotchas page ;-)
You really need to update your "known facts".
... except for MySQL's outstanding marketing.
Using a TPC-W style benchmark suite implemented with Apache, PHP4 and either MySQL 4.1.1 or PostgreSQL 7.4.2, I get more or less the same performance. Because of the transactional requirements and the update concurrency, all tables are InnoDB, of course. Based on that I cannot but contradict your claims about MySQL's scalability (and I am a PostgreSQL CORE developer). It keeps well up and is stable even under heavy load. Where the test uses a stored procedure in PostgreSQL, it must use a bunch of PHP code and separate query calls in the MySQL case, but that is exactly what developers do today and since the Apache server is part of the benchmarked system, this is as fair as possible.
That said, Apache+PHP+DB is the environment most people are talking about when they speak about simple to medium complex Web applications. With the scalability and performance being head to head, why would someone voluntarily miss stored procedures, views, triggers and all the other yet to be done for MySQL features? And while the (new in 4.1) subselect support makes it possible to get all of the TPC-W functionality implemented at all, to get it running fast enough in MySQL one has to rewrite some queries in a manner that I would call unmaintainable code. These complex features are not something where you can say "Transactions, checkmark". You have to look at how complete the implementation is and how well the query optimizer can deal with queries that use that feature.
So looking at the two right now, with the performance advantage gone, and the Win32 support knocking at the door, replication available and tons of well settled features in the HISTORY that are still on MySQL's ROADMAP, PostgreSQL is not just the better choice in some cases. It is ahead
Sincerely, Jan
Looking with a narrow view at situations that need 100% guaranteed zero transaction loss on failover, you are right. Many businesses however can live with a little (few seconds) lag and the risk of losing the last couple of transactions, given that there is a mechanism to later analyze the failed server (after recovery) and find out what had been lost, to solve these cases manually or inform users/customers.
The true failover functionality you are talking about will be the goal of my follow-up project Slony-II, which will implement synchronous multi master replication for PostgreSQL. The design phase will start in about 3-5 months.
Sincely, Jan