MySQL Clustering Software Launched
lawrencekhoo writes "MySQL AB announced yesterday that software for building a
MySQL Cluster
will be available for download by the end of April. Articles available from
Computerworld,
Internetnews,
Linux Electrons,
and PHP Architect.
Great! Now my website can finally have 99.99% availability ..."
Here are some direct links to more information:
Oh, and they say availability is 99.999%, not just 99.99%
MySQL Cluster combines the world's most popular open source database with a fault tolerant database
It's nice to start out a press release with a lie, isn't it? As far as I know, the title of the world's most popular open source database (meaning it has the most installs around the world) belongs to the Berkley DB.
I remember someone developing a rahter advanced multi-master replication and clustering for PostgreSQL. Does anyone know how far is that project? Has it entered the testing phase yet?
From what I've read it looked very, very prommising, but it doesn't do much good if it's on paper only...
Apples to oranges. The press release should have been more specific than just "database", but still... Berkely DB is not a "database" as most developers think of the term (relational, accessible using SQL, etc.).
Berkely DB is code that manages a data store, and you access the data using method calls within your app (you compile their code with your project), NOT using SQL, and NOT connecting to an independant application. Remote access n/a, no ODBC or JDBC, etc. etc.. Great product, but a completely different animal from MySql and other relational databases.
In fact, MySql used to offer Berkeley DB (as opposed to InnoDB, etc.) as a data storage option WITHIN the MySql product.
There are only 10 types of people: those who understand decimal, those who don't, and, uh, 8 other types I forget.
How much can databases improve over time and how much of improvement can be achieved? Sooner or later MySql will be enterprise caliber and Oracle will have bigger things to worry about than PeopleSoft.
If this is the requirement deployment then for people like us were db size at over 20GB, and yes the big blogs are already stored in compressed using compression, this would not be economically pratically to use. Factoring OS, caching, I need to get 22GB memory for each node? Last I checked, the 2GB cheaps are still nasty expensive.
You know you're a Database geek when you see the headline and immediately think: "Ah hah! Clustered indexes! That'll save some time during joins! Oh. Wait. They're talking about boxes. Drat."
No where did they mention battery backed-up ram modules as a recommended config so I believe your're correct to assume that disk not only has to be used, but MUST be used.
Without ramsan style battery packed ram, there is no way any enterprise would trust clusters of any kind to ram only storage for write commits.
Looks like each write transaction will be synchronized acrossed all nodes, which would explain the gigabit and lower latency interconnects. Still, this is crazy complex to make fast and reliable.
So to make it truely synchronized, they have to write to disk, for backup/log, before committiong the data to the ram. So regardless, writes are slow and I'm waiting to see how they by-pass this disk write commit latency. Add on that they have to do this for all nodes before responding to the app, writes are crazy slow, relatively, since they can influence indices, force cache/ramed-data flushes, etc. Would be interesting to see how they handle this.
Also, I'm interested to see what type of check code/algorithm to see which NODE is healthy and which ones are corrupt (not dead since dead servers are the easiest to detect). From their diagrams, looks like N-type replication so each node is an exact synchorinized duplicate of all others. But how to know for sure which one is the "safe" one when corrupts happen?
Also, I wonder how they tackle gigantic inserts/update like "replace into table2 select * from gigantic_table1". They can't assume or dictate that we only stick to small write transactions right?
Cheap N-way synchronized replication is my and probably most dbms managers' holy grail so I'm crossing my fingers for Mysql to get this right.
The standard requirements for the node surprised me.
Is stats that you need 16GB of RAM !! Why do they say that? Doesn't the amount of RAM depends on the size of your Database? If my InnoDB database file is only 3GB why would I need more that 4GB og RAM?
Also, why the hell would you need scsi drives for an in memory database?
I mean, this is an enterprise-scale storage engine from the same engineering team that used to deride ACID transaction isolation and rollback as unimportant, and whose parser still silently ignores any attempt to use integrity constraints that aren't supported. Are these the right people to achieve the robustness that needs to accompany "five nines"?
Good to see MySQL develops so fast and its press releases are already hyped enough.
If the MySQL development team did clustering, perhaps now they could consider implementing stuff from numerous wishlists, mainly from here and here...
Also, what do they mean by "share processing"? Do they mean all databases are mirrors of each other, therefore a read can be served up any node in the cluster. If that's the case, it also means any transactions can be served up any node.
Last year there was a post about CJDBC which allows you to create a cluster using clustered JDBC driver. It's good that MySql is getting some more advanced features. There's still a long way to go, but it's a step in the right direction.
For the lazy among you (and lazy you have to be to find the task of entering a few fields in a form exhiliarating), I have uploaded the MYSQL Cluster white paper to another FTP site, mirror of the file which you may access there: mysql-cluster-whitepaper.pdf (the document is a PDF file, so fear the Adobe Acrobat Reader loading time).
"Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect" -- Linus Torval
Thank you for noticing this new white paper, document which I have once again mirrored and that you may find there: mysql-cluster-technical-whitepaper.pdf (181 KB in size).
"Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect" -- Linus Torval