MySQL Clustering Software Launched
lawrencekhoo writes "MySQL AB announced yesterday that software for building a
MySQL Cluster
will be available for download by the end of April. Articles available from
Computerworld,
Internetnews,
Linux Electrons,
and PHP Architect.
Great! Now my website can finally have 99.99% availability ..."
Here are some direct links to more information:
Oh, and they say availability is 99.999%, not just 99.99%
It's PR. Remember, The SCO Group is "a leading provider of UNIX-based solutions", per many of their press releases. It doesn't make it any more acceptable, it's just a tactic. Chill.
This sig no verb.
It really depends on what the meaning of is is. Does popularity mean that it is the most used, or the most liked? I would think that popularity and usage are a different metrics.
I remember someone developing a rahter advanced multi-master replication and clustering for PostgreSQL. Does anyone know how far is that project? Has it entered the testing phase yet?
From what I've read it looked very, very prommising, but it doesn't do much good if it's on paper only...
Apples to oranges. The press release should have been more specific than just "database", but still... Berkely DB is not a "database" as most developers think of the term (relational, accessible using SQL, etc.).
Berkely DB is code that manages a data store, and you access the data using method calls within your app (you compile their code with your project), NOT using SQL, and NOT connecting to an independant application. Remote access n/a, no ODBC or JDBC, etc. etc.. Great product, but a completely different animal from MySql and other relational databases.
In fact, MySql used to offer Berkeley DB (as opposed to InnoDB, etc.) as a data storage option WITHIN the MySql product.
There are only 10 types of people: those who understand decimal, those who don't, and, uh, 8 other types I forget.
If this is the requirement deployment then for people like us were db size at over 20GB, and yes the big blogs are already stored in compressed using compression, this would not be economically pratically to use. Factoring OS, caching, I need to get 22GB memory for each node? Last I checked, the 2GB cheaps are still nasty expensive.
You know you're a Database geek when you see the headline and immediately think: "Ah hah! Clustered indexes! That'll save some time during joins! Oh. Wait. They're talking about boxes. Drat."
No where did they mention battery backed-up ram modules as a recommended config so I believe your're correct to assume that disk not only has to be used, but MUST be used.
Without ramsan style battery packed ram, there is no way any enterprise would trust clusters of any kind to ram only storage for write commits.
Looks like each write transaction will be synchronized acrossed all nodes, which would explain the gigabit and lower latency interconnects. Still, this is crazy complex to make fast and reliable.
So to make it truely synchronized, they have to write to disk, for backup/log, before committiong the data to the ram. So regardless, writes are slow and I'm waiting to see how they by-pass this disk write commit latency. Add on that they have to do this for all nodes before responding to the app, writes are crazy slow, relatively, since they can influence indices, force cache/ramed-data flushes, etc. Would be interesting to see how they handle this.
Also, I'm interested to see what type of check code/algorithm to see which NODE is healthy and which ones are corrupt (not dead since dead servers are the easiest to detect). From their diagrams, looks like N-type replication so each node is an exact synchorinized duplicate of all others. But how to know for sure which one is the "safe" one when corrupts happen?
Also, I wonder how they tackle gigantic inserts/update like "replace into table2 select * from gigantic_table1". They can't assume or dictate that we only stick to small write transactions right?
Cheap N-way synchronized replication is my and probably most dbms managers' holy grail so I'm crossing my fingers for Mysql to get this right.
I think they're using "database" here to mean RDBMS. Technically a database is
just anything that organises data, so a filesystem would count, but that's not
how the term is generally used. Usually these days when people say database
they mean RDBMS.
The other thing is, most installs is not the only reasonable measure of
popularity. I'm pretty sure more people have daily interaction with MySQL
than with Berkeley DB directly. Berkeley DB is installed so widely because
it's been around longer and because certain key pieces of software depend
on or use it for historical reasons, not because people like it better.
Note that I'm not trying to say Berkeley DB is bad or anything, or that MySQL
should replace it; they're really quite different things, and they exist for
different purposes and fill different niches. I wouldn't consider them to be
direct competition really -- well, not mostly. MySQL is in competition with
PostgreSQL mainly, and to a lesser extent the major commercial database
offerings (Oracle, MS SQL Server) and various lesser-known projects (e.g.
Firebird SQL). Berkeley DB competes with I think certain Gnu libraries and
maybe some other things I'm even less aware of. Not that MySQL and Berkeley
DB are in _completely_ different worlds; they both might reasonably be said
to compete on some level with SQLite for example, so there is some overlap
between their areas of application. But still, they're mostly not really in
the same category.
Sure, they're both databases. But to say one is more popular than the other
is like arguing whether traceroute is more popular than Mozilla. They are,
after all, both internet software.
Cut that out, or I will ship you to Norilsk in a box.
The standard requirements for the node surprised me.
Is stats that you need 16GB of RAM !! Why do they say that? Doesn't the amount of RAM depends on the size of your Database? If my InnoDB database file is only 3GB why would I need more that 4GB og RAM?
Also, why the hell would you need scsi drives for an in memory database?
I mean, this is an enterprise-scale storage engine from the same engineering team that used to deride ACID transaction isolation and rollback as unimportant, and whose parser still silently ignores any attempt to use integrity constraints that aren't supported. Are these the right people to achieve the robustness that needs to accompany "five nines"?
For the lazy among you (and lazy you have to be to find the task of entering a few fields in a form exhiliarating), I have uploaded the MYSQL Cluster white paper to another FTP site, mirror of the file which you may access there: mysql-cluster-whitepaper.pdf (the document is a PDF file, so fear the Adobe Acrobat Reader loading time).
"Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect" -- Linus Torval