How To Set Up A Load-Balanced MySQL Cluster
hausmasta writes "This tutorial shows how to configure a MySQL 5 cluster with three nodes: two storage nodes and one management node. This cluster is load-balanced by a high-availability load balancer that in fact has two nodes that use the Ultra Monkey package which provides heartbeat (for checking if the other node is still alive) and ldirectord (to split up the requests to the nodes of the MySQL cluster)."
from the article:
"All data is stored in RAM! Therefore you need lots of RAM on your cluster nodes. The formula how much RAM you need on ech node goes like this:
(SizeofDatabase × NumberOfReplicas × 1.1 ) / NumberOfDataNodes
So if you have a database that is 1 GB of size, you would need 1.1 GB RAM on each node!
- The cluster management node listens on port 1186, and anyone can connect. So that's definitely not secure"
One of the major problems with a number of useful open source projects is that their names are terrible for the corporate environment. This is a perfect example of such a project. I sure wouldn't want to pitch the use of this software to a CTO or CIO, even if it was perfect for our database cluster, just because it is called 'Ultra Monkey'. That's just not a professional name, and this product will likely not be taken seriously.
... and they just stare at you and ask incredulously, "You had to do WHAT with the seat?"
This is for the MySQL Cluster storage engine, describing how to set up load balancing across two MYSQL servers that serve as front ends to it.
If you want code to set up a MYSQL cluster (cluster of MYSQL servers NOT using the Cluster storage engine) you might do something like looking at the PHP code in MediaWiki, which does application-level load balancing for reads (not writes) and has so far been tested to 25,000 or so queries per second. Or you could use the same director tools and skip using them for writes, implementing failover for the master server getting the writes.
The MySQL Protocol is proprietary.
o tice.html
The MySQL Protocol is part of the MySQL Database Management System. As such, it falls under the provisions of the GNU Public License (GPL). A copy of the GNU Public License is available on MySQL's web site, and in the product download.
Because this is a GPL protocol, any product which uses it to connect to a MySQL server, or to emulate a MySQL server, or to interpose between any client and server which uses the protocol, or for any similar purpose, is also bound by the GPL. Therefore if you use this description to write a program, you must release your program as GPL. Contact MySQL AB if you need clarification of these terms or if you need to ask about alternative arrangements.
http://dev.mysql.com/doc/internals/en/licensing-n
No MySQL sessions are carried over from failed node, so all MySQL connections will be dropped and need to re-establish again. For sessionless Apache or other fast and short connections may make this very useful, but for application service with shared connection pool, a DNS RR with node check makes better choice rather than elaborate setup using lb nodes in front of MySQL cluster due to +2 lb nodes administration overhead. And also because of arp problem due to MAC on loopback broadcasting, the article forgets to mention;
/etc/sysconfig/network-scripts/ifcfg-eth0 and ifcfg-lo:x
net.ipv4.conf.all.arp_announce = 2
net.ipv4.conf.all.arp_ignore = 1
Should take care of all interface arp announce/ignore. Doesn't need to set net.ipv4.conf.eth0.arp_ seperately as the article provides. Or easier way to control it is to set ARP=no on
Personally I rather have Active/Active lb nodes in front and carry over sessions from failed node to active node transparently by using shared memory space to replicate all session info. OpenMosix comes to mind.
"Don't let fools fool you. They are the clever ones."
Of course that leaves out the very little loophole that you're only bound by the GPL when you release a shipping version of your software. Other than that you're under absolutely no obligation to release your code if it's only used internally.
Anyone have a guide for doing this kind of thing with postgresql?
I am trolling
Would this work for my MS SQL server? I don't like MySQL now that they're associated with SCO.
So, by your definition:
Microsoft = good, MySQL = bad
You must be Ballmer's son/daughter. Would you like to win a free iPod? I promiss not to tell your dad about it.....
For the perfect anti-Unix, write an OS that thinks it knows what you're doing better than you do and let it be wrong.
I know this is an unrelated question, but does anyone know if it's even possible to use SSL with MySQL (with the windows binaries). I ask cause i haven't found any documentation anywhere on this subject.
MABASPLOOM!
PostgreSQL supports these capabilities natively, without needing to resort to third-party packages. The best place to learn about setting it all up is from the PostgreSQL docs. Between the FAQ and the manuals, you should have more than enough to get you started.
Just because they say it is? I'm not trolling, I really want to know.
Because this is a GPL protocol, any product which uses it to connect to a MySQL server, or to emulate a MySQL server, or to interpose between any client and server which uses the protocol, or for any similar purpose, is also bound by the GPL.
I think this is the heart of my confusion about MySQL. That you can GPL a protocol.
I thought, IANAL, that you could legally reverse-engineer or emulate an interface for interoperability purposes. For example, I thought that's why Samba hasn't been sued by Microsoft for emulating their SMB protocol. So how can MySQL claim you can only use their protocol under the GPL. Of course, they're the ones telling you that, I haven't heard a real attorney's opinion.
I mean kudos to MySQL for developing and releasing their database under the GPL and kudos to RMS for coming-up with the GPL in the first place, but is this a legally binding application of it? I wonder.
Properietary AND GPL? Care to explain how can it be both?
Microsoft deals with SCO as well. Now what are you going to do?
Don't judge a company strictly on their business relations. I agree SCO is 'bad' but that doesn't mean we need to lump their current customer base in with them. Hence other companies still work with SCO to help support the customers they have in common.
Anyone have an equivalent HowTo for Postgres clusters?
And while I'm getting everything in life I casually ask for, where's the SW that automatically swaps out a MySQL install and replaces it with Postgres, including revising source code that calls/queries the DB, or just uses a MySQL installation as a proxy replica for Postgres nodes in a mixed cluster?
--
make install -not war
I use a Monkey Management Technique for my mySQL servers. If one server goes down, I have a monkey that jumps up and down at the datacenter and gets real mad until someone fixes it.
.. Works like a charm!
But anyway, how can MySQL's association with SCO put you off, if MS SQL's rather closer association with Microsoft doesn't? Beyond that, I concede your points: I know little beyond the four basic functions and phpMyAdmin, and I'm content in my ignorance.
Just because they say it is? I'm not trolling, I really want to know.
You can copyright source code, but not the rules of a protocol. You'd have to use a patent for that. Since the GPL is all about copyright and not patents (yet), what they say is probably nonsense.
The only construction that I can think of that would make the protocol governed by the GPL (though not for the reasons MySQL AB states) is if the protocol requires the transmission of some text that is itself copyrighted under the GPL *. I'm not familiar with the MySQL protocol, so I couldn't say if it does any such thing. I'd bet against MySQL AB on this issue, though.
*) As an aside, some company used a similar construction for an anti-spam solution, whereby a copyrighted haiku had to be sent in the mail header in order to circumvent the filter. This haiku wasn't licensed to spammers, automatically making them copyright infringers if they included it.
this may come in handy
I heard that you need to have enough RAM on the SQL nodes to hold your entire database on each cluster machine so that NDB will work. I did not see this info in the front page. I think it's critical to know.
mysql 5 has stored procedures, idiot!
My understanding was that your obligation to release under GPL was based on your intended usage of said modifcation. For example, if I develop a hack to this, but intend only for internal/private use, then I'm under no obligation to hand this out publicly.
The only time I would be obligated to submit this under public GPL would be if I intended to redistribute it.
Is that not correct?
postgres has no clustering support. It has limited replication support from a project called slony (http://gborg.postgresql.org/project/slony1/projdi splay.php).
The mysql clustering is pretty much bunk anyways. All of the DBs need to live in RAM. What happens if you have a catastrophic UPS failure?
Clustered databases using JDBC
How in hell can this be moderated "+ Interesting". It is either "- absolute ignorant" or "- blatantly trolling".
1/ Using "GPL" and "proprietary" in the same sentence is blatant trolling.
2/ Talking about using a license on a protocol is absolute ignorance.
Now for the facts:
If you are not linking to or otherwise using GPL'ed code within your sources, you can license however you want to. The GPL has nothing to say about it (it is not an opinion; you just read the f* license).
"Because this is a GPL protocol"
There ain't anything like a "GPL protocol". Therefore the rest of the sentence if pure nonsensical.
Since MySQL provided client access libraries *are* GPL, you only can use them (ie: linking to them) from GPLed programs, that's true. But the MySQL access protocol is publicly known. Nothing refrains you to write a different client access implementation... with whatever license you like it. I challenge you to probe otherwise.
The text I posted is a direct copy from the official MySQL site. If you click the underlined text it will take you to their site.
It isn't my fault that MySQL believe it.
So, following that logic, how is it possible to "redistrubute" your "hacks" to the protocol itself? I think it isn't. I think MySQL is full of baloney.
SCO has an operating system. MySQL has a database product. MySQL needs operating systems to run it's databases. Just because MySQL has certified that it runs on SCO OpenServer and that SCO says MySQL runs fine, doesn't necessarily one supports all the actions of the other.
I find it funny that you don't want to use MySQL because of their business practices in partnership with SCO, Yet you inquire about MSFT which could have even shadier business practices with SCO, and even excluding SCO, has a track record of being less then friendly to the open source community.
> The mysql clustering is pretty much bunk anyways. All of the DBs need to live in RAM. What happens if you have a catastrophic UPS failure?
... unfortunately no one's seriously built up that infrastructure.
It was designed for read-only telecom use, with real time writes from "upstream" propogated down. Basically, a cache. Local writes aren't super-critical in this case, because whatever causes a failure to write probably takes out the rest of the node anyway.
It's not really a general purpose SQL database, it just happens to let mysql provide some front-end services. Personally I consider it a drag on an otherwise decent distributed cache engine (mysql isn't exactly synonymous with bulletproof) but I suppose it does gain some exposure.
Postgres provides a MUCH more flexible infrastructure than MySQL for doing real clustering and replication
How about CARP from the BSD project? Pretty darn simple compared to Ultra Monkey.
Plus then you can say "Our machines handle failover at the IP level through Common Address Redundancy Protocol".
It looks like whoever actually wrote that note on the MySQL site does not know how GPL licenses work. It's a violation of the GPL to emulate the protocol? Nonsense, unless there are patents on it.
Mysql may believe this (of course, they believed foreign keys were unnecessary up until pretty recently) but they aren't right, at least, not under the terms of US copyright law.
You can't copyright a protocol. You can copyright the literal words of their description of the protocol, but that in now way binds you in terms of the license. You can't copy their description into a book without permission, but I doubt that the description is licensed under the Gnu FDL.
If you lift their client app and use it to talk to a mysql database then you have a problem; but if you reverse engineer the protocol without doing a literal cut and paste, then they have no leg to stand on.
IANAL, but apparently neither is the hoof-handed choom at Mysql that wrote that whopper.
What a strange bird is the pelican, his beak can hold more than his belly can.
Can you please provide a link to the open, published specification? I'd like to see it so that I know that anyone can implement that spec under whatever license they feel most comfortable with.
/ 12572/OpenDocument-v1.0-os.pdfl .html
c ol.html
An example of a truly open standard is:
http://www.oasis-open.org/committees/download.php
Or also, the PostgreSQL wire protocol:
http://www.postgresql.org/docs/8.1/static/protoco
The MySQL wire protocol is proprietary. You would have to reverse engineer the product to create code under a different license. They did that on purpose.
Note, there are documents in existence, such as:
http://www.redferni.uklinux.net/mysql/MySQL-Proto
That is someone who read the source files and produced a spec for a hypothetical re-implementation. The problem with that is that MySQL has made no promises to adhere to that spec, so it's quite possible that MySQL could manipulate the protocol to break 3rd party implementations without breaking their own. And nobody would have a standard document that they could use to file a bug report against. That's the definition of a proprietary protocol.
Social scientists are inspired by theories; scientists are humbled by facts.
It's proprietary, and you can reverse engineer it.
You could use the reverse engineering argument for MS Word documents as well. However, Microsoft can make subtle changes to the format designed to break 3rd party implementations, or even break their own old versions. MySQL can do the same, because since there is no standard document for the protocol, nobody can file a bug report regarding a 3rd party implementation.
Social scientists are inspired by theories; scientists are humbled by facts.
If you have a catastrophic UPS failure you wait for the power to come back up and reload from the disk copy. MySQL Cluster regularly saves the data to disk. Or you use two places that are on different power setups and UPS systems. Or you shut down while you stil have UPS power. Or you switch to your alternate data center and its copy of the cluster.
It was designed for fast distributed writes as well For example, for the call records of phone calls and similar session information. It's being used for high rate data capture.
You're right that the Cluster storage engine isn't a database. It's a storage engine and that's what it's supposed to be.
Holy shit!. I just laughed so hard I scared the dogs. Thanks. BTW, the garbage disposal seems like the next step.
WARNING: Smoking this sig may cause lowered IQ, insanity or short term memory loss. It is also really bad for your monit
"Postgres provides a MUCH more flexible infrastructure than MySQL for doing real clustering and replication ... unfortunately no one's seriously built up that infrastructure."
Yea, I would love multi master replication or even replication over unreliable links. Too bad I can't have either with postgres. Master slave over a local LAN is all you get really. Good for failover but not much else. Having said that failover is better then nothing!
evil is as evil does
This seems very risky.
A very affordable -- and far safer -- alternative may be to consider Sybase's "Adaptive Server Anywhere" database engine. (There's a demo version, which is available for free for development and educational use; or, a single license costs $400, less if ordered in bulk.) ASA is a high-performacne, SQL99, ACID-compliant DB that runs on {Windows,Linux,MacOS,etc.}, and can also run on handheld devices, like Blackberries (through a technology Sybase calls "Ultralite").
It has remote clustering ability which is far more robustly-designed than the tacked-on support offered by MySQL.
That's the stupidest fucking thing I've ever read.
How in hell can this be moderated "+ Interesting". It is either "- absolute ignorant" or "- blatantly trolling".
1/ Using "GPL" and "proprietary" in the same sentence is blatant trolling.
2/ Talking about using a license on a protocol is absolute ignorance.
You do realize that it is MySQL making the claim to which you are objecting, right?
Since MySQL provided client access libraries *are* GPL, you only can use them (ie: linking to them) from GPLed programs, that's true. But the MySQL access protocol is publicly known. Nothing refrains you to write a different client access implementation... with whatever license you like it. I challenge you to probe otherwise.
You do realize that it is MySQL making the claim that you can't, right?
You must be new here
How many beans make five, anyhow ?
If it's GPLd, it's Free and definitely _not_ proprietary.
Oolite: Elite-like game. For Mac, Linux and Windows
We are using Continuent's m/cluster. Costs a little bit (still loads cheaper than Oracle) but so far delivers the goods with little setup or maintenance. It is for a web application so read performance is where it's at. It distributes the writes across all the nodes, so write performance actually takes a slight hit, but read performance is load balanced. We are running three dual processor nodes and the performance is fantastic. It's shared nothing, so all data is on disk on all three machines. Supposedly we can add up to 12 more nodes (just buy the licenses) to dramatically increase our read performance. We looked at all the different open source clustering solutions and they were still a little too "roll your own" for our production environment.
1) Data is logged and checkpointed.
So even in case of "a catastrophic UPS failure" data will be there once your power is back.
2) In 5.1 the all data in RAM limitation is removed.
(Except for indexes which still has to reside in RAM)