High Availability Solutions for Databases?
An anonymous reader asks: "What would be the best high availability solution for databases? I don't have enough money to afford Oracle RAC or any architecture that require an expensive SAN. What about open source solutions? MySQL cluster seems to be more master/slave and you can lose data when the master dies. What about this Sequoia project that seems good for PostgreSQL and other databases? Has anyone tried it? What HA solution do you use for your database?"
While MySQL supports master/slave replication, MySQL Cluster specifically avoids that entire model. It's an entirely synchronous database storage engine. If you want master/slave, use postgres. If you want high availability and can handle the lack of a small number of features, MySQL Cluster is the way to go. The only real downside to the architecture required for CLuster is that all of the data is stored in RAM based tables. transactions are logged to disk every (configurable) time interval. If you're going to try for HA, you might want to RTFM on the available options before you settle on one.
The submitter of this question seems to have confused the two, Cluster and the older replication. Cluster does not in any way rely on a master/slave setup. Think of Cluster as RAID for databases, where you can lose a node (or more, depending on your configuration) before you lose your db. The current drawbacks of cluster are that it is in-memory and doesn't support certain features, such as fulltext indexing. Replication isn't going to cause you to lose data either if your application is designed to handle a situation where the master server (which you kick your writes to) hits the bricks. Have the app go into a read only mode from your slave.
Neither option is really "beautiful", though Cluster has a lot of promise for the future, especially in 5.1.
Have you looked into DRBD? It works kinda like RAID1 over a network. It uses 2 computers to store the database. Another computer acts as a heartbeat server. You'll need 3 NICs in the database servers; one for the connection to the network, one (gig-e preferably) for the connection between servers, and one for the connection to the heartbeat server.
http://www.drbd.org/
If you are smart, you'll play around with this on a test network or VMWare first. Get it all tweaked out and actually test it by killing a server while in mid-transaction to see if it works for you.
I'd rather you do it wrong, than for me to have to do it at all.
For commercial postgresql options, Afilias (who wrote Slony and uses it to power the
Hey, first, full disclosure: I work at Avokia. But we do have an availability solution that is cheaper than RAC (doesn't require a SAN) and combines the value props of both RAC and DataGuard. We virtualize the data layer enabling many identical databases to be in production serving your users. And you can put these databases into geographically distant datacenters. So you get a live-live-live... set-up without the need for manual conflict resolution that others require. Check it out at: www.avokia.com Wojtek (I'm the product manager).
http://www.radiantdata.com/
Radiant Data has a product called PeerFS which is a replicated filesystem (rw/rw, active/active) which allows you to also hold MySQL databases on it. You run 2 seperate MySQL servers pointing to the same data folder, and have it use POSIX locks for writes. The data is physicaly held on each server, and synced across the network.
I am testing it at work ( http://www.concord.org/ ) now for our websites. VERY easy to setup, but it supports MyISAM tables, and NOT InnoDB (or the other way around).
This does not solve IP failover, but there are other solutions out there for that. As a bonus we're holding our php webpages and MySQL databases files on the replicated PeerFS Filesystem.
-Ben
-=Down Syndrome in Maine