High Availability Solutions for Databases?
An anonymous reader asks: "What would be the best high availability solution for databases? I don't have enough money to afford Oracle RAC or any architecture that require an expensive SAN. What about open source solutions? MySQL cluster seems to be more master/slave and you can lose data when the master dies. What about this Sequoia project that seems good for PostgreSQL and other databases? Has anyone tried it? What HA solution do you use for your database?"
Don't buy the RAC hype. I've seen too many misperformant RAC clusters that Oracle couldn't fix to save their life (and no, they weren't all bad vendor configuratins either).
While MySQL supports master/slave replication, MySQL Cluster specifically avoids that entire model. It's an entirely synchronous database storage engine. If you want master/slave, use postgres. If you want high availability and can handle the lack of a small number of features, MySQL Cluster is the way to go. The only real downside to the architecture required for CLuster is that all of the data is stored in RAM based tables. transactions are logged to disk every (configurable) time interval. If you're going to try for HA, you might want to RTFM on the available options before you settle on one.
We chose to go with a Master / Slave option which basically gave us failover within 3 seconds. Any more fine grained monitoring and the CPU performance on the slave gets pretty high. Not ideal, but probably the best option that uses MySQL when you don't want to be tied so much to one platform.
1;
Just use .NET's XML serialisation on a DataSet to use an offline file a high availability server which has a RAID-5 array. Ok... so that's not really high availability, simple or really sensible but it would be a bit of fun.
.mdb files (created using Jet, it's free) updated asynchronously using a few COM objects in Transaction Server? Well, it wouldn't work but the idea is pretty cool. Well actually it isn't really but you know.
How about you have a few Access
Or just write down all of your data, photocopy it and give the copy to your Mum (that's Mom, to all you yanks.. (why are you called "yanks"?)). And let's face it, your Mum, oops sorry Mom, is always itching for your phone call. Maybe you should call her more often?
It's ok, I think I'm hilarious but know I'm not.
If you did go the Oracle route, you might take a look at Data Guard. It may be a cheaper option than RAC. It lets your site run in an active/standby mode. I'm not a DBA, but I think you can configure Data Guard with different levels of reliability. Guaranteed synchronization is one of those levels. I'm thinking that it would let you automatically fail over to the standby site if the active site goes down, but you'd have to look at the docs for sure. The nice thing about this route is that you could have a remote disaster recovery site. BTW, "master/slave" is not "politically correct." I have actually heard of people having to reword documentation becuase of some policy that said those terms couldn't be used.
You can actually feel it gripping!
It's called a filing cabinet. It's got full text search and an easy to use index. Although it has 99.9999% availability (it's blocked by crap stacked in front of it the other 0.00001%), it's a bit difficult to make backups without access to the office copier, two toner cartriges and 20 boxes of paper.
It's odd that all these people are answering without hearing a thing about your application. How big is the db? How often is it written? How often is it read?
For example, we run a site with data from a thousand odd different data sources, with each source getting updated every hour or so. We do it by parsing the data into static pages. We we receive a datum, we rebuild the pages that depend on it.
We have another site that runs off an Oracle db. the static page site runs about 90x faster, and is basically in memory (disk access is nil.) Now take into account that we can (and do) replicate the static page solution with zero load, we get to a solution that is literally 900x faster.
Now folks are thinking 'oh, the horror!' well... tough! There is no substitute for thinking about your data, and how it flows. A DB is not a given, but a (potentially wrong) answer to a question after you have done some analysis.
Hello, anonymous Sequoia promoter seeking free advertising. (BTW, You might try picking a product name that normal people can spell without thinking about it).
Your solution is not database clustering, and should not be advertised as such. It's more a long the lines of a database connection proxy which supports multiple simultaneous backends and operating on them in parallel, with some added features to make HA-like solutions relatively easy.
The downside of this style of approach, as opposed to an architecture of the likes of Oracle "RAC", is that it doesn't scale up as you add backend nodes (at least not for writes, but in any case for read-only scaling there are simpler solutions for all of the vendors, even the free ones), and it must have limits on how many transactions it can backlog and replay to a temporarily-unreachable/down server before that server has to be re-synced from scratch in order to catch back up (and I have to wonder if there's really any real-world scenario under real transaction load in which the practical net effect wouldn't be a complete resync of a backend server anytime something goes wrong with it, in which case one could throw out any attempt to backlog transactions for a single failed server and just keep things simple - you fall out of sync, you resync).
The open source world really needs a RAC-like solution for PostgreSQL and MySQL (I'm a fan of friendly open-source competition, so while my personal preference is PostgreSQL, I hope both projects stay current and popular for many years to come). Unfortunately there is unlikely to be a generic way to do this, it will probably have to be re-invented for each database project.
I took a brief look around PostgreSQL's guts a while back, and it actually seemed like the architecture they use isn't far off from something RAC-capable to begin with, just nobody's quite buttoned together a few peices here and there to make it happen. Basically on SMP multiple co-operating backends already serve parallel requests and synchronized on a shared memory cache. There's patches out there for the linux kernel to support network-synchronized distributed shared memory. Put two and two together, and what do you get? Something not far off from a first-pass hack at a RAC-like network-distributed database caching system. Most of the other details are easy to solve (start/shutdown, join/leave cluster, tracking of processes across the cluster, etc), or belong in another problem domain (implementing shared storage filesystems (hey, we have GFS, Oracle OCFS, etc available...)). One of the biggest issues would be multiple nodes all having pg "Writer" processes. The first step would probably be to put the writer on one node and failover the writer functionality when that node dies, to be quickly replaced by a scheme whereby multiple writers can work by synchronizing through a distributed lock manager (there are already dlm modules available for linux). Then there's the issue of making the current distributed shared-memory patches do the right thing performance-wise for this kind of usage, and so on. It's not easy, but it's not outside the realm of possibility.
11*43+456^2
There are several options on 'master/slave' that can be done. The easiest invovles shared storage (2 boxes tied to 1 disk controler... box A goes down, box B notices, imports the disks, mounts them, starts the DB, and you're back. You only loose any transaction that was 'in progress' at failure time.
Any time you add HA to something, you're adding complexity, and usually a fair bit of it. That's a trade off you need to consider (as is the extra price for software/hardware and support for the solution).
Zapman
The submitter of this question seems to have confused the two, Cluster and the older replication. Cluster does not in any way rely on a master/slave setup. Think of Cluster as RAID for databases, where you can lose a node (or more, depending on your configuration) before you lose your db. The current drawbacks of cluster are that it is in-memory and doesn't support certain features, such as fulltext indexing. Replication isn't going to cause you to lose data either if your application is designed to handle a situation where the master server (which you kick your writes to) hits the bricks. Have the app go into a read only mode from your slave.
Neither option is really "beautiful", though Cluster has a lot of promise for the future, especially in 5.1.
Have you looked into DRBD? It works kinda like RAID1 over a network. It uses 2 computers to store the database. Another computer acts as a heartbeat server. You'll need 3 NICs in the database servers; one for the connection to the network, one (gig-e preferably) for the connection between servers, and one for the connection to the heartbeat server.
http://www.drbd.org/
If you are smart, you'll play around with this on a test network or VMWare first. Get it all tweaked out and actually test it by killing a server while in mid-transaction to see if it works for you.
I'd rather you do it wrong, than for me to have to do it at all.
High availability is NEVER as highly available as on paper...
*sob*
"I used to have that really cool,funny sig
For commercial postgresql options, Afilias (who wrote Slony and uses it to power the
A typical transaction: ..
Phone: Ring Ring
Son: BEGIN TRANSACTION!
Mum: Oh, hi dear. How have you been? Did you know that
Son: *cough* Work, remember?
Mum: Yes, dear.
Son: WRITE_DOWN into notepad ADDRESSES. fname='John'. sname='Smith'
Mum: Wait, i have to look up the next identity number thingie. Oooh, that's a big number, I can't even recite it.
Son: Hush! address='12 Rover Roa..
Mum: Oh, dear. My pencil broke.
Son: Ok, Roll back!
Mum: That's fine for you to say. I don't have any erasers.
You want the "best" HA solution but not something too expensive. How about you give us something more to go on, like how much are you willing to spend, how much downtime you can tolerate in the event of failure, are there space/power constraints, etc. Then people can give you a real recomendation instead of the standard MySQL sucks/is great.
MySQL is the ONLY available 'shared nothing' (tm) solution available
Not quite. Check out the shared nothing architecture of DB2 Universal DB. You can get DB2 for AIX, Windows, Linux, and other platforms. UDB offers a High Availability Disaster Recovery solution.
The Big News Page
Hey, first, full disclosure: I work at Avokia. But we do have an availability solution that is cheaper than RAC (doesn't require a SAN) and combines the value props of both RAC and DataGuard. We virtualize the data layer enabling many identical databases to be in production serving your users. And you can put these databases into geographically distant datacenters. So you get a live-live-live... set-up without the need for manual conflict resolution that others require. Check it out at: www.avokia.com Wojtek (I'm the product manager).
http://www.radiantdata.com/
Radiant Data has a product called PeerFS which is a replicated filesystem (rw/rw, active/active) which allows you to also hold MySQL databases on it. You run 2 seperate MySQL servers pointing to the same data folder, and have it use POSIX locks for writes. The data is physicaly held on each server, and synced across the network.
I am testing it at work ( http://www.concord.org/ ) now for our websites. VERY easy to setup, but it supports MyISAM tables, and NOT InnoDB (or the other way around).
This does not solve IP failover, but there are other solutions out there for that. As a bonus we're holding our php webpages and MySQL databases files on the replicated PeerFS Filesystem.
-Ben
-=Down Syndrome in Maine
Redundant Array of Independent Databases; dunno if it went anywhere, but it was a very cool idea.
You don't need a SAN to run Oracle RAC - it will work using a shared firewire drive apparently.
RAC is an absolute ******* to install however. The heavily bugged Oracle Installer being the cause of most of these issues.
Incidentally for people experiencing blue screens when building a cluster with Windows Server 2003 Enterprise on Dell 2850 servers with AX100 SAN with Oracle's ASM with QLA200 fibre channel cards - flash the QLA card firmware.
And before anyone asks, I was going to use Linux, but it was actually cheaper to use Windows rather than fork out for RHEL / SUSE licenses. Yes really.
m/Cluster is a fine solution for MySQL clustering, and it's almost reasonably priced. Be preapred not to use it on RedHat EL4 for a bit while they work out rpoblems with RH's kernel tweaks, but it works really well for our large LAMP and e-mail system.
- Dan
High availability is another one of those marketing buzz words that really doesn't have a good, nailed down definition.
t ions/databases/postgresql.htm
You can acheive this in three basic ways. Each has their own pros and cons. I recommend that you weigh them out and come to a decision you think you can live with.
Clustering - You have a group of servers (physical hardware) each running the same software and working to stay synched up with each other. Now clustering comes in two flavors active/active and active/passive. The active/active clusters share the requests between them. The active/passive clusters wait for a node to fail and then another node in cluster assumes the active role.
Master/Slave - This is similiar to clustering in that you have a group of servers (physical pieces of hardware) each running copies of the software. The master does not service requests directly and operates only as a central repository replicating all the data to the slaves. The slaves synch with the master. If one slave fails, there is normally some means for another slave to take over. Note: While this is old and largely deprecated, it is still quite functional and cost-effective under certain circumstances.
Load-Balancing - There are several physical pieces of hardware who each process transactions all the time. All updates/writes/commits are sent to all the servers, but reads are serviced only by one. Note: Again, this is deprecated, but can still be quite functional in specific circumstances.
My experience has been that typically all that is needed for clustering to work is some shared drive space for the various instances in the cluster to manage their own internal bookkeeping. In all honesty, there are several factors here that would make a big difference in what I would recommend for you. One - how much data are you talking about storing? Two - What are you using it for? If this is driving your web site, a second or two of latency while something fails over won't be noticable to the end user. If you're transmitting medical records to an EMT team, it might be fatal. Three - Why are you looking at HA? What need are you trying to fill?
2 cents,
Queen B.
Links for you to consider -
http://www.linuxlabs.com/clusgres.html
http://www.openminds.co.uk/high_availability_solu
PS: Yes, I like Postgres. It supports foreign key relationships out the box (ummm...*R*DMBS, anyone?) It also doesn't force me to put in unnecessary indexes to use fk's. I will say that MySQL has managed to address some of my previous complaints about not supporting views, stored procedures or triggers. I haven't had time to test their version 5 for now, so I'll be silent as I have no opinion on their implementation.
HDGary secures my bank
Cheaper than RHEL or SuSE or Win2k3 Server and they have had practice with some massive systems.
Got time? Spend some of it coding or testing
A couple of points missing from the above setup:
- each and every device must have double power sources, linked to 2 separate power grids, possibly by different energy vendors (we just had the power failing at airport XXX server room last week, right after the cable guys had spent 6 months convincing us to migrate all servers to the 'new' power lines)
- the entire network path from server to end users should be duplicate. What about the super-solid cluster lan being connected to end-user lan via a single pof (e.g. firewall)?
Jokes aside, real HA is real hard. There's always a little single point of failure hidden somewhere that you forgot. And gettim them all duplicate costs $$$$ and then $$$$ more
try this http://www.intersystems.com/
Extended System's has a pretty good client-server high performance database server called Advantage. I guess they just released a new 8.0 version. Its not open source and but its affordable, gives you whole lot of replication and backup featurs..and support for various clients. herez is their url : www.advantagedatabase.com
If you are dealing with a small db or relatively light transactions you could setup real time replication or some other type of of rapid change transfer system depending upon DB vendor. But if the change is rapid to the database you will need some sort of shared storage. There are storage solutions for under 10000 that you can purcase and connect multiple servers to. If your data isn't worth that amount to spend you don't need a cluster.
I use two 16 way solaris boxes in different data centers, as a veritas cluster. The data centers are vlan'd so that they appear on the same subnet. Out of the bak of each sun are triple fiber channel routes to an EMC symmetrix disk array running SRDF. Oracle (9i) is active on one box, when the box fails, the cluster starts it up on the other box, complete to the last committed transaction.
failover has happened one or twice, and we didn't notice for a couple of days that it had happened.
HA could also mean hardware redundancy.
Most homebuilt computers now have motherboards that supports
built in RAID-5 support. It's an easy way to provide not just
database HA, but system stability as well.
Just string a couple of SATA-II drives together and activate
RAID support on a motherboard like LanParty from DFI.
RAID-5 is more robust and has higher survival rate in case of
hard-drive failures.
most ppl here tend to think in smale scale desasters. Master/Slave is unuseable for any serios HA database because of latencys and the single-point-of-failure. Several other solutions like GFS suffer the same problem. Our System is distributed in 6 centers over the whole country. the big trick is *not* to use fancy solutions. they drive you mad. Split the write as early as possible so data get early to the backup db.
Identify your need; alot of money and sweat is wasted in accuracy that nobody needs.
We simply compare the databases of all 6 centers regulary to make sure no write is lost. So the databeses
are actualy independent and no not waste important time with cecking up each other.
we redesigned the systems some years ago and could reduce spending by more than 90%. since simpler Software is easier to maintain we could also strongly reduce the number of 'strange incidence' (=nobody had an idea what happened).
in short: KISS, write early, understand your REAL problem, avoid SPOF at any circumstances
and forget the advertising from DB people