High Availability Solutions for Databases?

← Back to Stories (view on slashdot.org)

High Availability Solutions for Databases?

Posted by Cliff on Monday November 14, 2005 @02:57PM from the in-search-of-redundancy dept.

An anonymous reader asks: "What would be the best high availability solution for databases? I don't have enough money to afford Oracle RAC or any architecture that require an expensive SAN. What about open source solutions? MySQL cluster seems to be more master/slave and you can lose data when the master dies. What about this Sequoia project that seems good for PostgreSQL and other databases? Has anyone tried it? What HA solution do you use for your database?"

13 of 83 comments (clear)

Min score:

Reason:

Sort:

MySQL Cluster != master/slave by cravey · 2005-11-14 15:19 · Score: 5, Informative

While MySQL supports master/slave replication, MySQL Cluster specifically avoids that entire model. It's an entirely synchronous database storage engine. If you want master/slave, use postgres. If you want high availability and can handle the lack of a small number of features, MySQL Cluster is the way to go. The only real downside to the architecture required for CLuster is that all of the data is stored in RAM based tables. transactions are logged to disk every (configurable) time interval. If you're going to try for HA, you might want to RTFM on the available options before you settle on one.
1. Re:MySQL Cluster != master/slave by afabbro · 2005-11-14 15:31 · Score: 4, Interesting
  
  The only real downside to the architecture required for CLuster is that all of the data is stored in RAM based tables
  ...for an "only real downside" that's a pretty big one. I mean really, what sort of database is this - 256MB? 500MB? 1GB? Fine for small websites, not fine for large apps. I don't mean to be a "big shop" snob, but this is a ridiculous limitation.
  Unfortunately, open source hasn't caught up to the big guys yet in the area of replication.
  
  --
  Advice: on VPS providers
2. Re:MySQL Cluster != master/slave by cravey · 2005-11-14 16:01 · Score: 3, Interesting
  
  Name a DB from a 'big guy' that doesn't require a shared resource. Oracle? Nope. RAC requires a shared storage solution. Lose your single storage device device and you lose access to your db. MS-SQL? Don't make me laugh. SAP? No. Postgres? No. Firebird? No. Who's left? MySQL is the ONLY available 'shared nothing' (tm) solution available.
  
  Yes, the RAM only tables suck for large DBs. On the other hand, they're REALLY fast and they can be easily scaled up on commodity hardware rather than requiring faster and bigger servers and storage. I'm in medium shop that only requires a few gigs of table space. I'm n+2 on webservers/sql servers and data storage nodes for under $15k. If I wanted to use Redhat on amd64 with gobs of ram/machine, I could probably be n+2 to 64G for under $50k. The only time I will EVER have to take my cluster down is to add more storage space or to replace my switch. That will result in about 10 minutes of downtime. Based on growth, I won't have to do that for another 18 months. If I wanted to spend another $10k, I could avoid having to take down the system in the event of switch failure. I've never had a carrier grade switch failure on any of my networks, so I'm feeling pretty good about availability.
  
  Yes it's expensive to get up into the terabyte range with ram based tables, but there's also no other way to do it without either spending ridiculous amounts of money or being dependent on a single piece of hardware.
3. Re:MySQL Cluster != master/slave by Joff_NZ · 2005-11-14 18:13 · Score: 4, Informative
  
  The other thing to note with MySQL Cluster, is that, even in 'stable' releases of MySQL, is horribly unstable, and prone to massive data loss..
  
  We deployed it ourselves, and it worked ok for a while, but things went very very wrong when we tried updating one of the configuration parameters, causing us to to inexplicably lose quite a bit of data.
  
  Avoid it. At least for another generation or two, or three.
  
  --
  The revolution will not be televised. It won't be on a friggin blog either
4. Re:MySQL Cluster != master/slave by jorela · 2005-11-15 00:21 · Score: 4, Informative
  
  Actually we have implemented disk storage.
  Hopefully it will make into version 5.1.
  
  MySQL Cluster 4.1/5.0 supports:
  * transactions
  * transparent data partitioning and transparent data distribution
  * recovery using logging
  * (multi) node failure and automatic non blocking hot sync for crashed/stopped nodes
  * hot backup
  
  MySQL Cluster 5.1 will support:
  * user defined partitioning
  * cluster to cluster async. replication (like "ordinary" mysql replication)
  
  The disk impl. supports
  * support putting column in memory or on disk
  (currently index columns has to be in memory)
  * all HA features mentioned above.
5. Re:MySQL Cluster != master/slave by delta407 · 2005-11-15 02:49 · Score: 4, Informative
  
  Name a DB from a 'big guy' that doesn't require a shared resource. Oracle? Nope. RAC requires a shared storage solution. Lose your single storage device device and you lose access to your db.
  ...
  The only time I will EVER have to take my cluster down is to add more storage space or to replace my switch.
  Sorry, what? You're bashing Oracle for having a system with a single point of failure, then add two of your own?
  
  "Shared storage" doesn't mean "not highly available". If you're serious about building a database that cannot go down, you get multiple servers, each with two Ethernet interfaces and two Fibre Channel cards. Get two Fibre Channel switches, and two Ethernet switches. Get two Fibre Channel disk arrays. Hook together as follows.
  
  Both disk arrays have one or more uplinks to both FC switches. Each server has one HBA connected to each switch. Then, use Linux multipathing to provide automatic failover in case either switch dies or either HBA dies, and use Oracle ASM or Linux MD to mirror the data so you're good even if you unplug the shared storage.
  
  Set up the servers to use 802.1q VLAN tagging. (Remember, it's a good idea to keep your inter-node communication separate from client-to-server communication.) Create two Ethernet bridges, using the Linux bridging driver, binding eth0.2/eth1.2 to one and eth0.3/eth1.3 to the other. Take the two switches, set up 802.1q on the ports going to your database servers, and connect them together (preferably using high-speed uplinks, but channel bonding works fine). Enable spanning tree on the switches and on the bridges. Connect both switches to the rest of your network.
  
  Now, if a switch starts on fire, spanning tree on the servers will fail over to the other one automatically. If a network cable gets cut or a network card goes out on one node, that one node will fail over all traffic to the remaining interface, and the inter-switch trunk will make everything keep working. Suddenly, you've got a robust network.
  
  So, we've got network and storage covered. What about the database software? Neither MySQL nor Postgres can use this sort of configuration, but Oracle RAC can. Add a dash of TAF, and suddenly, any component -- network switch, database server, SAN switch, disk array -- can fail in the middle of a SELECT and the application won't notice. That is highly available.
  
  Yes, this solution costs more. Our data -- more accurately, the cost of it not being accessible -- justifies the expense. But don't tell me that shared storage is a weakness.
MySQL - Master / Slave is the only option by digerata · 2005-11-14 15:25 · Score: 3, Interesting

You can use MySQL's in-memory cluster replication, which is pretty cool. You can have quite a few nodes, each serving requests against the same database. However, your database size is limited to the amount of RAM a single node can support. That really limits long term scalability of the database. What we just hit 8 GB? What do we do? Sorry, boss, I need another ten grand for an unplanned DB upgrade. Also, if you are used to the atomic transactions InnoDB provides, forget that. The cluster storage system NDB does not support all of the features that InnoDB does.
We chose to go with a Master / Slave option which basically gave us failover within 3 seconds. Any more fine grained monitoring and the CPU performance on the slave gets pretty high. Not ideal, but probably the best option that uses MySQL when you don't want to be tied so much to one platform.

--

1;
1. Re:MySQL - Master / Slave is the only option by cravey · 2005-11-14 16:07 · Score: 3, Interesting
  
  Your database size is NOT limited to the amount of RAM a single node can support unless you're only running 2 nodes. It's possible that your tables size may be limited, but I don't believe that that is the case. No, the cluster storage system does not support all the INNO DB features, but INNODB doesn't support all of the MyISAM features. Does that make MyISAM better?
What are you doing? It's important. by anon+mouse-cow-aard · 2005-11-14 16:02 · Score: 5, Insightful

It's odd that all these people are answering without hearing a thing about your application. How big is the db? How often is it written? How often is it read?

For example, we run a site with data from a thousand odd different data sources, with each source getting updated every hour or so. We do it by parsing the data into static pages. We we receive a datum, we rebuild the pages that depend on it.

We have another site that runs off an Oracle db. the static page site runs about 90x faster, and is basically in memory (disk access is nil.) Now take into account that we can (and do) replicate the static page solution with zero load, we get to a solution that is literally 900x faster.

Now folks are thinking 'oh, the horror!' well... tough! There is no substitute for thinking about your data, and how it flows. A DB is not a given, but a (potentially wrong) answer to a question after you have done some analysis.
Nice "question" by photon317 · 2005-11-14 16:06 · Score: 4, Interesting

Hello, anonymous Sequoia promoter seeking free advertising. (BTW, You might try picking a product name that normal people can spell without thinking about it).

Your solution is not database clustering, and should not be advertised as such. It's more a long the lines of a database connection proxy which supports multiple simultaneous backends and operating on them in parallel, with some added features to make HA-like solutions relatively easy.

The downside of this style of approach, as opposed to an architecture of the likes of Oracle "RAC", is that it doesn't scale up as you add backend nodes (at least not for writes, but in any case for read-only scaling there are simpler solutions for all of the vendors, even the free ones), and it must have limits on how many transactions it can backlog and replay to a temporarily-unreachable/down server before that server has to be re-synced from scratch in order to catch back up (and I have to wonder if there's really any real-world scenario under real transaction load in which the practical net effect wouldn't be a complete resync of a backend server anytime something goes wrong with it, in which case one could throw out any attempt to backlog transactions for a single failed server and just keep things simple - you fall out of sync, you resync).

The open source world really needs a RAC-like solution for PostgreSQL and MySQL (I'm a fan of friendly open-source competition, so while my personal preference is PostgreSQL, I hope both projects stay current and popular for many years to come). Unfortunately there is unlikely to be a generic way to do this, it will probably have to be re-invented for each database project.

I took a brief look around PostgreSQL's guts a while back, and it actually seemed like the architecture they use isn't far off from something RAC-capable to begin with, just nobody's quite buttoned together a few peices here and there to make it happen. Basically on SMP multiple co-operating backends already serve parallel requests and synchronized on a shared memory cache. There's patches out there for the linux kernel to support network-synchronized distributed shared memory. Put two and two together, and what do you get? Something not far off from a first-pass hack at a RAC-like network-distributed database caching system. Most of the other details are easy to solve (start/shutdown, join/leave cluster, tracking of processes across the cluster, etc), or belong in another problem domain (implementing shared storage filesystems (hey, we have GFS, Oracle OCFS, etc available...)). One of the biggest issues would be multiple nodes all having pg "Writer" processes. The first step would probably be to put the writer on one node and failover the writer functionality when that node dies, to be quickly replaced by a scheme whereby multiple writers can work by synchronizing through a distributed lock manager (there are already dlm modules available for linux). Then there's the issue of making the current distributed shared-memory patches do the right thing performance-wise for this kind of usage, and so on. It's not easy, but it's not outside the realm of possibility.

--
11*43+456^2
MySQL cluster and replication are not the same by Scott · 2005-11-14 16:08 · Score: 4, Informative

The submitter of this question seems to have confused the two, Cluster and the older replication. Cluster does not in any way rely on a master/slave setup. Think of Cluster as RAID for databases, where you can lose a node (or more, depending on your configuration) before you lose your db. The current drawbacks of cluster are that it is in-memory and doesn't support certain features, such as fulltext indexing. Replication isn't going to cause you to lose data either if your application is designed to handle a situation where the master server (which you kick your writes to) hits the bricks. Have the app go into a read only mode from your slave.

Neither option is really "beautiful", though Cluster has a lot of promise for the future, especially in 5.1.
DRBD by Bios_Hakr · 2005-11-14 17:30 · Score: 3, Informative

Have you looked into DRBD? It works kinda like RAID1 over a network. It uses 2 computers to store the database. Another computer acts as a heartbeat server. You'll need 3 NICs in the database servers; one for the connection to the network, one (gig-e preferably) for the connection between servers, and one for the connection to the heartbeat server.

http://www.drbd.org/

If you are smart, you'll play around with this on a test network or VMWare first. Get it all tweaked out and actually test it by killing a server while in mid-transaction to see if it works for you.

--
I'd rather you do it wrong, than for me to have to do it at all.
MOM - Motherly Object Model by ACORN_USER · 2005-11-14 21:57 · Score: 3, Funny

A typical transaction:
Phone: Ring Ring
Son: BEGIN TRANSACTION!
Mum: Oh, hi dear. How have you been? Did you know that ..
Son: *cough* Work, remember?
Mum: Yes, dear.
Son: WRITE_DOWN into notepad ADDRESSES. fname='John'. sname='Smith'
Mum: Wait, i have to look up the next identity number thingie. Oooh, that's a big number, I can't even recite it.
Son: Hush! address='12 Rover Roa..
Mum: Oh, dear. My pencil broke.
Son: Ok, Roll back!
Mum: That's fine for you to say. I don't have any erasers.