Is there An Enterprise-Level Open Source RDBMS?
Colm@TCD asks: "This is something I've run up against, and I'm sure that anyone else who's trying to bring Linux and Open Source software into 'enterprise' systems has encountered the same problem: Are there any OSS 'enterprise-quality' RDBMSs out there? Many (most) business setups require a fast, solid database engine to run back-end stuff. Fast SQL support isn't enough, though -- these systems need to be able to fail-over automatically to a backup machine in the event of the main system falling over. What are the best of the available systems? Are there ways of taking the existing systems like PostgreSQL and making them failover-able?"
First off, databases look for connections to a port, just like any other service like httpd, smtp or ssh. Doing things so that you have failover and/or load-balancing on *that* is easy. There are a large number of solutions, from application level to library level to dedicated hardware to general purpose software (LVS).
That said, database failover is *not* easy.
The basic problem is that you don't just read from a database, you read and write to the database. Database replication is a very non-trivial thing to do. You have to deal with the possibility of data being inserted, updated or removed from any of the servers that are involved. This requires that you have things like global locks, timestamps on everything, primary keys that are generated in a way unique to each replicated version of the database, etc.
Most of the work of replication is handled very nicely by the big commercial databases (but you will find that they want you to pay for it.), but even with those you might have to make minor changes in your application (or maybe just the database structure) in order to handle replication properly...
AFAIK, there's nothing you can get for free that will be able to handle any kind of real-time n to n (or 1 to n and n to 1 simultaneous, even) replication. You can do stuff like have mysql dump a log of everything that's done that changes the database, and then import those changes into another mysql instance someplace else; but that's nowhere near what the big commercial databases can do when it comes to replication.
Given that replication is gonna cost real money, just get your free database running on a good-quality machine with good backups (that you make sure include the database) and maybe things like redundant power supplies. You can make a really robust box for much less than Oracle will charge you to license 2 instances with replication.
(Or hire a team of 10 database programmers to spend a year or 3 adding replication to postgresql. Well... maybe they could do it wrong in a month -- I'm trying to give a wild estimate for doing it vaguely close to right.)
Sorry.. I forgot to talk about failover.
There is a third-party, non-open source replication tool available fro Interbase 5 and 6 that will handle that data replication stuff you need for failover. The actual server switching I think you might have to write yourself... I suspect you might find there are a lot of people willing to help you.
http://www.freshmeat.net/appindex/2000/01/15/9479
It's a toolkit for clustering MySQL, which may or may not provide the failover facilities that you need. It looks beta, but may work... I haven't played with it yet. And of course, this is only if you're willing to use MySQL. Looks fun, though.
Otherwise, if you're doing serious enough work to need serious RDBMS backend, I hope you're making enough serious dough to afford a serious RDBMS. But before I invested big bucks into a failover setup using a commercial RDBMS, I'd cover my other bases first. Most DB failures I've seen were not a fault of the RDBMS itself, but hardware, power, diskspace, OS, human error, etc. The only time I've seen an RDBMS crap the bed at work, it was Micros~1 SQL Server and you should know better than to buy that. With our Sybase machines, it's always NT that takes down the box, not Sybase. Get these bases covered (hi-quality hardware, reliable power with a big UPS or generator, non-Micros~1 OS, and a DBA that knows what they're doing) and you most likely won't need a failover system.
Operating System design has been a solved problem for about a quarter of a century. 18-year-old kids write OSes as freshmen projects now. There's an astounding amount of freely-available material out there, including the source to several OSes, to get you started on doing such a thing, and there has been for a long time.
The graphical desktop environments you point out are still in the infancy stage, after 3-4 years of development. Although they're very cool, (as is Linux), both are basically aping existing art, and not very well in some cases. (Go on, moderate me down as flamebait and argue with me; this is not my primary point.)
RDBMS design is relatively new in the long view of the art of Computer Science. The commercial vendors that pay thousands of people fulltime to develop them still don't have it down correctly. It's certainly not progressed to the point where it's textbook knowledge, where any fresh-faced CS grad can get in there and build one.
I think the most telling evidence that this is beyond the current state of the art of Open Source, though, is the fact that it doesn't exist yet. It's not like this Ask Slashdot is the first time anyone's ever thought about free alternatives to the big boys of databases. It's a known need.
And the "Cathedral and the Bazaar" theory of Open Source as 'developers scratching their own itch' would seem to mean that if it COULD be done, it WOULD be done. Intrepid hackers would have taken Postgres or the GPL'ed old MySQL or the like and had a project underway years ago if it were just that easy. Apparently, it's not.
Open Source projects do not _innately_ scale well, merely from the fact of them being open. There are a lot of preconditions for a project to work. At least these three need to exist:
1) A critical mass of developers see the need.
2) That mass of developers (or a subset thereof) can be organized into a team to do the work.
3) That mass of developers has the knowledge and skills to do the required work.
As an example, there's a large set of people that would like an open-source engine to crack DES keys in real-time on commodity hardware. That takes care of 1. Getting a set of crypto hackers together to do this work would be a snap, cf OpenSSL et al. There's 2. But, the art isn't there. The general public, the mass of developers out there, simply don't have the knowledge or experience to do that. Maybe it can be done, maybe it can't, that's not the point. The point is that just saying "I'm starting an Open Source project to do [something really cool]" doesn't mean it's something you actually CAN do.
So, back to the topic at hand, yes, RDBMS is something that's outside of the Open Source world at this point. The skills required to do such a thing are reporting for work at Redwood Shores and Emeryville and the like, and not willing or able to share their really-quite-esoteric knowledge with the free software universe.
And pointing to a handful of other projects that are working doesn't prove that this project, or any other, will. Look around; the code's not coming, mostly because, no, there aren't many people out there who are up to the challenge. Or else they'd be doing it.
The proof, as they say, is in the pudding.
--
This is, perhaps, one of those situations where you are rather unlikely to find an OSS solution. An enterprise-level RDBMS is not exactly an overnight hack. It would take an enormous kind of time and effort to put an enterprise-level RDBMS.
I'm not even sure that an enterprise-level OSS RDBMS makes sense. If you're talking about an enterprise-level product, you probably have the resources to afford to purchase a commercial RDBMS. Both Sybase and Oracle have decades of experience building robust, rock hard, RDBMSes that you can bet your company jewels on, and both companies have Linux versions of their RDBMSes. I'm not sure about Informix, they may have a Linux version of their database engine too.
Right now, attempting to deliver an OSS RDBMS that can beat time-tested RDBMSes from Sybase and Oracle would be quite a challenge, to say the least. People will just have to accept the fact that OSS can't solve every problem in the world. It's certainly feasible that one day Sybase or Oracle might decide to release their respective RDBMS engines under an OSS license (which should certainly be quite a shock to many people), that's probably the most likely scenario.
--
The position that any Open Source, and most importantly "free", RDBMS could NOT hold a candle to Oracle (amongst others) where the needs of "enterprise" as concerned is reasonable. The development of Linux is (was), in many ways, the antithesis of what it takes to develop an equivalent to Oracle. Think about it:
Firstly, kernel's are not all that complicated.
Secondly, Linux's kernel is about as simple you can get.
Thirdly, Linux, thus far, has had the advantage of being behind the curve in development. It has had the benefit of essentially just being able to copy features, design, and even some code. While the code may be from scratch, virtually nothing else is. In other words, Linux's code was built with 20/20 hindsight. Furthermore, it's done nothing particularly spectacular with that opportunity.
Fourthly, the motive and feedback in developing an enterprise class RDBMS are vastly different. In developing Linux, there is something quite tangible, such that most "geeks" can get relatively instant gratification. Do geeks really personally need such a database (of course not, not even their hardware approaches this)? Contrast this with an RDBMS, if "they" make the system %.001 more reliable, by whatever measure, under heavy loads, how can they appreciate that? (never mind testing it) These things are about numbers. A company that does 1 million a day in sales with a broken database can't afford that %.001, but a geek screwing around developing a database in his spare time for his uses is not likely to experience such a problem. And even if he is, does that %.001 really mean anything to him? He can just reboot. Sure, you can say "given enough eyes all bugs become shallow", but ask yourself: Whose eyes, and how do these bugs become apparent? A minor bug in Linux, and a couple geeks have to reboot their PC (which presumably prompts them to patch it). Which company is going to volunteer to expose itself to such a bug---possibly millions of dollars in lost revenues (amongst other concerns). The scale of the two uses changes the nature greatly (likewise, empirically, Linux demonstrates such leanings--towards areas which most geeks appreciate, but not necessarily enterprise)
Fifthly, the development process itself is vastly different. Because Unix, and particularly Linux, is quite modularized, it can be developed is a ragtag fashion. This is not true with an RDBMS. They're are very complex and interconnected.
Sixthly, support is a major issue. Despite all that hype amongst the linux community about "support" it is absolutely undemonstrated. And when you consider issues such as proper documentation....
Seventh, given some of the risks i've mentioned, what company is going to prefer to try to save 10k in software fees for the increased risk. The expected value in this case could very well be negative.
...I can think of more, but that is plenty of doubt right there. Think critically, not dogmatically.
I don't think you will find database failover in an open source product -- it's too specialized an area to have gathered interest from the necessary critical mass of developers.
I cover databases for PC Week Labs, and in my view (and in my tests), the most sophisticated OSS relational database is PostgreSQL. In three months, though, that title will be held by InterBase. It has a more complete SQL implementation, has a (third-party) replication option and is generally more mature. However, it doesn't have any failover.
The products that do provide failover are (non exhaustive list, but these are the main ones): Oracle Parallel Server (or Oracle with the Failsafe option if you want just 2 node failover), Sybase ASE, IBM DB2 UDB, Informix, Tandem NonStop SQL and Microsoft SQL Server. The cheapest option in terms of purchase price will be Microsoft SQL Server with Microsoft Cluster Server, but it only runs on NT and only supports 2 node failover, and does not cluster.
Note that when you talk about failover (particularly with stateful connections), you need to also get failover-capable hardware systems. The databases mentioned above all need special, tightly spec'ed high availability hardware configurations from Sun, HP, etc. In general, a shared SCSI bus is required.
You also need to decide what level of failover you need: (from simplest and cheapest to most complex, expensive and best):
* cold standby (replicate to a standby server) -- no shared storage, mostly up-to-date, cheap, manual intervention required for failover
* warm standby (shared storage or transactional replication to a standby server) -- ensures no data loss for all committed transactions, requires fault-tolerant hardware and a fault-tolerant dbms, failover is automatic but may take several minutes to roll the log forward, warm up the cache and reconnect users
* hot standby (shared storage) -- no data loss for committed transactions, fast failover (30 seconds or less) -- this is a very specialized area, and you should have your dbms vendor work with you on setting up these kinds of HA setups. This is an ongoing area of research; Sybase 12 and Oracle8i 2.0 introduced new features specifically in this area to do things like pre-connect users and pre-warm the cache to speed failover times.
If you want to have an OSS solution, I'd advise using PostgreSQL or InterBase and writing stored procedures in C to replicate inserts, updates and deletes to a second server, then coding reconnect and heartbeat logic into your front-end apps. This will be easiest to do with an app server since you have a single point of data access. This will not provide atomicity or durability though (unless the updates use 2PC) -- you'll need to run a consistency check on the db on failover.
This is hairy, though. Overall, I advise paying for something that someone else has gotten working.
Regards,
Tim Dyck
Senior Analyst
PC Week Labs
Check out interbase.com or interbase.org. Interbase has been in use in enterprise environments for something like 15 years. It has very advanced crash recovery features and replication throught a commercial plug-in. Version 6.0 is under the MPL and in beta right now for Solaris, Windows, and Linux. No, I don't work for them, I just sound like that.
--JRZ
Interbase v6 is in (free beer) beta right now, and will be released under the MPL (Mozilla Public Licence) by Mid-June
You can download the beta now for Linux, Windows or Solaris from Interbase.com or The Interbase Developers Inititive
Supports databases up to 32 TB (!!!) spread over 2GB files and is fast and reliable. (It's been around for 15 years.) The biggest known Interbase DB is over 200GB.
It's is ANSI 92 SQL compliant (well.. as compliant as any DB I've ever seen - better than Oracle for instance), and the support is Amazing
Join the email lists at www.mers.com, and you'll be able to get answers from Ann Harrison (the president of Interbase.com), or from a lot of other people - it's the best support I've even seen.
As for enterpise features - well, apart from large DB support, it has row level locking, transactions, referential integrity, blobs, Multi-Generational commits, stored procedures... ummm... I can't think of what else to say.
Basically, if your Databases are less than 50GB then Interbase is the number 1 choice - above that maybe Oracle would be better, but that isn't exacly open source.
Here's a good way to do it in oracle, that'll probably genralize to almost any RDBMS:
The simplest way is to have two unix boxes, each physically attached to a raid unit (dont use raid 5, use mirroring for performance). Don't use clustering software, just mount the filesystems on only one machine at a time. If the primary system fails, then mount the filesystems on the secondary , startup the database, and you're on your way with minimum downtime. Script it and tie it to some ping or other fail detection strategy if you feel lucky, but in my opinion, keep a human in the loop to actually execute the switchover.
Another great solution is a so called "standby database". You make a copy of the primary database to your standby machine, using your backup tapes. Then you start to "roll forward" the standby database by applyin all offline redo logs of the primary database, as it is generated. This method should work for any database that logs transactions including Sybase, Informix and probably MySql. The equivalent of a redo log in MySql appears to be the "update Log". but I have no experience with MySql. One big drawback of this solution though, is that even though a transaction has been committd to disk on the primary, the event may not make it to the offline redo logs before the system crash. So a standby database can only be within some delta-T of the primary database, the last fiew transactions before the crash are lost. Sometimes this is acceptable, but if you billed the customer then lost the order, maybe not. You tune the redo logs buffering parameters to trade of performance for small "lag time" of getting redo logs across to the standby database.
The truth is that it takes 10x effort to get failover and cluster software correct, and 9 times out of 10, the automatic failover either triggers accidentally (bad news) or triggers correctly but fails to come up on the secondary. This is because you have to be incredibly scrupulous about keeping non-shared resourses in sync on the two machines and you can't test the failover (politically that is) until it actually fails. Heck, I just took a call this past saturday for this very problem: Site paid $$ for a bigger consulting firm to implement cluster, firm cant be reached on saturday when cluster trips but does not failover successfully. Customer calls me. I am to gracious to say I told you so.
My free advice: Make two machines, not a cluster. Keep the two machines in sync, using the "standby technique", or make the raid unit accessable by both machines. Keep a human in the failover loop.