Slashdot Mirror


Is there An Enterprise-Level Open Source RDBMS?

Colm@TCD asks: "This is something I've run up against, and I'm sure that anyone else who's trying to bring Linux and Open Source software into 'enterprise' systems has encountered the same problem: Are there any OSS 'enterprise-quality' RDBMSs out there? Many (most) business setups require a fast, solid database engine to run back-end stuff. Fast SQL support isn't enough, though -- these systems need to be able to fail-over automatically to a backup machine in the event of the main system falling over. What are the best of the available systems? Are there ways of taking the existing systems like PostgreSQL and making them failover-able?"

67 of 177 comments (clear)

  1. 1stPostDB by Anonymous Coward · · Score: 2

    With 1stPostDB running on slashdot you no longer need to worry about failover. Just post all your information to slashdot using the 1stPostDB and let the andover.net guys take care of the rest.

  2. Work on Enterprise Linux RDBMS by Anonymous Coward · · Score: 2

    Actually, I've been doing a lot of work recently on trying to get linux db's to scale. The simplest solution, if you have a static db, is to just mirror the contents across several machines, running (depending on your needs) MySql or postgresql. Then, the linux virtual server works great to load balance across your new semi-parallel db. If you have changing data, such as syncing several machines, things get harder. LSDProject.sourceforge.net has one solution for syncing multiple DB's, by executing SQL statements in parallel, and logging failures, but more work is needed in this area (and if you want to help that project, it'd be appreciated). There is no solution in the class of oracle parallel server, etc, but if your needs are more on the reliability side with simple queries, the linux virtual server and mysql will outperform oracle, especially taking price into account. -Sean Ward (forgot my slashdot login ;)

  3. Re:MySQL performance by Anonymous Coward · · Score: 2

    Since I saw that many people are talking about transactions, I wanted to give a short view of what actually a transactions. People who are really interested should refer to the before mentioned book of Gray/Reuter "Transaction Processing" or even the original paper by Härder/Reuter (1981) which introduced ACID transactions. Since when you'r into computer science and databases and are talking about transactions you probably mean ACID transactions. ACID stands for: ATOMICITY: trannsactions either succeed (commit) or fail completely CONSISTENCY: a transaction transforms the DB from one consistent state into another (since transactions are atomic, the DB is logically always in a consistent state) ISOLATION: transactions are processed without the context of other transactions (logical one user system). Thus they are not interfering with each other DURABILITY: changes caused by transactions are durable (that's where recovery, logging and so on comes in) In order to ensure those properties a lot of research has been done and one can say that all basic things have been implemented. The hard part is of course efficiently implementing it, e.g. using B-Trees (or the new UB-Tree) for accessing and storing data efficiently. Logging and recovery mechanisms for ensuring durability. Locking mechanisms for isolation and also atomicity..... A database system actually offers almost the same depth as an OS, since it needs physical storage management, indexing, caching, query processing & optimization, parallel processing, distributed physical storage, distributed query processing, logging, fail saveness... So you might imagine that implmenting a good DBMS is probably the work of hundred of work hours. An effort which is almost comparable with creating Linux itself...

  4. failover postgres by Anonymous Coward · · Score: 2

    I faced a similar problem about one year ago.

    I was annoyed with my proffesional collegues using Oracle 8i etc (excellent products) but not setting them up to facilitate their features. The 'use Oracle and don't worry about our databases fullstop' attitude is everywhere.

    Anyway, to the point:

    I used PostgreSQL with an additional layer underneath my application which done all selects on the same node as the process and all other operations on all nodes. There are a few tricks to getting this done, such as counters etc need to be setup to be multiples of, say, 100, with the cluster number added to that number. (this means if things screw up, nodes can operate independantly as much as possible then sync up when possible). I hope this makes sense, it's late. Other hacks, such as code to sync/check integrity etc databases after failing are relatively straight forward.

    So, the bottom line:
    Performance over n nodes only as good as one node (_slightly_ worse).... but then, hardware's cheap...

    Reliability: We have had all sorts of things go wrong, but at the end of the day, when the macines are live again, we have never had a problem.

    Im no expert by any stretch of the imagination, but with continuous incremental backups etc and a little common sence we are using PostgreSQL with a $$$ (8 digit) online business, and I can sleep at night, even though im personally responsible for the systems.

    PS: Total expenditure on software for the system: $0 (we have/had the $ to purchase expensive software; OS solutions just came out on top)

  5. Re:Postgres / Illustra by miniver · · Score: 2
    It has been my experience that postgres is just hideously slow. After removing "time travel" from the database engine, things improved greatly but postgres is still one of the (if not the) slowest SQL engines available. However, postgres is one of the most complete SQL implementations you can find (free and in source form.) It's not a speed demon, nor was it designed to be -- make it work, then make it fast.

    Let me follow up to that.

    Once upon a time I spent a year and a half developing applications for Illustra, the commercial follow-on to Postgres. Illustra had the same features as Postgres, plus SQL-compatibility, plus all of the standard SQL datatypes, while still including all of the ORDBMS features of Postgres. They even had support.

    What Illustra didn't have, couldn't have, and what Postgres will never have is speed. The Object-Relational data model is not as efficient at doing plain Relational data work -- it can't be. Using Illustra/Postgres for normal relational databases is like trying to fight Mike Tyson with one arm tied behind your back -- you're playing the other guy's (in this case Oracle/DB2/Sybase) game, and giving him points to start out. Why is this important? Because all of the database APIs (SQL, ODBC, JDBC) only understand rows in tables with cursors, and all of the back-end reporting engines only cope with relational tools.

    Illustra and Postgres are much more interesting for storing object data ... but the only way to take advantage of their capabilities is to use their special APIs. At which point you've thrown away portability and standards, and you've bought yourself a proprietary (even if it's open source) solution. It may be the perfect solution to your problem, but like a black hole, it will warp everything around it. Take replication -- replication is hard for ordinary RDBMSes to implement reliably; when you add object IDs to every row of every table in every database, you multiply the problem, since you can't just copy the rows directly from one database/table to another, for the same reason that you can't easily serialize a C structure containing memory pointers -- all of the pointers (object IDs) will be wrong.

    I should really be talking about Illustra in the past tense -- the company got bought out by Informix several years ago, and Informix folded most of Illustra's technology into their Universal Server database. Dr. Stonebreaker (the creator of Ingres, Postgres, and Illustra) went on to become the VP of Technology for Informix.

    --
    We call it art because we have names for the things we understand.
  6. Re:OSS is not a solution for every problem. by logicTrAp · · Score: 2

    Hey now, I'd never work *with* databases, but working *in* a database can be quite neat - in many ways it's somewhat like working in an operating system since you handle so many things (paging, data storage etc) at a low level.

  7. Maybe, but things look much better now. by Bruce+Perens · · Score: 2
    Actually, they are spinning off an entire company around it and I would not be surprised if they go for more money soon. There could have been some problems before they came to that decision, but I've emailed with their president and they seem to have a clear set of goals in sight now.

    I doubt it was K&R style code - it was probably written for the Borland C compiler and it's been on Windows for quite a while. It is certainly possible that it might have some ugly code. We can fix that.

    Thanks

    Bruce

  8. Overextension of the Open Source concept by drix · · Score: 2

    This post is, albeit to a lesser extent, in the same pointless vein as the one about buying Iridium and "open sourcing" it. Things that are boring, don't impress your friends, or don't scratch an itch typically will not exist open source. Programming RDBMS is (I imagine) highly theoretical, scientific work, and while a few CS wizards out there might get a kick out of it, such a project would never reach the critical mass needed to sustain development. It just doesn't meet any of those criteria.

    At times people seem to get carried away with the Open Source phenomenon. We need to remember to take a step back and realize what an incredibly miniscule blip on the radar of innovation that OSS is, and that for a good three decades conventional development models produced some great things. Relational database code is one of them.

    --

    --

    I think there is a world market for maybe five personal web logs.
  9. Re:Open Source Databases by Nicolas+MONNET · · Score: 2

    MySql is neither Free nor is it (yet) an RDBMS

    Actually older versions are GPLed. Newer version are free enough IMO: you can modify, redistribute and use at will, you're just not allowed to sell them for a profit (as in selling a product that would only work with it). So for me it's Free, if not in the general sense. Then, Monty in an interview hinted that he was trying to have it fully FREEed (Open Source'd).

  10. Looks like I'm not the only one by trog · · Score: 2

    I am designing a data-mining system at work, and I believe that this far surpasses Linux's ability to scale. The problem is, Linux only supports files up to 2GB in size(4GB with reserf), and MySQL only supports 4GB tables. I estimate that several tables will exceed 20GB in size, and breaking them up into smaller tables will be highly inefficient. We've just pushed Linux as far as it can currently go.

    So A big Sun box and Oracle are going to be the back end. Even Oracle on Linux runs into the filesize limit, because of the lack of raw device access in the kernel.

    Sometimes, free software isn't the way to go. You have to pick the right tool for the job.

  11. Re:OSS is not a solution for every problem. by glyph · · Score: 2

    we know it will stand up, and if it doesn't, there are people to sue.

    Please re-read your licensing agreements with these respective vendors. You can't sue them, you can't even think about suing them. With oracle, I think they can actually have all of your children killed if you think about suing them and they find out.

    It's amazing how long elementary FUD like this can abound, especially after it's been refuted countless times in OSS diatribes. Who do you know who has ever sued Microsoft for lost data and won? Have you ever heard of Sun getting busted up over a hardware failure? It is EXTREMELY difficult to prove that the system failed in the first place (who's to say you configured it correctly? not the experts from Oracle, that's for sure), and even if you can, you agree to a LOT of things when you put that software on your server.

    I'd trust Postgres against Linux any day (and YES, I do work at the "enterprise level") simply because at least they're straightforward about there being no-one to sue. If I really needed to get hard-core (not too many machines, lots of RAM and processor, zero tolerance for failure) maybe FreeBSD would fit the bill.

    Most likely though, I would be averse to anything that involved x86 hardware for real reliability. So it would probably be NetBSD.

    --
    Glyph Lefkowitz - Project leader, Twisted Matrix Labs
    Writer, Programmer - Not a member of the TSU
  12. Fun conspiracy theory here :-) by A+nonymous+Coward · · Score: 2

    Now way back in the glory days of that last great war, way back in 19 and forty something, there were two developments which have been entangled down to this very day, causing no end of grief while each tries to ensnare and evade the other. I am referring, of course, to the digital computer and the Office of Strategic Services (OSS), which has morphed into the current CIA and NSA.

    I put it to you: is it just a coincidence that we have here a claim that Micro$oft coined the term? Or is there some deeper conspiracy that has been in progress these many years?

    Or is it just another example of Micro$oft falsely claiming innovation?

    --

  13. What you're looking for is InterBase by kronos · · Score: 2

    You're looking for InterBase. While it's not yet Open Source, IB 6.0 will be released under MPL 1.1. While they sanitize their code, you can get free (beer) beta releases of InterBase 6.0 for Linux, Windows, or Solaris from http://www.interbase.com at http:// www.interbase. com/open/downloads/60beta_downloads.html. There are RPMs and Tarballs there, and Debian packages are forthcoming.

    You can also check out the community springing up around InterBase at http://www.interbase2000.org.

    From my own personal experience with InterBase, I can tell you that it is most certainly an enterprise-grade database, with amazing speeds and all the features you could ask of a modern DB.

  14. Postgres by Phexro · · Score: 2
    I have to agree with the previous poster that Postgres is the best open-source RDBMS. I switched from MySQL to Postgres some time ago, MySQL just didn't have the feature set that Postgres does.

    Postgres seems quite robust, but slower than MySQL. If you can live with the slight speed hit for the cool features (views, and stored procedures come to mind) it's well worth it.

    Postgres works great with PHP, btw. :)

    --

    1. Re:Postgres by Cramer · · Score: 2

      It has been my experience that postgres is just hideously slow. After removing "time travel" from the database engine, things improved greatly but postgres is still one of the (if not the) slowest SQL engines available. However, postgres is one of the most complete SQL implementations you can find (free and in source form.) It's not a speed demon, nor was it designed to be -- make it work, then make it fast.

      MySQL is certainly fast as heck, but it's still missing alot of functionality one takes for granted these days. (Transactions being one of them.) Alot of people (myself included) use MySQL for business class stuff.

      As the original question was one of "it cannot fail, EVER"... you're going to have to buy that level of security. There are several commercial databases for Linux that will do what you need -- they are not free (save Sybase.) But I suspect, you aren't concerned about "free". (security, speed of bug resolution, etc.)

  15. My english skills are fine... by FallLine · · Score: 2
    firstly (fûrstl)
    adv.

    In the first place; to begin with.
    Usage Note: There is ample reputable precedent for using both first and firstly to begin an enumeration: Our objectives are, first (or firstly), to recover from last year's slump. Whichever is chosen, however, consistency is best served if any succeeding items are introduced by a parallel form, as in first . . . second . . . third or firstly . . . secondly . . . thirdly.


    a) My usage is far from incorrect, though in more formal writings i'll frequently use other forms.
    b) Slashdot is a casual discussion forum; colloquial language is generally fine, so long as the writing remains coherant.

    Nonetheless, I generally agree that the bulk of slashdot writing is piss-poor, even for this forum.

    ...gotta run
  16. Re:OSS is a solution for some problem. by FallLine · · Score: 2
    Irrelevent question. Since it's near impossable to know ALL the hackers (# can change at any time), and their respective itches. It would be an impossible question to answer. Pro or Con.

    It might be impossible to answer with absolute certainty, but nothing in this world is absolute. Would you say it's similarly impossible to know with any real certainty that a group of, say, medschool grads from Harvard are going to engineer a chip that can compete head to head against the latest offerings from Intel (et. al)? This of course, is just a random example, but you should get my point. You can accept perhaps one or two of them having the desire, but the odds of a whole community of them getting together to create something quite complex, and outside of their range of experience, for a long duration, are very slim.

    That being said, I agree with your later point.
  17. It's not (all) about the software. by jetson123 · · Score: 2
    You can put together systems that are as efficient, reliable, full-featured, and robust as Oracle or DB2 out of open source software. But it's a bit of work, it requires that you understand exactly what you are doing, and you are going to end up with a custom solution that you have to document and maintain yourself.

    What sets Oracle or IBM DB2 apart is that they have large user communities, and that means a lot of people trained in using them, a lot of third party add-ons, a lot of "how-to" books, etc. The real selling point of the commercial enterprise level RDBMS systems is not software, it's market position and widespread use.

    Whether that kind of support is something that matters to you enough to pay the steep prices that those systems cost, you have to decide for yourself.

    The only system that comes close to Oracle or DB2 in terms of popularity is MySQL. If you can live with its limited feature set in your application, consider using it.

    In the long run, hopefully some open source database will become as popular as the big commercial guys. The most likely path to that is no an Oracle clone, but the development of a new kind of database paradigm and popularizes it with an open source implementation.

  18. Re:MySQL performance by rjsquire · · Score: 2

    What you want is atomicity. However, rollback sort of goes hand in hand with it. If you try to group together a select and a delete from one table with an insert into another table and your insert should fail for any reason, the dbms must roll back the select and the delete. While I agree that transactions (atomicity) would be nice the developers will need to implement at least an internal rollback mechanism in order for it to work. I'm sure it's been mentioned here already but do check out PHOTO.NET. Philip explains it all and he's funny too.

  19. Re:Hypersonic SQL by phred · · Score: 2

    Hmm, an SQL database without GROUP BY.

    bzzzzt. . . . next . . .

    -------

    --
    Bill Gates Is My Evil Twin.
  20. Re:Failover info... by warpeightbot · · Score: 2
    Red Hat 6.2 (so bloody new you can't even get it in the store yet, I'm probably going to suck mine down the bitpipe tonight or tomorrow) has Piranha, which supports 2-node failover functionality. It's not Qualix, but it's a start.... and Red Hat is supported.

    So, we have an enterprise-class RDBMS on a world-class operating system with RAID and failover capability, and several world-class organizations supporting it. Sounds like a Real World Solution to me.... So it doesn't scale into the terabyte range. How many folks really need that? and how many folks are employed by small businesses?

    --
    "We came, we saw, we kicked its ass!"
    -- Bill Murray, "Ghostbusters"

  21. Re:OSS is not a solution for every problem. by Black+Parrot · · Score: 2

    > Slashdot is filled with different ppl very opinionated

    Yeah, I noticed.

    > Linux is not where your major enteprises turn when they need 24/7 huge scalable solutions.

    Some do, though I suppose you could define "major interprises" to filter them out.

    > this according to Gartner Group

    O Bastion of Bullshit. And PHB Fertilizer.

    You're right in that some of the commercial solutions are still better. How long that will last, anyone can tell. But what makes you think OSS can't provide a RDBMS that is at least useful, and useful for lots of people?

    > It takes more than opinions to form valid statements but it takes less than an minute to spout something on Slash.

    [This tart retort intentionally left blank.]

    --

    --
    Sheesh, evil *and* a jerk. -- Jade
  22. Re:Question ... by rm+-rf+/etc/* · · Score: 2


    Apple's Enterprise Object Framework is the best way I've seen. There's also the free GNUStep alternative GSDB. The downside with the GNUStep version is that there aren't a lot of drivers (postgress, sybase, and MSSQL at this point). EOF/GSDB completely abstracts everything and presents a true OO view of your data with no queries, transactions, or anything to worry about, the framework handles fetching and resource management.

    Excluding that, your best bet is to keep operations on your data source sperate. Thing's like Perl's DBI are just a API abstraction and not a Database abstraction. The way around this is to seperate out all your data fetching functions so that all your app does is call things like fetchCustomerList(start, end). That way to switch DB's you just create a new DB module for the new DB and your application could care less.

  23. Hypersonic SQL by chocolateboy · · Score: 2

    I gave up dB work a year or two ago. Maybe it was starting with MS Access and enjoying its compact, RAD, toy quality. Unscaleable and inapproriate for Enterprise stuff of course (and making no pretence to such garlands), but fast, small (well, the .mds were small if you compacted them) and fun.

    The problem was the inevitable upgrade lead me to MS SQL Server 6.5 (I was stuck on NT at the time). SQL Server had a lot of stuff I was missing in Access - triggers, stored procedures, scaleability - but it also brought a lot of frustrations. The domain aggregate functions were poorer than those offered by Access, which was a pain as I was trying to roll my own OLAP before it all got proprietarized into a Babel of different buyouts and skill-subsets. Its big-iron feel didn't stop it having a ludicrous 255 byte limit on varchar fields. For bigger you had to futz about wastefully amalgamating BLOBs of text with READTEXT/WRITETEXT. Plus it was grotesquely high-maintenance. I didn't want to become a DBA. I just wanted to hack SQL. And wasteful. The 'devices' ('Honey, I bloated the database') were huge and couldn't be shrunk, no matter how svelte the actual data.

    By the time I escaped the NT shop I was naively looking to Oracle to save me from these frustrations. Unfortunately, I'll never know the joys or horrors of that particular 'platform', because at 600 Mb for the Linux installation I just bailed out and cried 'Enough of this grotesquely bloated crap!' and pursued XML or BerkeleyDB solutions to anything remoteley persistence-flavoured thereafter. I knew that fast and small were synonyms, but the vendors were growing fat on the antonym line and there was nothing I could do about it. Even MySQL and PostgreSQL were part of the problem. They're all emacs. None of them are vi.

    Recently I ran my periodic, wishful, wistful Google-grep for 'fast', 'small' and 'rdbms' and found myself, after rejecting Brian Jepson's TinySQL as ridiculously small and cute but strictly pedagogical, finally discovering The One.

    Hypersonic SQL, a tiny Open Source Java database weighing in at less than 100K, supports correlated subqueries, transactions, referential integrity, indexes, stored procedures and JDBC - everything basically, but GROUP BY, cursors and triggers. I never used cursors myself. I'd rather iterate in Perl. The other two, admittedly are fondly missed, but not life-threatening. It doesn't support failover. But with such a small, developer-friendly codebase anything's possible.

    Did I say 'Perl'? 'But it's a Java database', I hear you cry. How can this beast talk Perl? Well, it can't, which is why I'm working on a Perl DBI interface to it talking to a native driver over TCP/IP. If anyone wants to contribute (I wouldn't need to hack it if some brave soul wants to polish up the languishing JNI module in JPL to support embedding Java in Perl on Linux (currently it only works for Win32)), please get in touch. I'm almost certainly way out of my depth and entering a world of pain.

    Oh, did I mention? It's 7 times faster than SQL Server.

  24. Re:Replication server for postGreSQL? by churchr · · Score: 2

    You need DBD::Multiplex, which is part of Perl's
    DBI. It allows you to do round-robin connections,
    among other things.

  25. Three months time???? by chris.bitmead · · Score: 2

    I don't think you're comparing apples with apples Tim. Postgresql will have V7.0 released way before 3 months time at which point the SQL should be equal or superior in standardisation to Interbase. OTOH, who actually knows where Interbase will be in 3 months? It might be nowhere fast.

    And in terms of "advanced", Interbase will probably never catch PostgreSQL. PostgreSQL is the king of features in RDBMS with its ORDBMS features. Interbase may one day be faster, but more featureful or advanced, I doubt.

  26. Frontbase & Standards Compliance by Dacta · · Score: 2

    Frontbase looks pretty nice - I'd never heard of it, though. Although... it does look a lot like some product the OpenLink ODBC people had. Is it related in any way?

    I'm not too sure about the benefits of full ANSI 92 compliance.

    For sure, standards are a good thing, but if no one supports them, what's the use?

    For instance, take Sybase & Oracle & SQL Server. For simple select, insert & update queries the SQL is farily portable. Once they get a bit complicated you'll probably need to do some fixing to get it to work.

    But that isn't the major problem with porting between databases. The real "problem" is the way different DBs have different strengths.

    For instance, on Oracle you use cursors all the time, and they work really well. They are fast, efficent and a great way to do some thing.

    Try porting them directly to SQL Server or Sybase, and your server will die really quickly - they just don't support the same number of similtanious cursors that Oracle does. Even if they were all ANSI SQL 92 compliant, it won't fix problems like that, which are much more difficult to fix than a few syntax differences.

  27. Question by Dacta · · Score: 2

    What is the difference between cold standby & a database server with replication facilities?

    I'm thinking of Interbase & Replication, or even Interbase and the database mirroring it has built in. Is there anything else needed for it to be classed as cold-standby at least?

    I'd think it would be pretty close to warm standby if you used it with some kind of heartbeat monitor, wouldn't it?

    I agree there is a fair way to go before Interbase supports hot standby, though.

  28. Not quite by Dacta · · Score: 2

    Interbase has been available on many non-windows platforms for a long time (15 years+) now. On Win32, they use MS VC++ to compile it.

    The short answer for why MSVC was used for InterBase is that we wanted to keep options open to compile the same source on NT for Intel, NT for Alpha (if market demand warranted) and NT for PowerPC (when that operating system existed). Borland compilers only support Intel. Adding compiler-specific #ifdef differences for different flavors of a single operating system seemed inelegant and needlessly complex.

    Bill Karwin

    The code was never a mess like people have been saying, but it (apparently) has a fair amount of conditional defines for various platforms. That's the stuff they are fixing now, adn why it is taking so long.

    This is an email Ann Harrison (president of Interbase) posted on a list back in January:

    >Subject: Delivering sources > > >As the opening of the source gets closer, I find myself >wondering more and more about the details. > >Specifically, where to make the tradeoff >between clean code and code now? > >At the moment, there's a single source for InterBase >with magic in the source control tool to deal with >differences between the various environments. The >most significant of those is the use of $ in names. > >Should NewCo make a pass through the code once to >so that the engine is actually, rather than almost, >one code base? Should they go a step further and >use GCC for all platforms? > >At one point, some of the compilers used to build >InterBase did not accept ANSI prototypes. As a >result, every routine has conditionalized prototypes. >They're really ugly and completely useless. Should >NewCo remove them before releasing the code? > >At various times, InterBase has been ported to >platforms that are no longer strategic - like >HP MPE/XL. Should NewCo remove the conditional >code for those platforms? Just the really ugly >ones? > >High level internals documentation is almost >non-existent. Complete documentation would >be (much) larger than the code, so that's not >likely to appear. Should NewCo take the time >to write module by module documentation? A >~30 page overview? > >As you think about these questions, please remember >that you (the knowledgeable InterBase developer) are >not the only user of the source. It's important >that the code that's released be buildable by humans >on all the platforms it runs on. > >Appreciate your thoughts, > >Ann

    Sorry the formatting is so screwed up! I hope it clears up some misconceptions, though.

    BTW, it's good to see you're not so cynical about Inprise any more, Bruce. I was their sole defender on Technocrat when you posted that story on them a couple of months back.

  29. I agree by Dacta · · Score: 2

    I've done a fair bit of stuff on SQL Server, and while it is limited in that it can only run on NT, it is so much better than Sybase (for instance) that it isn't funny.

    For instance, I'm using Sybase Adaptive Server 11.92, and guess what? No row locks, non-ANSI join syntax, and you can't seem to have outer joins where you filter the joined table by another (non-join) field.

    SQL Server has wonderful development tools, too - like Query Analyzer. After spending the first year and a half of my working life trying to work out Oracle query plans, MS Query Analyzer was such an amazing revalation.

    You are right that there a lot of (surprisingly) big enterprises relying on SQL Server. I think version 7 was a big step forward in this area.

    SQL Server still can't match Oracle on joins of more than four table with lots of data, though - of course, not much can really.

  30. Re:looking forward to it... by Dacta · · Score: 2

    Yeah... I know what you mean.

    They do seem to be very dedicated, though.. they listened to the community on the licence and everything.

    I doubt there will be any problems, but it will be nice when the source is out.

  31. Interbase by Dacta · · Score: 2

    Interbase is a RDBMS, and it is free (beer) now and will be Open Source (MPL) soon (June).

    1. Re:Interbase by Anonymous Coward · · Score: 3

      Interbase has great features such as database shadowing (to help with failover safety), online backup, is self "cleaning", is hands of (administration wise), is an absolute cakewalk to install, is fast as a bitch, has stored procs, views, triggers, decent security *and* supports UDF's that can be written in C/C++ or Delphi (so probably Kylix) and most of all is currently available for Linux in non beta, v 5.x, and in beta 6.0.

      It also is ANSI SQL-92 compliant, as far as I'm aware, as well as adding a couple of nifty extensions to the language that negate the need for looping through cursors (for those who have used Sybase/MSSQL).

      I have used Interbase on several Windows projects and it scales brilliantly from the single user desktop version to the server.

  32. Nice, but the benchmark is crap by Dacta · · Score: 2

    I'm pretty impressed! It seems a pretty nice bit of work - joins seem to work okay (I didn't try any out joins, and there is no ANSI 92 syntax, though)

    I think the benchmark is crap, though - simply testing how fast you can create a table and an index, and then do an insert might be fair enough, but testing select speed by doing a select * from that table is a bit of a joke. I'd love to see how it performed with a 4-way join and lots of data.

    That shouldn't be seen as a flame, though. I'm really, really impressed with this as a whole.

  33. Re:Yes, but Open Source is the solution for THIS o by maw · · Score: 2
    By the way, "OSS" is an acronym coined by Microsoft, and one I've never liked. Can we please call it "Open Source"?

    If we don't have an obnoxious Corporate eXecutive iMission Critic@l eAcronym (oCXiCeA) for OSS, then OSS is not suitable for obnoxious Corporate eXecutive iMission Critical eTasks (oCXiCeTs). Several Corporate eXecutive iMission Critic@l eConsultants (CXiCeCs) have demonstrated that OSS should be deployed in the eCorporation within several quarters, or iTimeUnits as they are now called. This will enhance our VC-funded eCorporation's rapid execution of oCXiCeTs.

    Furthermore, numerous CXiCeCs have given me an executive summary about OSS, from which I have learned that because OSS is all capital letters, it will fail to succeed in eBusiness, E-commerce, and/or iWare deployment. One eConsultant suggested a paradigm shift to oSS, while another one suggested O-ss. But the latter was actually an E-consultant and not an eConsultant, and he hardly ever uses the IT acronym, so I called an eXecutive iMeeting where we all met in a conference room where we reached a consensus that the so called "E-consultant"'s credibility is lower.



    Seriously, I agree with Bruce. "OSS" is highly obnoxious. Don't use it.

    So is eAnything, E-anything, iAnything, and many more. Terms like those reek of marketing, rich presentation (read: content free tripe), and all those other things which make the web, computing, and many other things less fun. (Maybe Bruce agrees with that point too, but I wouldn't want to speak for him. He's been misrepresented enough as it is.) Whenever I see eFoo or E-Bar, I turn the other way as fast as I can. You should too.

    --
    You're a suburbanite.
  34. Re:Replication server for postGreSQL? by Telamon · · Score: 2

    There is replication code available for Postgres at ftp://ftp.sdc.com.au/pub/repl/ I've never used it myself, but they mentioned it on the general list a while ago.

  35. Postgres speed by FutileRedemption · · Score: 2

    according to numbers I saw on the web Postgres is about 20% slower than Oracle for TPC tests.

    Not very far off, especially when considering that Oracle probably optimizes their DB specifically concerning benchmarks.

    Have no link, sorry.

    Oracle et al however have some edge or two regarding reliability features (and features in general).

    1. Re:Postgres speed by Eric+Green · · Score: 3
      PostgreSQL's query speed is reasonable, within 10-20% of Oracle's and "close enough" to MySQL's. Postgres's insert speed is hideous. I benchmark it as less than 1/3rd of MySQL's speeds on inserts.

      Much of Postgres's reputation for slow speed stems from attempts to use it as a CGI back end database. PostgreSQL, like Oracle, takes a long time to open and close connections. This is because PostgreSQL spawns off an entirely new process for each connection, complete with checking the user permissions etc., while MySQL spawns a thread, which is considerably faster to do but potentially less reliable if a thread dies due to a bug in the RDBMS (ask Microsoft what happens when a thread dies without releasing any locks that it has! Ever wonder why Windows dies so often?!).

      In short, if you want to use PostGreSQL as a web back end, use a persistent connection -- do NOT open and close the database for each CGI transaction!

      -E

      --
      Send mail here if you want to reach me.
  36. Re:MySQL performance - a decent hack by hardcorejon · · Score: 2

    I have a similar solution. I'm using mysql to develop a large-scale commecial website with gnujsp and apache jserv.

    My problem is this:
    This is a database of commercial real estate data. There is really only one main table, but it has lookup fields that reference as many as 7 other tables, most with funky outer joins. Performing even simple queries on it is horrific.

    My hack is this:
    Every night, at 3am, do a GIGANTIC 7-way merge of all the relevant tables, and insert the results into a MAMMOTH SINGLE TABLE, the cache table. Lookups are lightning fast, and the data is at most 24 hours old (this is reasonable in my case). Building the cache actually only takes 15 seconds or so --- abominable if you're waiting for a web page to load, but no so bad at 3am :)

    This technique may or may not be available to you, depending on your circumstances. I find it to work quite well with mine.

    - jonathan.

  37. Re:Transactional DSS/OLAP? by costas · · Score: 2

    Not all DSS environments are/should be read-only. My employer deploys one of the largest (if not the largest) read-write DSS client-server application.

    Transactions/Rollback are not critical (we have our own rollback solution over Acumate, made by Kenan) but in a read-write environment, they sure are good to have around.

    OTOH, you're right; in a DSS environments reads are always way more common than writes, so a rollback layer could actually be preferrable to a built-in transactional functionality.

    So, forget about transactional control, it's not even essential; are there *any* OSS OLAP apps out there? they sure would make a nice toy ;-)...

    engineers never lie; we just approximate the truth.

  38. Re:OSS is not a solution for every problem. by costas · · Score: 2

    While I agree with almost everything you said (including the part about Linux being an excellent imitator and a rare innovator), I am tired of seeing the same premise being used again and again wrt OSS: you don't necessarily need a buncha 'lone hackers' hacking away at a problem.

    Let's suppose that a fairly good-sized company came out with an open-sourced RDBMS/OLAP framework, that has all the groundwork to be fast and extensible. I am sure there are a few companies out there that would benefit to contribute to such a project for their own uses... Why can't OSS act as a framework and collaboration scheme, instead of just a toy for hobbyists?


    engineers never lie; we just approximate the truth.

  39. Re:RedHat 6.2 and some custom code? by MemRaven · · Score: 2
    This would have to be VERY specific to the application in question, if they're doing trivial failover. Database connections of all kinds keep a WHOLE lot of state around. If you want transactional failover, you're going to be hard pressed to do it without a full-blown 2PC implementation.

    Imagine the following scenario: You, client C, initiate a service/transaction with server A. Server A crashes. Your service/transaction fails over to server B. However, server B knows nothing about what's been going on in your stateful connection, and is thus unable to help you.

    The only solution that you can have within a database for failover is in the case when you have a stateless connection, and/or full state replication across the cluster of server machines. However, this alone is going to almost certainly bog down any of your transactions to the point where you don't want to continue anymore!

    A company which has done this extremely well is the old Tandem, with their NonStopOS and NonStopSQL. Their mastery of failover in transactions was superb, and quite a bit of wall street is still running their stuff, even though Compaq has been quite unable to successfully market their services now (NonStopNT.... that's a complete oxymoron!).

    I suppose you COULD implement the type of service you're talking about with failover stateless connections, but if you're going to do that you're much better going with an app server in the middle, which handles failover of connections in an EJB type model, and use the features of the app server to handle this.

    So to sum it all up: For simple services like RPC and HTTP and FTP and whatnot, which are stateless (or very nearly so) by their nature, simple failover is fine. For extremely stateful services (like a database system), you've got to use much more complicated measures.

  40. Open-Source OODBMS? by 1010011010 · · Score: 2

    How about an Open-Source Object Database? I've been fooling around with Cache' on Linux for a week or so, and it seems nice. I think OODBMSes are the way to go for web applications, especially ones that make any use of XML. Any Open Source ones?

    --
    Napster-to-go says "Fill and refill your compatible MP3 player", which is a lie. It's not MP3. It's WMA with DRM.
  41. RedHat 6.2 and some custom code? by crankyspice · · Score: 2
    Question... RedHat 6.2 claims "Generic service failover support. High availability support is no longer available only for web and ftp servers. Now you can have virtually guaranteed service from any of your servers including mail, pcp, and dns."



    While they don't mention database servers specifically, "Generic service" indicates that they've figured out some way (probably similar to watchdog coupled with some form of ARP spoofing) to fail a down server over gracefully. Which leaves data replication.<P>

    Would this work in conjunction with a simple mirror program? What I'm thinking of is... <P>

    A small daemon that runs (and listens) on the 'master' box. When a database connection is initiated it reads the commands sent by the client and sends them on to the 'master' database server, and again to a secondary server.<P>

    Both the master and secondary server have current copies of the data, and then leave the actual failover to whatever RedHat has concocted (still downloading the .iso, so this may or may not make sense).

    --
    geek. lawyer.
  42. Re:if you're not a religious fanatic... by Mr.+T · · Score: 2

    Umm, I'm an Oracle DBA, is that considered "enterprise level"? I read /. every day. Oracle is an excellent product, but I do think that it would be great if there were an open source alternative. At work, we are currently sitting around waiting for Oracle to release some bugfixes that impact our application. If it were open source, it would probably be fixed by now. Plus, imagine all the neat features you could add!!! I would definitely get involved with that project, if it ever came about.

  43. Re:OSS is not a solution for every problem. by JTB · · Score: 2
    Let me get this straight. So you're saying that anything that requires an "enormous amount of time and effort" and is "quite a challenge" can't be done by traditional open-source methods? You mean, for instance, an operating system? [snipped open source zealotry]

    Enterprise database applications are subject to FUD that runs far deeper than what we see in the desktop OS competition. I implement enterprise datawarehousing solutions for a living, and I see that most clients will ONLY accept an implementation that's built on the RDBMS they already use in house.

    You can open source the operating system, or the desktop environment, because your test bed starts out as hobbyists (the early adopters). Good luck finding the first company that will take the plunge and bet their entire transactional system on an untested platform, when they can pay for the insurance of an Oracle or a SQLServer.

    -JTB

  44. Re:OSS is not a solution for every problem. by IO+ERROR · · Score: 2
    Let me get this straight. So you're saying that anything that requires an "enormous amount of time and effort" and is "quite a challenge" can't be done by traditional open-source methods?

    You mean, for instance, an operating system?

    Or maybe a graphical desktop environment or two?

    How terribly shortsighted. I'm sure there are many people out there who are up to the challenge.

    Besides, what you're basically saying is that open source projects don't scale very well, and all of the above examples disprove that.
    ---

    --
    How am I supposed to fit a pithy, relevant quote into 120 characters?
  45. If not relational, then what? by Tassach · · Score: 2
    In fact, I'm not 100% convinced about the relational model (with it's SQL) being the best one for using at enterprise level, there should be better solutions for the 100.000% availability problem.
    If not relational, then what? Object-relational? Flat-file? What technology exists that can perform the same range of tasks as a rdbms with the same level of efficiency? O-R is still in it's infancy, and is very much a special-purpose tool. Flat-file does not scale well except for some specialized cases.

    RDBMS is a very mature, flexable technology. SQL (& RDBMS) does have some shortcomings, but then every tool does. For many (most?) typical business applications, RDBMS is the tool of choice.

    Because modern RDBMS's are so pervasive and accessable, many people forget that designing databases and writing SQL is a specialized and difficult art. A well-designed database can outperform a poorly-designed one by an order of magnitude or more. When people talk about RDBMS's having poor performance my first instinct is to question the db design and not the engine. Sadly, in my experience, most programmers treat the database almost as an afterthought. (I design databases for a living, in case you havn't guessed)


    "The axiom 'An honest man has nothing to fear from the police'

    --
    Why is it that the proponents of "one nation under God" are so eager to get rid of "liberty and justice for all"?
  46. Add DB2 to the list of "enterprise-ready" RDBMS by swamp+boy · · Score: 2

    DB2 should definitely be added to the list of "bet the company's jewels" RDBMS products. This is the database of choice for many banks and financial institutions. BTW, the company that provides data processing for roughly 40% of *ALL* mutual funds stores *your* data in DB2 (mainframe flavor) -- I used to work for them.

  47. erlang / mnesia / rdbms by Bob.Smart · · Score: 2

    Erlang (www.erlang.org) is open-sourced sw that Ericsson use internally. This has a "production quality" DB called felicitously mnesia. It has failover, replication, very easy to use transactions, ... It's not relational but the programmer's interface is very easy. There is an rdbms front-end advertised in the erlang contrib area - don't know how good that is.
    Erlang deserves to be better known: it is one of the biggest and best open source projects.

  48. Re:MySQL Clustering by ZeroDiV · · Score: 2

    It works, it's not beta i've used it on a cluster of 6 machines. Course it's going to work for me since i wrote it. The mysql guys are working on replication support themselves.

  49. Just bite the bullet and buy Oracle or DB2 by Anonymous Coward · · Score: 3
    You're not going to find open solutions that come close to these products, at least not yet.

    Added to which, you can get excellent support (at a price) for these products, which may ultimately be worth more to you than anything else.

  50. Failover vs. Replication by Eric+Green · · Score: 3
    Failover generally relies upon transaction logs and when machine A fails, uses machine A's transaction logs to bring machine B up to date. The result is that machine B is current as of the data that machine A failed.

    Replication generally does not take place on as current a basis. Replication can be done with any currently-existing RDBMS simply by adding a "last changed" timestamp to each record, and then occasionally querying out all records before a certain timestamp and sending those records to the remote server. Replication thus ensures that you have a snapshot of the data at some point in time, but not necessarily an up-to-date version of that data.

    Those differences are why replication can be done by third-party software, while "real" standby requires specific support at the RDBMS level. I have done replication with technology as crude as dBase II by explicitly programming it into my applications. Standby, on the other hand, is a different issue.

    Finally: For small businesses, none of these particularly matter. Small businesses don't have the money for big replication servers and such. They rely on an older version of replication: they have a paper copy of everything that goes into the computer, and make nightly backups (small businesses, unlike large businesses, don't run 24/7 and thus the RDBMS is stable at that time of night, they don't need the ability to do "hot" backups). If the database crashes and eats their data, they restore it from tape, then manually type in any transactions that weren't on the tape (using their paper copies). PostGreSQL is already "there" as far as small businesses are concerned, from a price/feature standpoint. PostGreSQL's only real issue is performance, especially performance when adding records, which is pathetically slow (using transactions helps, in that at least you don't get a sync for each "insert" but rather a single set of syncs at the "end transaction" time, but it's still less than 1/3rd the speed of MySQL even if you turned off the syncs altogether and ran it in the dangerous "async" mode). However, most small businesses don't need to add 200,000 records in a short period of time... I can add 200,000 records in little over 3 minutes with PostGreSQL on modest hardware, which is pretty pathetic, but far beyond the needs of a small business.

    -E

    --
    Send mail here if you want to reach me.
  51. Yes, but Open Source is the solution for THIS one. by Bruce+Perens · · Score: 3
    Inprise decided that the way for their database to remain competitive was for it to be Open Sourced. The thousands of man-hours have already been invested, but they think the community participation is important enough that they are opening it now, after that work has been done. Interbase is a rather powerful enterprise database, maybe not as powerful as Oracle yet but I think the Open Source folks might be able to help that.

    By the way, "OSS" is an acronym coined by Microsoft, and one I've never liked. Can we please call it "Open Source"?

    Thanks

    Bruce

  52. Re:list of open source RDBMS's by Nicolas+MONNET · · Score: 3

    Hey I'm developping a B2B exchange portal as well. I'm using MySQL. But damn, the thing sucks for everything more than simple selects. Plus selects are sometimes NOT optimised at all. Computing some basic stats on the content of the DB brings it down to its knees. I had to add another server duplicating the content in a very hackish way so that users would'nt be slowed down to a crawl every time we had to compute the stats (often). Duh.

  53. Postgres-- what about DSS/OLAP? by costas · · Score: 3

    I've done some research on this topic, and by far the most open/tested OSS RDBMS out there is PostgresSQL, which is actually based on the same tree from which a commercial RDBMS, Ingres, was spawned. However, I do not believe that Postgres is solid/fast enough for enterprise level applications yet (mind you, this is from reading a lot of Usenet and Web opinions on it, not personal experience, so YMMV).

    Since I am a DSS ("Data Mining") guy though, I am more interested in an extension to your question? is there any decent OSS OLAP/DSS DB system out there that can handle multiple dimensions, transactions, partitioning, ? If not, how many people would be interested in one?

    engineers never lie; we just approximate the truth.

  54. Chris Browne has a good site for this. by Brand+X · · Score: 3

    Christopher Browne has a good reference page for this. I stumbled across it while researching possible database solutions for one of my projects, and it seems to be well maintained and updated. The page can be found at: http://www.ntlug.org/~cbbrowne/rdbms.html

    --
    -- Still waiting for the Nike endorsement
  55. Replication by Falsch+Freiheit · · Score: 4

    First off, databases look for connections to a port, just like any other service like httpd, smtp or ssh. Doing things so that you have failover and/or load-balancing on *that* is easy. There are a large number of solutions, from application level to library level to dedicated hardware to general purpose software (LVS).

    That said, database failover is *not* easy.

    The basic problem is that you don't just read from a database, you read and write to the database. Database replication is a very non-trivial thing to do. You have to deal with the possibility of data being inserted, updated or removed from any of the servers that are involved. This requires that you have things like global locks, timestamps on everything, primary keys that are generated in a way unique to each replicated version of the database, etc.

    Most of the work of replication is handled very nicely by the big commercial databases (but you will find that they want you to pay for it.), but even with those you might have to make minor changes in your application (or maybe just the database structure) in order to handle replication properly...

    AFAIK, there's nothing you can get for free that will be able to handle any kind of real-time n to n (or 1 to n and n to 1 simultaneous, even) replication. You can do stuff like have mysql dump a log of everything that's done that changes the database, and then import those changes into another mysql instance someplace else; but that's nowhere near what the big commercial databases can do when it comes to replication.

    Given that replication is gonna cost real money, just get your free database running on a good-quality machine with good backups (that you make sure include the database) and maybe things like redundant power supplies. You can make a really robust box for much less than Oracle will charge you to license 2 instances with replication.

    (Or hire a team of 10 database programmers to spend a year or 3 adding replication to postgresql. Well... maybe they could do it wrong in a month -- I'm trying to give a wild estimate for doing it vaguely close to right.)

  56. Failover info... by Dacta · · Score: 4

    Sorry.. I forgot to talk about failover.

    There is a third-party, non-open source replication tool available fro Interbase 5 and 6 that will handle that data replication stuff you need for failover. The actual server switching I think you might have to write yourself... I suspect you might find there are a lot of people willing to help you.

  57. MySQL Clustering by starlingX · · Score: 4
    Check out:

    http://www.freshmeat.net/appindex/2000/01/15/94795 2558.html

    It's a toolkit for clustering MySQL, which may or may not provide the failover facilities that you need. It looks beta, but may work... I haven't played with it yet. And of course, this is only if you're willing to use MySQL. Looks fun, though.

    Otherwise, if you're doing serious enough work to need serious RDBMS backend, I hope you're making enough serious dough to afford a serious RDBMS. But before I invested big bucks into a failover setup using a commercial RDBMS, I'd cover my other bases first. Most DB failures I've seen were not a fault of the RDBMS itself, but hardware, power, diskspace, OS, human error, etc. The only time I've seen an RDBMS crap the bed at work, it was Micros~1 SQL Server and you should know better than to buy that. With our Sybase machines, it's always NT that takes down the box, not Sybase. Get these bases covered (hi-quality hardware, reliable power with a big UPS or generator, non-Micros~1 OS, and a DBA that knows what they're doing) and you most likely won't need a failover system.

  58. Re:OSS is not a solution for every problem. by emerson · · Score: 5

    Operating System design has been a solved problem for about a quarter of a century. 18-year-old kids write OSes as freshmen projects now. There's an astounding amount of freely-available material out there, including the source to several OSes, to get you started on doing such a thing, and there has been for a long time.

    The graphical desktop environments you point out are still in the infancy stage, after 3-4 years of development. Although they're very cool, (as is Linux), both are basically aping existing art, and not very well in some cases. (Go on, moderate me down as flamebait and argue with me; this is not my primary point.)

    RDBMS design is relatively new in the long view of the art of Computer Science. The commercial vendors that pay thousands of people fulltime to develop them still don't have it down correctly. It's certainly not progressed to the point where it's textbook knowledge, where any fresh-faced CS grad can get in there and build one.

    I think the most telling evidence that this is beyond the current state of the art of Open Source, though, is the fact that it doesn't exist yet. It's not like this Ask Slashdot is the first time anyone's ever thought about free alternatives to the big boys of databases. It's a known need.

    And the "Cathedral and the Bazaar" theory of Open Source as 'developers scratching their own itch' would seem to mean that if it COULD be done, it WOULD be done. Intrepid hackers would have taken Postgres or the GPL'ed old MySQL or the like and had a project underway years ago if it were just that easy. Apparently, it's not.

    Open Source projects do not _innately_ scale well, merely from the fact of them being open. There are a lot of preconditions for a project to work. At least these three need to exist:

    1) A critical mass of developers see the need.
    2) That mass of developers (or a subset thereof) can be organized into a team to do the work.
    3) That mass of developers has the knowledge and skills to do the required work.

    As an example, there's a large set of people that would like an open-source engine to crack DES keys in real-time on commodity hardware. That takes care of 1. Getting a set of crypto hackers together to do this work would be a snap, cf OpenSSL et al. There's 2. But, the art isn't there. The general public, the mass of developers out there, simply don't have the knowledge or experience to do that. Maybe it can be done, maybe it can't, that's not the point. The point is that just saying "I'm starting an Open Source project to do [something really cool]" doesn't mean it's something you actually CAN do.

    So, back to the topic at hand, yes, RDBMS is something that's outside of the Open Source world at this point. The skills required to do such a thing are reporting for work at Redwood Shores and Emeryville and the like, and not willing or able to share their really-quite-esoteric knowledge with the free software universe.

    And pointing to a handful of other projects that are working doesn't prove that this project, or any other, will. Look around; the code's not coming, mostly because, no, there aren't many people out there who are up to the challenge. Or else they'd be doing it.

    The proof, as they say, is in the pudding.

    --

  59. OSS is not a solution for every problem. by mrsam · · Score: 5

    This is, perhaps, one of those situations where you are rather unlikely to find an OSS solution. An enterprise-level RDBMS is not exactly an overnight hack. It would take an enormous kind of time and effort to put an enterprise-level RDBMS.

    I'm not even sure that an enterprise-level OSS RDBMS makes sense. If you're talking about an enterprise-level product, you probably have the resources to afford to purchase a commercial RDBMS. Both Sybase and Oracle have decades of experience building robust, rock hard, RDBMSes that you can bet your company jewels on, and both companies have Linux versions of their RDBMSes. I'm not sure about Informix, they may have a Linux version of their database engine too.

    Right now, attempting to deliver an OSS RDBMS that can beat time-tested RDBMSes from Sybase and Oracle would be quite a challenge, to say the least. People will just have to accept the fact that OSS can't solve every problem in the world. It's certainly feasible that one day Sybase or Oracle might decide to release their respective RDBMS engines under an OSS license (which should certainly be quite a shock to many people), that's probably the most likely scenario.
    --

  60. Not at all unreasonable! by FallLine · · Score: 5

    The position that any Open Source, and most importantly "free", RDBMS could NOT hold a candle to Oracle (amongst others) where the needs of "enterprise" as concerned is reasonable. The development of Linux is (was), in many ways, the antithesis of what it takes to develop an equivalent to Oracle. Think about it:

    Firstly, kernel's are not all that complicated.

    Secondly, Linux's kernel is about as simple you can get.

    Thirdly, Linux, thus far, has had the advantage of being behind the curve in development. It has had the benefit of essentially just being able to copy features, design, and even some code. While the code may be from scratch, virtually nothing else is. In other words, Linux's code was built with 20/20 hindsight. Furthermore, it's done nothing particularly spectacular with that opportunity.

    Fourthly, the motive and feedback in developing an enterprise class RDBMS are vastly different. In developing Linux, there is something quite tangible, such that most "geeks" can get relatively instant gratification. Do geeks really personally need such a database (of course not, not even their hardware approaches this)? Contrast this with an RDBMS, if "they" make the system %.001 more reliable, by whatever measure, under heavy loads, how can they appreciate that? (never mind testing it) These things are about numbers. A company that does 1 million a day in sales with a broken database can't afford that %.001, but a geek screwing around developing a database in his spare time for his uses is not likely to experience such a problem. And even if he is, does that %.001 really mean anything to him? He can just reboot. Sure, you can say "given enough eyes all bugs become shallow", but ask yourself: Whose eyes, and how do these bugs become apparent? A minor bug in Linux, and a couple geeks have to reboot their PC (which presumably prompts them to patch it). Which company is going to volunteer to expose itself to such a bug---possibly millions of dollars in lost revenues (amongst other concerns). The scale of the two uses changes the nature greatly (likewise, empirically, Linux demonstrates such leanings--towards areas which most geeks appreciate, but not necessarily enterprise)

    Fifthly, the development process itself is vastly different. Because Unix, and particularly Linux, is quite modularized, it can be developed is a ragtag fashion. This is not true with an RDBMS. They're are very complex and interconnected.

    Sixthly, support is a major issue. Despite all that hype amongst the linux community about "support" it is absolutely undemonstrated. And when you consider issues such as proper documentation....

    Seventh, given some of the risks i've mentioned, what company is going to prefer to try to save 10k in software fees for the increased risk. The expected value in this case could very well be negative.

    ...I can think of more, but that is plenty of doubt right there. Think critically, not dogmatically.

  61. Database failover is only available commercially by Timothy+Dyck · · Score: 5

    I don't think you will find database failover in an open source product -- it's too specialized an area to have gathered interest from the necessary critical mass of developers.

    I cover databases for PC Week Labs, and in my view (and in my tests), the most sophisticated OSS relational database is PostgreSQL. In three months, though, that title will be held by InterBase. It has a more complete SQL implementation, has a (third-party) replication option and is generally more mature. However, it doesn't have any failover.

    The products that do provide failover are (non exhaustive list, but these are the main ones): Oracle Parallel Server (or Oracle with the Failsafe option if you want just 2 node failover), Sybase ASE, IBM DB2 UDB, Informix, Tandem NonStop SQL and Microsoft SQL Server. The cheapest option in terms of purchase price will be Microsoft SQL Server with Microsoft Cluster Server, but it only runs on NT and only supports 2 node failover, and does not cluster.

    Note that when you talk about failover (particularly with stateful connections), you need to also get failover-capable hardware systems. The databases mentioned above all need special, tightly spec'ed high availability hardware configurations from Sun, HP, etc. In general, a shared SCSI bus is required.

    You also need to decide what level of failover you need: (from simplest and cheapest to most complex, expensive and best):

    * cold standby (replicate to a standby server) -- no shared storage, mostly up-to-date, cheap, manual intervention required for failover

    * warm standby (shared storage or transactional replication to a standby server) -- ensures no data loss for all committed transactions, requires fault-tolerant hardware and a fault-tolerant dbms, failover is automatic but may take several minutes to roll the log forward, warm up the cache and reconnect users

    * hot standby (shared storage) -- no data loss for committed transactions, fast failover (30 seconds or less) -- this is a very specialized area, and you should have your dbms vendor work with you on setting up these kinds of HA setups. This is an ongoing area of research; Sybase 12 and Oracle8i 2.0 introduced new features specifically in this area to do things like pre-connect users and pre-warm the cache to speed failover times.

    If you want to have an OSS solution, I'd advise using PostgreSQL or InterBase and writing stored procedures in C to replicate inserts, updates and deletes to a second server, then coding reconnect and heartbeat logic into your front-end apps. This will be easiest to do with an app server since you have a single point of data access. This will not provide atomicity or durability though (unless the updates use 2PC) -- you'll need to run a consistency check on the db on failover.

    This is hairy, though. Overall, I advise paying for something that someone else has gotten working.

    Regards,
    Tim Dyck
    Senior Analyst
    PC Week Labs

  62. Interbase by JohnZed · · Score: 5

    Check out interbase.com or interbase.org. Interbase has been in use in enterprise environments for something like 15 years. It has very advanced crash recovery features and replication throught a commercial plug-in. Version 6.0 is under the MPL and in beta right now for Solaris, Windows, and Linux. No, I don't work for them, I just sound like that.
    --JRZ

  63. Interbase, Interbase, Interbase!!! by Dacta · · Score: 5

    Interbase v6 is in (free beer) beta right now, and will be released under the MPL (Mozilla Public Licence) by Mid-June

    You can download the beta now for Linux, Windows or Solaris from Interbase.com or The Interbase Developers Inititive

    Supports databases up to 32 TB (!!!) spread over 2GB files and is fast and reliable. (It's been around for 15 years.) The biggest known Interbase DB is over 200GB.

    It's is ANSI 92 SQL compliant (well.. as compliant as any DB I've ever seen - better than Oracle for instance), and the support is Amazing

    Join the email lists at www.mers.com, and you'll be able to get answers from Ann Harrison (the president of Interbase.com), or from a lot of other people - it's the best support I've even seen.

    As for enterpise features - well, apart from large DB support, it has row level locking, transactions, referential integrity, blobs, Multi-Generational commits, stored procedures... ummm... I can't think of what else to say.

    Basically, if your Databases are less than 50GB then Interbase is the number 1 choice - above that maybe Oracle would be better, but that isn't exacly open source.

  64. Oracle strategies generalized to other RDBMS by grantsucceeded · · Score: 5
    I am an Oracle DBA, working in Silicon valley, I've had a lot of experience with various flavors of failover. You can have all sorts of degrees of automation and rapid failover, but most of the really high end solutions are very complex, and the added complexity buys almost nothing in terms of uptime. The "sweet spot", IMHO, for Oracle is a "standby" databse, or simply a raid box that is physically attached to two unix boxes, but only one mounts it at a given time. Not a clustered or Parallel Server implementation. There are some places that will pay the complexity and $$ price to have parallel server, but it really only gets your downtime down from, say 30 minutes to under 5 minuts *IF* it works right, automatically. Even parallel server, where both systems are up at the same time, you'll still see half of your connections aborted when one side of the cluster goes down, and you'll have to redirect that half to the surviving node. All the while dealing with complexity of these things *every day*.

    Here's a good way to do it in oracle, that'll probably genralize to almost any RDBMS:

    The simplest way is to have two unix boxes, each physically attached to a raid unit (dont use raid 5, use mirroring for performance). Don't use clustering software, just mount the filesystems on only one machine at a time. If the primary system fails, then mount the filesystems on the secondary , startup the database, and you're on your way with minimum downtime. Script it and tie it to some ping or other fail detection strategy if you feel lucky, but in my opinion, keep a human in the loop to actually execute the switchover.

    Another great solution is a so called "standby database". You make a copy of the primary database to your standby machine, using your backup tapes. Then you start to "roll forward" the standby database by applyin all offline redo logs of the primary database, as it is generated. This method should work for any database that logs transactions including Sybase, Informix and probably MySql. The equivalent of a redo log in MySql appears to be the "update Log". but I have no experience with MySql. One big drawback of this solution though, is that even though a transaction has been committd to disk on the primary, the event may not make it to the offline redo logs before the system crash. So a standby database can only be within some delta-T of the primary database, the last fiew transactions before the crash are lost. Sometimes this is acceptable, but if you billed the customer then lost the order, maybe not. You tune the redo logs buffering parameters to trade of performance for small "lag time" of getting redo logs across to the standby database.

    The truth is that it takes 10x effort to get failover and cluster software correct, and 9 times out of 10, the automatic failover either triggers accidentally (bad news) or triggers correctly but fails to come up on the secondary. This is because you have to be incredibly scrupulous about keeping non-shared resourses in sync on the two machines and you can't test the failover (politically that is) until it actually fails. Heck, I just took a call this past saturday for this very problem: Site paid $$ for a bigger consulting firm to implement cluster, firm cant be reached on saturday when cluster trips but does not failover successfully. Customer calls me. I am to gracious to say I told you so.

    My free advice: Make two machines, not a cluster. Keep the two machines in sync, using the "standby technique", or make the raid unit accessable by both machines. Keep a human in the failover loop.