Slashdot Mirror


Beyond Relational Databases

CowboyRobot writes "Relational databases were developed in the 1970s as a way of improving the efficiency of complex systems. But modern warehousing of data results in terabytes of information that needs to be organized, and the growing prevalence of mobile devices points to the increasing need for intelligent caching on the local hardware. According to the ACM, the future of database architecture must include more modularity and configuration. Although no concrete solutions are included, the article is a good overview of the problems with modern data systems."

12 of 360 comments (clear)

  1. KISS by mcrbids · · Score: 5, Insightful

    Some of the biggest problems that "new" database designs have:

    1) Overly complex

    2) Don't scale

    3) Tied to a single platform/implementation

    4) Poor performance

    It's typical to see all four in a single try!

    SQL, on the other hand:

    1) Reasonably simple API

    2) Scales to very large databsaes

    3) Cross-platform/architecture

    4) Performs very well.

    Given the insane amount of inertia SQL has, it will extend into an object model, rather than be replaced by one. (EG: C/C++)

    --
    I have no problem with your religion until you decide it's reason to deprive others of the truth.
    1. Re:KISS by NoOneInParticular · · Score: 3, Insightful
      Google's web index and desktop search facility is a database. I don't know about point 1, but Google definitely blows any relational database out of the water on point 2 to 4.

      As for SQL I do not agree with point 4: SQL does not perform very well. We're in the age of Ghz processors, fast disk drives and it *still* is an performance issue to add a few million records to a database? What SQL sorely lacks is a recognition that the 4th generation of software languages was a bad mistake, and get back to a third generation language: explicit indices, explicit loops over these indices and fast (compiled) execution of said loops. Just freaking program the database instead of waving chicken bones at it.

    2. Re:KISS by MSBob · · Score: 3, Insightful
      Relational algebra is a very nice and tidy concept but SQL is a piss poor (and limited) implementation thereof.

      In other words, relational databases are very nice and elegant but their interface (SQL) is bad and should be replaced.

      Also relational databases by themselves can't supply the needs of a typical enterprise and that's where technologies such as OLAP are built on top of RDBMS to make certain data manipulations efficient.

      --
      Your pizza just the way you ought to have it.
    3. Re:KISS by WaterBreath · · Score: 3, Insightful
      Usually a piss-poor database reflects on piss-poor developers.

      This is a BIG IMPORTANT POINT. But, unfortunately, the people that are architecting the databases in these cases usually don't realize that they're producing bad designs. All they see is "The database is slow. Why the *!?$ is it so slow!?" There's usually one or two answers: 1) Your schema is organized poorly. 2) You're just plain juggling too much data at once. And let's not forget the deadly combination of both 1 and 2. 100M rows, for example, is a lot of data, any way you look at it. If you've got 100 bytes per row (remarkably small in many cases), that's 10G of data. If you're shuffling through all that data every time, yeah, it's gonna be slow. Ya might want to look into reorganizing, or archiving.

    4. Re:KISS by Brian_Ellenberger · · Score: 3, Insightful

      Google's web index and desktop search facility is a database. I don't know about point 1, but Google definitely blows any relational database out of the water on point 2 to 4.



      Google is a very unique case. There are two things in Google's advantage that most RDBMS system have to take into account:

      1. Google does not have to update in realtime. If I add a page to my website it is not immediately available on Google. Contrast this to a normal RDBMS where if I add a record it must be available immediately. Google is much more similar to a Data Warehouse than a RDBMS.

      2. Websites indexes are more easily parallelized because complex joining of data is not needed. Naively, Google can store all websites starting with 'A' on a server, 'B' on a server, etc. You can store the User table and the Address table on separate database servers and expect to query on users and their addresses with any sort of performance.

      3. Going along with 3, the queries expected out of most RDBMS systems are much much more complex than any queries expected out of Google right now. I'm assuming you haven't seen any complex financial reports or statistics that have been generated from a RDBMS. Databases do alot more than "select name from user where id=1234".



      Brian
  2. Just because something is "old" by Anonymous Coward · · Score: 4, Insightful

    Doesn't make it obsolete. "Databases are old and kludgey. Teh suXX0rs for R0xxng H4XX0rs liek me.

    Just because people are too stupid to take the time to read and understand the theory and learn the application doesn't mean the technology is no longer relevant.

    Of course no solutions are proposed. There are none because relational theory is correct, and appropriate for real database driven applications. Little crap bulletin boards can use MySQL.

    Netcraft confirms relational databases are dead!

  3. Wake me up when it's ready by Vile+Slime · · Score: 3, Insightful

    People,

    Have been crying for the need to replace relational databases since the early nineties at least.

    We can all see where that got them.

    --
    ---- Go ahead, mod me down, I'll just post it again and you lose your mod points.
  4. . . . no concrete solutions are included. . . by kfg · · Score: 3, Insightful

    Funny how they never are, eh?

    KFG

  5. Re:I did not RTFA by AKAImBatman · · Score: 5, Insightful

    I didn't RTFA but for my needs

    Or the summary

    mySQL suits me quite well.

    That's nice. It won't handle a multi-terabyte database, though. That's the domain of Terabase, Oracle, and (blech) DB2. It's also what the article is about.

    The power of PHP and mySQL is all I need.

    And a moped is all you need to get to work. If you want to haul 300 metric tons of rock from point A to point B, you need a dump truck. Again, that's what this article is about.

    Back on topic, this entire article is mostly speculative for the moment. A lot of excellent work has been done in OODB and XMLDB designs, but no singular design has yet emerged to solve all our woes. For example, I love the Prevayler concept. It solves a lot of problems, lowers data access times, and provides for complete data security. It also isn't usable or scalable without a lot more design work.

    The future will hold some very interesting things, but for now we'll have to keep inventing until we come up with a consolidated solution.

  6. Re:I did not RTFA by techwolf · · Score: 5, Insightful

    Quite true. MySQL does very well into the gigabytes. I haven't seen any good evidence of its abilities in handling terabytes of data. Don't get me wrong, I'm a huge fan of the MySQL, but I'm a bigger fan of using the right tool for the job. For your web message board, MySQL works fine. For holding product, sales, distribution, etc. information for, say Levis, it would not.

    --
    I don't do this for karma, I do it for cash. It's much better.
  7. Why 'Beyond'? by Anonymous Coward · · Score: 5, Insightful

    Designed in the 1970s, the RDBMS has nevertheless proven to be the cornerstone of Web development three decades later. Thanks to systems like MySQL deployments are surely at record levels.

    Essayist Clay Shirky has gone to far as to suggest that MySQL is at the center of a whole new software movement.

    In my experience with Web applicaions the chief problem with the RDBMS seems to be that it does not do text indexing and search very well, so I have to keep a second store of data in something like Lucene.

    The other major problem is the level of skill required to tune the database to achieve high-performance SQL queries, so hopefully the RDBMS will evolve with more self-configuration capability.

    The article, which I only skimmed, actually addresses these two concerns but seems to pooh-pooh the notion of simply refining the existing RDBMS systems. Instead it says " Old-style database systems solve old-style problems; we need new-style databases to solve new-style problems. "

    The paper seems awfully squishy on what this means. The clearest I found was a call to "produce a storage engine that is more configurable so that it can be tuned to the requirements of individual applications."

    But this call for new highly modular/configurable storage "engines" seems to me to require at least as much fussy care and feeding as a traditional RDBMS. You're just replacing one DBA with another. And throwing out decades of refinement in the process.

    The raison d'etre of the RDBMS is to allow the programmer to treat storage as a black box while gaining nifty ACID features. Extending this to text indexing seems logical.

  8. Re:A thought on XML documents by The+Slashdolt · · Score: 4, Insightful

    Here is the problem with your idea. Unlike the relational model, XML does not link facts. XML documents can be joined in any way, either valid or invalid, without you knowing one way or another. The relationships between documents are weak. There is no referential integrity. Within a proper relational model you are stating facts and factual relationships. Joins of those facts generate derived facts that are as true and accurate as your original model. Why add the overhead and complexity of xml? Why not just use a proper relational model?

    --
    mp3's are only for those with bad memories