Slashdot Mirror


Is the Relational Database Doomed?

DB Guy writes "There's an article over on Read Write Web about what the future of relational databases looks like when faced with new challenges to its dominance from key/value stores, such as SimpleDB, CouchDB, Project Voldemort and BigTable. The conclusion suggests that relational databases and key value stores aren't really mutually exclusive and instead are different tools for different requirements."

38 of 344 comments (clear)

  1. new record by hguorbray · · Score: 5, Interesting

    that's efficient -a summary that refutes the inflammatory headline

    I'm just sayin'

    1. Re:new record by Jah-Wren+Ryel · · Score: 4, Funny

      Yeaah. Only if you did not know the meaning of the '?' symbol.

      --
      When information is power, privacy is freedom.
    2. Re:new record by bFusion · · Score: 4, Insightful

      Well the '?' means that there's a question. The summary gave the conclusion to that question.

    3. Re:new record by julesh · · Score: 5, Funny

      that's efficient -a summary that refutes the inflammatory headline

      I'm just sayin'

      Nah. Efficient would be if the summary were "No."

    4. Re:new record by eln · · Score: 4, Funny

      Next Slashdot article: Is Jah-Wren Ryel a child molester?

      There's no evidence Jah-Wren Ryel has ever molested children, and no reason to suspect he would ever do so. Bandying about accusations like that would likely ruin his life forever.

      However, since child molestation is such a big political issue these days, as a responsible news site I believe we need to have equal representation from both sides of the argument and let our viewers decide.

  2. Uh-oh by benjymouse · · Score: 5, Funny

    Someone forgot to put a where clause on that delete.

    --
    Reading slashdot one-liner: (irm http://rss.slashdot.org/Slashdot/slashdot).rdf.item | fl title,desc*
  3. Top 25 Reasons the Relational Database is Doomed by MillionthMonkey · · Score: 5, Funny

    Someone type this up and submit it to Digg.

  4. Hey! by MightyMartian · · Score: 4, Insightful

    Hey, read my article! Just to make sure you do, I'll pull a Dvorak and put in some incredibly sensational headline about how RDBMs are dewmed!!!!!! BWAHAHA, feed my advertisers!!!!

    (Tune in ext week, when I write about how C programming is going to become extinct in the light of fantastic new development tools like C# and Ruby on Rails!!!)

    --
    The world's burning. Moped Jesus spotted on I50. Details at 11.
    1. Re:Hey! by dkleinsc · · Score: 5, Insightful

      Especially when the claim is as ridiculous as this one.

      There's a reason relational databases took over the world of databases: They provide a good combination of flexibility and structure to efficiently represent data. Which is what databases are supposed to do.

      --
      I am officially gone from /. Long live http://www.soylentnews.com/
    2. Re:Hey! by Just+Some+Guy · · Score: 5, Insightful

      There's a reason relational databases took over the world of databases: They provide a good combination of flexibility and structure to efficiently represent data.

      Especially since so many databases really are inherently relational. The textbook example of 1-customer:n-invoices, 1-invoice:n-items plays out quite a bit in the workplace.

      --
      Dewey, what part of this looks like authorities should be involved?
  5. Enough with the death of the relational DB by Mr.+Underbridge · · Score: 5, Interesting

    This same basic story keeps getting submitted from the same group of people who are generally trying to sell non-relational-DB stuff. This is an ad. Move along.

    1. Re:Enough with the death of the relational DB by Penguinshit · · Score: 5, Funny

      Don't online dating sites use relational databases?

  6. Finally the OODB people will by thammoud · · Score: 5, Insightful

    Leave us RDBMS dinosaurs alone. String Name/Value pairs, that is a great innovation. In other news, Sun will be dropping all types from the Java object system and rely on the VOID type. Idiots.

  7. Re:Voldemort! by youthoftoday · · Score: 4, Funny

    A Harry Potter fan? Voldemort? Surely the name is the one thing that'll *prevent* approval?

    --
    -1 not first post
  8. Re:Voldemort! by fuzzyfuzzyfungus · · Score: 5, Funny

    The name might be cool; but the length of some of the commands will really get to you. How many times do you want to type AVADA_KEDAVRA TABLE?

  9. A great open source implementation by thammoud · · Score: 5, Funny

    Map db = new HashMap();

    beginTransaction(); // Synchronize on the map
    db.add("key", "value");
    commitTransaction(); // Just serialize the fucker to a file. The idiots using this won't know the difference.

  10. Re:Voldemort! by the_B0fh · · Score: 4, Funny

    **SPOILER ALERT**

    In book 8, it turns out that good ol' Voldy is actually Harry's older brother. They had a tearful reunion, and Voldy now works for Harry.

  11. In relation to what? by Penguinshit · · Score: 5, Funny

    I won't believe it until Netcraft confirms it.

  12. This is an old argument which will not fly by bogaboga · · Score: 5, Informative

    It has been suggested before that the life of the relational DB is coming to an end. I must say that while I agree with this statement: -

    Relational databases scale well, but usually only when that scaling happens on a single server node. When the capacity of that single node is reached, you need to scale out and distribute that load across multiple server nodes. This is when the complexity of relational databases starts to rub against their potential to scale.

    I disagree with the following statement: -

    Try scaling to hundreds or thousands of nodes, rather than a few, and the complexities become overwhelming, and the characteristics that make RDBMS so appealing drastically reduce their viability as platforms for large distributed systems.

    I submit that the complexity can be managed and that's why we have jobs.

    I am an IT consultant at a major bank and we keep all kinds of data. Data that many find useless and is spread across 27 [major] nodes. Total records in our biggest table number about 57 million with 49 rows. I can tell you that data querying and integrity maintaining are a breeze if the schematic design is correct in the first place.

    We are always designing and testing different scenarios. In cases where we have had to change the schema, it has been simple if one knows what to do.

    I must say that Open Source DBs have worked for us though we rely on products from IBM and Oracle.

    Our philosophy is: If it works in PostgreSQL, it will even do wonders on DB2 or Oracle. I do not see how we can do away with the relational DB. Whoever designed it in the beginning did a marvelous job.

  13. ?'s meaning - literal and implied by qbzzt · · Score: 5, Insightful

    In headlines, "?" implies that something is a serious question, whose answer is likely to be yes. One that makes it worth spending the time to read the article.

    Imagine the headline said "Does Obama Smoke Crack?" and the article had a bunch of stuff about the president, with a last paragraph saying: "There is absolutely no reason to thing that President Obama has ever smoked crack."

    --
    -- Support a free market in the field of government
    1. Re:?'s meaning - literal and implied by 117 · · Score: 5, Funny

      President Obama smokes crack?!!?!??!!?!

    2. Re:?'s meaning - literal and implied by Cajun+Hell · · Score: 4, Interesting
      --
      "Believe me!" -- Donald Trump
    3. Re:?'s meaning - literal and implied by digitig · · Score: 4, Insightful

      In headlines, "?" implies that something is a sensationalized question, whose answer is "almost certainly, no".

      Fixed that for ya.

      --
      Quidnam Latine loqui modo coepi?
    4. Re:?'s meaning - literal and implied by value_added · · Score: 4, Funny

      President Obama smokes crack?!!?!??!!?!

      Dunno. Has he stopped beating his wife?

  14. Re:Voldemort! by jollyreaper · · Score: 5, Funny

    The name might be cool; but the length of some of the commands will really get to you. How many times do you want to type AVADA_KEDAVRA TABLE?

    Better than PokemonDB. Then you have to jump on top of your desk and shout "Customer Table, I select you!" every time you run a damn query.

    --
    Kwisatz Haderach
    Sell the spice to CHOAM
    This Mahdi took Shaddam's Throne
  15. Re:Voldemort! by GreatRedShark · · Score: 5, Funny

    You're right, that is a bit cumbersome. Hopefully, they'll release a friendly GUI wizard to make working with it more efficient.

  16. Supid people who don't understand data by mlwmohawk · · Score: 4, Informative

    The relational database is not going anywhere and nothing in that article is based on any firm understanding of managing data.

    Is the notion of a "join" obsolete? No, but it is typically impractical in a high volume system. You would probably use denormalization as a strategy.

    Scaling many nodes? OK, you still gotta put your data "in" something.

    key/value indexing? yawn. select val from keyvalue_tab where key = foo;

    The value can be basically anything, and most "relational" databases have good object support as well as XML, JSON, etc.

    So we can establish that a SQL relational database can do *everything* a simpler system can do. Now, think about ALL the things you can do with your data in a real database.

    What is the point of using a limited and less functional system? A good system, like Oracle, DB2, PostgreSQL, etc (!mysql of course) will do what you need AND allow you do do more should you be successful.

    The problem with data is two fold: Managing read/write/deletes and finding what you are looking for. These problems have been solved. A good database will do this for you. Want to store object? XML, JSON, binary objects, or a specialized database extension works perfectly.

    1. Re:Supid people who don't understand data by sl0ppy · · Score: 5, Insightful

      The relational database is not going anywhere and nothing in that article is based on any firm understanding of managing data.

      no, the relational database is not going anywhere, you are correct. but, that does not mean that there aren't instances where a non-relational database, with the addition of map/reduce, aren't extremely useful.

      non-relational databases have been around for decades, and are in use for quite a number of applications involving rapid development and storage of very large records. couple this with map/reduce, and you have the ability to scale quickly with very large datasets.

      scaling quickly is a very difficult problem to solve with an RDBMS - you either need to continue to throw more hardware at the problem, to the point of diminishing returns, or re-architect your data at the cost of possible significant downtime, while still attempting to serve up the data in a timely manner. i've been deep in the bowels of oracle RAC, fighting to get just 5% more speed out of a query over a billion rows and realizing that i have to start over with a new schema, just to squeeze more data out. compare that to simply adding another machine and letting the map functionality run across one more cpu before returning it for the reduce.

      Is the notion of a "join" obsolete? No, but it is typically impractical in a high volume system. You would probably use denormalization as a strategy.

      once again, correct, but having to denormalize to a snowflake or a star isn't always the best solution. you're taking the best parts of the relational database model, and throwing them out - normalization, referential integrity, just to squeeze more out of something that may not be the best tool for the job.

      do you hammer with a wrench? i have before, and i managed to hurt my thumb.

    2. Re:Supid people who don't understand data by DougWebb · · Score: 4, Informative

      Map/Reduce was developed at Google. It's a bit tough to wrap your head around at first, and once you get it you wonder what the big deal is, until you realize how suitable it is for Google's datacenters.

      Basically, you take a dataset (a bunch of key/value pairs) and a mapping function, and you run the mapping function over every item in the dataset. This gives you an intermediate dataset with different keys and values. You then run that through a reducing function, which produces your final dataset. This can be a single result, or a dataset that can then be processed with a different map/reduce pair of functions.

      The big deal for Google is that many of their problems can be expressed in terms of map and reduce functions that can operate in parallel over their datasets, and that their datacenters can handle absolutely enourmous quantities of parallel operations. So, for the mapping operation, they take the original dataset and mapping function, subdivide the dataset over thousands of servers, and let them run the mapping function in parallel. When these servers return their results, it's common for many different servers to return the same or related keys in the intermediate set. These are collated, so that when the intermediate dataset is distributed with the reduce function, all of the values with the same keys go to the same servers. This helps the reduce function to be run in parallel; it's often counting the number of original items that were assigned to the same key in the intermediate set.

  17. Not buying it. by reginaldo · · Score: 5, Interesting

    In theory, I agree the most costly actions in a database are joins. It seems like the key/value model is a great solution to this, on the surface. However, what the key/value model does is push the cost to the application layer. Instead of ensuring relational integrity and conformity in the database, suddenly all app code has to do this on the frontend. Also, instead of managing this process in a single place, suddenly this process is distributed among multiple methods. Sure, the DB is more scaleable, but suddenly the app is a mess.

  18. Re:Yes, but not soon. by Eravnrekaree · · Score: 5, Informative

    Actually i read TFA, and I just couldnt make sense of the benefits offered by the key value thing. You basically should be able to get the same benefits with a relational database system with a query that does a lookup on a single column index. This would involve searching the b-tree for that column, which would yield a row data address of some sort, to either a linked list of cells or a list of addresses of those cells. Once the single b-tree is done it is then very fast to find the other column values in that row. The b-tree or other index lookup also has to be done with the key value pair, the relational is just a collection of multiple key value indexes.

    There is the issue of having a variable number of pieces of data linked to a certain key. But you can do this in relational too. Just create a table with an id column, value type column and value column. A well designed relational, if you do a query on the id column, the b-tree will lead to data which has all of the row data addresses in the database that match the id. EAch of those rows will contain a different data type/data payload for the id. This is again pretty much as fast as a simple single index database.

  19. Re:Yes, but not soon. by photon317 · · Score: 5, Interesting

    Yes, these newer simple key/value databases like BigTable and CouchDB are effectively a subset of RDBMS functionality, so of course the same thing can be implemented relationally by just not using features.

    The reason these projects have taken off is that the relational features being skipped comprise most of the complexity of an RDBMS. Without them, it's relatively trivial to write new database engines from scratch instead of re-using MySQL, PostgreSQL, and so-on. These new feature-poor rewrites can take on many challenges that are harder for the big relational guys, like stellar performance on huge datasets, and being truly distributed in nature.

    --
    11*43+456^2
  20. Re:Here's a match.. by DarkOx · · Score: 4, Informative

    Wow, um where to being really....

    So you realize that the structure you are suggesting can be easily built in a traditional RDB, using a star-schema or cluster design right?

    Next you suggest doing the sorting on the client, and then say that if there is more data then a client can handle the server can be asked to send chunks according to the clients sort order. That means the server has to have all the sort logic the client has and probably in all but the most trival applications do all the sorting anyway... Seems to me a star schema and indexing the fact table on the attributes that are most comonly going to be used for sorting makes much more sense; because as I said the serve is going to be sorting anyway.

    Now there are data sets that non relational structers do make some more sense, but we have hierarchy , and navigational designes for those, yours is not one of them.

    --
    Repeal the 17th Amendment TODAY! Also Please Read http://www.gnu.org/philosophy/right-to-read.html
  21. A SQL query walks into a bar... by SystematicPsycho · · Score: 4, Funny

    A SQL query walks into a bar and sees two tables. He walks up to them and says 'Can I join you?'

    From Tom Kyte's blog sql joke

    --
    Analytic & algebraic topology of locally Euclidean meterization of infinitely differentiable Riemmanian manifold
  22. Re:Voldemort! by WuphonsReach · · Score: 4, Funny

    Better than PokemonDB. Then you have to jump on top of your desk and shout "Customer Table, I select you!" every time you run a damn query.

    *polite golf clap*

    --
    Wolde you bothe eate your cake, and have your cake?
  23. Re:SQL is the problem, not RDBMSs by WuphonsReach · · Score: 4, Informative

    Over 15 years ago Paradox's query-by-example was light-years ahead of today's soul-killing SQL crap.

    QBE grids are nothing more then a UI abstraction of the underlying SQL SELECT statement. In fact, in MS-Access (which has a QBE grid), you can flip between looking at the QBE and looking at the raw SQL SELECT statement.

    Sometimes it's faster to do it in raw SQL, sometimes it's faster to setup the query in a QBE grid.

    --
    Wolde you bothe eate your cake, and have your cake?
  24. Re:SQL is the problem, not RDBMSs by Just+Some+Guy · · Score: 4, Insightful

    Database operations do not need to look like code or algorithms, the only reason they do is to provide jobs for database programmers.

    From Wikipedia:

    Relational database theory uses a different set of mathematical-based terms, which are equivalent, or roughly equivalent, to SQL database terminology.

    SQL looks like SQL because it's based on set theory. As an exercise, invent your own language that's as powerful (read: also based on a strong theoretical basis) but simpler. See you in a couple of decades!

    --
    Dewey, what part of this looks like authorities should be involved?
  25. MapReduce is a bunch of hype by Estanislao+Mart�nez · · Score: 4, Interesting

    The name of the MapReduce framework comes from the functional programming operations "map" and "reduce." Map takes as its input a collection of data, and a function that transforms data elements into other elements; it outputs a collection where each element of the input collection has been replaced by the result of applying that function to it. Reduce takes a collection of elements, an initial value of the same type as the elements, and a two-place, commutative, associative and symmetric operation; it produces as its output the value that results from applying the operation to the initial value and each element of the collection in turn, accumulating the partial results.

    Map and reduce are operations that can be trivially parallelized. To parallelize map, you divide the collection into subcollections (in any arbitrary manner), and map over each of them in parallel. To parallelize reduce, you divide the collection into subcollections, also arbitrarily, reduce each subcollection independently, then apply the reduction operation to the partial results. (That works because the reduction operation is commutative, associative and symmetric.)

    Well, guess what: this sort of technique is trivially applicable to relational database queries. A SQL query translates down to a combination of joins (the FROM clause), filters (the WHERE clause) and maps (the SELECT clause). Joins are trivially parallelizable; you give each execution unit a subset of the tuples of the driving relation. Filtering (the WHERE clause) is a kind of reduce operation. SELECT is a kind of map operation. This means that relational queries are not any less amenable to parallel execution than the stuff Google does.

    But the killer thing here is that MapReduce says absolutely nothing about the updates problem. This is one of the big features of RDBMSs: the ability to handle concurrent query and modification. It also says nothing about the data integrity problem, which is also one of the big RDBMS features.

    So, when you get down to it, there is a good argument to be made that many applications could make use of database technologies that support much faster querying, at the expense of very little updating. But there's no convincing argument that that technology isn't best implemented in the context of an RDBMS.