Slashdot Mirror


Is the Relational Database Doomed?

DB Guy writes "There's an article over on Read Write Web about what the future of relational databases looks like when faced with new challenges to its dominance from key/value stores, such as SimpleDB, CouchDB, Project Voldemort and BigTable. The conclusion suggests that relational databases and key value stores aren't really mutually exclusive and instead are different tools for different requirements."

344 comments

  1. new record by hguorbray · · Score: 5, Interesting

    that's efficient -a summary that refutes the inflammatory headline

    I'm just sayin'

    1. Re:new record by Jah-Wren+Ryel · · Score: 4, Funny

      Yeaah. Only if you did not know the meaning of the '?' symbol.

      --
      When information is power, privacy is freedom.
    2. Re:new record by bFusion · · Score: 4, Insightful

      Well the '?' means that there's a question. The summary gave the conclusion to that question.

    3. Re:new record by Anonymous Coward · · Score: 0, Troll

      Next Slashdot article: Is Jah-Wren Ryel a child molester?

    4. Re:new record by julesh · · Score: 5, Funny

      that's efficient -a summary that refutes the inflammatory headline

      I'm just sayin'

      Nah. Efficient would be if the summary were "No."

    5. Re:new record by eln · · Score: 4, Funny

      Next Slashdot article: Is Jah-Wren Ryel a child molester?

      There's no evidence Jah-Wren Ryel has ever molested children, and no reason to suspect he would ever do so. Bandying about accusations like that would likely ruin his life forever.

      However, since child molestation is such a big political issue these days, as a responsible news site I believe we need to have equal representation from both sides of the argument and let our viewers decide.

    6. Re:new record by Anonymous Coward · · Score: 0

      It's a story posted by ScuttleMonkey. Did you actually expect him to do a good job? He's probably one of the worst editors on here...

    7. Re:new record by Anonymous Coward · · Score: 1, Informative

      And now the first search result on Google for Jah-Wren Ryel returns a conversation discussing whether or not he's a child molester.

    8. Re:new record by Anonymous Coward · · Score: 0

      That's fucking hilarious. My anon post has succeeded beyond my wildest dreams.

    9. Re:new record by Pseudonym · · Score: 2, Informative

      This is what linguists refer to the "tabloid headline question mark". Its use is to say something inflammatory and only tangentially related to the story in order to get readers.

      Examples:

      "Is Jennifer pregnant?"
      "Steve Ballmer: Love child of Satan?"

      --
      sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
    10. Re:new record by Ironica · · Score: 1

      I've never even heard of Jah-Wren Ryel before this conversation, and now I won't ever let him (or her) near my kids!

      --
      Don't you wish your girlfriend was a geek like me?
    11. Re:new record by Ed+Avis · · Score: 1

      The experienced reader, according to Brian Kernighan, will usually know what's wrong.

      --
      -- Ed Avis ed@membled.com
    12. Re:new record by Anonymous Coward · · Score: 0

      Except, as has already been pointed out, it was KEY to the story, it was the entire premise of the story, not "only tangentially related."

    13. Re:new record by horza · · Score: 1

      Whoever he/she is sounds like a complete monster. There's obviously no smoke without fire. I reckon hanging/electric chair/castration is too good for him/her.

      Or maybe this is all just a smear campaign against him/her by the extreme Left/Right? Wonder if freejahwrenryel.com is available? This country needs more people like him/her!

      Phillip.

    14. Re:new record by Anonymous Coward · · Score: 0

      Next Slashdot article: Is Jah-Wren Ryel a child molester?

      Has he stopped molesting children?

    15. Re:new record by Pollardito · · Score: 1

      It's also a way to infer something completely untrue and purposefully inflammatory while protecting yourself from a lawsuit because you didn't make a statement you "only asked and answered a question on everyone's minds." e.g. "did Barack just give Michelle a terrorist fist jab?"

  2. Cloud by Anonymous Coward · · Score: 0

    Bla bla bla

  3. WTF? by E+IS+mC(Square) · · Score: 1

    Headline: Is the Relational Database Doomed?

    Summary: "The conclusion being that relational databases and key value stores aren't really mutually exclusive and instead are different tools for different requirements."

    WTF?

    1. Re:WTF? by Lord+Ender · · Score: 2, Insightful

      If "key/value" databases do become more popular, they certainly might eat in to relational database mindshare. 90% of web applications use RDMSs merely as persistent data storage--the fact that they are "relational" doesn't matter at all; the fact that a separate SQL language is needed to get the data (rather than using language-native data structures as an interface) is even a negative for RDMs.

      As a web app developer, I'm excited that something other than SQL is getting attention. RDMSs won't go away because they have properties data miners, for example, need. But they aren't ideal for the simple persistent data stores most apps call for.

      --
      A slashdotter who didn't build his own computer is like a Jedi who didn't build his own lightsaber.
    2. Re:WTF? by kbrasee · · Score: 1

      You must be working on some really simple apps then, because for just about everything I've worked on, a key/value data store would be woefully inadequate.

    3. Re:WTF? by ultranova · · Score: 2, Insightful

      If "key/value" databases do become more popular, they certainly might eat in to relational database mindshare.

      A "key/value" database is simply a relational database with a single table and two columns. It doesn't make any sense to build a separate server program for what current database servers can already easily do.

      90% of web applications use RDMSs merely as persistent data storage--the fact that they are "relational" doesn't matter at all; the fact that a separate SQL language is needed to get the data (rather than using language-native data structures as an interface) is even a negative for RDMs.

      I'm a bit uncertain what you're saying here. Surely the fact that the server can do more than what you need doesn't hinder your program? The same goes for SQL language; surely the fact that commands sent to the database are text strings isn't a negative? In any case, you can (and probably should) separate database access into a module of its own, offering whatever API you desire for the rest of the program.

      As a web app developer, I'm excited that something other than SQL is getting attention. RDMSs won't go away because they have properties data miners, for example, need. But they aren't ideal for the simple persistent data stores most apps call for.

      However, they can handle such data stores in a very simple fashion. A pair of "setvalue(key, value) / getvalue(key)" is trivially easy to implement on top of SQL language. It just doesn't make sense to pour resources into developing a less capable database server.

      --

      Forget magic. Any technology distinguishable from divine power is insufficiently advanced.

    4. Re:WTF? by jadavis · · Score: 1

      they aren't ideal for the simple persistent data stores most apps call for

      Most applications collect some kind of data that is valuable to the decision makers.

      You are thinking about what it takes to get an application "done" without considering what the business really needs. The application might find it most convenient to collect information per-customer (for obvious reasons), but the decision makers might need a more global sense about what's going on.

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
    5. Re:WTF? by jadavis · · Score: 1

      Well, it would work for the application itself, if you totally ignore any business need to access the collected data in a useful way.

      Key/Value stores are usually a poor attempt to reinvent a persistent virtual memory system. That's all a VM system is: give it a pointer and you get a word of data back.

      So, if you really like hopping from pointer to pointer (or "reference" as it's called in an OOP language), which some people obviously do, a key/value store will suffice to finish the application in the most useless way possible (i.e. all of the data is trapped in a web of pointers).

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
    6. Re:WTF? by EastCoastSurfer · · Score: 1

      You sound like the person who doesn't know enough about relational databases and just wants to put everything in one big table.

    7. Re:WTF? by anomalous+cohort · · Score: 1

      I, too, believe that key/value is a sign of sickness. Obviously, systems where key/value stores appear to be a good fit don't have accounting modules to them. In addition to that, whenever I see a system where a key/value store appears to be a good fit, I look deeper and realize that what we have is a system where the designers were too weak to make a stand on how to capture data so they just delegate all decisions to the user.

      That may be a great approach for photo sharing but with almost anything else, the user wants the program to serve as an authoritative source. The choice of fields on a screen should focus the user's attention, like an interview or a debriefing, on making relevant decisions. It shouldn't just be this waxy flexible "anything you want in the moment you can have" style of programming.

    8. Re:WTF? by Anonymous Coward · · Score: 0

      You clearly have not worked with a functional programming language. Trees are your friend, especially when you can iterate over them in lots of neat ways.

      I have found myself in situations where I needed specific data for a program to work as intended. (For example, programmatically computing the structure of a form) In these cases, I have two realistic choices. I can either hard code the data, typically in a set of arrays of trees I iterate over -- as it essentially defines how a form will work -- or I can store it in a database. Somehow, people think it is a "best practice" to do the latter. It really isn't, and needs to die. At best, you're going to be using the same interface to iterate over the result set. At worst, you separate similar concerns in a really obnoxious way.

  4. Uh-oh by benjymouse · · Score: 5, Funny

    Someone forgot to put a where clause on that delete.

    --
    Reading slashdot one-liner: (irm http://rss.slashdot.org/Slashdot/slashdot).rdf.item | fl title,desc*
    1. Re:Uh-oh by frodo+from+middle+ea · · Score: 1

      and were operating in auto-commit mode. gasp..

      --
      for the last time people, I am "frodo from middle eaRTH", not "middle eaST".
    2. Re:Uh-oh by MarkRose · · Score: 2, Funny

      That's okay! I'll just rollback the transaction.... oh shit, that was a MyISAM table...

      --
      Be relentless!
    3. Re:Uh-oh by SpaghettiPattern · · Score: 1

      Someone forgot to put a where clause on that delete.

      If only. Any clause narrows down a result which in this deletion case is undesired.

      --

      I hadn't the slightest objection to his spending his time planning massacres for the bourgeoisie... (P.G. Wodehouse)
    4. Re:Uh-oh by Tablizer · · Score: 1

      Someone forgot to put a where clause on that delete.

      After making that mistake once (hitting the Enter key too soon), I from then on typed it in like this:

          DELETE FROM WHERE {condition}

      I then back-arrow to fill in the table name once I inspect the condition. (Just don't name a table "where" :-)
                 

  5. Yes, but not soon. by pwnies · · Score: 3, Interesting

    The flexibility offered in key/value databases is simply too good of a feature to pass up. However, do you really think you can get people to give up MSSQL? It'll be nice for smaller projects, but corporations wont even consider it for a number of years.

    1. Re:Yes, but not soon. by SanityInAnarchy · · Score: 3, Interesting

      do you really think you can get people to give up MSSQL?

      In favor of MySQL, PostgreSQL, SQLite, even Oracle, yes, I do.

      corporations wont even consider it for a number of years.

      You must have some specific corporations in mind, because I've known many corporations to use each of the above technologies. In fact, SQLite is one of the most popular databases ever.

      No, the reason it's not soon is because these other ones (CouchDB) aren't mature, and the ones that are (BigTable) aren't available at any price.

      --
      Don't thank God, thank a doctor!
    2. Re:Yes, but not soon. by Anonymous Coward · · Score: 1, Funny

      Sleepycat made a good key/value pair database but that isn't what we're talking about.

    3. Re:Yes, but not soon. by Eravnrekaree · · Score: 5, Informative

      Actually i read TFA, and I just couldnt make sense of the benefits offered by the key value thing. You basically should be able to get the same benefits with a relational database system with a query that does a lookup on a single column index. This would involve searching the b-tree for that column, which would yield a row data address of some sort, to either a linked list of cells or a list of addresses of those cells. Once the single b-tree is done it is then very fast to find the other column values in that row. The b-tree or other index lookup also has to be done with the key value pair, the relational is just a collection of multiple key value indexes.

      There is the issue of having a variable number of pieces of data linked to a certain key. But you can do this in relational too. Just create a table with an id column, value type column and value column. A well designed relational, if you do a query on the id column, the b-tree will lead to data which has all of the row data addresses in the database that match the id. EAch of those rows will contain a different data type/data payload for the id. This is again pretty much as fast as a simple single index database.

    4. Re:Yes, but not soon. by photon317 · · Score: 5, Interesting

      Yes, these newer simple key/value databases like BigTable and CouchDB are effectively a subset of RDBMS functionality, so of course the same thing can be implemented relationally by just not using features.

      The reason these projects have taken off is that the relational features being skipped comprise most of the complexity of an RDBMS. Without them, it's relatively trivial to write new database engines from scratch instead of re-using MySQL, PostgreSQL, and so-on. These new feature-poor rewrites can take on many challenges that are harder for the big relational guys, like stellar performance on huge datasets, and being truly distributed in nature.

      --
      11*43+456^2
    5. Re:Yes, but not soon. by timmarhy · · Score: 0, Offtopic
      let me guess, you don't like mssql because it's microsoft? what a fucking sheep, mssql is a great database.

      oh and i've used all the others and for you to suggest mysql over mssql tells a lot...

      --
      If you mod me down, I will become more powerful than you can imagine....
    6. Re:Yes, but not soon. by Anonymous Coward · · Score: 0

      Especially to replace Berkeley DB, that amazingly unstable piece of trailer trash, going out, getting drunk, and bringing bad data home to sleep on the couch that would refuse to leave the next morning.

    7. Re:Yes, but not soon. by Anonymous Coward · · Score: 0

      You are going to recommend mysql over mssql? Seriously? Seriously? mysql can't even do a fucking full join. The idiots left Feb 31st as a valid date for YEARS. It loves to not throw errors when it should, silently changing your data from what it should be. It's not on the same level as any of the other databases you mentioned. It's down with ms access. Crap.

    8. Re:Yes, but not soon. by horza · · Score: 3, Insightful

      let me guess, you don't like mssql because it's microsoft? what a fucking sheep, mssql is a great database.
      oh and i've used all the others and for you to suggest mysql over mssql tells a lot...

      MSSQL? Isn't that the only database that isn't cross platform these days? Why would anybody want to use MSSQL outside of .Net developers? On a side note, why is it that only MSSQL appears to get crippled by worms and none of the others?

      Phillip.

    9. Re:Yes, but not soon. by Anonymous Coward · · Score: 0

      Indeed - they offer the same basic functionality as BerkeleyDB which has been around a long time.

      (Online backup. Full transaction support. etc. All this from something that's been around LONGER than MySQL!)

    10. Re:Yes, but not soon. by Estanislao+Mart�nez · · Score: 3, Insightful

      Yes, these newer simple key/value databases like BigTable and CouchDB are effectively a subset of RDBMS functionality, so of course the same thing can be implemented relationally by just not using features.

      What worries me about these arguments, however, is that they're missing a point that's very similar to yours here: these high-performance key-value databases can be implemented as features in an RDBMS. Basically, if you have a technology that allows some limited type of database to be distributed across tons of nodes and to be queried really fast, well, that's a kind of limited-functionality materialized view with a special engine to access it. So put it in as a subsystem to the full RDBMS, and use your plain old full-featured relational engine as the system of record that solves the concurrent transactional update and data integrity problems, and have it also push out the deltas to the specialized store that supports the the high-performance distributed querying.

      Nobody is denying that there are many applications where you don't need all that the relational model provides, and that those applications can be made to perform faster by not providing certain features. What people repeatedly fail to understand is that this is not a refutation of the relational data model, because it is a logical and general data model that's capable of modeling the data in such applications, and does not dictate the implementation.

    11. Re:Yes, but not soon. by encoderer · · Score: 3, Informative

      Suggesting that you could replace a MS-SQL server with SQLite basically forces anybody in the know to ignore every other point you make.

      MySQL is good, unless you need a highly performent query analyzer.

      Postgres is good, unless you need actual replication features.

      SQLite is good, if your datastore is less than 1GB.

      Oracle is no-doubt a valid replacement and improvement upon SQL Server. And I use MySQL more than any other DB. But you need to hire Percona to get the same performance out of MySQL that you get from SQL Server out of the box.

    12. Re:Yes, but not soon. by SanityInAnarchy · · Score: 3, Informative

      Suggesting that you could replace a MS-SQL server with SQLite basically forces anybody in the know to ignore every other point you make.

      You're assuming that the person using MS-SQL Server knows what they're doing. How do you know it's more than just a glorified Access database?

      MySQL is good, unless you need a highly performent query analyzer.

      In other words, the query analyzer is slow? Because the queries work well enough.

      Postgres is good, unless you need actual replication features.

      Like these?

      SQLite is good, if your datastore is less than 1GB.

      Another quick Google, and we find these limits -- by default, the maximum database size is just under 32 terabytes.

      Not that I'm suggesting it's a good choice at that point, especially with multiple processes. But it does make it kind of hard to take you seriously with that kind of imagined limit, unless you're suggesting there's a practical, performance wall after 1 gig.

      --
      Don't thank God, thank a doctor!
    13. Re:Yes, but not soon. by SanityInAnarchy · · Score: 2, Interesting

      let me guess, you don't like mssql because it's microsoft?

      And because it's proprietary, single-platform, and expensive for what is, at the end of the day, just a database.

      And because I have seen new and interesting things built with MySQL, like NDB. What has MS SQL got on that?

      what a fucking sheep

      Look who's talking.

      More seriously, while I have pretty much no MS SQL experience, I don't particularly want to. The only good experience I've ever had from a Microsoft product was Halo. Bungie was acquired, and has now been sold, making me wonder if Microsoft had the chance to screw them up yet.

      --
      Don't thank God, thank a doctor!
    14. Re:Yes, but not soon. by anothy · · Score: 2, Interesting

      But you need to hire Percona to get the same performance out of MySQL that you get from SQL Server out of the box.

      this has not been my experience. at least with version 8 (two back from current), performance was miserable compared to either mysql or postgresql of comparable vintage. this was my first serious experience using mssql, but with no tuning on either side, both mysql and postgresql outperformed mssql by a factor of about 2.
      while we never got the database on the production system swapped out (development was underway to replace the application it was supporting anyway), and thus i can't speak to mysql or postgresql's reliability in the same use environment, mssql was very unstable. the database would hang indefinitely if either a query or the resulting data was too large, and, as near as we could tell, once every other month or so for no particular reason. the data set was tens of thousands of records a month going back a few years, which is not a trivial sum of data, but shouldn't be considered a lot for a modern database.
      while it's not a direct comparison, i've used mysql in several production projects and have seen less than a half dozen hangs in production total. i've only used postgresql in production on one project, but have seen no production hangs.

      --

      i speak for myself and those who like what i say.
    15. Re:Yes, but not soon. by joib · · Score: 1
      Suggesting that you could replace a MS-SQL server with SQLite basically forces anybody in the know to ignore every other point you make.

      Why? At work we have a MSSQL DB, and TBH, it could easily be replaced with SQLite with no noticeable loss in performance or functionality that we actually use.

      That does of course not mean that every MSSQL deployment could be replaced with SQLite, far from it. But I also believe we're far from the only case were we use a DB engine that is vastly overqualified for our actual needs. But hey, our PHB loves it, so we'll keep using it.

    16. Re:Yes, but not soon. by segedunum · · Score: 1

      let me guess, you don't like mssql because it's microsoft? what a fucking sheep, mssql is a great database.

      I always chuckle when I see this because it's the only tac left - claim that everybody hates Microsoft. SQL Server is about the only database that isn't cross-platform these days, and while it is quite powerful, that power comes at the expense of consuming a terrible amount of system resources when compared with a database system such as Postgres.

    17. Re:Yes, but not soon. by ultranova · · Score: 1

      i've only used postgresql in production on one project, but have seen no production hangs.

      I use PostgreSQL on desktop machine to store two databases with the total size of about 60 gigabytes, and have never seen hangs or other problems. However, vacuuming (a PostgreSQL-specific database maintenance operation) causes some slowdown for the rest of the machine due to heavy IO combined with the fact that I'm typically using over a gigabyte of swap.

      --

      Forget magic. Any technology distinguishable from divine power is insufficiently advanced.

    18. Re:Yes, but not soon. by EastCoastSurfer · · Score: 2, Insightful

      First, all applications have bugs that open them up to security flaws. Picking on MSSQL in that area is a non-starter.

      What you're missing are all of the tools that come with a MSSQL license. SISS and MSAS are two big ones that are hard to replace with open source tools (Pentaho is interesting). If all you're looking to replace is a pure data store then yeah, postgre is what I would move to. When you start replacing all functionality offered by MSSQL it gets a little more complicated.

    19. Re:Yes, but not soon. by EastCoastSurfer · · Score: 1

      More seriously, while I have pretty much no MS SQL experience, I don't particularly want to.

      It shows, because MSSQL is a lot more than 'just a database.' SSIS and MSAS come to mind as two very nice tools that come with an MSSQL license.

    20. Re:Yes, but not soon. by Jack9 · · Score: 1

      MySQL is good, unless you need a highly performent query analyzer.

      It's really sad, but true, that the best query analyzers...or any kind of tool...for MySQL are not made by MySQL/Sun.

      See:
      MONYog (versus the EXPENSIVE bundle that gives you the MySQL Enterprise Monitor)
      Maatkit

      --

      Often wrong but never in doubt.
      I am Jack9.
      Everyone knows me.
    21. Re:Yes, but not soon. by Anonymous Coward · · Score: 0

      I think the difference between you and the GP is that the GP has production experience running these things while you think that any feature listed on some open-source project's feature sheet means it will be ready for prime-time.

    22. Re:Yes, but not soon. by Anonymous Coward · · Score: 0

      "effectively a subset of RDBMS functionality"

      Add to that: "and these modern web programmers are just too thick to understand RDMS theory, so we'll just dumb it down. Again."

      It always cracks me up how these kids get to re-invent the relational model. Once they have painted themselves into a corner with these simple, random- files-is-a-database approach, they always seems to come knocking on my door to help them out and build a REAL database. Sad thing is that is that they leave after a year or two, and then the next hotshot h@x0r does it all over again.....

      Get a proper education. Study Normalization. Get to grips with a real database engine (I don't care which one), and THEN look again at the "simple, new!!" approach. You'll see it's the same old, same old, with a funky name and a fresh coat of paint.

    23. Re:Yes, but not soon. by tepples · · Score: 1

      Why would anybody want to use MSSQL outside of .Net developers?

      Because it's cheaper for some small businesses to license Stone Edge Order Manager, which can use MS SQL Server Express, than to hire a Python guru to write a custom warehouse management stack.

    24. Re:Yes, but not soon. by tepples · · Score: 1

      vacuuming (a PostgreSQL-specific database maintenance operation)

      PostgreSQL isn't the only SQL DBMS that can defragment a database. SQLite has VACUUM, MySQL has OPTIMIZE TABLE that makes a dummy ALTERation, and Jet has "Compact and Repair".

    25. Re:Yes, but not soon. by SanityInAnarchy · · Score: 1

      I cited my sources, he didn't cite his. So no, I won't take his "production expertise" on faith.

      --
      Don't thank God, thank a doctor!
    26. Re:Yes, but not soon. by Tablizer · · Score: 1

      Some related anecdotes here: I've used a fair number of different database products over the years, and some of the strangest bugs I've seen came from MS-SQL-Server. I've seen bugs in the other vendor's, but MS's were just plain bazaar. I feel more comfortable with semi-predictable bugs over WTF bugs.

    27. Re:Yes, but not soon. by Bill,+Shooter+of+Bul · · Score: 1

      What the heck are they? The problem with most microsoft technologies is that most of the information about them is written by marketing rather than engineering. Are you really saying that Oracle doesn't have something comparable?

      In any case, as this story points out the economics of data storage are moving towards less features for greater scalability.

      --
      Well.. maybe. Or Maybe not. But Definitely not sort of.
  6. No by Azarael · · Score: 1

    It isn't up for debate that tupple stores are a very useful tool. That being said, they aren't a silver bullet for *ALL* data storage situations. For types of data that are inherently tabular, I really doubt that 40 years of RDBMS development will be trumped by a tuple store. When you move to hierarchical data though, things are reversed.

  7. Top 25 Reasons the Relational Database is Doomed by MillionthMonkey · · Score: 5, Funny

    Someone type this up and submit it to Digg.

  8. Hey! by MightyMartian · · Score: 4, Insightful

    Hey, read my article! Just to make sure you do, I'll pull a Dvorak and put in some incredibly sensational headline about how RDBMs are dewmed!!!!!! BWAHAHA, feed my advertisers!!!!

    (Tune in ext week, when I write about how C programming is going to become extinct in the light of fantastic new development tools like C# and Ruby on Rails!!!)

    --
    The world's burning. Moped Jesus spotted on I50. Details at 11.
    1. Re:Hey! by dkleinsc · · Score: 5, Insightful

      Especially when the claim is as ridiculous as this one.

      There's a reason relational databases took over the world of databases: They provide a good combination of flexibility and structure to efficiently represent data. Which is what databases are supposed to do.

      --
      I am officially gone from /. Long live http://www.soylentnews.com/
    2. Re:Hey! by Just+Some+Guy · · Score: 5, Insightful

      There's a reason relational databases took over the world of databases: They provide a good combination of flexibility and structure to efficiently represent data.

      Especially since so many databases really are inherently relational. The textbook example of 1-customer:n-invoices, 1-invoice:n-items plays out quite a bit in the workplace.

      --
      Dewey, what part of this looks like authorities should be involved?
    3. Re:Hey! by iamhigh · · Score: 1

      Outrageous Claim + Car Analogy = Slashdot Front Page Story!

      --
      No comprende? Let me type that a little slower for you...
    4. Re:Hey! by whyloginwhysubscribe · · Score: 1

      When I saw the headline - I thought it could pose some interesting questions about de-normalisation/zation(!)

      I believe that in huge databases it is quite important to de-normalise for efficiency - so it could have made an interesting article...

    5. Re:Hey! by Anonymous Coward · · Score: 0

      Especially since so many databases really are inherently relational. The textbook example of 1-customer:n-invoices, 1-invoice:n-items plays out quite a bit in the workplace.

      That's not what "relational" means.

    6. Re:Hey! by Just+Some+Guy · · Score: 1

      That's not what "relational" means.

      "Customers", "invoices", and "items" are certainly relations.

      --
      Dewey, what part of this looks like authorities should be involved?
    7. Re:Hey! by Anonymous Coward · · Score: 0

      Yes it is.

    8. Re:Hey! by MichaelSmith · · Score: 1

      This whole debate seems to be why should I have to use a supertanker when I only need a sea kayak?. If course we still need both, just not in the same quantities.

      I am betting that most of the money in the world is stored in relational databases. I can't see that changing soon, but I don't use Oracle for my address book either.

    9. Re:Hey! by Anonymous Coward · · Score: 0

      > "Customers", "invoices", and "items" are certainly relations.

      What the heck did you mean by "inherently relational," then, if you're giving examples of "one-to-many" relationships? These relationships between tables have nothing to do with the name "relational."

      "Customers" is a relation even if you don't have any invoices or items in your DBMS.

    10. Re:Hey! by Estanislao+Mart�nez · · Score: 1

      There's a reason relational databases took over the world of databases: They provide a good combination of flexibility and structure to efficiently represent data. Which is what databases are supposed to do.

      And don't forget data integrity at the logical level, like protecting against the duplication of information (normalization), verifying that the data meets all kind of semantic conditions (constraints), atomicity of complex reads and writes (transactions), etc.

      It is very, very important to know that these "death of the RDBMS" articles seldom take notice of all of the many problems that RDBMSs address. They normally just completely ignore at least the data integrity problems, and focus on the query speed problem. Well, guess what, of course you can do better than a plain RDBMS at the query speed issue if you ignore all the other stuff. Unless the RDBMS implements some kind of super-fast read-only materialized view, in which case, well, then it's not clear whether any technique you have can't be duplicated by the RDBMS.

    11. Re:Hey! by anothy · · Score: 1

      part of the issue with the current generation of relational databases is that in the significant majority of cases (i'd say 3/4 in my experience, across multiple industries and applications), the relations are very static and, with only a little bit of intelligence, trivially obvious. in these cases, all the JOIN and FROM mechanics end up being simply unneeded overhead. you end up with extra work for the application developer to understand and represent these and a bunch of extra work on the database server to put them together. you also end up with the structure of the database being embedded in the application, which is a maintenance nightmare.
      to a degree, the "view" feature common to most large RDBs addresses these, but in those cases you often end up with the same maintenance issues (just shifted from the app developer to the DBA) and the view ends up needing to know just as much about the internals of the DB as the application code otherwise would have (although that is still an improvement, since presumably whoever's putting new views on your DB server knows about the DB structure anyway). and unless you build a set of views designed to accommodate any reasonable use of your data up front (rather than doing it as needed for a given application, which is the standard practice), the time for this remains in your application development timeline, and complicates testing. also, even this partial solution is unavailable if there's no direct relationship between whoever presents the data and whoever's writing the app.

      in these cases, an IRDB (I=implicit) system, where the server can construct the "views" dynamically based on a trivial understanding of the database, yields simpler queries, reduced maintenance, and faster application development and testing cycles. in your example, invoice in the first table is always going to join against invoice in the second; you're not going to try and join invoice against items, or join customer to items (which isn't the same as saying you can't query for customer given items, or vice versa).

      --

      i speak for myself and those who like what i say.
    12. Re:Hey! by shutdown+-p+now · · Score: 1

      "Customers", "invoices", and "items" are certainly relations.

      They are certainly objects, too.

  9. Completely off topic by Anonymous Coward · · Score: 0

    Is anyone else experiencing a long delay when loading the Slashdot homepage, like a couple of seconds during which the browser is unresponsive? I'd like to know if there's something I can do about it besides blocking the offending script and reducing Slashdot to an unusable shadow of itself. I don't intend to dive into 400 KBytes (!) of minified Javascript code to find what the hell it is doing.

    1. Re:Completely off topic by Anonymous Coward · · Score: 0

      I find it blocks on network a fair bit before showing the headlines - 'waiting for images.slashdot.org' etc.

  10. Re:Karma Whoring by Anonymous Coward · · Score: 2, Insightful

    This isn't digg. Posting that doesn't guarantee you +5

  11. Voldemort! by GreatRedShark · · Score: 3, Funny

    There's a db called Project Voldemort? That's awesome! I'm switching to that just for the name! I think my manager is a Harry Potter fan so getting approval shouldn't be too hard.

    1. Re:Voldemort! by youthoftoday · · Score: 4, Funny

      A Harry Potter fan? Voldemort? Surely the name is the one thing that'll *prevent* approval?

      --
      -1 not first post
    2. Re:Voldemort! by fuzzyfuzzyfungus · · Score: 5, Funny

      The name might be cool; but the length of some of the commands will really get to you. How many times do you want to type AVADA_KEDAVRA TABLE?

    3. Re:Voldemort! by the_B0fh · · Score: 4, Funny

      **SPOILER ALERT**

      In book 8, it turns out that good ol' Voldy is actually Harry's older brother. They had a tearful reunion, and Voldy now works for Harry.

    4. Re:Voldemort! by hansamurai · · Score: 1

      CREATE_TABLE customer_balance (
      id INTEGER AUTO_INCREMENT,
      balance WINGUARDIUM_LEVIOSA,
      PRIMARY KEY (id)
      );

    5. Re:Voldemort! by MightyMartian · · Score: 1

      Nah, in book 8, Voldemort returns, lops of Harry's hand in a big fight scene, then tells Harry that he's really his father. In book 9, Harry gets mad at Dumbledore's animated picture for deceiving him, to which Dumbledore admits "Tom Riddle was my friend, when I first knew him he was already a great Quidditch player, but I was amazed by how strong the magic was in him."

      --
      The world's burning. Moped Jesus spotted on I50. Details at 11.
    6. Re:Voldemort! by jollyreaper · · Score: 5, Funny

      The name might be cool; but the length of some of the commands will really get to you. How many times do you want to type AVADA_KEDAVRA TABLE?

      Better than PokemonDB. Then you have to jump on top of your desk and shout "Customer Table, I select you!" every time you run a damn query.

      --
      Kwisatz Haderach
      Sell the spice to CHOAM
      This Mahdi took Shaddam's Throne
    7. Re:Voldemort! by rherbert · · Score: 1

      ACCIO firebolt FROM dorm WHERE firebolt_owner = 'hpotter';

    8. Re:Voldemort! by fuzzyfuzzyfungus · · Score: 1

      Be careful, if we keep this up, we'll enjoy the honor and/or terrible burning shame of having defined the official backend database for Lolcode applications.

    9. Re:Voldemort! by GreatRedShark · · Score: 5, Funny

      You're right, that is a bit cumbersome. Hopefully, they'll release a friendly GUI wizard to make working with it more efficient.

    10. Re:Voldemort! by Palshife · · Score: 1

      Well, considering that the database process dies as soon as you type it, you only have to type it once.

      --
      Attention deficit disorder is a complicated issue, spanning several major... HEY LET'S GO RIDE BIKES!
    11. Re:Voldemort! by moderatorrater · · Score: 1

      How many times do you want to type AVADA_KEDAVRA TABLE?

      Just once, but that's all I'll need...

    12. Re:Voldemort! by lastchance_000 · · Score: 1

      Given the name, I suspect you're more likely to get an unfriendly wizard.

    13. Re:Voldemort! by cmdrcoffee · · Score: 1

      You, sir, are the first poster to make me laugh out loud today! Mod Up!

    14. Re:Voldemort! by schamarty · · Score: 1

      jokes apart, you should read the "Barry Trotter" spoofs. Voldemort is called Valuemart, and makes millions selling Barry Trotter merchandise, among other weirdnesses.

    15. Re:Voldemort! by WuphonsReach · · Score: 4, Funny

      Better than PokemonDB. Then you have to jump on top of your desk and shout "Customer Table, I select you!" every time you run a damn query.

      *polite golf clap*

      --
      Wolde you bothe eate your cake, and have your cake?
    16. Re:Voldemort! by Anonymous Coward · · Score: 0

      What makes you think Voldemort was the bad guy? Just because he kept getting defeated by those meddling kids?

      I'm not saying everything he did was right, but have a little sympathy, at least. The guy was constantly being punished by his homosexual headmaster, and it didn't even end when he graduated.

    17. Re:Voldemort! by Anonymous Coward · · Score: 0

      AND, he didn't even once make the obvious joke about a gay guy forgoing promotions to remain in a post with the title, "Head Master."

    18. Re:Voldemort! by daveime · · Score: 3, Funny

      It's already been done ...

      HAI
      CAN HAS DBASE?
      I HAZ A VARIABLE1 IS NOTHING
      IM IN YR DATA ;) "test.mdb"
              CAN I PLZ GET column1 column2 column3
              ALL UP IN table1
              OMG column1 IZ BIGGER THAN 5
              ALL UR BASE R BELONG 2 VARIABLE1
      IM OUTTA YR DATA
      VISIBLE VARIABLE1
      KTHXBYE

    19. Re:Voldemort! by Anonymous Coward · · Score: 0

      Noooo! Voldemort is Harry's dad. He found out just after Harry cut of his arm. Something like that.

    20. Re:Voldemort! by ultranova · · Score: 1

      Better than PokemonDB. Then you have to jump on top of your desk and shout "Customer Table, I select you!" every time you run a damn query.

      "PokemonDB - keeps your operators lean and mean!" ;)

      --

      Forget magic. Any technology distinguishable from divine power is insufficiently advanced.

  12. Enough with the death of the relational DB by Mr.+Underbridge · · Score: 5, Interesting

    This same basic story keeps getting submitted from the same group of people who are generally trying to sell non-relational-DB stuff. This is an ad. Move along.

    1. Re:Enough with the death of the relational DB by Penguinshit · · Score: 5, Funny

      Don't online dating sites use relational databases?

    2. Re:Enough with the death of the relational DB by Anonymous Coward · · Score: 0

      Yes, this is crap. The same thing was said 10 years ago (if anybody remembers that long ago) about XML. XML was going to replace RDBMSes. Didn't happen.

    3. Re:Enough with the death of the relational DB by Anonymous Coward · · Score: 0

      If it's actually useful and gets some media coverage, then oracle, mssql and others will add support for it, and all the original developers will be SOL.

      In my mind KVP stores seem to be a sop for the modern cheap HashMap-wielding programmer who can't even grok objects and relationship models.

    4. Re:Enough with the death of the relational DB by wonmon · · Score: 1

      Dunno, but I use a post-relational database to keep track of my ex's faults. It's a scalability issue.

    5. Re:Enough with the death of the relational DB by caveratpaul · · Score: 1

      The XML databases and tools are just now becoming mature enough to compete on the same level as relational databases. I would say they won't completely replace them but they have the potential to augment them in some interesting verticals. Anywhere the problem can be thought of as document centric (or hierarchical) an XML DB has the potential to out preform a relational data-store.

      For some examples of XML DBs you can refer to:
      http://exist.sourceforge.net/
      http://www.modis.ispras.ru/sedna/
      http://www.marklogic.com/
      http://www.x-hive.com/

      Also take a look at http://www.saxonica.com/ (Michael Kay's company) for some insight on how XML DB's can be used.
      Remember that it takes years for major process cycles to change. XML DB's may not be in the limelight yet but their time is approaching.

    6. Re:Enough with the death of the relational DB by lgw · · Score: 1

      Yes, but 10 years ago it wasn't XML DBs - it was XML. As in: just store all your data in a big XML document, no query language or anything. There are ways of doing that that aren't as stupid as it sounds at first but ... still pretty dumb.

      Of course, these days I spend a lot of time arguing that "we don't need a DB at all, we'll just store that in XML", so I shouldn't mock. When your entire data set fits in memory, a RDMS is a bit much.

      --
      Socialism: a lie told by totalitarians and believed by fools.
    7. Re:Enough with the death of the relational DB by Anonymous Coward · · Score: 0

      Maybe if they sold a non-relational-DB on FreeBSD we could kill two birds with one stone.

    8. Re:Enough with the death of the relational DB by Yetihehe · · Score: 3, Funny

      They do. Object databases are only for insensitive clods.

      --
      Extreme Programming - Redundant Array of Inexpensive Developers
    9. Re:Enough with the death of the relational DB by itsdapead · · Score: 1

      Don't online dating sites use relational databases?

      Indeed, I believe that there's a custom build of PostgerSQL used by the more... specialist agencies that includes the new "UNNATURAL JOIN" keyword, permits one-to-many "CROSS JOIN" queries and performs an automatic rollback before any self-join.

      --
      In a survey of 100 programmers, 111111 thought that duck-typing was a good idea.
  13. 99.9% of databases... by Ckwop · · Score: 3, Interesting

    99.9% of database claim to follow the relational model.

    The rest have scalability problems that 99.9% of developers will never see throughout their entire careers.

    So the answer is a simple, emphatic, no.

    1. Re:99.9% of databases... by Seth+Kriticos · · Score: 1

      Well, I'm playing with ZODB, which is a Python object database. Nice thing, can also be run as clustered version, is really scalable and reliable. The thing is, only very few know it, even less want to know about it. If people talk about databases, they think about RDBMs. Developers hate new stuff, that's why relational databases will stay for a very long time, even in situations where they might not be the optimal choice. That's life.

    2. Re:99.9% of databases... by arevos · · Score: 2, Interesting

      99.9% of database claim to follow the relational model.

      The rest have scalability problems that 99.9% of developers will never see throughout their entire careers.

      Uh, actually, relational databases are pretty damn hard to scale. That's basically the main problem with them. Why do you think relational databases are so often paired with a cache made from a hashtable-based database?

    3. Re:99.9% of databases... by Just+Some+Guy · · Score: 1

      Developers hate new stuff, that's why relational databases will stay for a very long time, even in situations where they might not be the optimal choice.

      To play to your example, there are also developers who played with ZODB long enough to run screaming back to PostgreSQL. "Different" is not automatically the same as "better".

      --
      Dewey, what part of this looks like authorities should be involved?
    4. Re:99.9% of databases... by TapeCutter · · Score: 1

      "Uh, actually, relational databases are pretty damn hard to scale. That's basically the main problem with them."

      Do you have a better idea?

      Why do you think relational databases are so often paired with a cache made from a hashtable-based database?"

      You mean an index? - the kind of thing you would use to efficiently access large amounts of stored data?

      --
      And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
    5. Re:99.9% of databases... by arevos · · Score: 1

      Do you have a better idea?

      A better idea for what? Different problems require different solutions. Relational databases are useful tools to solve a wide range of problems, but they're not particularly easy to scale. That's one of their main weaknesses.

      You mean an index? - the kind of thing you would use to efficiently access large amounts of stored data?

      No, I was more thinking of a distributed hash table database like Memcached. Hashtable databases are less useful than a full relational databases, but as they can be trivially distributed over any number of machines, they make scaling extremely easy.

      So if you look at any large website, there will be typically two database layers. The relational database is used as the master, and the more scalable hashtable database is used as a read-only cache.

    6. Re:99.9% of databases... by TapeCutter · · Score: 1

      I think you are conflating application with theory (or I have misunderstood you). What you call a "hash table database" others might call an "indexed cursor". It is true that setting up a distributed RDBMS is probaly more difficult than setting up a distributed "hash table database" but this is not the same thing as saying scaleablity is their main weakness.

      "A better idea for what?"

      I was asking for an example of a data storage technique that scales better than RDB. Your last sentance makes a lot of sense but Memcached isn't really a database in the traditional sense, it's more akin to a fancy cursor that could just as easily be part of the RDBMS toolkit.

      --
      And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
    7. Re:99.9% of databases... by arevos · · Score: 2, Informative

      What you call a "hash table database" others might call an "indexed cursor".

      Others would be wrong ;)

      An indexed cursor only contains a reference to the original data. Memcached contains a duplicate of the original data, so I'd argue it was a database in its own right.

      However, even if Memcached doesn't meet the criteria of a database, DBM-based databases certain do. They operate on a similar principle; a unique key points to a specific piece of data. Unlike Memcached, they are persistent, but like Memcached they are very fast and easily scalable.

      I was asking for an example of a data storage technique that scales better than RDB.

      Well, consider a modern DBM-based database like Tokyo Cabinet. Let's say we want to distribute it evenly across 16 machines, labelled 1 to F. When a request for data comes in, we MD5 the key and use the first 4 bits to determine the machine to use. This gives us an even and consistent spread of data between machines.

      Relational databases can't easily use the same trick, because table joins are very costly to perform if the table data is distributed across several machines. In a nutshell, the flexibility of relational databases reduces their speed and scalability compared to databases with a more limited scope.

  14. Finally the OODB people will by thammoud · · Score: 5, Insightful

    Leave us RDBMS dinosaurs alone. String Name/Value pairs, that is a great innovation. In other news, Sun will be dropping all types from the Java object system and rely on the VOID type. Idiots.

    1. Re:Finally the OODB people will by whyloginwhysubscribe · · Score: 1

      Isn't a relational database essentially a rationalised array of name/value pairs anyway?

    2. Re:Finally the OODB people will by TheTurtlesMoves · · Score: 1

      You say this as a joke. Yet some C code i was forced to work on once. Well there was void, void*, void**, void*** and (void*)(void*). Or some other such nonsense. I don't recall exactly. But i had to check if it was valid C rather than some "compiler feature".

      --
      The Grey Goo disaster happened 3 billion years ago. This rock is covered in self replicating machines!
  15. Yeah by Spazmania · · Score: 1

    The conclusion being that relational databases and key value stores aren't really mutually exclusive and instead are different tools for different requirements.

    In related news, black is not white.

    --
    Moderating "-1, Disagree" is simple censorship. Have the guts to post your opinion.
    1. Re:Yeah by idontgno · · Score: 3, Funny

      Is white the new black?

      No, it isn't, black is the new black, and whiten and black are not really mutually exclusive. And.... I made you look. Thanks for the pageviews, suckers!

      --
      Welcome to the Panopticon. Used to be a prison, now it's your home.
    2. Re:Yeah by Anonymous Coward · · Score: 0

      White? Black? Pffft. It's all albedo.

  16. A great open source implementation by thammoud · · Score: 5, Funny

    Map db = new HashMap();

    beginTransaction(); // Synchronize on the map
    db.add("key", "value");
    commitTransaction(); // Just serialize the fucker to a file. The idiots using this won't know the difference.

    1. Re:A great open source implementation by julesh · · Score: 1

      commitTransaction(); // Just serialize the fucker to a file. The idiots using this won't know the difference.

      You've come across prevayler then?

    2. Re:A great open source implementation by ADRA · · Score: 1

      Its better than the Image viewer that I once played with. It created all metadata and image thumbnails into a purely memory database.

      It then stored the database to disk as a large set of insert statements instead of using some sort of disk based storage container. Needless to say, said application ran like a frigging slug. With larger sets of thumbnails, it'd simply run out of memory and die horribly.

      --
      Bye!
    3. Re:A great open source implementation by Anonymous Coward · · Score: 0

      Map db = new HashMap();

      beginTransaction(); // Synchronize on the map
      db.add("key", "value");
      commitTransaction(); // Just serialize the fucker to a file. The idiots using this won't know the difference.

      I think you just re-invented MySQL.

  17. It's like facebook...only slashdottier by Anonymous Coward · · Score: 1, Interesting

    Ugh, yet another superficial blog post pimped out on slashdot. The guy doesn't have a solid technical grasp about data system and what really constitutes the difference between a system like BigTable or SimpleDB versus an RDBMS. Instead of talking about the differences in transaction management, consistency guarantees, etc. he comes up with brilliant ideas like RDBMSes are slower because they are more consistent.

    Enough with the bad blog posts already, it's like facebook, only less interesting.

    1. Re:It's like facebook...only slashdottier by MightyMartian · · Score: 1

      Enough with the bad blog posts already, it's like facebook, only less interesting.

      No, it's more like porn, except it doesn't have naked people doing things to themselves and each other.

      --
      The world's burning. Moped Jesus spotted on I50. Details at 11.
    2. Re:It's like facebook...only slashdottier by poot_rootbeer · · Score: 1

      "Enough with the bad blog posts already, it's like facebook, only less interesting."

      No, it's more like porn, except it doesn't have naked people doing things to themselves and each other.

      It's kind of like a car, if the car didn't have a fuel gauge or any springs inside the passenger seat...

  18. Free Traffic by nurb432 · · Score: 1

    And people complain that i don't go read the articles and rely on summaries.

    This is one of the reasons.

    --
    ---- Booth was a patriot ----
  19. ah, stupid. by tjstork · · Score: 2

    The big dumb thing about key store values is that they are actually just a subset of relational algebra in theory and are thus readily implementable in a relational database in fact. If you really wanted to have a database just do key / store values, you could quite easily do that in any rdms.

    --
    This is my sig.
    1. Re:ah, stupid. by poot_rootbeer · · Score: 3, Insightful

      If you really wanted to have a database just do key / store values, you could quite easily do that in any rdms.

      Sure, but it's not likely that a key/value store implemented within a general-purpose RDBMS can achieve the same raw performance that a system designed to do nothing but implement a key/value store -- nor the distributability, for that matter.

    2. Re:ah, stupid. by Anonymous Coward · · Score: 0

      as long as you don't mind all values being the same type, or serialized/unserialized to a string.

    3. Re:ah, stupid. by DragonWriter · · Score: 1

      The big dumb thing about key store values is that they are actually just a subset of relational algebra in theory and are thus readily implementable in a relational database in fact.

      Yeah, sure you get implement a key/value store in any RDBMS, but then, to make it into a distributed database, you'd have all the problems of doing that with an RDBMS, whereas purpose-built key/value stores and similar distribute much better than traditional RDBMS's.

      If you really wanted to have a database just do key / store values, you could quite easily do that in any rdms.

      OTOH, if you wanted to do seamless scaling, you can't do that easily in most RDBMS's. But you can do that with a lot of the purpose-built key-value stores. OTOH, most distribtued key/value stores don't provide ACID guarantees, though Scalaris seems to.

    4. Re:ah, stupid. by anothy · · Score: 1

      no reason to assume that from the text. comparing disk-based to disk-based, something designed to just do key/value pairs (or tuples, even) is going to perform better than something designed to provide more flexibility.

      --

      i speak for myself and those who like what i say.
    5. Re:ah, stupid. by Anonymous Coward · · Score: 0

      If you end up using the same data structures and the somewhat larger memory footprint through views after the optimization (which, granted, takes time if not done automatically), the speed comparison centers in the query systems, system specific tuning and any advantage a maturity of implementation, like bypassing the operating system, can bring. Like the article states inderectly, use whatever rocks your application + environment's boat, unless the boss tells otherwise. I'll take my coat now, thank you very much.

    6. Re:ah, stupid. by smellotron · · Score: 1

      It is fair to say, "Any key/value database will be at least as fast as any relational database", since one degenerates to the other. However, I find it quite easy to believe that there are a good number of optimizations that can be applied to a key/value database that don't apply to relational systems with foreign key integrity. There are more constraints, and more constraints usually leads to more efficient implementation. For an example, check out memcached.

    7. Re:ah, stupid. by convolvatron · · Score: 1

      i hardly want to defend sql, its a terribly conceived language.

      however, if by 'distributability' you mean execution across machines connected by unreliable networks, then its just as straightforward to implement relations having a single key and a single value as multiple keys and multiple values. all the additional complexity is in the language front end, and its not that bad.

      with both query semantics the primary limit in scaling, or concurrency, is the cost of providing isolation, if required, and the workload dependent serialization conflicts.

  20. Summing up by godrik · · Score: 0, Redundant

    I do not believe someone dared to write this down : "If you don't need a relationnal database then relational database are unefficient". Waouh, this is rocket science guy! You should apply for the Fields medal.

    It is obvious that if you do not need a structured and coherent base but a big hash table with property, then you do not need a Relationnal Database.

  21. In relation to what? by Penguinshit · · Score: 5, Funny

    I won't believe it until Netcraft confirms it.

    1. Re:In relation to what? by Netcraft+Confirms+It · · Score: 1

      Nope, sorry.

  22. I see the problem! by dexmix · · Score: 0, Offtopic

    they think Nissan makes the Civic!

    1. Re:I see the problem! by poot_rootbeer · · Score: 3, Insightful

      they think Nissan makes the Civic!

      This lack of data integrity could have been prevented if they had used a relational database...

    2. Re:I see the problem! by Tablizer · · Score: 1

      [they think Nissan makes the Civic!] This lack of data integrity could have been prevented if they had used a relational database...

      If early life had too many integrity checks, we'd still be slime mold. Viva mutations! (But let somebody else drive the Nissan Civic, please.)
           

  23. This is an old argument which will not fly by bogaboga · · Score: 5, Informative

    It has been suggested before that the life of the relational DB is coming to an end. I must say that while I agree with this statement: -

    Relational databases scale well, but usually only when that scaling happens on a single server node. When the capacity of that single node is reached, you need to scale out and distribute that load across multiple server nodes. This is when the complexity of relational databases starts to rub against their potential to scale.

    I disagree with the following statement: -

    Try scaling to hundreds or thousands of nodes, rather than a few, and the complexities become overwhelming, and the characteristics that make RDBMS so appealing drastically reduce their viability as platforms for large distributed systems.

    I submit that the complexity can be managed and that's why we have jobs.

    I am an IT consultant at a major bank and we keep all kinds of data. Data that many find useless and is spread across 27 [major] nodes. Total records in our biggest table number about 57 million with 49 rows. I can tell you that data querying and integrity maintaining are a breeze if the schematic design is correct in the first place.

    We are always designing and testing different scenarios. In cases where we have had to change the schema, it has been simple if one knows what to do.

    I must say that Open Source DBs have worked for us though we rely on products from IBM and Oracle.

    Our philosophy is: If it works in PostgreSQL, it will even do wonders on DB2 or Oracle. I do not see how we can do away with the relational DB. Whoever designed it in the beginning did a marvelous job.

    1. Re:This is an old argument which will not fly by cat_jesus · · Score: 2, Informative

      Total records in our biggest table number about 57 million with 49 rows.

      I think you mean columns.

    2. Re:This is an old argument which will not fly by Anonymous Coward · · Score: 0

      Total records in our biggest table number about 57 million with 49 rows.

      I think you mean columns.

      He's an IT Consultant. He'll charge you $350/hr to talk out his ass. Don't get in his way; he's a PROFESSIONAL.

    3. Re:This is an old argument which will not fly by bogaboga · · Score: 1

      Yes! Thank you!

    4. Re:This is an old argument which will not fly by Just+Some+Guy · · Score: 1

      Total records in our biggest table number about 57 million with 49 rows.

      First, while I agree with your conclusion, 57 million rows isn't all that big. Second, is that 49-column table actually normalized? I mean, there's no theoretical limit on the size of a tuple, but in practice they don't tend to be anywhere near that big.

      --
      Dewey, what part of this looks like authorities should be involved?
    5. Re:This is an old argument which will not fly by Anonymous Coward · · Score: 2, Informative

      E F Codd, an IBM mathematician. And I won't even look at a technology that claims to replace the RDB until I've seen a fully developed mathematical treatment that at least approaches the sophistication of Codd's work.

    6. Re:This is an old argument which will not fly by Anonymous Coward · · Score: 0

      I do not see how we can do away with the relational DB. Whoever designed it in the beginning did a marvelous job.

      You're welcome.

      -Ted Codd (deceased)

    7. Re:This is an old argument which will not fly by feronti · · Score: 1

      It saddens me that someone who apparently works with relational databases every day has no idea who invented them in the first place. What are they teaching kids these days, anyway?

    8. Re:This is an old argument which will not fly by GWBasic · · Score: 1

      Our philosophy is: If it works in PostgreSQL, it will even do wonders on DB2 or Oracle. I do not see how we can do away with the relational DB. Whoever designed it in the beginning did a marvelous job.

      Let's face it, programming a relational database is hard for the run-of-the-mill, copy & paste coder! I find the move away from relational databases to be a naive path taken by those who don't understand the value of what they do... It's like trying to turn lead onto gold.

  24. ?'s meaning - literal and implied by qbzzt · · Score: 5, Insightful

    In headlines, "?" implies that something is a serious question, whose answer is likely to be yes. One that makes it worth spending the time to read the article.

    Imagine the headline said "Does Obama Smoke Crack?" and the article had a bunch of stuff about the president, with a last paragraph saying: "There is absolutely no reason to thing that President Obama has ever smoked crack."

    --
    -- Support a free market in the field of government
    1. Re:?'s meaning - literal and implied by 117 · · Score: 5, Funny

      President Obama smokes crack?!!?!??!!?!

    2. Re:?'s meaning - literal and implied by Anonymous Coward · · Score: 0

      In headlines, "?" implies that something is a serious question, whose answer is likely to be yes. One that makes it worth spending the time to read the article.

      You seem to be implying that all those things are false about the article. Yet after reading the article it is quite apparent that your implication is incorrect.

    3. Re:?'s meaning - literal and implied by Cajun+Hell · · Score: 4, Interesting
      --
      "Believe me!" -- Donald Trump
    4. Re:?'s meaning - literal and implied by qbzzt · · Score: 1

      You're right - I should have used "summary" instead of "article", since the headline was the headline for the slashdot summary.

      --
      -- Support a free market in the field of government
    5. Re:?'s meaning - literal and implied by digitig · · Score: 4, Insightful

      In headlines, "?" implies that something is a sensationalized question, whose answer is "almost certainly, no".

      Fixed that for ya.

      --
      Quidnam Latine loqui modo coepi?
    6. Re:?'s meaning - literal and implied by rthille · · Score: 0, Offtopic

      Welcome to the post FauxNews world, where questions like:
      Obama a Muslim?
      Obama a Socialist?
      George W. Bush Great President or Greatest President?

      Are the height of journalistic integrity!

      --
      Awesome furniture, accessories and cabinetry in Santa Rosa, CA: http://humanity-home.com/
    7. Re:?'s meaning - literal and implied by Dragonslicer · · Score: 1

      I've been pushing for the 'punctuationpunditry' tag for crappy headlines like this.

    8. Re:?'s meaning - literal and implied by Anonymous Coward · · Score: 0

      Welcome to the post FauxNews world, where questions like:
      Obama a Muslim?
      Obama a Socialist?
      George W. Bush Great President or Greatest President?

      Are the height of journalistic integrity!

      Yeah, Fox News is the worst.

      What? Does Fox News not having Obama-fanboy "news" anchors comment on how Obama "sends a tingle" up their leg annoy you?

      Are you pissed off because Fox New has higher standards than "fake but accurate"?

      Get this, bright boy: if Obama is the next FDR, he'll be out on his ass in 2012.

      Don't think so?

      Take a look at what the unemployment rate was in the US in 1936 - four years into FDR's Presidency.

      Yeah.

      FDR's economic policies were a demonstrable and complete failure.

    9. Re:?'s meaning - literal and implied by hondo77 · · Score: 0, Offtopic

      Last I checked FDR was our only Unconstitutionally long office holder in the presidency....

      Last I checked, FDR being elected to four terms was perfectly constitutional at the time.

      --
      I live ze unknown. I love ze unknown. I am ze unknown.
    10. Re:?'s meaning - literal and implied by NMEismyNME · · Score: 0, Offtopic

      The 22nd amendment was ratified during Eisenhower's presidency, which if I'm not mistaken was after FDR, which if I'm again not mistaken means that it was completely constitutional for FDR to serve more than two terms, it was just against the convention.

    11. Re:?'s meaning - literal and implied by zenlunatics · · Score: 2, Insightful

      so your hate for Obama is strong enough to wish that the entire country has a bad 4 years? gee, thanks.

    12. Re:?'s meaning - literal and implied by sammy+baby · · Score: 0, Offtopic

      Shh. Obama Derangement Syndrome does strange things to a person. Best not to antagonize them.

    13. Re:?'s meaning - literal and implied by value_added · · Score: 4, Funny

      President Obama smokes crack?!!?!??!!?!

      Dunno. Has he stopped beating his wife?

    14. Re:?'s meaning - literal and implied by Z-MaxX · · Score: 1

      Dunno. Has he stopped beating his wife?

      Just because he hasn't stopped doesn't mean he ever started.

      --
      Dr Superlove 300ml. I use my powers for awesome
    15. Re:?'s meaning - literal and implied by sharkey · · Score: 1, Funny

      OK, but, what about the fisting?

      --

      --
      "Outlook not so good." That magic 8-ball knows everything! I'll ask about Exchange Server next.
    16. Re:?'s meaning - literal and implied by Anonymous Coward · · Score: 0

      Why are you looking to Slashdot as a zenith of anything, let alone Journalistic Quality?

    17. Re:?'s meaning - literal and implied by Latinhypercube · · Score: 0

      The above method is extensively used on FOX news.

    18. Re:?'s meaning - literal and implied by cayenne8 · · Score: 0
      Obama a Muslim? - Debateable, he was raised in that environment and went to muslim schools early in life

      Obama a Socialist? - Well, so far his track record is definitely pointing in that direction.

      George W. Bush Great President or Greatest President? - Aside from handling post 9/11 fairly well, no...not a good president at all. He gave conservatism, TRUE conservatism a bad name. I'm glad he's gone, but, I wasn't happy with either choice presented to us to succeed him.

      --
      Light travels faster than sound. This is why some people appear bright until you hear them speak.........
    19. Re:?'s meaning - literal and implied by theshowmecanuck · · Score: 1

      why is this more offtopic than its parent posts? the 'fisting' comment by that airhead woman was hysterical. an unintended double entendre.

      --
      -- I ignore anonymous replies to my comments and postings.
    20. Re:?'s meaning - literal and implied by Anonymous Coward · · Score: 0

      I am certain that I have has much disdain for Obama as anyone else you might find here. Nothing would make me happier then seening (sic) O leave office in 2012 head hung in disgrace.

      Here I was thinking this was Slashdot, not realizing it's actually the Free fucking Republic. It takes an average (by redneck standards) kind of imbecile to think that McCain/Palin would've been a better option, more of the same neocon garbage that got us into this mess in the first place.

      You want Obama to fail? Then you probably want to be chronically unemployed within a years' time, I guess. Unless you're an obese, Oxycontin poppin' ideologue who fails in every personal relationship and has to go on Caribbean sex tours to get any action between your loins, in which case your inflated salary would be assured. What was it that Bill Hicks called it back then? A scat muncher.

    21. Re:?'s meaning - literal and implied by Anonymous Coward · · Score: 0

      Rush Limbaugh, Caribbean sex tours, scat muncher.

      Good Lord, does anybody else picture Baron Vladimir Harkonnen with this description?

    22. Re:?'s meaning - literal and implied by Anonymous Coward · · Score: 0

      Aside from handling post 9/11 fairly well...

      Never lose sight of the fact that 9/11 happened on the jug-eared goon's watch, that he and his cabinet ignored the warnings. Even if we ignore a thousand other facts, this qualifies the Bush presidency as ineptly catastrophic.

      Presidential memo: "Bin Laden determined to attack the USA with airplanes".
      Bush response: "OK, you've covered your ass now, you can go".
      Condoleeza Rice: "Nobody could have anticipated an attack on US soil with airplanes".

      9/11 was THE epic fail, why do people keep ignoring (as in ignorance) or glossing over this fact, will never fail to thoroughly baffle me. However, the White House squatters (non-elected) expertly politicized and propangandized it to fit their salivating, warmongering, regressive ideological agenda after the fact. If invading the wrong country with disastrous results can be referred to as handling it pretty well, then the jug-eared squatter holding hands in Texas with a House Of Saud prince must be fap juice.

      Hurricane Katrina was the straw that finally broke the camel's back, nobody can perpetually disguise incompetence, yet even that was an udder they blatantly milked dry with obscure no-bid contracts for KBR (Halliburton).

    23. Re:?'s meaning - literal and implied by laejoh · · Score: 1

      Mu!

    24. Re:?'s meaning - literal and implied by cromar · · Score: 1

      Concur!!!!-itude

  25. Ridiculous by Eravnrekaree · · Score: 3, Insightful

    Really rational is the best way to take a data set and be able to access it in various ways. Many of the other concepts are indeed regressions and reintroduce problems a relational database solves. Relational allows you to able to display and view data in various different ways and apply the dataset in new ways, ways that may not have originally been a part of the original design of the application. Every time we hear someone harp about some new database technology that reintroduces all of the problems of the past, but relational is still the best and most versatile way to store your data in a way that allows for query flexibility.

    1. Re:Ridiculous by Grapedrink · · Score: 2, Insightful

      I agree with you on a lot of points, particularly people coming up with stupid solutions and creating new problems, but how is the rest of this insightful? Sure, relational is a good general fit databases, but it sounds like you are saying the fact that you can query and modify it using something like SQL in most implementations makes it great?

      Exactly how is that easier than some other ways, such as building an object database? Can't you just write a few lines of code that are far more expressive than any SQL ever could be in a language like Common Lisp, Smalltalk, Python, Ruby, etc? Isn't that more accomodating than a relational model which limits your options due to performance vs. flexibility vs. integrity vs. extensibility vs. scalability? How does SQL give you more ways to manipulate things than a map, collect, slice, reduce, anonymous function/lambda, etc?

      I use both relational and object databases (preference to object dbs in all honesty). For an object database, my process both in use and development is to write and modify like it sounds, objects. Instances of objects in those classes are automatically stored for me and even in most implementations, class level data as well. I simply write my code and trudge along and do not worry about some ridiculous ORM. If I need transactions, I have them at the object level which I would want anyway even with a relational DB.

      If I need a query, it is done in a well-known language that I used to write the application. I can of course see if there was no application, it might annoying to do this and relational can make some of that easier, but that is rarely the case. Further, I don't hit as many bumps where I need to denormalize my data to do reporting or data warehousing. I simply once again write code as normal to get what I want.

      A great example is try storing an organizational hierarchy in a database. Query it for basic info such as a list of a manager and all subordinates and superiors. Now try to ask it for the full path between employees. Keep asking it questions about the hierarchy. In just about every relational db it is a fail. Oracle for instance even realized things like this and added "Connect By." Storing the data itself is a nightmare and you end up needing something like nested sets, self joining queries, cursors (never), handing it off to an application (aka relational failure), or materialized path.

      You run into other similar problems where you see hackish solutions in the realtional world like table inheritance. Why have it if a relational database is so good? It is there because relational completely fails here, just like object databases fail elsewhere. There is no ideal solution, and for general cases both work great in my experience, even giving an edge for web applications to object dbs.

      There are so many areas where either the relational model itself, or SQL fails. If you have not hit them, then you have not used relational databases as much more than a glorified spreadsheet. The amount of time I spend tuning my queries in a relational db is ridiculous, even for relatively simple data. Hints, denormalization, columns as rows, cursors, triggers, user defined functions, and other such devices are all crutches for relational dbs. Of course some of those are also caused by bad devs of course, but it need not be that hard in the first place.

      Anyway, I am not trying to slam the relational model. Rather, I think you are wrong to say it's the best and most flexible. Like all things, it depends what you are doing, and in my own experience object databases have been far easier to work with and maintain. I must save months of work every time I use one, but general ignorance often forces me to use either object or relational. If people better understood the strengths of each and paid more attention to each specific task rather than marketing, we would all be happier. It's sad that complaints about tools for example are even valid points. If you market the hell out of something and it just becomes the standard for whatever reason, then of course it is going to win in areas like that. You would think with all the anti-Microsoft rhetoric around here, people would get it.

      For now, I'll continue to use both and enjoy them for different reasons.

    2. Re:Ridiculous by FxChiP · · Score: 1

      A great example is try storing an organizational hierarchy in a database. Query it for basic info such as a list of a manager and all subordinates and superiors. Now try to ask it for the full path between employees. Keep asking it questions about the hierarchy. In just about every relational db it is a fail. Oracle for instance even realized things like this and added "Connect By." Storing the data itself is a nightmare and you end up needing something like nested sets, self joining queries, cursors (never), handing it off to an application (aka relational failure), or materialized path.

      I am probably running into the exact problems with relational that you are describing here, but if you wanted all subordinates *AND* superiors to a given person, could you not do:

      SELECT name, subord, relation = if(name='personsname','Supervisor','Subordinate') FROM employees WHERE name = 'personsname' OR subord = 'personsname';

      Or something related? (note: this is pseudo-SQL)

      But I do agree that SQL, at least, is way too "short-sighted" to (easily?) descend or walk through an entire, specific hierarchy. You might get *some* level of recursion in there, but not easily in one SELECT statement.

    3. Re:Ridiculous by xelah · · Score: 2, Interesting
      Hierarchical queries are a historical weakness of SQL (but not the relational model) - that's why he chose it as an example. You'd actually do something like this (but you'll need a very recent database):

      WITH RECURSIVE hierarchy AS (
      SELECT * FROM employees WHERE name = 'personsname'
      UNION ALL
      SELECT sub.* FROM employees AS sub, employees AS super
      WHERE super.id = sub.parent_department
      )
      SELECT * FROM hierarchy;

    4. Re:Ridiculous by xelah · · Score: 2, Insightful

      Sure, relational is a good general fit databases, but it sounds like you are saying the fact that you can query and modify it using something like SQL in most implementations makes it great?

      If you're a DBA, system administrator or tester - or if you simply have to do something ad-hoc and dodgy as a quick fix on a live system - then this makes it not so much great as absolutely fantastic. You can do things like:

      • Look at the most time consuming queries and analyze them, optimize them, add/remove indexes or move tables or indexes between different sets of disks. And when you do, the query plans will change because they have been frozen in to the application code.
      • Make ad-hoc changes, or generate ad-hoc reports (or run a query from cron, say) without having to write a little program every time.
      • Examine the data following your software screwing up, and fix it.
      • Run the queries your software has generated and check the results. Correct the query and try again.
      • Fetch a list of currently held locks, or examine the queries which have resulted in deadlocks being reported to the log.
      • Add columns to support admin or reporting functions (or a second application) without worrying about the effect on the (still running) original application.
      • Write a reporting system which programmatically generates queries and has the DBMS do the difficult bit of working out query plans.

      These aren't specific to relation databases or SQL, of course. However, having a query language is amazingly useful.

      I'm surprised you're complaining about having to tune your queries. A lot of databases and SQL have shortcomings, but it's really not that hard if you know your database well (and haven't chosen, say, MySQL). You must still have a query plan with your object databases - it's just implied by your code. (I'm assuming you're not using some sort of alternative query language, because you're comment suggests otherwise and you'd only have to tune that instead). It won't adapt to changing data or indexes, and you're going to have a lot of work to do if you want to duplicate some of the more sophisticated techniques a modern database will use. Worse still, you're going to have to change your application, add some sort of profiling and run it in-place or in a test harness to work out why it's taking as long as it does. And when you want to try a different plan you have to rewrite your code.

      It's the ORM layer that's the real pain in the arse (assuming you're using OOD, and assuming you actually want a direct mapping between your object model and relational model). Things like Hibernate and judicious use of code generate make it a lot easier, but you still need to know what's going on and you still need to (and can!) choose between navigating among objects (letting the ORM do the queries) and generating a hand-written query. To some extent an ORM (and the RDBMS vs OODBMS choice) is just a reflection of the different requirements of on-disk vs in-memory representations of objects. On-disk storage is all about efficient and flexible querying, retrieval, (distributed) concurrency, storage and management of huge data-sets, whereas in-memory storage is all about assigning behaviour and navigating relationships between smaller sets of objects whilst carrying out that behaviour.

      In any case, the original article is just silly. How does taking all the formal structure away make any difference to the fundamental scalability restrictions - your applications need for data consistency (across nodes) and concurrency control? I work in ticketing. It's not the relational model that causes scalability problems, it's the fundamental fact that 100k people are competing for access to 10k seat statuses, that when we check per-person ticket limits or assign seats we need 100% up-to-date data, that we regularly need to fetch the status of all the seats in a block for display, etc. I believe that concurrency and scalability concerns

    5. Re:Ridiculous by Anonymous Coward · · Score: 0

      Wow, such a long post, I almost thought it was another one of those John Galt trolls...

    6. Re:Ridiculous by anothy · · Score: 1

      you could make a reasonable argument that RDB systems are the "most versatile" way to store your data, but "best" is a much harder claim.
      for one thing, that versatility comes at a cost, including more complex server code, higher resource costs, and increased development and testing time.
      in a lot of cases, that ability to do arbitrary joins is entirely wasted: your data set is much more rigid than that. in your corporate directory, you're never going to join telephone numbers against department names, or titles against room numbers. here, typical RDB systems offer too little intelligence and too much weight.
      in other cases, those being described in the original article, the costs are simply unbearable. replication and distribution of relational databases is a real problem. even if everything can be stored locally, if your data set is large enough the RDB overhead is still going to hurt. this shows up in a lot of scientific or financial systems.
      RDB systems provide a very good compromise overall, sort of a least common denominator, but are no silver bullet. you characterize things like this as "regressions", which is true in the sense that they basic tech has been around for a long time, but the point is that the engineering tradeoffs change over time. it's not unreasonable to say that what tended to be a very good default tradeoff with the hardware, networking, and data sets common a decade ago might not be the best default tradeoff today.

      --

      i speak for myself and those who like what i say.
    7. Re:Ridiculous by Jimithing+DMB · · Score: 1

      It's the ORM layer that's the real pain in the arse (assuming you're using OOD, and assuming you actually want a direct mapping between your object model and relational model). Things like Hibernate and judicious use of code generate make it a lot easier, but you still need to know what's going on and you still need to (and can!) choose between navigating among objects (letting the ORM do the queries) and generating a hand-written query. To some extent an ORM (and the RDBMS vs OODBMS choice) is just a reflection of the different requirements of on-disk vs in-memory representations of objects. On-disk storage is all about efficient and flexible querying, retrieval, (distributed) concurrency, storage and management of huge data-sets, whereas in-memory storage is all about assigning behaviour and navigating relationships between smaller sets of objects whilst carrying out that behaviour.

      Well there's your problem. Hibernate. Hibernate basically just pulls tuples in to objects but doesn't really do a very good job of managing the object graph. My experience with it has been that users of Hibernate have to actually write the code that pulls in related objects. So your customers table can have a getInvoices() method but at some point you actually have to write how you want to retrieve the invoices given a customer. In the end you also seem to have this situation where you call save on a "root" object and it saves that object and any objects related to that object recursively. But.. that's not relational. That's hierarchical. FAIL.

      A more fully-featured design has you describe the tables and their relationships in some sort of a data file. It could be one giant XML file or a number of XML files (like one for each Entity/Table) or even something simpler like a plist file (just a serialized hierarchical key/value dictionary).

      Probably the best example of this is the Enterprise Objects Framework (EOF) component of WebObjects. Everything you do is done through an "editing context" that is somewhat akin to a database transaction. Typically you pull in your "root" objects by asking the editing context to do a query for you. From then on out you just ask your customer object for invoices (to-many) or for account manager (to-one) or whatever. The editing context records all of the changes you make relative to what was fetched from the database and when it comes time to save them you ask the EC to save and it figures it all out no matter how complex your object graph is.

      More advanced usage allows for the ability to fake attributes and even relationships giving you attributes that are actually sums or averages or whatever of some related data.

      Apple themselves took this concept, pared it down, and brought it back from Java to Objective-C as "Core Data". By pared down I mean it's hardwired to use a SQLite store vs. any old SQL database. Well there is an XML store as well but we won't get into that here.

      Other similar options include Apache Cayenne (Java) and Telerik ORM (.NET). Those I have not explored as fully as WebObjects but at the basic level they seem to be structured in much the same way. The one thing I haven't yet figured out with those is how you would go about doing some of the derived attribute stuff I did in a fairly large WO project. Basically what I had in WO was a mapping between three very different database servers that all stored somewhat related data. But there were some caveats like one database would store a compound key like 'XXXX','YYY' and another would store one column 'XXXX-YYY'. EOF, if you knew what you were doing, would handle this effortlessly and you could actually join across these without problem. For instance, in the table with the compound key you could define an attribute newStyleNumber with a read format of (column1 || '-' || column2) and it would handle it (mind the PostgreSQL string concatenation there). Obviously I'm now breaking the "pureness" of the ORM but the point was that you wouldn't actually then us

    8. Re:Ridiculous by xelah · · Score: 1

      Well there's your problem. Hibernate. Hibernate basically just pulls tuples in to objects but doesn't really do a very good job of managing the object graph. My experience with it has been that users of Hibernate have to actually write the code that pulls in related objects. So your customers table can have a getInvoices() method but at some point you actually have to write how you want to retrieve the invoices given a customer.

      Hibernate will do that for you. You have to define the relationship, of course (or generate a definition), but Hibernate will give you a set object that gets initialized when you access it. You need to keep one eye on performance, of course - creating many such sets for every customer can be expensive (object creation is expensive in Java) and if you need only a subset of invoices you may be better generating a query or altering your definition.

      In the end you also seem to have this situation where you call save on a "root" object and it saves that object and any objects related to that object recursively. But.. that's not relational. That's hierarchical. FAIL.

      No, it doesn't, unless you've enabled some feature I'm not aware of. If you create a graph of newly constructed objects then you must call save() on all of them. If they aren't newly constructed you needn't call save at all, of course. And navigating your objects as a graph doesn't mean your data is stored in a hierarchical form rather than a relational one.

      A more fully-featured design has you describe the tables and their relationships in some sort of a data file. It could be one giant XML file or a number of XML files (like one for each Entity/Table) or even something simpler like a plist file (just a serialized hierarchical key/value dictionary).

      That's exactly what you do do with Hibernate - though, if you can be bothered writing a generator, it's also possible to generate these descriptions from your database's system catalogues instead of writing them. You're likely to need to tweak them, though - your relational and OO models are likely to be subtly different.

      Probably the best example of this is the Enterprise Objects Framework (EOF) component of WebObjects. Everything you do is done through an "editing context" that is somewhat akin to a database transaction. Typically you pull in your "root" objects by asking the editing context to do a query for you. From then on out you just ask your customer object for invoices (to-many) or for account manager (to-one) or whatever. The editing context records all of the changes you make relative to what was fetched from the database and when it comes time to save them you ask the EC to save and it figures it all out no matter how complex your object graph is.

      That's exactly what Hibernate does, too (but an 'editing context' is called a 'session').

      Sometimes you simply cannot change the underlying data you have and being able to actually use little SQL snippets while still letting the framework do the heavy lifting is a huge win.

      Using SQL snippets - especially big and complicated SQL snippets that do complex data manipulation - is an absolutely essential thing for your application to be able to do. You probably shouldn't be navigating object graphs for everything - especially reporting. There are also some occasions when you absolutely must know what's going to the DB (particularly for concurrency control) - you probably don't want to do an equivalent to UPDATE BankAccount SET balance=balance-200 WHERE balance > 200 [then check the number of rows modified] by initializing a BankAccount object, checking there is sufficient balance, updating the balance and then committing the session. It just won't behave properly when there's concurrent access. (You can explicitly lock the row FOR UPDATE first instead, of course, but you're going to significantly hurt your performance if you keep doing th

    9. Re:Ridiculous by shutdown+-p+now · · Score: 1

      Exactly how is that easier than some other ways, such as building an object database? Can't you just write a few lines of code that are far more expressive than any SQL ever could be in a language like Common Lisp, Smalltalk, Python, Ruby, etc?

      The problem I see with OODBMS is that there's no good commonly agreed definition of object between various OO paradigms - and OODB has to understand a lot of the object model to handle the more powerful queries. For example, should it support MI? If you mention Common Lisp and Python, then it probably should; the question, then, is how to map this to Ruby and Java. And so on.

  26. Is the automobile doomed? by Renegade+Iconoclast · · Score: 3, Insightful

    Turns out, there's something called a "skateboard." You can use it to travel as far as the Quickie Mart, with nothing but your feet to propel it.

    In conclusion, skateboards and automobiles aren't the same thing, so probably not.

  27. Re:Karma Whoring by Anonymous Coward · · Score: 0

    This is.

    What you need here is a quote from Monty Python. It always gets the poster a +5 here. (Yes, it's not Karma whoring due to funny mods not contributing but I would assume funny was what GP was after too). And XKCD strips are very common ways to get +5 too. As are bash.org quotes. And those are just +5 funnies. I could count a lot of ways to get nearly certain insightful mods on any Linux related article, etc...

    So get rid of your elitism. There are a lot of stupid mods here, too.

  28. Always doomed and never dead by MyMistake · · Score: 1

    I keep hoping that someday I'll read a Slashdot article headlined, "Is xxx doomed?" and the answer is... Yes!
    So far... no.
    Maybe we could see something like, "Is this the year that the Linux desktop doomed?"

  29. Project Voldemort? by Talderas · · Score: 1

    Seriously? Get the Harry Potter out. Out now!

    --
    "Lack of speed can be overcome. In the worst case by patience." --Znork
    1. Re:Project Voldemort? by geekoid · · Score: 1

      Lavate las manos!

      --
      The Kruger Dunning explains most post on /. http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect
    2. Re:Project Voldemort? by Anonymous Coward · · Score: 0

      And here I thought Project Voldemort was the new garbage collector for C++.
      Objects would create horcruxes and as long as there is at least one horcrux left the object wouldn't be collected.
      I would keep talking but I have a Sourceforge account to create.

  30. Is this article pointless? by Giant+Electronic+Bra · · Score: 0, Offtopic

    Uh, no, it is so pointless I didn't even have to read it to know its pointless!

    Is the Earth really flat?

    LOLOLOL.

    --
    "Malo periculosam, libertatem quam quietam servitutem." -- Jefferson
  31. Is the flat file doomed? by Anonymous Coward · · Score: 1, Funny

    You should have seen how quickly flat file usage flatlined when relational databases came out. I mean *nobody* uses plain text files anymore. Can you imagine not having your crontabs in a SQL DB?

  32. Supid people who don't understand data by mlwmohawk · · Score: 4, Informative

    The relational database is not going anywhere and nothing in that article is based on any firm understanding of managing data.

    Is the notion of a "join" obsolete? No, but it is typically impractical in a high volume system. You would probably use denormalization as a strategy.

    Scaling many nodes? OK, you still gotta put your data "in" something.

    key/value indexing? yawn. select val from keyvalue_tab where key = foo;

    The value can be basically anything, and most "relational" databases have good object support as well as XML, JSON, etc.

    So we can establish that a SQL relational database can do *everything* a simpler system can do. Now, think about ALL the things you can do with your data in a real database.

    What is the point of using a limited and less functional system? A good system, like Oracle, DB2, PostgreSQL, etc (!mysql of course) will do what you need AND allow you do do more should you be successful.

    The problem with data is two fold: Managing read/write/deletes and finding what you are looking for. These problems have been solved. A good database will do this for you. Want to store object? XML, JSON, binary objects, or a specialized database extension works perfectly.

    1. Re:Supid people who don't understand data by sl0ppy · · Score: 5, Insightful

      The relational database is not going anywhere and nothing in that article is based on any firm understanding of managing data.

      no, the relational database is not going anywhere, you are correct. but, that does not mean that there aren't instances where a non-relational database, with the addition of map/reduce, aren't extremely useful.

      non-relational databases have been around for decades, and are in use for quite a number of applications involving rapid development and storage of very large records. couple this with map/reduce, and you have the ability to scale quickly with very large datasets.

      scaling quickly is a very difficult problem to solve with an RDBMS - you either need to continue to throw more hardware at the problem, to the point of diminishing returns, or re-architect your data at the cost of possible significant downtime, while still attempting to serve up the data in a timely manner. i've been deep in the bowels of oracle RAC, fighting to get just 5% more speed out of a query over a billion rows and realizing that i have to start over with a new schema, just to squeeze more data out. compare that to simply adding another machine and letting the map functionality run across one more cpu before returning it for the reduce.

      Is the notion of a "join" obsolete? No, but it is typically impractical in a high volume system. You would probably use denormalization as a strategy.

      once again, correct, but having to denormalize to a snowflake or a star isn't always the best solution. you're taking the best parts of the relational database model, and throwing them out - normalization, referential integrity, just to squeeze more out of something that may not be the best tool for the job.

      do you hammer with a wrench? i have before, and i managed to hurt my thumb.

    2. Re:Supid people who don't understand data by DragonWriter · · Score: 2, Insightful

      So we can establish that a SQL relational database can do *everything* a simpler system can do.

      In terms of expressive power, sure, but no one is arguing that distributed key/value stores are going to gain against RDBMS's because they have superior expressive power. What is being argued is that they will do so because they have superior scalability and distribution properties, and that in many real-world applications those are more important than the having the full expressive power of relational algebra. Particularly as you get ones that can provide ACID guarantees, that becomes a compelling selling point in many applications where RDBMS's would otherwise be used simply because they are the only available tool, but where distributed key/value stores are a better tool.

    3. Re:Supid people who don't understand data by mlwmohawk · · Score: 1

      What is being argued is that they will do so because they have superior scalability and distribution properties, and that in many real-world applications those are more important than the having the full expressive power of relational algebra.

      Please site a valid argument that suggests the SQL lacks scalability or distribution. If you are willing to get rid of ACID like the other solutions, there are no limitations.

      Particularly as you get ones that can provide ACID guarantees, that becomes a compelling selling point in many applications where RDBMS's would otherwise be used simply because they are the only available tool, but where distributed key/value stores are a better tool.

      Please site one example, just one, where a simple key/pair data system is the "better" solution for a high volume site than a more powerful database like PostgreSQL wouldn't do a better job.

    4. Re:Supid people who don't understand data by mlwmohawk · · Score: 1, Interesting

      no, the relational database is not going anywhere, you are correct. but, that does not mean that there aren't instances where a non-relational database, with the addition of map/reduce, aren't extremely useful.

      A relational database is a lot like C++. It provides the tools for relationships but that does not mean you have to use in that way or that it is sub-optimal.

      non-relational databases have been around for decades, and are in use for quite a number of applications involving rapid development and storage of very large records. couple this with map/reduce, and you have the ability to scale quickly with very large datasets.

      What is a large data set? I have a few PostgreSQL databases with some pretty HUGE data sets, and it blows the doors off anything else I could use.

      scaling quickly is a very difficult problem to solve with an RDBMS - you either need to continue to throw more hardware at the problem, to the point of diminishing returns, or re-architect your data at the cost of possible significant downtime, while still attempting to serve up the data in a timely manner.

      Without any details this sounds like an urban legend. If you designed your system as you would have with a lesser system like a simple "key/value" pair, how would a RDBMS be any different?

      i've been deep in the bowels of oracle RAC, fighting to get just 5% more speed out of a query over a billion rows and realizing that i have to start over with a new schema, just to squeeze more data out. compare that to simply adding another machine and letting the map functionality run across one more cpu before returning it for the reduce.

      Why do you need to start over with a new schema? Why not simply denormalize some of the tables. You can alter the table on line.

      I don't mean to impugn your abilities, but if you can't get oracle to do what you need, you aren't going to get a lesser system to do better.

      once again, correct, but having to denormalize to a snowflake or a star isn't always the best solution. you're taking the best parts of the relational database model, and throwing them out - normalization, referential integrity, just to squeeze more out of something that may not be the best tool for the job.

      Normalization is a tool. Denormalization is also a tool. They are approaches to handling data. The RDBMS doesn't care. Why bitch about referential integrity if you aren't going to use it anyway?

      do you hammer with a wrench? i have before, and i managed to hurt my thumb.

      Poor analogy. Do you use the bottle opener screwdriver in your swiss army knife or do you use a real screw drive?

    5. Re:Supid people who don't understand data by DragonWriter · · Score: 2, Insightful

      If you are willing to get rid of ACID like the other solutions, there are no limitations.

      The other solutions (see below) do not, in all cases, "get rid of ACID".

      Please site one example, just one, where a simple key/pair data system is the "better" solution for a high volume site than a more powerful database like PostgreSQL wouldn't do a better job.

      Scalaris, a distributed transactional key/value store that does not get rid of ACID, is one of the "other solutions" (and one that has been demonstrated, by replicating Wikipedia on a distributed cluster, to scale better, at least, than Wikipedia's existing MySQL platform).

    6. Re:Supid people who don't understand data by DiegoBravo · · Score: 1

      Sorry for the ignorance, but could you elaborate more about those map/reduce systems? some product name? Why hashing is not enough for those big querys?

    7. Re:Supid people who don't understand data by DougWebb · · Score: 3, Interesting

      Without any details this sounds like an urban legend. If you designed your system as you would have with a lesser system like a simple "key/value" pair, how would a RDBMS be any different?

      The difference is optimization vs generalization. Many problems can be handled using simple key/value pair relationships. You can model this in an RDBMS using two-column tables that you never join across, where all of your queries are SELECT val FROM tab WHERE key=? and INSERT INTO tab (key,val) VALUES (?,?). However, if you use the RDBMS this way, you're paying for the overhead of the SQL engine, (usually) a client/server connection, and your language's library for interacting with an RDBMS.

      The alternative is a non-relational database like BerkeleyDB, which is optimized for key/value pair operations. All the fetch and store operations do is fetch and store the value for a given key, with a minimum of overhead. BerkeleyDB is also an in-process database, where your application is accessing the database files directly using the BerkeleyDB library code. (The library handles locking so that multiple processes can use the database files at the same time.) Again, the overhead is kept to a minimum.

      BerkeleyDB is much less flexible than an RDBMS, but for the problem domains where that flexibility is not needed, BerkeleyDB is much more efficient. I've easily achieved over 6000 read/write transactions per second on modest hardware in a single-threaded process; a multi-threaded and/or multi-process application can achieve much higher rates. Compare that to a typical Oracle database connection, where you're lucky to get as many as a few hundred transactions per second, just because of the network round-trip.

    8. Re:Supid people who don't understand data by DougWebb · · Score: 4, Informative

      Map/Reduce was developed at Google. It's a bit tough to wrap your head around at first, and once you get it you wonder what the big deal is, until you realize how suitable it is for Google's datacenters.

      Basically, you take a dataset (a bunch of key/value pairs) and a mapping function, and you run the mapping function over every item in the dataset. This gives you an intermediate dataset with different keys and values. You then run that through a reducing function, which produces your final dataset. This can be a single result, or a dataset that can then be processed with a different map/reduce pair of functions.

      The big deal for Google is that many of their problems can be expressed in terms of map and reduce functions that can operate in parallel over their datasets, and that their datacenters can handle absolutely enourmous quantities of parallel operations. So, for the mapping operation, they take the original dataset and mapping function, subdivide the dataset over thousands of servers, and let them run the mapping function in parallel. When these servers return their results, it's common for many different servers to return the same or related keys in the intermediate set. These are collated, so that when the intermediate dataset is distributed with the reduce function, all of the values with the same keys go to the same servers. This helps the reduce function to be run in parallel; it's often counting the number of original items that were assigned to the same key in the intermediate set.

    9. Re:Supid people who don't understand data by mlwmohawk · · Score: 1

      The difference is optimization vs generalization. Many problems can be handled using simple key/value pair relationships. You can model this in an RDBMS using two-column tables that you never join across, where all of your queries are SELECT val FROM tab WHERE key=? and INSERT INTO tab (key,val) VALUES (?,?). However, if you use the RDBMS this way, you're paying for the overhead of the SQL engine, (usually) a client/server connection, and your language's library for interacting with an RDBMS.

      This assumption is one of those things that needs a LOT of discussion to explain why it is wrong. There is not enough space to have a real technological argument.

      Suffice to say that the client/server relationship between the application and server actually helps your application.

      The "overhead" of the SQL engine can be minimized to almost negligible by using stored procedures.

      As for the interface library, no matter what data storage system you use you will always have an interface library.

      The alternative is a non-relational database like BerkeleyDB [wikipedia.org], which is optimized for key/value pair operations. All the fetch and store operations do is fetch and store the value for a given key, with a minimum of overhead. BerkeleyDB is also an in-process database, where your application is accessing the database files directly using the BerkeleyDB library code. (The library handles locking so that multiple processes can use the database files at the same time.) Again, the overhead is kept to a minimum.

      You say a couple things that don't make sense. "BerkeleyDB is an in-process database" and "library handles locking so that multiple processes can use the database files at the same time." You are saying there is a mechanism being used to synchronize access across multiple processes? An MVCC database blows the doors off a locking system.

      BerkeleyDB is much less flexible than an RDBMS, but for the problem domains where that flexibility is not needed, BerkeleyDB is much more efficient.

      The "problem domain" for BerkeleyDB is far far more limited than people think.

      I've easily achieved over 6000 read/write transactions per second on modest hardware in a single-threaded process; a multi-threaded and/or multi-process application can achieve much higher rates. Compare that to a typical Oracle database connection, where you're lucky to get as many as a few hundred transactions per second, just because of the network round-trip.

      Well, the network "round trip" isn't even possible with BerkeleyDB so it is not an issue. However, I would compare PostgreSQL and BerkeleyDB in a simple key/value database shootout with 50 or more reader/writer processes on the same box and see which scales better. I have a hint, BerkeleyDB would spend all its time managing locks.

      where you're lucky to get as many as a few hundred transactions per second

      Are you insane? A few hundred a second? Yea, back in 1999 on a 10BaseT ethernet connection with IDE hard disks and PIIIs.

    10. Re:Supid people who don't understand data by DiegoBravo · · Score: 1

      Thanks for the response. As you point Google is a perfect user for this, but I couldn't find a big overlapping to the RDBMS niche.

      From your link: 'MapReduce is useful in a wide range of applications, including: "distributed grep, distributed sort, web link-graph reversal, term-vector per host, web access log stats, inverted index construction, document clustering, machine learning, statistical machine translation..." '

      To me it looks more like an useful extension to RDBMS, for example, for optimizing big sorts on certain heavy offline processes.

      regards

    11. Re:Supid people who don't understand data by Matt+Perry · · Score: 2, Insightful

      do you hammer with a wrench? i have before, and i managed to hurt my thumb.

      Not usually, but I have done so before. If it hurts your thumb, you're holding it wrong.

      --
      Slashdot: Failed Car Analogies. Amateur Lawyering. Anecdote Battles.
    12. Re:Supid people who don't understand data by drinkypoo · · Score: 1

      do you hammer with a wrench? i have before, and i managed to hurt my thumb.

      You can hurt your thumb with a hammer, too. Professionals either tool up (power hammer - or in this case, just use a hammer) or get over it (get more accurate.) Nerds find a craftier way to not get hurt - with a magnet on a stick. Try harder, grasshopper.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    13. Re:Supid people who don't understand data by oliderid · · Score: 1

      Is the notion of a "join" obsolete? No, but it is typically impractical in a high volume system. You would probably use denormalization as a strategy.

      once again, correct, but having to denormalize to a snowflake or a star isn't always the best solution. you're taking the best parts of the relational database model, and throwing them out - normalization, referential integrity, just to squeeze more out of something that may not be the best tool for the job.

      I'm currently developping a DB which is supposed to handle around 80 requests per sec and I use intensively joint (MySQL) to joint 3 tables (around 40.000 entries each). I would be quite interested to read the side effects you both are talking about.

    14. Re:Supid people who don't understand data by Anonymous Coward · · Score: 0

      Supid people, don't spell stupid properly.

    15. Re:Supid people who don't understand data by Bryan+Ischo · · Score: 1

      You sound very wedded to relational database technology, to the point where you are impugning other technologies based on preconceptions and strawmen. It isn't very hard to understand that BerkeleyDB is a simple persistent hashtable (with some support for automatically updating indexes with newly hashed values). I am not a database expert at all but I can't imagine how you could store name/value pairs with less overhead. Now if the data storage semantics you require are no more sophisticated than key/value pair, or if the logic to wrap key/value pairs sufficiently to work with your application is sufficiently simple, then I cannot even imagine that anything would be more streamlined than BerkeleyDB for this purpose.

      Note that I am not saying that every type of data storage problem is best addressed by BerkeleyDB. I'm just talking about simple key/value pairs. If you said that most applications requiring a database need more sophisticated data access mechanisms than key/value pair, I wouldn't argue. But to argue that within this limited domain, something as straightforward as BerkeleyDB wouldn't be the most efficient solution, must come simply from an unwillingness to divorce yourself from relational database technology.

      And with regards to locks, you do realize that BerkeleyDB is developed by a professional software company that seems to know pretty well what they are doing? Do you really think that with data access mechanisms as simple as a persisted hashtable, they can't implement very efficient locking? Do you really think that in this very simple type of database, it's not possible to achieve fast operation by 50 or more concurrent readers/writers?

      As a final point, I am pretty sure that MySQL was originally developed as a layer on top of BerkeleyDB. People were in love with MySQL because it was so fast. And it was a wrapper around BerkeleyDB. Doesn't that indicate that BerkeleyDB itself used without a wrapping layer might even be faster? Once again note, that this is only for some kinds of applications. I am not claiming that every huge database application could use BerkeleyDB sanely.

    16. Re:Supid people who don't understand data by mlwmohawk · · Score: 1

      You sound very wedded to relational database technology, to the point where you are impugning other technologies based on preconceptions and strawmen.

      I've been in the industry for over 25 years, I've learned a thing or two about databases both theoretically and practically. I made no strawmen and I evaluate everything with an open mind. An "open mind" does not need to be an empty one.

      It isn't very hard to understand that BerkeleyDB is a simple persistent hashtable (with some support for automatically updating indexes with newly hashed values).

      There is no lack of understanding.

      I am not a database expert at all but I can't imagine how you could store name/value pairs with less overhead.

      Well, as a database expert, I assure you that "overhead" in databases is there for a reason. A "simple" system does not have the ability or the functionality to manage scalability correctly. Any dumb system can have one process and store and retrieve and not have an issue.

      A more complex system embodies a lot of knowledge about data and access. Techniques like ordered writes, MVCC, lazy writes, and others are far more complex and provide much higher throughput than a simpler system.

      But to argue that within this limited domain, something as straightforward as BerkeleyDB wouldn't be the most efficient solution, must come simply from an unwillingness to divorce yourself from relational database technology.

      You are assuming because I disagree with you, that I am inflexible or something. That's your mistake.

      Now, if you have more than one process accessing your BerkeleyDB database, say a web site, I'll bet you PostgreSQL will outperform BerkeleyDB under load with 50 processes. The performance curve of berkeleyDB under load is bad. PostgreSQL, on the other hand, stays fairly flat.

      And with regards to locks, you do realize that BerkeleyDB is developed by a professional software company that seems to know pretty well what they are doing?

      This is not argument. It has no place in this discussion.

      Do you really think that with data access mechanisms as simple as a persisted hashtable, they can't implement very efficient locking?

      There is no such thing as efficient locking.

      Do you really think that in this very simple type of database, it's not possible to achieve fast operation by 50 or more concurrent readers/writers?

      I have written many high speed system with a high degree of concurrency, and it is possible to get "high performance," but locking is ALWAYS a performance hit. ALWAYS. Think of it this way, all locking mechanisms rely on the OS process scheduling to synchronize processes to a single resource.

      As a final point, I am pretty sure that MySQL was originally developed as a layer on top of BerkeleyDB. People were in love with MySQL because it was so fast. And it was a wrapper around BerkeleyDB. Doesn't that indicate that BerkeleyDB itself used without a wrapping layer might even be faster? Once again note, that this is only for some kinds of applications. I am not claiming that every huge database application could use BerkeleyDB sanely.

      No.

    17. Re:Supid people who don't understand data by Anonymous Coward · · Score: 0

      Jeeze, you people just don't understand. He obviously knows everything there is to know about databases. It's all right there in his php webapp. What's this middle-ware you all keep talking about? 4tb peoplesoft database? That's just crazy talk.

    18. Re:Supid people who don't understand data by mlwmohawk · · Score: 1

      In what way is Scalaris" anything like a database? It doesn't even store data on disk.

    19. Re:Supid people who don't understand data by sl0ppy · · Score: 1

      google for star schema - or snowflake schema, or denormalizing data.

      but if you have so much data that 80 queries/second mysql 3 may not be the best tool for the job.

      modern rdbms's rely on several nice tricks to make things a little bit easier, ranging from partitioning of data to join collapses, to materialized views, and partial indexes. none of which you will get from mysql 3.

    20. Re:Supid people who don't understand data by Tablizer · · Score: 1

      Is it possible that this is really a "labor" issue and not really a performance issue? High-end RDBMS can often be tuned to optimize performance for specific access patterns. However, such may require a seasoned and expensive expert. It may be that these narrow-purpose databases have a lower learning curve as long as your usage fits what it was designed for. They are fast out-of-the-box as long as you stick to its comfort zone.

    21. Re:Supid people who don't understand data by Anonymous Coward · · Score: 0

      oodbms is the futurrre! its teh sql !!

    22. Re:Supid people who don't understand data by DragonWriter · · Score: 1

      In what way is Scalaris" anything like a database?

      Its a persistent, transactional data store.

      It doesn't even store data on disk.

      It achieves persistence by using a cluster of nodes storing data in memory and being tolerant of nodes dropping out, rather than storing on disk. This is, admittedly, not the usual means in which databases achieve persistent data storage, but so what? It provides data storage and retrieval and is transactional and persistent. A database is defined by function, not by how the function is acheived. It is, pretty clearly, a database, albeit a non-relational one.

    23. Re:Supid people who don't understand data by mlwmohawk · · Score: 1

      Its a persistent, transactional data store.

      Persistent in what way?

      It achieves persistence by using a cluster of nodes storing data in memory and being tolerant of nodes dropping out, rather than storing on disk. This is, admittedly, not the usual means in which databases achieve persistent data storage, but so what? It provides data storage and retrieval and is transactional and persistent. A database is defined by function, not by how the function is acheived. It is, pretty clearly, a database, albeit a non-relational one.

      Arguments just to win arguments are useless. It is not a database. It may have some redundancy, but it does not have persistence.

    24. Re:Supid people who don't understand data by DragonWriter · · Score: 1

      Persistent in what way?

      Persistent in that it is designed to be durable over time, not transient. It relies on cluster architecture rather than storage that is itself persistent at any given node to acheive this, which is a pretty significant difference from the point of view of, say, a system administrator setting up a system relying on it as a data store, but irrelevant from the perspective of its use by applications.

      Its function is the same as any other database, it differs in the means by which it acheives it. Saying its not a database because it doesn't use a disk is like saying that an automobile isn't a mode of land transportation because it doesn't use a horse.

  33. Not by dedazo · · Score: 1

    As an architect I tend to see databases are fancy storage systems, and in general they annoy me. I love object databases and distributed key-value pair store mechanisms. But even as jaded as I am I can see that the RDMBS isn't going anywhere any time. There are a lot of things that alternative storage systems simply cannot do well.

    As for the Google argument (i.e., "bug Google does it and it works"), I've heard it a few times in meetings where a bright-eyed company executive is trying to make a case for their use. My response is usually "yeah, all you need now is to hire people like the ones that work at Google", at which point the argument is usually dropped. Making applications work with storage systems like that take an engineering mindset that's simply different than the talent at the average Fortune 500. RDBMS are pretty good at masking crappy development practices.

    --
    Web2.0: I love when people Flickr my cuil and digg my boingboing until my google is reddit and I start to yahoo
  34. Ignorant doomspeaking. by Jonas+Buyl · · Score: 1

    There are plenty of solutions yet to be explored to handle that problem without having to dump the relational database model. One I think of right now might be to view a server park as a hard drive, view a table like a file and apply all that filesystem technology to a problem that rises today. We can for example apply the UNIX inodes filesystem so you can efficiently store a database with great scalability.

  35. Some credibility... by jernejk · · Score: 3, Insightful

    form the article: "For example, a relatively simple SELECT statement could have hundreds of potential query execution paths, which the optimizer would evaluate at run time. All of this is hidden to us as users, but under the cover, RDBMS determines the "execution plan" that best answers our requests by using things like cost-based algorithms." So, you have no idea how optimizers work and how you can access tuning information, and you'd like to tell us RDBMSs are bad? Get of my lawn! (yay, I'm getting old)

    1. Re:Some credibility... by jernejk · · Score: 2, Funny

      no, really.. this is utter crap: so called benefit: "The first benefit is that they are simple and thus scale much better than today's relational databases. If you are putting together a system in-house and intend to throw dozens or hundreds of servers behind your data store to cope with what you expect will be a massive demand in scale, then consider a key/value store." but then:"Bugs in a properly designed relational database usually don't lead to data integrity issues; bugs in a key/value database, however, quite easily lead to data integrity issues." and then it just goes on on how RDBMSs are really cool... oh, I got it, it was really written by Oracle reverse marketing department!

  36. He can't even explain relations correctly... by iamhigh · · Score: 3, Informative

    Does that example of a relational DB have a serious error, or is that just me? Why have make key in two tables?

    He lost cred right then.

    --
    No comprende? Let me type that a little slower for you...
    1. Re:He can't even explain relations correctly... by reginaldo · · Score: 1

      Not really an error. If make is an attribute of model, then landing it in the model table is fine, the grain is still at model level. If model is an attribute of car, then landing make in the car table is fine as well for the same reason.

      Basically it negates the need for a multi-table join to get a make for a car.

    2. Re:He can't even explain relations correctly... by iamhigh · · Score: 1

      True, you *could* put it in either and it would make sense, but not both. If it's in the makemodel table, you don't need it in the car table. The point of a relational db is to normalize data, removing duplicate data, right? If the Plymouth Neon changes to a Chrysler (or if it was something that would change more often) you then have to change it in 2 tables. If one gets updated incorrectly, you have conflicting data, thus rendering everything screwed up.

      --
      No comprende? Let me type that a little slower for you...
    3. Re:He can't even explain relations correctly... by reginaldo · · Score: 1

      Lol. I think we are differing because you are thinking OLTP and I am thinking OLAP. You want the cars table to always reflect what is current, and I never want to throw away (i.e. update) data ever. In OLAP the proper approach is to never update the make or model key in the cars table. You want to retain all historical information by using slowly changing date-driven attribute tables. From an OLAP perspective, TFA has a decent data model.

    4. Re:He can't even explain relations correctly... by deraj123 · · Score: 1

      The point of a relational db is to normalize data, removing duplicate data, right?

      (or if it was something that would change more often)

      I think you just answered your own question there. There are times to denormalize things. Given that most cars don't change their make after they are created, this could possibly be a worthwhile performance based decision. That being said, having that particular model directly after defining Normalization...seems a bit off...

    5. Re:He can't even explain relations correctly... by Anonymous Coward · · Score: 0

      He is also missing the Model table.

  37. Not buying it. by reginaldo · · Score: 5, Interesting

    In theory, I agree the most costly actions in a database are joins. It seems like the key/value model is a great solution to this, on the surface. However, what the key/value model does is push the cost to the application layer. Instead of ensuring relational integrity and conformity in the database, suddenly all app code has to do this on the frontend. Also, instead of managing this process in a single place, suddenly this process is distributed among multiple methods. Sure, the DB is more scaleable, but suddenly the app is a mess.

    1. Re:Not buying it. by anothy · · Score: 1

      it's even worse than that. joins are the most costly part of DBs, but in more ways than people usually mean when they say that. sure, they're the most costly in terms of cpu time and disk access time on the DB server, but they're also often the most costly in terms of application design, testing, maintenance, and so on.
      the two common ways of removing joins from the picture are views or, as you note, moving them to the application. views can, depending on implementation, eliminate the cpu cost, and in conjunction with other tuning procedures, the disk cost, but do nothing about the other costs. moving the joins into the application saves cpu time on the DB server, but does nothing for disk access time (and makes some of the other tuning methods harder), and has no impact on the other costs (and yes, it'll make your application more of a mess).

      --

      i speak for myself and those who like what i say.
  38. Here's a match.. by Slicker · · Score: 3, Interesting

    Relational databases need to die. I loved them and preached the goodness of them 10 years ago, but they are just too rigid for contemporary needs. I've learned better ways of organizing and filtering data.. but the old RDBMS school is too canonical (stubborn) and self-indulging to realize that needs are changing and their model doesn't fit.

    We need efficient attribute/value models. We need to stop referencing data by where it is and start referencing it by what it is. There is too much data that needs to exist in different views, based on policy--not explicit placement.

    Dumb-tags (attributes without values) like those used with Delicious bookmarks are also broken. They are too vague.

    My own approach is that every attribute may have any number of value instances. Each value instance may, in turn, have sub-attributes. So you can look up data based on its characteristics even with disregard for its name. For example: /mycompany/mailserver1/ip of zone = infirewall

    This returns all IP addresses under the "zone" attribute while also under the mailserver1 attribute that is under the mycompany attribute.

    When validating instances of the "ip" attribute, it looks backward in the path because it is extremely quick that way.

    The data server's sole responsibility is storing and retrieving information (not just data) in context (aka filtering).

    Sorting is the responsibility of the client. This makes sense because there are an infinite number of algorithms one could have for sorting data (e.g. alphabetic mixed case, ASCII order, etc). To facilitate this, I wrote a method to return the number of values that would be returned if the values were requested. If too big a bite for the client, it can re-request the size of a smaller chunk, segmented according to the client's ordering method. This is useful for scale, in any case. Processing in chunks makes sense whether over a network of limited capacity or from directly form disk with limited memory.

    And--this is a columnar approach like Google's BigTable is.. That means you get 10+ times faster read performance.

    Matthew

    1. Re:Here's a match.. by lgw · · Score: 2, Insightful

      What do you mean by "informatin, not just data"? It seems like you have specific, personal definitions of those words that others might not share.

      If you make sorting the responsibility of the client, what do you do with large result sets? You can't sort chunked data client-side, as you have to sort before chunking. There should be *some* answer for result sets that don't fit in memory (client or server). I'd be happy with only being able to get results in a certain order if I've already built an index accorind to that ordering criteria, or something equally elaborate, but what's an index in your scheme?

      --
      Socialism: a lie told by totalitarians and believed by fools.
    2. Re:Here's a match.. by DarkOx · · Score: 4, Informative

      Wow, um where to being really....

      So you realize that the structure you are suggesting can be easily built in a traditional RDB, using a star-schema or cluster design right?

      Next you suggest doing the sorting on the client, and then say that if there is more data then a client can handle the server can be asked to send chunks according to the clients sort order. That means the server has to have all the sort logic the client has and probably in all but the most trival applications do all the sorting anyway... Seems to me a star schema and indexing the fact table on the attributes that are most comonly going to be used for sorting makes much more sense; because as I said the serve is going to be sorting anyway.

      Now there are data sets that non relational structers do make some more sense, but we have hierarchy , and navigational designes for those, yours is not one of them.

      --
      Repeal the 17th Amendment TODAY! Also Please Read http://www.gnu.org/philosophy/right-to-read.html
    3. Re:Here's a match.. by segedunum · · Score: 1

      So you realize that the structure you are suggesting can be easily built in a traditional RDB, using a star-schema or cluster design right?

      # I think the point is that you have to put those things on top of a RDB and hammer it into shape. Yer it's doable, but the basic design just doesn't really reflect what needs to be done.

    4. Re:Here's a match.. by spaceman375 · · Score: 1

      The database you describe used to be called Pick. I worked full-time on pick from 1986 to 1990. It was written in the late '60's by the army and was WAY fantastically advanced for its time. I've seen an office with 12 people, 4 printers, and a streaming tape backup all running simultaneously on a 286 with fine performance and response times. Their "mulridimensional" database was amazingly flexible.
      I really love this bit: Since computers weren't so great in 1969, they wrote a kernel that just emulates a better computer, then they wrote the actual OS on that virtual machine. When RISC became popular in the '80', the virtual processor in the pick machine was so close to the real RISC chips that pick was the first OS ported to the new IBM RISC servers, even before any IBM OS's like AIX.
      I Love PICK. There's many variations these days (PICK, UniData, Universe, Ultimate, PICK OA, R83, R9, Advanced PICK, D3, MvEnterprise, Prime Information, Revelation, Mentor, jBase, Sequoia), many of whom started in the '70's. And Yes, I think Pick-like databases can run rings around any other db. They aren't just a db that runs on some OS; Pick IS the OS (tho it can be a guest too) IBM actually came to Pick first for an OS for the original PC, but its requirements were too much for the hardware, so Bill Gates won out. The owner of Pick actually laughed at the IBM guys; I'll bet he choked on that memory quite a few times.
      Check out Pick - it really is what you describe and a LOT more.

      --
      On the one hand you take life too seriously, and on the other, you do not take playful existence seriously enough. Seth
    5. Re:Here's a match.. by molarmass192 · · Score: 1

      LDAP and more specifically Berkley DB have been doing things like that for a very very long time. The reality is that that model doesn't scale well. As archaic as RDBMSes are they are built to scale and be generic in how they store data. I've used BigTable via GoogleApps and it's limitations as compared to an RDBMS are readily apparent when you want to share common data between objects.

      --

      Good people do not need laws to tell them to act responsibly, while bad people will find a way around the laws-Plato
    6. Re:Here's a match.. by jadavis · · Score: 3, Insightful

      We need to stop referencing data by where it is and start referencing it by what it is.

      You say that without any explanation of your apparent position that the relational model requires you to reference data by "where it is".

      You seem to think that the semantics of your system are somehow richer -- providing "information" rather than "data".

      Do you even know what a relation is?

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
    7. Re:Here's a match.. by Anonymous Coward · · Score: 1, Informative

      Need to die? Really. I'm not starting a pissing contest but I work on a 300GB database with single tables containing 700,000,000+ rows. And in the right hands, this stuff flies.

      And I think you go on to invent hierarchical databases. From personal experience, 1993 is calling. (And your indexing may get complicated. Wonder if anyother systems have had to address this?)

      Sorting on the client? Comedy gold. There are lots of way to sort but the big-ass server can usual cope. You're not suggesting something that might involve sorting a few million rows across the network (in chunks) just to return the first 100 or so? That would be stupid. If you're serious I'll sell you lots of hardware and bandwidth that's completely unnecessary.

      RBDMSes work. Other stuff will work too. Doesn't mean RBDMS need to die.

      As for faster read performance, it'll depend on what your doing. But I can pull numbers out of my arse/ass too.

    8. Re:Here's a match.. by Anonymous Coward · · Score: 0

      My own approach is that every attribute may have any number of value instances. Each value instance may, in turn, have sub-attributes. [...]

      Dude - whilst you're creating an application to hold data in this new unique way some of us have used an existing, reliable database to solve the actual problem and gone home...

    9. Re:Here's a match.. by cswiger · · Score: 1

      Gah. I remember dealing with PICK also; it's used in a bunch of airline reservation and courier/package delivery systems. You deal with this beast in something resembling BASIC, rather than in SQL, and you'll quickly discover that all it supports is unary and binary operations: no operator precedence, no parenthesises, and not even compound statements. No equivalent of JOIN, no notion of atomic commits, no transaction model, logging, or rollback, etc.

      Think of Berkeley DB 1.x, or early versions of MySQL which only had the MyISAM storage type, and then remove the C or SQL-based API and replace it with a crippled BASIC variant instead, and you've got something pretty close to PICK. Basically, all you get is a filesystem with data kept in hash tables, maybe with a B-tree index added for newer versions of PICK. To do anything beyond the trivial with it, you end up doing all of the heavy lifting on the client side.

      Anybody remember DeMorgan's laws? Well, if you want to use PICK you'd better, because to do:

      RESULT = NOT (A or B)

      You end up having to do three separate statements:

      C = NOT A
      D = NOT B
      RESULT = C AND D

      --
      "The human race's favorite method for being in control of the facts is to ignore them." -Celia Green
  39. Re:Top 25 Reasons the Relational Database is Doome by Anonymous Coward · · Score: 0

    This is slashdot, it was on digg a week ago and reddit a week before that.

  40. Not the only one thinking this is silly... by micromuncher · · Score: 1

    ...or at least an attempt at bad advertising or pursuasive writing (cognitive justification.)

    OODBMS have been pushing this, and many of them are pushed as light weight key-value stores.

    http://en.wikipedia.org/wiki/ODBMS

    This isn't new, like OpenDoc's Bento
    http://en.wikipedia.org/wiki/OpenDoc

    That became IronDoc
    http://linuxfinances.info/info/oodbms.html

    The problem with any of it is that relational databases rule the enterprise space. You cannot get away from them, and they are far from dead, because you will always have business people wanting to do ad hoc reports, and those are best done against denormalized models (where object stores tend to get super normalized which is just bad for reporting because cross table joins are the most expensive thing you can do in any database.)

    Yay.

    --
    /\/\icro/\/\uncher
    1. Re:Not the only one thinking this is silly... by DragonWriter · · Score: 1

      The problem with any of it is that relational databases rule the enterprise space. You cannot get away from them, and they are far from dead, because you will always have business people wanting to do ad hoc reports, and those are best done against denormalized models where object stores tend to get super normalized which is just bad for reporting because cross table joins are the most expensive thing you can do in any database.)

      Huh? Object databases tend to be extremely denormalized in the relational sense, the whole point of them is to have a closer mapping to the application domain than a normalized relational representation would provide. Furthermore, regular, anticipated reports are often best done against denormalized models, since you know exactly how the report is going to be structured and can optimized the data storate to support it; truly ad hoc reports (i.e., where you've got a pile of data and the questions being asked about it aren't anticipated in advanced) gain the least, performance-wise, from denormalization, and are often made much more difficult (in terms of actually putting a query together to get what you want) by denormalization.

    2. Re:Not the only one thinking this is silly... by xelah · · Score: 1

      where object stores tend to get super normalized which is just bad for reporting because cross table joins are the most expensive thing you can do in any database

      That's rather simplistic. Normalizing can make your queries faster if they reduce duplicated storage. Imagine having a massive table of cars with a long manufacturer text field, vs having a table of manufacturers. By taking out the manufacturer field you make your table smaller, which means less IO for your queries and better use of your caches.

      Besides, I'm sure there are many more expensive operations than a join like the one I've implied. Imagine, say, sorting a few million of those car records by a text attribute. That's likely to add much more time than a hash-join to a small second table.

  41. Re:WTF? seconded by patiodragon · · Score: 1

    As long as there is data that is related, I predict a form of relational database.

    Genius! I know, pure genius.

  42. hell no by Thaelon · · Score: 1

    There are still multi-billion dollar businesses operating the core of their business on COBOL systems, and they're decades older than relational database technology.

    So don't bet on it.

    --

    Question everything

    1. Re:hell no by plopez · · Score: 1

      COBOL is a programming language. Relational databases are a data management paradigm. COBOL can be used to access Oracle, Postgresql, IDMS, SQL server and a host of other data management systems.

      I'm not sure what your point is.

      --
      putting the 'B' in LGBTQ+
    2. Re:hell no by Thaelon · · Score: 1

      Technologies that businesses get built on tend to stick around for decades, and are replaced about as often as foundation blocks.

      Evidence of this is found in the fact that COBOL is alive, if not well.

      That their roles are different is irrelevant.

      --

      Question everything

  43. Doomed I tell you, by geekoid · · Score: 1

    DOOOOOOOOOOMED!

    --
    The Kruger Dunning explains most post on /. http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect
  44. some stories you could submit by Xtifr · · Score: 1

    I keep hoping that someday I'll read a Slashdot article headlined, "Is xxx doomed?" and the answer is... Yes!

    Feel free to submit any or all of the following:

    Is CP/M doomed?
    Is Microsoft Word for OS/2 doomed?[*]
    Is pets.com doomed?
    Is the passenger pigeon doomed?
    Is T-Rex doomed?
    Is Shoemaker-Levy doomed?

    Get any of these accepted, and your wish will come true. :)

    [*] Note, I own a copy of this, but I still suspect it's pretty doomed.

  45. RDBMS don't scale! by Anonymous Coward · · Score: 1, Funny

    Quick someone tell CCP!

  46. Relational DBs will be around for a long time... by terryfunk · · Score: 0

    it will be a long time before the death of Relational DBs.

    People will make them work regardless. It is a bit like COBOL, there's lots of it out there and it will be maintained for years to come...put it in the bank.

  47. my question by systematical · · Score: 1

    Yes I read most the article and perhaps reading the rest would answer this, but how is a key/value database different from a MySQL databse running MyISAM where you store a bunch of different objects as a string, maybe json_encoded or whatever in the row?

    1. Re:my question by DragonWriter · · Score: 1

      Yes I read most the article and perhaps reading the rest would answer this, but how is a key/value database different from a MySQL databse running MyISAM where you store a bunch of different objects as a string, maybe json_encoded or whatever in the row?

      Purpose-designed distributed key/value stores (which are what is interesting, not just "key/value stores") that aren't pretending to be RDBMS's are generally easier to scale out, since scalability is what they are designed for.

  48. Yes by Anonymous Coward · · Score: 0

    I am already in the process of migrating all my enterprise systems from Oracle to Project Voldemort.

  49. Re:Karma Whoring by deraj123 · · Score: 1

    Except...usually the xkcd links that get modded up are...relevant. That comic is about validating user input, it really doesn't have much to do with RDBMS at all.

  50. Re:WTF? seconded by Anonymous Coward · · Score: 0

    As long as there is data that is related, I predict a form of relational database.
    Genius! I know, pure genius.

    You might want to look up what relational means, because it has nothing to do with one piece of data being related to another.

  51. Blah, blah, blah, etc. by SIR_Taco · · Score: 1

    1. Netcraft...
    2. Overlords...
    3. Shampoo...
    4. Skynet...
    5. ????
    6. Profit!

    --
    I say don't drink and drive, you might spill your drink. Before you get behind the wheel just stop and think.
  52. Answer: by ardle · · Score: 1

    About as doomed as COBOL

    1. Re:Answer: by ardle · · Score: 1

      ... the only difference being that SQL is human-readable ;-)

    2. Re:Answer: by Qzukk · · Score: 1

      My experience is that COBOL is quite human readable, but who the fuck would ever think to write
      MULTIPLY x BY y GIVING z?

      --
      If I have been able to see further than others, it is because I bought a pair of binoculars.
  53. Longest death in tech history by TheWoozle · · Score: 1

    Well, apparently the relational database has been doomed for the last 20 years or so.

    You'd think that the people running the thousands of systems with databases managing data on everything from bank accounts to medical records to what you bought from Wal-Mart last tuedsay would have heard the news by now and moved on to the Next Big Thing.

    --
    Insisting on "correct" English is like saying that there is only one, definitive recipe for chili.
  54. Ummm what regular graph/object databases by Grapedrink · · Score: 1

    It seems like every time I read one of these articles, it is written by someone with no knowledge of what is actually out there. I suppose that is normally because anyone with enough time to mouth off on the internet in article or blog form is not actually doing real work. The rest of us are too busy, you know, working?

    Anyway, as someone that has been programming, particularly against just about every major database platform out there, I can tell you that there has always been a battle between relational vs. other types of databases, most notably graph databases. Full disclosure: I am partial to graph databases, but that is because I find a lot of utility in working in pure code and I also apply graph theory to a lot of my work.

    There is no silver bullet and each kind of database is better at certain tasks. Long ago, hardware made much more of a difference than it does today and was one reason relational databases "won" out. Other reasons include marketing, the development of SQL and other standards, and the ease of applying relational mathematics which are easy to understand. I spent most of my life working with relational databases as they were simply in my comfort zone, until I realized I needed to get off my ass and learn more than just new languages and programming techniques (and realized relational dbs were such a huge fail at tasks like traversing graphs, storing dynamic columns ala a CRM, etc).

    Particularly when you look at older languages like LISP, relational was a great fit vs. graph. Since then, many factors have caught up and many more languages, solutions, and designs are out there. In the meantime, graph databases never went away. It seems suddenly since the outbreak of "web 2.0" frameworks and ORMs, a lot of people who don't have a clue about SQL, and especially databases in general of all types find it a great idea to go out and make their own or put some huge hack on top of a relational db, or perhaps worse, try to come up with something entirely new that is not based on fact or need.

    I've played around with some of the databases mentioned above, and just like MySQL, they are mainly reinventing the wheel badly. The google implementation is the only one I have seen that is not completely shoddy, but color me somewhat unimpressed as well. FYI, just because xyz company such as google or IBM uses some product does not make it good. On the contrary, it's often a warning sign as big companies are often what keeps sites like thedailywtf going.

    Regarding graph databases, I'm currently using Gemstone with Smalltalk for instance, and I have used it with Java as well. I can tell you it is great, but it is no panacea. It's been around forever like many of its competitors which are also noteworthy, but a side-by-side comparison is best left outside the scope of this comment. Gemstone lets me avoid any ORM overhead and I can write and maintain queries, transactions, reports, etc in one place; my code where they belong for my projects. It's fast as hell and I have collections with millions of objects and there is no slow down. In fact, I migrated many of my databases from other systems including Oracle, DB2, MS SQL, and Postgres, and it smokes them all, but I am confident that if I can think of a huge laundry list of tasks where the opposite will be true.

    For instance, I can do reporting using Gemstone as I mentioned, but it is left up to me writing my own code or using a library, where in something like MS SQL, I can just use reporting services to do more complicated cubes, or one of the 100 enterprise reporting tools of various size for Oracle. It also sucks in some ways if you need to share a lot of data between applications because you have to make decisions about what do you want to split out and how to manage the memory and load.

    Even Gemstone itself is not so bold as to presume there is no use for relational databases. You can build a SQL layer on top of it and it has many tools to move data back and forth between any relational db, for instance.

    The authors

    1. Re:Ummm what regular graph/object databases by DragonWriter · · Score: 2, Insightful

      Long ago, hardware made much more of a difference than it does today and was one reason relational databases "won" out.

      Hardware makes just as big of a difference today, which is why distributed key/value stores are gaining currency at the moment. The hardware-related difference that was a big win for relational databases was their efficient use of disk space when normalized; the hardware-related difference that is a big win for distributed key/value stores now is their efficient scalability by distribution across multiple nodes.

      I am going to tear my eyes out if I see "yet another tuple store or graph db." Welcome to the last century, please try again.

      The big thing isn't "tuple stores or graph dbs" its distributed tuple stores, and, even better, distributed transactional tuple stores. Not a whole of them from the last century.

    2. Re:Ummm what regular graph/object databases by Grapedrink · · Score: 1

      You are misunderstanding me, but perhaps I was not clear. Architecture with the right hardware makes a difference of course. My point was it was no longer technically impractical to run certain types of databases. Certainly every database loves more hardware.

      There are plenty of tuple and graph stores that support distributed transactions. There are also plenty that have not been updated, but just as many in the relational world. It is in the nature of graph and tuple stores to scale out in this way. Most of the major implementations already do this quite well and much more elegantly than any replication based solution in the relational world.

      I am not sure what you are getting at here other than straying a bit off topic from what I was commenting about. Regardless, yes, I agree with you.

  55. Optional reading by LordMyren · · Score: 1

    oreilly radar recently covered the topic, as did Richard Jones, a last.fm person. Some decent reading in both

  56. Oracle has this - it is called Coherence by (H)elix1 · · Score: 1

    Coherence gives you all the magic you would want to have were you starting to put together high powered hashmap. Does key/value pairs, across multiple machines, with cache invalidation, etc. It also lets you perform interesting queries on the cache. It also can front end hibernate or other databases and act as a cache there too. It also works better than then most http session stores. It also.... Gah. This is one of my favorite JARs in my toolkit.

  57. Just to be pendactic by plopez · · Score: 3, Insightful

    There really isn't a true implementation of the relational model as per Codd and Date.

    Also, SQL is a nightmare. A badly designed programming language which is not quite functional and not quite procedural and so needs a bunch of hacks to work properly. And then there is the issue of NULLS. And the fact that you can end up with ugly bag operations and path dependencies in SQL.

    And just to start yet another flame war (Iknow, I just know some one is going to mod me as a troll today) key/value is just another way of saying "network database".

    And another thing which I will probably get hammered for, if you normalize a DB properly you will get you objects almost for free. And vice versa. Where I see people having problems is that they either are :

    1) lazy about defining and understanding their data
    2) or likewise for their objects
    3) or both.

    If you do it properly will will get a nice set of multidimensional objects and fact/attribute tables which are orthogonal and lean. Easy to understand, search, join, build, compose, decompose, signal and track.

    As opposed to a snarled up hacked together, overloaded, over inherited nightmare with hidden dependencies which I have seen too many times.

    OK, you can slam me now.

    --
    putting the 'B' in LGBTQ+
    1. Re:Just to be pendactic by reginaldo · · Score: 1

      You're right, but you are missing the point. This isn't about creating objects, it's about removing the necessity to join in the DB, hence de-normalizing.

      Sure, a normalized database model gives you a ton of discrete objects, but you have to join them all together to get everything you need, so performance suuccks.

      Not that I'm advocating the key/value approach :P

    2. Re:Just to be pendactic by rycamor · · Score: 1

      No, you're missing the point. You are mistaking the concept for the implementation. Relational database management systems are a concept (well, a combination of concepts), and they can be implemented any way you want at the physical level. Most of the perceived problems with 'relational databases' are really problems with how SQL systems are implemented. It doesn't matter what storage mechanisms you have. As long as the *front end* (or API, or querying interface) allows you to express your instructions and receive results via declarative set logic rather than hierarchical navigation, then you have a relational database management system.

    3. Re:Just to be pendactic by coryking · · Score: 1

      You have to join everything together at some point no matter what. If you join them on the way out the door, it buys you flexibility to basically see your dataset any damn way you need. If you join them going in, you can only see your data in ways envisioned by whoever designed the system inially.

          Joins only kill performance if you are either *huge* or are using a certain database that happens to have a dolphin for a logo. In most cases, on a properly indexed, properly tuned database server a 10 table join shouldn't take more then a few milliseconds. You run into more problems with things like nested subqueries--but it really depends on how well you know your system and how many knobs you can turn.

      Once you outgrow PostgreSQL, both of the big-boys offer all kinds of neat ways to optimize particularly "interesting" queries. Things like materialized views and such are good hacks to work around crazy views of your data that get hit often. Those kinds of things are the reason people pay big-bucks for Oracle and such. And if they went some crazy "object database" or key-value pair system like in the article, I'd argue the company would have never grown to the size that it could afford Oracle--those systems would have made it impossible to know the metrics and statistics needed to grow to a large enough size.

  58. SQL is the problem, not RDBMSs by Savantissimo · · Score: 3, Interesting

    SQL and all its pointy-headed progeny are the real problem with databases, not the relational vs. newMarketingBuzzwordDuJour arguments.

    Database operations do not need to look like code or algorithms, the only reason they do is to provide jobs for database programmers.

    Over 15 years ago Paradox's query-by-example was light-years ahead of today's soul-killing SQL crap.

    SQL is not going away, though, any more than its idiot older brother Mumps (M, Caché).

    --
    "Is life so dear, or peace so sweet, as to be purchased at the price of chains and slavery?" - Patrick Henry
    1. Re:SQL is the problem, not RDBMSs by Grapedrink · · Score: 2, Insightful

      Microsoft has CLR code running on top of MS SQL but it sucks performance wise. Oracle has Java. That's about as close as we have gotten, but both are just crutches.

      Unfortunately, you are right that SQL is terrible and not going away. The status quo, industry, and marketing will make sure we suffer for years to come.

    2. Re:SQL is the problem, not RDBMSs by emurphy42 · · Score: 2, Insightful

      Paradox's query-by-example

      *looks up* GUI query builder? Highly appropriate for simple things (e.g. Crystal Reports), but absolutely terrible for more complex things.

    3. Re:SQL is the problem, not RDBMSs by WuphonsReach · · Score: 4, Informative

      Over 15 years ago Paradox's query-by-example was light-years ahead of today's soul-killing SQL crap.

      QBE grids are nothing more then a UI abstraction of the underlying SQL SELECT statement. In fact, in MS-Access (which has a QBE grid), you can flip between looking at the QBE and looking at the raw SQL SELECT statement.

      Sometimes it's faster to do it in raw SQL, sometimes it's faster to setup the query in a QBE grid.

      --
      Wolde you bothe eate your cake, and have your cake?
    4. Re:SQL is the problem, not RDBMSs by Just+Some+Guy · · Score: 4, Insightful

      Database operations do not need to look like code or algorithms, the only reason they do is to provide jobs for database programmers.

      From Wikipedia:

      Relational database theory uses a different set of mathematical-based terms, which are equivalent, or roughly equivalent, to SQL database terminology.

      SQL looks like SQL because it's based on set theory. As an exercise, invent your own language that's as powerful (read: also based on a strong theoretical basis) but simpler. See you in a couple of decades!

      --
      Dewey, what part of this looks like authorities should be involved?
    5. Re:SQL is the problem, not RDBMSs by neonskimmer · · Score: 1

      I don't understand all the hate for SQL.

      Generally speaking the only issues I've ever had with databases are platform-specific and have nothing to do with the query language ie. Oracle's insanely complex configuration and endless lists of parameters. Performance tuning, indexing, replication, etc.

      SQL in itself is easy to learn. What saddens me more is seeing people badly use atrocities like Hibernate instead of figuring out (or hiring someone who will) how a database works.

      That being said I totally see how databases such as CouchDB will become more and more popular. Regardless of how proficient one is in SQL the possibilities are pretty much endless when you get to write the map/reduce functions yourself using an expressive language (like Javascript in couchdb)

    6. Re:SQL is the problem, not RDBMSs by coryking · · Score: 1

      I really dont get the SQL-hate either. SQL is not scary once you learn it. I think people who learned SQL on MySQL 3.0 got scared off. MySQL didn't really like things like "JOIN" and instead wanted you to shove all your relations in your WHERE. Plus it didn't support aliasing tables without an "AS"


      -- the kind of query one might find on MySQL 3
      -- it is hard to figure this kind of thing out
      -- because all the join conditions are not spelled out
      -- and jammed in the WHERE clause
      SELECT o.name, p.pet_name, v.name AS vet_name
      FROM owner AS o, pet AS p, vet AS v
      WHERE o.owner_id = p.owner_id, o.vet_id=v.vet_id

      vs.

      -- how you are supposed to write it.
      -- you can tell exactly what the hell is
      -- happening in this query
      SELECT o.name, p.pet_name, v.name as vet_name
      FROM owner o
      JOIN pet p ON o.owner_id=p.owner_id
      JOIN vet v ON o.vet_id=v.vet_id

      SQL has an easy learning curve. You do about 75% of the queries you need without learning more then SELECT, FROM, * JOIN, ORDER BY, etc. The fun stuff is once you do aggregate queries and subqueries. The best is when you get to optimize some complex query with indexes and a good query analyzer (hint: the dolphin logo'd database has horrible query analysis tools)

    7. Re:SQL is the problem, not RDBMSs by Savantissimo · · Score: 1

      QBE grids are nothing more then a UI abstraction of the underlying SQL SELECT statement.

      Well they are both direct abstractions of the the underlying algebra so neither one is necessarily more fundamental. QBE is easier because all the variations on SELECT are a mouse click or keystroke or two away. The amount of typing is typically 1/10 that of SQL and the opportunities for error are similarly reduced.

      While it sounds like the Access QBE - SQL translator is a good way of generating parts for more complicated SQL work, Access was to decent QBE something like what MS Word formatting was to WordPerfect's "reveal codes". (Perhaps things have changed since I used them.) Nothing wrong with instant decaf if you like that sort of thing, but it's not quite real coffee.

      --
      "Is life so dear, or peace so sweet, as to be purchased at the price of chains and slavery?" - Patrick Henry
    8. Re:SQL is the problem, not RDBMSs by Anonymous Coward · · Score: 0

      No. SQL looks like SQL because the people who wrote it followed the "should look like english" design rule (like COBOL) with the justification that "that way non-programmers can use it", which has never been true.

      But even if SQL was a straightforward implementation of the relational model, that wouldn't make it right. All programming languages follow the Turing model, yet I don't see anyone arguing that Brainfuck is the way forward. What some relational advocates (and it *is* a very strong and flexible model) fail to realise is that a model is just that, a model. It allows to reduce the implementation to a few rules, allowing it to be compared with others, or you to know what is and isn't possible with it. For instance Prolog without recursion is equivalent to SQL, except you wouldn't know without reducing them to models.

      Just because SQL follows the model doesn't mean I shouldn't be able to traverse a foreign key in a pointer like fashion instead of writing joins. Or retrieve rows from 2 different tables in a single query without having to rename the fields to merge them. None of these operation bring anything to the table in terms of models, just like C structs bring nothing over offsetting a pointer. They just make the language *much* easier to use. And that something I wish people such as yourself or the third manifesto guys would understand.

    9. Re:SQL is the problem, not RDBMSs by Savantissimo · · Score: 1

      SQL looks like SQL because it's based on set theory. As an exercise, invent your own language that's as powerful (read: also based on a strong theoretical basis) but simpler. See you in a couple of decades!

      OK, I went back in my time machine and gave the query-by-example spec to this guy at IBM back in the mid-'70s.

      It's based on set theory but it doesn't require typing nearly as much because it isn't a language as such, but instead a means of specifying queries that uses the computer to keep track of and display in human-preferred form what fields and operations are relevant to a given table or joined set of tables.

      Computer languages let you specify anything you can type. Nearly all the things you can type are errors. Human-appropriate interfaces moving beyond the simple character stream so ingrained in programming languages- simple things like columns and grids - can not only make things easier without losing flexibility, they can prevent specifying most wrong or impossible commands. Geometrical relations are more fundamental and intrinsically mathematical than symbolic relations, and brains are better at geometry and physics than they are at algebra and arithmetic, so why not give a nod to the human side of the human-computer interface bottleneck when it's not hard to do?

      --
      "Is life so dear, or peace so sweet, as to be purchased at the price of chains and slavery?" - Patrick Henry
    10. Re:SQL is the problem, not RDBMSs by WuphonsReach · · Score: 1

      MS-Access QBE was just the ready example at hand that most people would have exposure to. I have used it in the past to figure out more complex SQL statements, but it's also Jet SQL - which is about 2 steps away from standard SQL in a weird direction. Never used Paradox (we switched from a mainfram SQL query server on top of DB2 to MS Access 2.0).

      I still prefer for people to learn SQL rather then some QBE tool. SQL is fairly universal, and QBE isn't always available. QBE then becomes a learning tool for the times when the QBE simply can't represent what you need it to. To only know one or the other is not wise.

      I still use QBE for about 80% of my workload though.

      --
      Wolde you bothe eate your cake, and have your cake?
    11. Re:SQL is the problem, not RDBMSs by Tablizer · · Score: 1

      No. SQL looks like SQL because the people who wrote it followed the "should look like english" design rule (like COBOL) with the justification that "that way non-programmers can use it", which has never been true.

      Indeed. IBM was selling to PHB's, not programmers when they designed SQL. The earliest relational languages were more math-like, and would probably have made heavy-duty programmers happier. I even modeled my pet relational language, SMEQL, on the earlier languages rather than SQL. SMEQL looks more like a functional language.
               

    12. Re:SQL is the problem, not RDBMSs by Tablizer · · Score: 1

      Yes, but MS-Access mangles your SQL like nobody's mother. It removes your line-feeds, puts in long bloaty table specifiers even if they are unnecessary, adds parentheses like a Lisper on crack, and does very odd things with "or" statements. All without asking. One ends up using mostly the mouse approach largely because they don't want to deal with Access's reformatted SQL mess.

    13. Re:SQL is the problem, not RDBMSs by convolvatron · · Score: 1

      ullman - datalog - 1989 plus or minus a couple years. strictly more powerful and quite a bit easier to program in. and substantially easier to process.

      i've done a couple derived languages and they took months/weeks rather than the years the single sql-derived engine i worked on took.

    14. Re:SQL is the problem, not RDBMSs by Anonymous Coward · · Score: 0

      Whatever suits you works best. I learned SQL at university, and it has served me for 13 years since without fail - not some "query builder for dummies" that only exposes 0.1% of what you can do with SQL. SQL can't be reduced in complexity unless you take away features, so there you go... make your choice. You (or your delegate) must learn how to use SQL, or you can't avail yourself of its power. There's no 3rd option. If you're not a programmer, as painful as this concept may be, YOU CAN'T UTILIZE THE FULL POWER OF DATABASES!

  59. STILL "relational" - Dynamic Relational by Tablizer · · Score: 2, Interesting

    Some of those systems appear to more or less still be "relational". If each row is treated like a map (associative array) of strings, then the "schema" for a given table is the set union of all attributes used in the table, and non-existing columns for a given row can be treated as nulls.

    As long as an asterisk is not used in a query (ex: "select * from tableX"), then it will pretty much act like existing RDBMS, and as long as the type-explicitness issues are resolved based on dynamic language conventions. (Asterisks can be implemented perhaps, but it could be computationally expensive.)

    It's kind of like dynamic (AKA "scripting") languages versus static or type-heavy languages. The static kind of languages requires more up-front info that "protects" the integrity of the thing at the expense of flexibility and declaration volume. The same dichotomy can be applied to RDBMS also. We have RDBMS that like a lot of info up-front, and now those which accept incremental or ad-hoc insertions are starting to be common (but still less standardized).

    And constraints can be incrementally added, such as later requiring that every new record in a "Cars" table have a value for "brand" or the like.

    One possible exception is that there were some examples that violated "map-ness" of records, such as having two colors for a car. If they instead supplied "color_1" and "color_2", then map set rules would not be violated, keeping it closer to true relational.

    In short: We don't have to abandon relational to get dynamism.

  60. presidential term limits by falconwolf · · Score: 0, Offtopic

    Last I checked FDR was our only Unconstitutionally long office holder in the presidency....

    No, FDR was constitutionally the president the whole tyme he was president. Amedment 22 limiting presidential terms to 2 wasn't ratified until 21 March 1947, after he died.

    Falcon

  61. relavance of key value data stores by falconwolf · · Score: 1

    Actually i read TFA, and I just couldnt make sense of the benefits offered by the key value thing.

    While I don't except RDBMS to ever disappear there are cases where key value data stores can be more efficient. One such is name, address. Look up a name to get the address associated with the name. When all you need is the name and address why use a database? Where this falls down is when you have more data such as orders the named entity placed and what was ordered.

    Falcon

    1. Re:relavance of key value data stores by EastCoastSurfer · · Score: 1

      This may show my lack of knowledge on key-value stores, but using your example how would I look up everyone who lives at the same address? How about return to me all the people who have more than 1 address?

      Since the key is the name it doesn't sound like you could answer either of those queries without pulling all the data back and going through it yourself. At that point you've killed an of the perceived gain in performance over a traditional, normalized rdbms schema.

    2. Re:relavance of key value data stores by falconwolf · · Score: 1

      This may show my lack of knowledge on key-value stores, but using your example how would I look up everyone who lives at the same address? How about return to me all the people who have more than 1 address?

      Key: Name1
      Values:
      Address 1
      Address 2
      Address 3

      Same thing for Name2.

      That's how my black, er green, book is for phone numbers.

    3. Re:relavance of key value data stores by EastCoastSurfer · · Score: 1

      So you go through each key and count the number of addresses and then return the ones that have more than one? Can look things up based on the Value and not the key? That would seem to be the only way to efficiently answer my questions posed above.

      Sounds like a huge table scan for something that is simple in a relational model.

    4. Re:relavance of key value data stores by falconwolf · · Score: 1

      So you go through each key and count the number of addresses and then return the ones that have more than one? Can look things up based on the Value and not the key? That would seem to be the only way to efficiently answer my questions posed above.

      What exactly is your question? My own questions: Why would I want to go through each key to see if there's more than one value? Can who look things up?

      Falcon

    5. Re:relavance of key value data stores by EastCoastSurfer · · Score: 1

      First, the 'find everyone who lives at the same address' question is a valid one.

      Why would I want to go through each key to see if there's more than one value?

      The simple case is to find any contacts that don't have an address yet. Other reasons could be to find ones that don't have a home address or business address yet.

      What about history of addresses?

      See, you're only thinking about your own little address book. I'm thinking about millions of addresses and all the possible analytics that could be done on them. From your answers it doesn't seem that a key-value store allows the same types of questions to be answered that a very simple relational model would allow.

    6. Re:relavance of key value data stores by falconwolf · · Score: 1

      First, the 'find everyone who lives at the same address' question is a valid one.

      I can find easily everyone in my phone book.

      The simple case is to find any contacts that don't have an address yet. Other reasons could be to find ones that don't have a home address or business address yet.

      My phone book again. for instance I have 3 phone numbers I can use, her home, cellphone, qnd work numbers. Though I haven't I could also include the addresses.

      What about history of addresses?

      I can cross out one address and write in another.

      See, you're only thinking about your own little address book. I'm thinking about millions of addresses and all the possible analytics that could be done on them.

      I no longer keep one but I used to use a text file on my computer to do the same thing I can do with my phone book. And searching it is easy, [cont + f"] brings up the search box. Because it's a .txt file many programs can open it whereas if I had used a database I'd have to use the database software to open it. Heck, when I took both Java and Perl one of the things we did was to read and write text files.

      Falcon

  62. MySQL = key value store by ppierre · · Score: 1

    MySQL : Doomed or key value store ?

    1. Re:MySQL = key value store by MarkRose · · Score: 1

      The InnoDB engine is pretty nice. The coming Falcon engine looks promising, too. MySQL will be around for a long time to come.

      --
      Be relentless!
  63. Smells like PICK to me... by Anonymous Coward · · Score: 0

    Had to chime in here. I work with a PICK database daily and can tell you, it blows. Lack of tools, compatibility, and structure makes living with it more than a notion. If this is the way of the DB; I'd rather shovel shit at a hog farm.

    http://en.wikipedia.org/wiki/Pick_operating_system

  64. Yep, this will happen by ghjm · · Score: 2, Funny

    I can see the meeting now.

    Developer: "Hey boss, I found a better product for the transaction processing data! It might save us a bunch of money on Oracle licenses!"
    Boss: "Great, what is it?"
    Developer: "Project Voldemort!"
    Boss: "..."
    Developer: "No really, let me explain..."
    Boss: "I have a meeting to get to, but hey, let me know if you have any other great ideas."

  65. A SQL query walks into a bar... by SystematicPsycho · · Score: 4, Funny

    A SQL query walks into a bar and sees two tables. He walks up to them and says 'Can I join you?'

    From Tom Kyte's blog sql joke

    --
    Analytic & algebraic topology of locally Euclidean meterization of infinitely differentiable Riemmanian manifold
    1. Re:A SQL query walks into a bar... by dcooper_db9 · · Score: 1

      That would be a rhetorical query.

      --
      I do not block ads. I do block third party scripts.
  66. Stupidest Title Yet by Nom+du+Keyboard · · Score: 1

    The conclusion suggests that relational databases and key value stores aren't really mutually exclusive and instead are different tools for different requirements.

    So how does this have anything to do with dooming relational databases anyway?

    --
    "It's the height of ridiculousness to say for those 9 lines you get hundreds of millions."
  67. Limited schema... by Anonymous Coward · · Score: 0

    The real problem isn't table selection but all of your columns have to be named "pika!" with only different intonations.

  68. The answer is no! by BillKaos · · Score: 1

    I don't see relational databases going away any time soon.

    Most (>70%) of the web is using them, and so far, they've worked very well.

    What is missing is good support for them from the programming language point of view.

    The nature of relational databases is declarative, as you define mathematically what you want, not how. That's a job for the database, and they've got huge compilers and optimizers for it.

    Of course, the SQL language is a leaky abstraction of the pure relational calculus, and you have to know certain rules in order you query can be answered efficiently.

    SQL doesn't fit well in imperative languages, where all you can do is write down instructions. Compare that with a language like Prolog, which is OOTB a relational database.

  69. Slashdot: Never news, nothing that matters. by Anonymous Coward · · Score: 0

    With apologies to Family Guy:

    "Coming up next: Can bees think? A new study indicates that no, they cannot."

    Congratulations, Slashdot, you're as ridiculous as a parody.

    Seriously, when are you going to get real jobs, it must be hard lying to your parents every Christmas that you're doing something more honest and worthwhile like pimping.

    1. Re:Slashdot: Never news, nothing that matters. by Vexorian · · Score: 1

      Random AC rant said:
      Seriously, when are you going to get real jobs,

      I find the irony hilarious.

      --

      Copyright infringement is "piracy" in the same way DRM is "consumer rape"
    2. Re:Slashdot: Never news, nothing that matters. by scotch · · Score: 1

      Please explain the irony. I wish to subscribe to your newsletter.

      --
      XML causes global warming.
  70. I think relational DBs are best for storage only by alexibu · · Score: 1

    My view on relational DBs is that architecturally they are a bad way to implement software.
    I think they should be just used for tables with indexs, no stored procs or triggers anything else.
    There should be code written in the language of your choice to control all the transactions and business logic etc.
    Giving out database schemas as an interface and giving out database logins to client software is a disaster IMHO.
    Much richer, more explicit and typesafe interfaces can be provided by modern programming languages than are possible with DB scripting procedures.
    The DB providers have a vested interest in developers using all their more complicated, DB specific features to avoid their product being a mere commodity.
    But like any other API or technology on offer, it is just as much what you reject that makes good software as what you accept.

    In summary I think the relational DB as a marketable technology may be dead, as to my way of thinking it is just an API that does indexs on tables larger than memory and knows all the searching tricks and disk access performance tricks necessary to scale to large data sizes.

  71. Relational databases are so doomed.... by drolli · · Score: 1

    Probably transition to other databases is pending immediatly and will be a matter of months. The big database companies will die soon. Just outsource rewriting billions of line of code to India and programmer newbies who have not such fixed ideas yet. This will work out great.

  72. Object class anyone? by kilodelta · · Score: 2, Insightful

    Reading this I keep seeing OOP in there, and data as an object class.

    This is just the OOP crowd trying to not learn SQL and do things their way. It won't replace a full RDBMS. And an RDBMS can scale quite nicely if you know what the hell you're doing.

  73. PICK by ryadex · · Score: 1

    Smells like PICK to me. Those days are gone, let them go. http://en.wikipedia.org/wiki/Pick_operating_system

  74. Scaling? by Max_W · · Score: 1

    Scaling when a hard disk was 6MB was a serious matter. But scaling when HD is 10 TB, when server has got 4 processor, 12 GB RAM. How many websites are there to overwhelm such a server?

    Scaling to, like, 50 TB of data? What can it be?

    1. Re:Scaling? by anothy · · Score: 1

      um, there are applications other than web sites out there. take a look at sawzall, hancock, brook, or aurora. the data sets those applications get used for simply will not fit in a relational database without massive efforts in distribution and replication (and that introduces other problems). the stuff google's doing with BigTable is fascinating (er, if your into that kinda thing), and is very non-RDB. oh, and it's used for web sites.

      --

      i speak for myself and those who like what i say.
  75. MapReduce is a bunch of hype by Estanislao+Mart�nez · · Score: 4, Interesting

    The name of the MapReduce framework comes from the functional programming operations "map" and "reduce." Map takes as its input a collection of data, and a function that transforms data elements into other elements; it outputs a collection where each element of the input collection has been replaced by the result of applying that function to it. Reduce takes a collection of elements, an initial value of the same type as the elements, and a two-place, commutative, associative and symmetric operation; it produces as its output the value that results from applying the operation to the initial value and each element of the collection in turn, accumulating the partial results.

    Map and reduce are operations that can be trivially parallelized. To parallelize map, you divide the collection into subcollections (in any arbitrary manner), and map over each of them in parallel. To parallelize reduce, you divide the collection into subcollections, also arbitrarily, reduce each subcollection independently, then apply the reduction operation to the partial results. (That works because the reduction operation is commutative, associative and symmetric.)

    Well, guess what: this sort of technique is trivially applicable to relational database queries. A SQL query translates down to a combination of joins (the FROM clause), filters (the WHERE clause) and maps (the SELECT clause). Joins are trivially parallelizable; you give each execution unit a subset of the tuples of the driving relation. Filtering (the WHERE clause) is a kind of reduce operation. SELECT is a kind of map operation. This means that relational queries are not any less amenable to parallel execution than the stuff Google does.

    But the killer thing here is that MapReduce says absolutely nothing about the updates problem. This is one of the big features of RDBMSs: the ability to handle concurrent query and modification. It also says nothing about the data integrity problem, which is also one of the big RDBMS features.

    So, when you get down to it, there is a good argument to be made that many applications could make use of database technologies that support much faster querying, at the expense of very little updating. But there's no convincing argument that that technology isn't best implemented in the context of an RDBMS.

  76. Scaling? by Chysn · · Score: 1

    Okay, I RTFA. I've made a living with relational databases for ten years, but I have no experience with key/value databases. So maybe someone can explain this better than TFA did.

    Bain writes: "The first benefit is that they are simple and thus scale much better than today's relational databases. If you are putting together a system in-house and intend to throw dozens or hundreds of servers behind your data store to cope with what you expect will be a massive demand in scale, then consider a key/value store."

    So, they scale well if you're going to throw tons of hardware at them? Okay, I guess I can live with that. But then, in the "The Bad" section:

    "In the cloud, key/value databases are usually multi-tenanted, which means that a lot of users and applications will use the same system. To prevent any one process from overloading the shared environment, most cloud data stores strictly limit the total impact that any single query can cause.... These limitations aren't a problem for your bread-and-butter application logic (adding, updating, deleting, and retrieving small numbers of items)."

    So the resource usage in the cloud with a large number of users consumes so many resources (of your "hundreds of servers") that you need to limit your application to retrieving "small numbers of items."

    I can see how there'd be some benefit to providing a subset of RDMS functionality to improve efficiency. But, if I'm understanding this article, they're apparently offsetting any gains from simplicity by duplicating data, justified by "hard drives are cheap." How the hell is this scaling "much better than today's relational databases?"

    --
    --I'm so big, my sig has its own sig.
    -- See?
  77. The real performance issue is joins. by Animats · · Score: 1

    Getting past the buzzwords, the real issue here is that key/value databases don't do joins. Joins are expensive and hard to distribute across machines. For many web apps, one key is enough to find the relevant data, and general joins aren't necessary.

    Both Google and Amazon realized this, and their key/value systems don't support joins. Developers of both systems have spoken in EE380 at Stanford, and were grilled over this issue. The big advantage of a join-free system is that a database can be split across machines without the need for elaborate intercommunication between them. You can simply put keys A-L on machine 1, and keys M-Z on machine 2. There are no crosslinks between the machines, and you don't have to do inter-machine locking. The front end machines just direct the query to the appropriate back-end machine based on the key.

    There's a set of things you can't do this way, but they seem not to be the high-volume queries in web applications. That's the real insight here.

    Arguably, the web crowd just reinvented ISAM.

  78. Tokyo Products by chrysalis · · Score: 1

    Just in case you never heard about them, have a look at Tokyo Products http://tokyocabinet.sourceforge.net/index.html by the wonderful guy who already wrote QDBM, Hyper Estraier, etc.

    The presentation tells you the basics, but Tokyo Products are quickly improving, and there's already a bunch of useful new features since the presentation, as seen in the Mixi's Blog : http://alpha.mixi.co.jp/blog/

    Tokyo Products + Flare ( http://labs.gree.jp/Top/OpenSource/Flare-en.html ) makes SQL relational databases totally useless for almost every web app, except for beginners or conservative people.

    Also, with the raise of products like Terracotta (for Java) and Maglev (Ruby VM), getting back to SQL really seems retardated.

    --
    {{.sig}}
  79. object db by hey · · Score: 1

    I would love to find an object database that keeps relations between objects and the data. eg child, siblings, parents. It could be done with one of the these key/value db's but not so nicely.

  80. Think vs Thing by coryking · · Score: 1

    There is absolutely no reason to thing that President Obama has ever smoked crack.

    Glad I'm not the only one who makes this typo. I think that our fingers get so used to "ing" that we automatically type it. But why don't we mis-type "pink" as "ping", "link" as "ling", or "fink" as "fing"?

    Maybe it is the "th" part. "Something", "Everything", "Nothing".

    I'm not sure how to break this habit either. Word 2007 seems to pick this up, but none of the browser spell checks will. ... anyway, back on topic I suppose.

  81. BigTable is what you should be comparing with by CoughDropAddict · · Score: 1

    But the killer thing here is that MapReduce says absolutely nothing about the updates problem.

    That's because MapReduce is a data processing system, not a data storage system. You should read about BigTable, which is the data storage system we use (I work at Google), which does support updates.

    In your comments on this thread, I think you miss the key difference between an RDBMS and a system like BigTable. BigTable is almost perfectly horizontally scalable. When you need more capacity, it really is as simple as throwing more machines at the problem.

    RDBMS's can never give you this kind of horizontal scalability, because they make a promise to you that you can transactionally modify any two bits of data anywhere in your database. Fulfilling this promise requires that either your whole database lives on a single machine, or that you use a distributed transaction protocol like 2PC (which totally kills performance).

    So when your database gets busier than a single machine can handle, you have to manually partition your database into multiple physical databases. All the nice RDBMS features like transactions, joins, foreign keys, triggers, etc. can only (reasonably) work within a single physical database. The divide between physical databases is something your application code has to deal with -- it has to know to direct its queries to the correct partition. And repartitioning your data to run on more machines later is an invasive procedure, both operationally and to your application's code.

    BigTable is designed around the reality that a database of any significant size will need to run on more than one machine. It only guarantees that you can transactionally modify data within a single row. This gives BigTable the ability to move rows around between machines without the application even knowing this is happening. If you add more machines, BigTable can immediately start moving some subset of your rows onto this new machine.

    I recommend reading this paper for a far more in-depth look at this pattern. The key point of this paper is:

    A scale-agnostic programming abstraction must have the notion of entity as the boundary of atomicity.

    BigTable calls such entities "rows."

    1. Re:BigTable is what you should be comparing with by CoughDropAddict · · Score: 1

      A couple addenda to my last post:

      • I said BigTable is "the" data storage system we use. I should have said "a" data storage system we use -- of course it is not the only one.
      • I work for Google, but of course to not speak for them.
    2. Re:BigTable is what you should be comparing with by Estanislao+Mart�nez · · Score: 1

      RDBMS's can never give you this kind of horizontal scalability, because they make a promise to you that you can transactionally modify any two bits of data anywhere in your database. Fulfilling this promise requires that either your whole database lives on a single machine, or that you use a distributed transaction protocol like 2PC (which totally kills performance).

      This isn't as big of a problem as you make it to be. That just means that if you limit or eliminate that RDBMS ability, you can query the data much faster and distribute it over a bigger cluster.

      You're still missing the point that the relational model is a logical data model. This is the one biggest misconception in all arguments against relational technology. If row-based RDBMSs oriented toward small transactional updates have intrinsic performance limitations, this doesn't defeat the relational model; it just means that it's the wrong implementation of the model for one type of application.

      All the nice RDBMS features like transactions, joins, foreign keys, triggers, etc. can only (reasonably) work within a single physical database.

      But other than joins, you're not even addressing the most fundamental relational feature, which is the separation of the logical and the physical data models.

  82. Succesful obect relational mapping? by weston · · Score: 1

    If you do it properly will will get a nice set of multidimensional objects and fact/attribute tables which are orthogonal and lean. Easy to understand, search, join, build, compose, decompose, signal and track.

    I'm led to believe it's not that easy, but I'd love to be shown wrong.

    Also, SQL is a nightmare

    I agree, and I think one of the interesting questions is why we don't have something better, or even just something else. There are probably millions of man-hours put into ORM or QBE layers, some with their own hacked-up query languages... that are eventually re-written as an SQL query. But as far as I know, despite the fact that we have open source databases, despite the fact that storage engines aren't married to queries... we don't have any other query languages directly supported by the database (unless, I don't know, is QUEL still supported by Postgres?).

    Where's D? Why not have Prolog (or a tabled prolog if you're worried about unbounded queries)? Given the fecundity of the field with regards to all kinds of different programming languages, I don't understand why there seems to have to be One Query Language(TM).

  83. Right Tool for the Job by CyberLife · · Score: 1

    Do you really need automatic referential integrity, or are you just trying to save the programmers some time?

    Do you really need ad-hoc queries, or are you just trying to save the programmers some time?

    Do your application programmers really write such buggy code that they cannot be trusted to write their own integrity/query code, or does your QA just suck ass?

    Do you really need a client/server solution for data-storage, or is it just what you know and are comfortable with?

    Do you really need an RDBMS as an integration tool for several applications, or are there better options?

  84. Re:WTF? seconded by Anonymous Coward · · Score: 0

    Even in mathematical usage, a relation determines a relationship between entities. If (x,y) is in a relation R, we say that x and y are related by R.

    So, shut the fuck up, you illiterate twat.

  85. No such thing as "materialized view" by Crazy+Taco · · Score: 1

    well, that's a kind of limited-functionality materialized view with a special engine to access it.

    I know I'm being pedantic, but it grates on me to hear the term "materialized view". There is no such thing, and "materialized view" is a contradiction in terms, since a view by definition is never fixed and changes as the data in the tables it references change. You would be better off referring to "materialized views" as snapshots. Because once you "materialize" something, it is definitely no longer a view.

    --
    Beware of bugs in the above code; I have only proved it correct, not tried it.
    1. Re:No such thing as "materialized view" by Estanislao+Mart�nez · · Score: 1

      I know I'm being pedantic, but it grates on me to hear the term "materialized view".

      I think this kind of objection is silly; but, what's worse, in this particular case, the factual basis of the objection is wrong. I'll get to that.

      There is no such thing, and "materialized view" is a contradiction in terms, since a view by definition is never fixed and changes as the data in the tables it references change. You would be better off referring to "materialized views" as snapshots. Because once you "materialize" something, it is definitely no longer a view.

      Various databases support automatic refresh of the physical tables in question when changes are made to the base logical tables. In this case, you're materializing the view on disk, and changing it as the tables in question change.

      Indexed views in SQL Server, for example, don't even allow the refresh to be deferred; an update to the base tables must refresh any indexed views that use that table. By your criterion, these are neither snapshots (because they change as the data in the base table change) nor views (because they are materialized).

  86. Nope by AlgorithMan · · Score: 1

    The treat of Key/Value stores to Relational Databases is about as big as the threat of Visual Basic to C++. Yes, they're easier (so average script-kiddie posers might prefer them), but for many applications (especially in the pro business) they're just noch powerful enough.

    --
    The MAFIAA is a bunch of mindless jerks who will be the first up against the wall when the revolution comes
  87. It is prepared statement.. by Anonymous Coward · · Score: 0

    Sure I know - ? marks the place where parameter is inserted in a prepared statement

    In other words: you have to read the actual code below to know ...

  88. Re:WTF? seconded by Anonymous Coward · · Score: 0

    Way to miss the point - there are countless NON-relational database systems that can also handle "data that is related."

    Oh, and here's your gratuitous insult: you're a cranially underdeveloped moribund coprophage.

  89. Tutorial-D and SMEQL by Tablizer · · Score: 1

    SQL looks like SQL because it's based on set theory. As an exercise, invent your own language that's as powerful (read: also based on a strong theoretical basis) but simpler.

    1. Tutorial-D

    2. SMEQL
         

  90. I've used both and like them both. by Lee+Cremeans · · Score: 1

    I've used "real" RDBMSes (MySQL and PostgreSQL -- yes, I know, MySQL doens't really count compared to Postgres, Oracle or DB2, but I've used it...), MS-Jet (ugh), and SQLite, and my favorite by far is SQLite. PostgreSQL is lovely, and can do maybe 90% of what Oracle can for 0% of the price, but massive overkill for small apps (not to mention not very friendly to non-DBA system admins).

    As for things that can get away with just a simple key-value mapping and don't need to be written to disk, I love std::map. The STL makes C++ that much nicer to live with (and it beats Microsoft's MFC utility classes any day).

    -lee

  91. C++ to doom C by aoheno · · Score: 1

    PL/1 to doom COBOL.
    Linux to doom Windows.
    Ruby to doom PHP.
    Chrome to doom Firefox.
    Boxers to doom Y-Fronts.
    Derivatives to doom economy (OK, OK - they did).

    --
    Her lips were softer than a duck's bill, but her quacks ...
  92. That's not making things better by Anonymous Coward · · Score: 0

    I think having this arbitrary pool of key/value pairs, especially when a value is at most (and at least) a string, has potential to cause more chaos and havoc than what relational DBs have already emposed on the software development industry. Yes, simple key/value pairing has the benefits of simplicity and can be distributed easily, but ask yourself, would you really want to distribute it?

    I agree with the author that relational databases are not a solution - they never were, but let's not lower ourselves to the level of simple key/value pairs. Just reading the specs of the free implementations out there, two followed the eventual consistency model which, lets face it, is absolutely pathetic for anything that calls itself a "database" for this day and age. Its fine for a social network site like facebook, but would you use it in a financial application? Or, as you put it, in a ticket reservation app.

    A proper database is one that mirrors and evolves in parallel with the application that uses it. And the only database flavour that manages to do this right is a native object oriented database. Yes, it has been cluged up in the past, and there are way too many post-relational flavours that are nothing more than a glorified relational database. In my opinion, nothing that offers a SQL-like syntax for querying a database can be considered relational: not in this universe and not in any one of the parallel universes that may exist. If you realy want to solve a long term problem, you need to solve it right. Have a look at native object database, and transaction-oriented databases, which are a completely new breed of OO DBs combining transaction processing, auditability and native object support. And they can run in a distributed environment, scaling out as the load goes up. Take a look at DTS/S1 from Obsidian Dynamics.
    http://www.obsidiandynamics.com/dts

    The idea behind these kind of offerings is dead simple. If I'm writing my application using objects, instantiating classes, passing references around, then why would I resort to a relational product, let alone a key/value (albeit a distributed) hashmap to persist my data? Trust me, and I have been in this industry for many years, all roads lead to objects.