Slashdot Mirror


Enthusiasts Convene To Say No To SQL, Hash Out New DB Breed

ericatcw writes "The inaugural NoSQL meet-up in San Francisco during last month's Yahoo! Apache Hadoop Summit had a whiff of revolution about it, like a latter-day techie version of the American Patriots planning the Boston Tea Party. Like the Patriots, who rebelled against Britain's heavy taxes, NoSQLers came to share how they had overthrown the tyranny of burdensome, expensive relational databases in favor of more efficient and cheaper ways of managing data, reports Computerworld."

82 of 423 comments (clear)

  1. Quit Whining by KingPin27 · · Score: 5, Funny

    Just use flat text files --- no need for expensive db's .... think of the freedom!

    --
    "i lost my dignity on a slippery wiener"
    1. Re:Quit Whining by Anonymous Coward · · Score: 4, Insightful

      The horrible lag I get when using address completion in Firefox 3 makes me wish more people thought that way!

    2. Re:Quit Whining by MichaelSmith · · Score: 2, Funny

      This is one of the main objectives of ReiserFS, to make such things easy, a project which unfortunately has run into some difficulty of late.

      I wonder if I could sneak Hans an eeepc inside a birthday cake...

    3. Re:Quit Whining by phantomfive · · Score: 2, Insightful

      You didn't learn to backup after the first time?

      --
      Qxe4
    4. Re:Quit Whining by Paradise+Pete · · Score: 4, Funny

      I"ve lost data in two filesystems thanks to the Slasher's shoddy work.

      Have you looked near Redwood Regional Park? On the side of a hill?

    5. Re:Quit Whining by fooslacker · · Score: 2, Insightful

      I have no idea what rubycodez's experience was but just because I can recover from backup doesn't mean data wasn't lost by the filesystem. More importantly external risk management schemes (i.e. backup) don't relieve the file system of the obligation not to kill the copy of my data I've entrusted it with.

    6. Re:Quit Whining by CyberLife · · Score: 3, Insightful

      Flat files are a perfectly viable option in some circumstances. Not everything requires data uniformity or the ability to run complex ad-hoc queries, nor does everything need information to be controlled by a separate process running on a different machine. Not every system integrates multiple applications through a shared data-store. The NoSQL crowd isn't arguing that SQL is bad, just overused. There are a great many situations where something like flat files or Berkeley DB is more than sufficient, and yet people still use relational technology. In my experience it's generally because that's all they know. In their mind, if one needs to store data one uses SQL. They don't select the right tool for the job because they honestly don't know there are other tools.

    7. Re:Quit Whining by jadavis · · Score: 4, Insightful

      One of the reasons is because RDBMSs offer a lot of tools, like atomicity, durability, backup/restore, centralization, point-in-time-recovery, etc. Many application developers need these things without actually needing the abstraction of a relational system.

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
    8. Re:Quit Whining by Anonymous Coward · · Score: 2, Informative

      The horrible lag I get when using address completion in Firefox 3 makes me wish less people thought that way!

      fixed that for ya. your address completion would probably not lag if there was an efficient data structure behind it, like a btree, like a rdbms would use

    9. Re:Quit Whining by Profane+MuthaFucka · · Score: 3, Funny

      Even flat files are an abstraction. My files are long, skinny, spiral, and spin around like Linda Blair's head.

      --
      Fascism trolls keeping me up every night. When I starts a preachin', he HITS ME WITH HIS REICH!
    10. Re:Quit Whining by dna_(c)(tm)(r) · · Score: 2, Insightful

      If your data isn't complex enough to require a RDBMS, you almost certainly don't need a program.

      Really? IM, word processor, spreadsheet, vector graphics, photo editor, ... Google probably uses MapReduce without a "normal" RDBMS behind it - is that data complex enough?

    11. Re:Quit Whining by Chelloveck · · Score: 2, Informative

      The horrible lag I get when using address completion in Firefox 3 makes me wish fewer people thought that way!

      fixed that for ya.

      Fixed that for ya!

      --
      Chelloveck
      I give up on debugging. From now on, SIGSEGV is a feature.
  2. A time and place for everything by Marillion · · Score: 3, Insightful

    There is a time and place for SQL. There is a time and place to avoid SQL.
    SQL is great for financial data. SQL is terrible for genetic data.

    --
    This is a boring sig
    1. Re:A time and place for everything by Bromskloss · · Score: 3, Insightful

      It would be interesting to hear why this is.

      --
      Swedish plasma phys. PhD student; MSc EE; knows maths, programming, electronics; finance interest; seeks opportunities
    2. Re:A time and place for everything by Carewolf · · Score: 3, Insightful

      Design an efficient table relating a tree structure. Then design queries to answer questions such as:
      * Find the nodes in the subtree under B.
      * Find all ancesters of G
      * Find the nearest common ancestor of D and H

      Trees is a wellknown problem of SQL, but the fact is that SQL can't handle most datastructures and complex relations, only very simple one dimensional ones.

    3. Re:A time and place for everything by MichaelSmith · · Score: 2, Insightful

      It would be interesting to hear why this is.

      My guess would be that because SQL is a Structured Query Language it is best used for handling structured data. If you have serial, unstructured data you have to invent your own format for it to use inside the database, and then the query language isn't helping you.

    4. Re:A time and place for everything by Marillion · · Score: 2, Interesting

      Genetic sequences are long strings alphabetic characters. One of the most common representations is the FASTA which deals with the most common type of nucleotide polymorphisms. You can't use exact string searching to find a match which makes BLOBS and CLOBS useless. That said, the meta-data of genetic data is reasonably structured and does load into relational databases fairly well.

      --
      This is a boring sig
    5. Re:A time and place for everything by Marillion · · Score: 3, Insightful

      Right, I went into a little detail on another post. When I said, "genetic," I mean genes - DNA. There are four main Nucleic Acid types in DNA: Adenosine, Cytosine, Guanine, and Thymidine. Abbreviated ACGT. So you could store a gene sequence as ACGCCTGCAATC. But in other populations, Asian for example, the same gene is more commonly found as ACTCCTGCAATC. (The third nucleotide is different) Exact string matches won't find matches between different population groups. So they create wild-card letters that represent either G or T -> K. So ACKCCTGCAATC would match either the both of sequences commonly found in western and eastern populations. Data of this nature has no business being in a relational database. For that matter, it doesn't belong in these pseudo databases either.

      --
      This is a boring sig
    6. Re:A time and place for everything by E+IS+mC(Square) · · Score: 5, Interesting

      >> Trees is a wellknown problem of SQL, but the fact is that SQL can't handle most datastructures and complex relations, only very simple one dimensional ones.

      Sorry, that's not true. Have you tried analytical functions? You would be amazed how complex scenarios can be handled easily with them. And they are part of ANSI SQL standards. And db providers (Oracle etc) have taken the concept and improved a lot on it.

      I think the anti-sql 'movement' has more to do with new (internet era) languages and their developers than so called 'lack' of features. In my limited experience, I have observed people coming from C (and such) background have no problem with sql, while java developers (and this is probably true for most developers working on web-based applications) are the worst kind when it comes to understanding even basics of sql. All they want is their objects.

      I strongly believe that a competent programmer designing/developing system which includes data and data-storage should at least know normalization, indexes, and what does it mean by 3NF. Programming language is one thing, database is another, and knowledge of both is required to build a decent system.

    7. Re:A time and place for everything by MichaelSmith · · Score: 2, Insightful

      Our own nervous system works as a pretty good associative database, though (in my case at least) it seems to be designed to associate places with objects, ie, it is intended to answer the question "what happened the last time I was here?". So as we develop new applications we tend to develop spatial or geographical models for our data.

      The genetic data you describe is not too different from other things we all have to deal with. Trace or log data. Video streams. Sequences of real time events from practically anything. All of these things consist of partly structured streams from which we need to extract meaning. And yes, for all these things storage in a relational database doesn't add any value.

    8. Re:A time and place for everything by julesh · · Score: 4, Informative

      SQL is great for financial data.

      Actually, this isn't true either. See this article for pointers to some of the failings of SQL in dealing with financial data, particularly time series (e.g. sales figures, share prices, etc.). Here's another take on the problem, which essentially is that SQL doesn't recognise that there can be relationships between the rows of a table (e.g., "this happened after this").

    9. Re:A time and place for everything by diamondmagic · · Score: 5, Informative

      Design an efficient table relating a tree structure.

      Huh? Tree structures are best handled by relational databases, as it is far faster then recursion. Give row a unique ID and a parent ID, and in addition, a left hand and right hand number, the root node having a left-hand value of 1 and a right hand value of (number rows * 2), the first child node has a left-hand value of one more than the parent's, the right-hand value is one less then the left-hand of a younger sibling.

      Then design queries to answer questions such as:
      * Find the nodes in the subtree under B.

      SELECT * FROM rows WHERE left > [left hand value of B] AND right < [right hand value of B]

      * Find all ancesters of G

      SELECT * FROM rows WHERE left < [left hand value of G] AND right > [right hand value of G]

      * Find the nearest common ancestor of D and H

      SELECT * FROM rows WHERE left < [lowest left hand value from D,H] AND right > [highest right hand value from D,H] ORDER BY right LIMIT 1

      Trees is a wellknown problem of SQL, but the fact is that SQL can't handle most datastructures and complex relations, only very simple one dimensional ones.

      Are you saying trees are easy or hard? And for more complex systems, that is what JOINs are for. SQL is by far the most powerful way and often the fastest way to manipulate data that I know of. The only time I can recall that I had to use a non-SQL solution that was faster then the SQL solution was a matrix operation.

    10. Re:A time and place for everything by lumbercartel.ca · · Score: 2, Insightful

      PostgreSQL also has specialized index algorithms for handling exotic arrangements of data. One index type that I'll be looking into in the near future [I've been told] makes it possible to efficiently take two dimensional data, and return rows that all fit within the specified radius starting from two coordinates. Although this can be done with a combination of indexes and various formulas, it's not as elegant as what PostgreSQL can do now. So, when I see statements that SQL hasn't progressed, I question the level of expertise of those making such statements.

      As for "all they want is their objects," I think that's true -- just look at all the PHP newbies out there who like to create tables with hundreds of columns instead of taking advantage of one of SQL's greatest hallmarks: Relationships between tables

      Java (with JDBC) and Perl/mod_perl/mod_perl2 (with DBIC, a.k.a., DBIx::Class) should make these folks happy though because they do provide slick OO interfaces to SQL that, unless you keep all your hundred or so columns in a single table, also require at least a basic understanding of SQL.

      Anybody working with databases in a serious way has to have at least a basic understanding of the underlying technology. Creating an alternative to SQL seems ludicrous to me because it will eventually take away from the large pool of SQL expertise that exists today (I never knew about an anti-SQL movement until I read about it here today on SlashDot); many other alternatives do already exist too, such as BTrieve which has a totally different interface from SQL, and although it performs well it's just not as popular anymore (didn't Pervasive, the company that currently owns the BTrieve technology, switch to a customized rendition of PostgreSQL or something like that anyway? I wonder what their real reasons were for focusing on SQL in favour of BTrieve?).

      As a developer, I see that performance (speed) is a very important factor, but that's not the only factor for me -- quality of code, helpful documentation, and overall system reliability (including, very importantly, the ability to always recover gracefully from a power outage that occurred while thousands of INSERTs, UPDATEs, and DELETEs are active, which I confirmed in some informal "pull-the-plug" testing many years ago that PostgreSQL and Oracle both do by issuing ROLLBACKs at database mount time after the OS recovers) because it will mean easier development with fewer potential problems for me to deal with in the future if something does fail.

    11. Re:A time and place for everything by raddan · · Score: 3, Insightful

      And to expand on that a little, I think each part of the MVC idiom has it's own domain-specific language because those languages are well-suited to those applications. An imperative language with an emphasis on objects (e.g., Java) just doesn't do the same thing that a declarative set-theoretical language (SQL) does. Well, it can, but doing so is a royal pain in the ass. That same imperative language is also total overkill for defining a layout. HTML does that job beautifully and simply.

      There are certainly common CS themes running between all three. We have three languages not because people haven't thought about those things, but because they make our lives easier.

      Whenever I hear people bitching about 'doing away' with SQL, I always wonder what they think is wrong with it. SQL certainly has some limitations, don't get me wrong, but it is a great language for the vast, vast majority of cases. If your application is so specialized that SQL isn't appropriate, well, bravo, but that does not mean that the relational database concept is flawed. Personally, I think if people spent a few moments doing some formal analysis before they built their databases (imagine that, thinking before doing?!), they would find that SQL is a beautiful thing. If your implementation of SQL doesn't cut the mustard, maybe you just need a better query optimizer?

    12. Re:A time and place for everything by allenthelee · · Score: 2, Informative

      That's true, but you should mention that this data representation comes at a tradeoff for update efficiency - insertion of new nodes force you to update the entire subtree's left and right hand values.

      I wouldn't call SQL the natural way to model a tree data structure. It's possible but it comes with a price.

    13. Re:A time and place for everything by raddan · · Score: 3, Informative

      A basic premise of the relational model is that there is no relationship between rows. So it isn't surprising that SQL can't see any. Maybe you need to organize that data differently? You can solve a lot of problems in SQL using triggers, temporary tables, and the built-in aggregate and sorting functions.

    14. Re:A time and place for everything by lumbercartel.ca · · Score: 2, Interesting

      TIMESTAMPs are very useful for retaining temporal order of data. Those articles don't deal with timestamps at all. If they knew about TIMESTAMPs (like what PostgreSQL has), things could be a lot better.

      Cumulative totals can be achieved as well, either within a SELECT statement (that sums from the starting date to the current date being processed) or with a stored procedure (which I'd prefer to ensure efficiency since I could just keep adding to an internal variable along the way instead of re-summing everything for each row returned).

      Moving averages (with varying temporal ranges), Ranks, etc., can all be handled with a stored procedure using some fairly straight-forward "plpgsql" or other DBMS back-end language. With enough inventiveness, someone could probably figure out how to do the same with a SELECT statement (possibly needing sub-SELECTs).

      Adding features such as "accreting" (see bottom of first article referenced at dbmsmag.com above) seems like a nice idea to me -- I'm certainly in favour of adding more features to an RDMBS since it will make it more useful to people. I worry, however, that those authors are expecting SQL to do more than simply return data sets (which their reporting code should be responsible for formatting for the user) since that would be missing the point of what SQL is for.

    15. Re:A time and place for everything by smittyoneeach · · Score: 3, Interesting

      It seems like a huge chunk of all programming is like being Gerardus Mercator.
      You've got a bunch of information with one shape, but you really need it in another shape.
      The code is about dealing with the embarrassment of it all.
      If you've got tabular stuff, and you so very often do, the relational model is fantastic.
      If you're dealing with some kind of graph, you're hating life.
      People coming in complaining that their aircraft makes a poor submarine are initially amusing, but become tedious.

      --
      Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
    16. Re:A time and place for everything by TheNarrator · · Score: 5, Insightful
      I think the main problem that the web 2.0 dynamic language crowd has with RDBMs is that:
      • 1. Relational data is strongly typed. You cannot easily add new fields to a table or store arbitrary types in a column and expect acceptable performance.
      • 2. Migrating large amounts of relational data to a new structure takes a very very long time. Constant refactoring of data models is to be avoided. You have to get it right the first time or at least very early in the development cycle to avoid major headaches...
      • 3. Databases are hard to mock in a testing context. Automated tests can be significantly slowed down with even a test database..
      • 4. Error in database architecture are very difficult to correct due to 1 and 2, especially when used with a dynamically typed language..
      • 5. It's difficult to maintain the data integrity that RDBMSs take for granted in highly scalable distributed systems and have acceptable performance.

      The only real show stopper and a real reason to replace RDBMSs is #5. All the others can be worked around by just deeper study of data modeling techniques. Data modelling is not something most developers can figure out intuitively. There is a lot of theory to be learned to do it right and it can very easily be done badly leading to severe performance problems and an unmaintainable application.
      With regards to # 5: I went to a presentation at Javaone where some Ebay engineers explained that they do not use transactions in any of their database operations. They just leave junk rows around in the db if a transaction half completes and as long as they aren't reachable they don't consider it a big deal. They have to very carefully organize the order in which they manipulate data to avoid data corruption ,but that lets them get around # 5,

    17. Re:A time and place for everything by lawpoop · · Score: 3, Informative

      For anyone wondering, parent is talking about a preorder tree traversal algorithm:
      Link 1
      Link 2

      And parent it right. I was doing an adjacency list in MySQL for a while, because I thought that preorder trees were just a little too complicated, but they are *way* easier and more intuitive.

      --
      Computers are useless. They can only give you answers.
      -- Pablo Picasso
    18. Re:A time and place for everything by aztracker1 · · Score: 2, Interesting

      >> Find all ancesters of G

      > SELECT * FROM rows WHERE left [right hand value of G]

      Now promote a tree node so it's before G...

      --
      Michael J. Ryan - tracker1.info
    19. Re:A time and place for everything by pvera · · Score: 2, Insightful

      As a web programmer, I wanted to take offense at your statement, but something that happened to me only a few weeks ago is making me have a hell of a lot less faith about the available pool of web programmers out there:

      During a round of interviews, we sent out a take-home quiz. We mostly wanted to know if the candidates either knew the actual answers, or could at least google it. One of the questions involved simple aggregates in SQL. Given a table with a unique id and a date of birth, I wanted a query that would produce a list of the months of the year, and how many unique records had a DOB that fell on that month. It's a one-liner.

      One of my candidates wrote TWELVE counting queries, each one counting DOBs between the (hardcoded) start and end of each month, then she used UNION to make it send out the 12 one-row queries as one 12-row query.

      Both of us evaluating the results screamed when we read her answer, and we did not pursue her further. I used to complain that programmers simply didn't give a shit about learning beyond the querying aspects of the RDBMS, which kept us at the mercy of overpaid DBAs. Now? Now we are starting to see that programmers don't even give a shit about learning how to query.

      --
      Pedro
      ----
      The Insomniac Coder
    20. Re:A time and place for everything by Kjella · · Score: 4, Informative

      1) 1996 called, they want their arguments back. For example, most RDBMS have ranking functions now.
      2) Even in 1996, he doesn't know SQL worth shit

      SELECT (prev.sales+now.sales+next.sales)/3 three_day_average
      FROM sales prev,
                    sales now,
                    sales next
      WHERE prev.day_number = now.day_number-1
      AND next.day_number = now.day_number+1

      Easy as pie making most of the calculations he wants. Maybe he should ask someone knowledgable in SQL?

      --
      Live today, because you never know what tomorrow brings
    21. Re:A time and place for everything by slashdotwannabe · · Score: 2, Insightful

      I want to say you're breaking fourth normal form, but I can't.

      I want to say you're storing derived data, but I can't.

      I CAN say that data structure is just butt-ass-ugly.

      --
      This comment is my opinion and does not represent an official position of Donald Trump or others I do not work for
    22. Re:A time and place for everything by mcrbids · · Score: 3, Insightful

      Design an efficient table relating a tree structure. Then design queries to answer questions such as:...

      I don't know, but I recall reading that Postgres 8.4 is now out and includes support for recursive queries. (trees) Not sure about the reputation of the blog in question, but you may have heard of it?

      the fact is that SQL can't handle most datastructures and complex relations, only very simple one dimensional ones

      You are kidding, right? Just today I cooked up a 7-table query including 2 subselects, and a left outer join to a meta table consisting of 2 inner joined tables. Total of some 11 tables comprising a highly complex data set. Don't know what you mean by "very simple one dimensional ones" but 11 tables each joined in either a one-to-many or many-to-many mapping provides at least 11 dimensions. (more if you self-join tabls, often needed) And this isn't particularly hard for me - often I have joins combining 12 or more very large tables with unrestrained combinations somewhere in the billions to trillions of possibilities that all somehow seem to parse just a few seconds thanks to a few well-placed indexes and a well-structured query.

      Methinks you don't really understand SQL?

      --
      I have no problem with your religion until you decide it's reason to deprive others of the truth.
    23. Re:A time and place for everything by dzfoo · · Score: 3, Insightful

      So what you are saying is that you do not know enough of SQL to understand that query; therefore you are not qualified to comment on the practicality and viability of using SQL for complex structures.

      I suspect the same applies to Perl.

                -dZ.

      --
      Carol vs. Ghost
      ...Can you save Christmas?
    24. Re:A time and place for everything by lawpoop · · Score: 2, Informative

      That's true, but you should mention that this data representation comes at a tradeoff for update efficiency - insertion of new nodes force you to update the entire subtree's left and right hand values.

      This is not entirely correct. If you want to insert a new node, you need only update the values of the nodes to the right of the insertion point. This covers more and more nodes the closer you get to the left side of the tree. If you wanted to add a new node onto the right of the tree, you need update only one value, the rgt value of the root node.

      --
      Computers are useless. They can only give you answers.
      -- Pablo Picasso
  3. This is what happens by Anonymous Coward · · Score: 4, Funny

    When you get a lot of morbidly obese nerds with no life to program for you.

    Meanwhile SQL users get laid.

    1. Re:This is what happens by Anonymous Coward · · Score: 5, Funny

      It's true. I do a lot of INNER JOINing. Often with multiple tables.

    2. Re:This is what happens by turing_m · · Score: 3, Funny

      I call BS. You are in your mom's basement with one eye watching the door. While you construct intricate combinations of self-joins. Until your fingers cramp up. Because you are too scared to even query another table let alone join with it.

      --
      If I have seen further it is by stealing the Intellectual Property of giants.
  4. Don't Like Traditional Relational Databases? by ChoboMog · · Score: 3, Funny

    Go fork yourself!

    1. Re:Don't Like Traditional Relational Databases? by CorporateSuit · · Score: 2, Funny

      It seems an idiot has modded you down because they don't understand very basic database expressions.

      No need to get mad at Slashdot's mod point system, because, after all, if they outlaw giving mod points to stupids, then only stupid outlaws will have mod points... or something like that.

      --
      I am the richest astronaut ever to win the superbowl.
  5. Tilting at windmills by Anonymous Coward · · Score: 5, Insightful

    Seems to be a silly thing to be against. Relational databases and the stuctured query language may not be perfect, but I bet these people could die in their 90's and people will still be using relational dbs and sql.

    If you want to tout open or cheap dbs and more lightweight types of storage/db servers, then they might have some points, but being against sql is just plain dumb.

    1. Re:Tilting at windmills by Qzukk · · Score: 5, Insightful

      SQL isn't the only way possible to query relational databases. It's nice and does a really good job for even mildly complex queries and I would not want to ditch it just yet, but seriously... who hasn't had a business need for multiple levels of aggregates (eg averages of sums across multiple groupings, say "average across all customers' total balances") As it is, you end up splitting the logic between the database and the application, or creating a view of the first level of aggregation, then querying against that and hoping that the performance doesn't suck total ass.

      --
      If I have been able to see further than others, it is because I bought a pair of binoculars.
    2. Re:Tilting at windmills by profplump · · Score: 2, Insightful

      I agree, there are problems SQL doesn't solve well. But I think it's unlikely that other, better solutions to those problems will also be superior to SQL where it *does* perform well. As such, "no SQL" is probably not the right plan any more than "SQL only".

    3. Re:Tilting at windmills by Strudelkugel · · Score: 3, Informative

      OLAP was designed to answer that type of question. MDX is the language used to perform multi-dimensional queries.

      --
      Imagine how much harder physics would be if electrons had feelings! -Feynman, maybe
    4. Re:Tilting at windmills by TheRaven64 · · Score: 2, Interesting
      I was at an HPC talk a few years ago, where the speaker said:

      I don't know what kind of language people will be using for HPC programming in 20 years. I don't know the features it will have. I do know that it will be called Fortran.

      I wouldn't be surprised if the same applies to SQL. The language has evolved a lot over the years, to better express different kinds of data. In 20 years time, I wouldn't be surprised if the most commonly-used subset of SQL is nothing like the subset currently popular, but I'd be surprised if the thing to replace SQL isn't called SQL.

      --
      I am TheRaven on Soylent News
  6. Flat Earth by Seumas · · Score: 3, Insightful

    I've seen strong reactions from various camps with regard to concern over saying no to SQL. I'm not sure why people freak out over it. First, you have to strike out toward new things if you want to progress the world. Second, SQL hasn't caused people to stop using spreadsheets or Access databases. Third, there are groups that get together to dispute that the earth is round; insisting that it is flat. Or that gray aliens are visiting earth regularly and probing our anuses.

    Bring on the next fascinating data technology. SQL will continue to have a major place for many years to come, no matter what happens.

    1. Re:Flat Earth by syzler · · Score: 3, Interesting

      I've seen strong reactions from various camps with regard to concern over saying no to SQL.. Third, there are groups that get together to dispute that the earth is round; insisting that it is flat.

      Corporations represented in this group included the likes of Google, Last.fm, Amazon, and Facebook. Hardly the same caliber of people who claim the earth is still flat. I'm inclined to listen to engineers from these companies if they say that an SQL database does not scale well for vast amounts of data.

    2. Re:Flat Earth by MightyMartian · · Score: 2, Interesting

      The whole thing is just reactionary mumbo-jumbo. There are kinds of data that relational databases are fantastic for, and kinds of data they're not, and sometimes none of it is exactly perfect. SQL is actually a pretty damned good, single-purpose language. It's not hard to learn, and once you learn it, the differences between RDBMS implementations becomes a little like Javascript, just something you have to put up with, not that a lot of people actually have to worry all that much about writing fully-portable SQL queries.

      --
      The world's burning. Moped Jesus spotted on I50. Details at 11.
    3. Re:Flat Earth by MightyMartian · · Score: 5, Insightful

      And yet where the other corporations; the oil companies, the banks, large merchant conglomerates. In IT we seem to have this sort of myopic view that if it isn't an IT company of some kind, it doesn't exist. Google, as compared to the huge companies that use tools like Oracle, is a bit player. I know that's hard for all of us who have sucked at the teat of silicon valley for so long have a hard time dealing with, but a significant amount of data that has nothing to do with social networking and finding pr0n goes on and does use tools like SQL.

      --
      The world's burning. Moped Jesus spotted on I50. Details at 11.
    4. Re:Flat Earth by Clover_Kicker · · Score: 2, Funny

      You're not going to get many page hits with an attitude like that...

    5. Re:Flat Earth by Vellmont · · Score: 3, Informative

      And so you're saying this is all the fault of the relational database, and would all be solved by using some sort of object based database? That's the topic at hand here, not developers dealing with legacy systems patched together.

      --
      AccountKiller
    6. Re:Flat Earth by Threni · · Score: 2, Informative

      > Second, SQL hasn't caused people to stop using spreadsheets or Access databases

      If if weren't for SQL there wouldn't be any Access databases...

  7. The problem is performance not SQL by presidenteloco · · Score: 3, Interesting

    The problem is the performance of transactions and persistence and distribution of data techniques, not
    whether we are using a logic-like STRUCTURED QUERY LANGUAGE to ask for data matching certain conditions.

    The latter is still, and will continue to be, very useful.

    It's just that now that we can assume local clusters and WANs worth of co-operating data stores, there
    are probably better, more performant ways of implementing persistence, replication, distribution of data
    than traditional RDBMS implementations.

    The two concerns: The logical model of how we QUERY for data (or combine it in bulk), which is the core of SQL,
    and how we persist it and retrieve it quickly, now have more options for being separated.
     

    --

    Where are we going and why are we in a handbasket?
    1. Re:The problem is performance not SQL by oGMo · · Score: 4, Insightful

      It's just that now that we can assume local clusters and WANs worth of co-operating data stores, there are probably better, more performant ways of implementing persistence, replication, distribution of data than traditional RDBMS implementations.

      You can also assume magical fairy dust and free energy, but that doesn't make it so. You can ask if there are better ways, but you can't assume it, and in the end you will find there is no magic.

      Clusters and replication are NOT NEW. Not even remotely new. There is, in fact, nothing new architecturally at all that would indicate some new capability that hasn't already been repeatedly analyzed and tried. That doesn't mean you can't tweak something for a situation, or that you need a giant Oracle database for everything, but "the web" and "cheap hardware" change the equation by precisely nothing.

      What has changed the equation is cheap, unimportant data, which covers the majority of the web. "Real" applications, where data integrity is important (like say, your bank account), and immediate accuracy guaranteed, require the main thing you use a database for: data integrity. Your facebook page, your google search, that blog entry, or some video on youtube: these don't matter. If it's a little slow, or doesn't update immediately, or you get an error, no one is losing money. No one cares.

      In essence, if a reliable database isn't important for your app, your app isn't really handling important data. This may be fine; in the mainstream, there's a lot of noncritical stuff. But this doesn't make databases unimportant.

      --

      Don't think of it as a flame---it's more like an argument that does 3d6 fire damage

  8. Not mutually exclusive by JobyOne · · Score: 3, Insightful

    It's pretty easy to say "yes" to alternatives without saying "no" to SQL.

    Just because a crowbar can pull out a stubborn nail better doesn't mean they should replace all the hammers. Then what would we put nails in with? Different tools for different jobs.

    --
    Porquoi?
  9. Yeah, so why are they better? by Anonymous Coward · · Score: 5, Insightful

    If I was to read the article, I bet somewhere someone would be wittering on about Key Value Datastores.

    The brainchild of a generation brought up on high level collections, they learn one (in this case Map) and apply it to everything.

    Sadly SQL, and RDBMS, works for most people. It maps object data well (oh whaaaa, i have to do foreign keys - GROW SOME FUCKING BALLS YOU LAZY GRADUATE!) and it is well understood. And with abstractions like LINQ to query them, even the lazy dumb Windows .NET programmer doesn't have to strain their brain to learn SQL.

    And when you have terabytes of specific unique data, you clearly should go away to work out how best to store it. Even a RDBMS/SQL solution is too generic for all problems.

    1. Re:Yeah, so why are they better? by fabs64 · · Score: 3, Insightful

      Saying RDMS's map object data well is a bit of a stretch, they map relational data well and that's it.

      http://www.codinghorror.com/blog/archives/000621.html for some good background on the problems.

      For me using an RDMS as the persistence layer for an object-oriented application has ALWAYS felt like a bit of a kludge. Like we're using it just because it's what we have, rather than the best tool for the job.

    2. Re:Yeah, so why are they better? by kage.j · · Score: 2, Interesting

      Linq to SQL/Entities(on your entity provider) has it's benefits and downfalls.

      But damn, Linq to everything else fucking rocks faces, and anyone who says otherwise seriously needs to buy a linq book and actually use the shit. Linq to XML/collections .. I don't know where I'd be without it. And I don't want to know!

      Yeah, linq is handy with Entities, but you run into a whole messuh problems if you don't be careful with it. (And people who don't understand relational databases should stay away from it.)

      At least, that is my opinion...but don't take it too seriously

      --
      he demonstrated by A plus B minus C divided by Z that the sheep must be red, and die of the rot
  10. What's the benefit exactly? by SendBot · · Score: 3, Insightful

    I'm not seeing anything that offers a real advantage over using advanced features like one finds in postgres combined with memcached. Some of my program likes to think of its data as a structured object while other parts like seeing that data as rows in a table (they even link up to other tables through foreign keys!).

  11. How about saying yes to the alternative by syousef · · Score: 4, Insightful

    Saying no to SQL and relational databases is just fine if you've got something better to replace it with. However I know of no such thing. The reason they're popular is that they are so powerful for data storage. If something better came along you wouldn't even need to say no to SQL. You'd just say yes to the newer better rival.

    --
    These posts express my own personal views, not those of my employer
    1. Re:How about saying yes to the alternative by g2devi · · Score: 2, Informative

      Sure. There are several.

      If you do clinical work, you're fairly familiar with EAV databases:
      http://en.wikipedia.org/wiki/Entity-attribute-value_model

      and The Associative Model of Data:
      http://www.lazysoft.com/docs/other_docs/AMD.pdf

      These data models are best when either your schema is inherently hazy (e.g. in case of patient information) of where the schema is so big that it's impossible to manage (e.g. enterprise data warehousing).

  12. SQL is not a database by j.+andrew+rogers · · Score: 5, Insightful

    SQL is not a database, it is a standard interface to a feature set commonly associated with relational models. Before everyone standardized on SQL, there were other relational query languages. The "No" part of "NoSQL" refers to the fact that some basic elements of relational implementations cannot be usefully expressed using a much simpler distributed hash table model.

    All the "NoSQL" does is eliminate all the parts of traditional relational databases that do no scale -- discarding the bottleneck rather than fixing it. These are things like joins and external indexing. Unfortunately, discarding those things means you discard a lot of very important functionality as a practical matter, notably the ability to do fast, complex analytics. Adopting the NoSQL architecture runs contrary to the trend toward more real-time, contextual analytical processing. There are a great many analytical applications that are not amenable to batch-mode pattern-matching, and the NoSQL model is a lot less applicable than I think some people want to acknowledge. In its domain, it is a great tool but it has many, many prohibitive limits. We are essentially trading power for scale.

    That said, do not take this as an endorsement of traditional SQL relational databases either, as they have a number of serious limitations themselves. As just mentioned, a number of the core analytical operations those models support are based on algorithms that scale poorly. The SQL language itself has mediocre support for many abstract data types (e.g. spatial) and data models (e.g. graph), which in part reflects the inadequacies of the assumed underlying database algorithms (e.g. B-trees) that are implicit in SQL. The inability to efficiently do event-driven/real-time applications is also more a reflection of the access methods used in databases than any intrinsic weakness in SQL; SQL may be clunky for that purpose, but that is not the real limiter.

    A truly revolutionary deviation from SQL would usefully implement a superset of the features SQL supports, not take them away. Of course, we would need access methods more capable than hash tables and B-trees to useful implement those features, which is a lot more work than discarding features that scale poorly. NoSQL is a stopgap technical measure for that small subset of applications where the serious tradeoffs are acceptable.

  13. Pros & Cons of non-relational solutions by kpharmer · · Score: 5, Interesting

    Note that most of these solutions come from the interwebs, social networks, etc. And it isn't so much anti-sql as it is anti-relational database (sql != rdb).

    The basic premise is that we need different solutions that: can scale very high for very narrowly scoped reads & writes, don't need to perform ranged queries / reporting /etc, and don't need ACID compliance. And that may be the case. Sites like slashdot, facebook, reddit, digg, etc don't need the data quality that ebay needs.

    On the other hand, ebay achieves scalability AND data quality with relational databases. And when I've worked with architectures that scale massively and avoid the relational trap for better solutions - they inevitably later regret the lack of data quality and complete inability to actually get trends and analysis of their data. It *always* goes like this:
        Me: So, is this thing (msg type, etc) increasing?
        Developer: No idea.
        Me: Ok, so lets find out.
        Developer: How?
        Me: I don't know - typical approach - lets query the database.
        Developer: It'll take four+ hours to write & test that query and then days to run. And when it's done we might find that we wrote the query wrong.
        Me: What?!?
        Developer: We had to do it this way, you can't report on 10TB databases anyhow
        Me: What?!? Are you on crack? there are dozens of *100TB* relational databases out there that people are reporting on
        Developer: well, we probably don't need to know what that trend is anyhow
        Me: I'm outta here

    1. Re:Pros & Cons of non-relational solutions by GryMor · · Score: 2, Informative

      Yah, good luck with that. Unless the index already exists to do nearly exactly what you want, queries against multi terabyte production oracle tables have this bad tenancy of never completing, if you are lucky, and (effectively, from the perspective of the app running on top of it) taking down the database if you are not so lucky.

      If the index doesn't exist, good luck adding in less than a week or two.

      For the most part, novel trend and behavior information is trivial to instrument in the service layer as a side effect of the apps normal operation, at which point you record it in your query logs and build reports based on the logs.

      --
      Realities just a bunch of bits.
  14. Hogwash! by gbutler69 · · Score: 2

    Check out "Window Aggregates" etc in Oracle and PostgreSQL 8.4

    --
    Over-the-top Response Guy! Giving "Over-the-Top Responses" since 1970.
  15. Data out-lives applications by 4to6Offshore · · Score: 5, Insightful

    First: my mantra: Data belongs to the organization, not the application... if the app fails and data is accessible then we all go on - if the data fails or is locked away - what was the point of the app again?

    In a SQL database then data is understood by the organisation, DBAs and data architects. If left to app developers taking an app-centric approach to data... I get nervous quickly.

    So long as the data is just as definable and accessible as current SQL databases then all good - give me an app with some odd-ball storage then it is bye-bye.

  16. Cartesion Product? by gbutler69 · · Score: 2, Insightful

    Epic Fail. You're wrong. It in now way results in a "Cartesion Product". That would be a "Cross Join", not an "inner join". From my experience, people who complain about SQL and relational database, are, for the most part, ignorant. They really don't even understand what they are saying or what they are talking about. I've seen so much abuse and misunderstanding of relational data and SQL in my career, that I just have to laugh at this sort of thing.

    --
    Over-the-top Response Guy! Giving "Over-the-Top Responses" since 1970.
  17. RDBMS and application logic by gd2shoe · · Score: 4, Insightful

    That is one view. It's nice and all, but incomplete. The issue is performance.

    Any time you're dealing with a large quantity of data, it's always easiest to process or filter where it's located. Transmitting it, processing it, and transmitting back changes adds an unreasonable amount of overhead. Hence, SQL is a "Query" language. In other words, you have the RDBMS do reasonable data processing and filtering of records for you. Your application should only need to specify the operations performed, and should only process data if your computation is particularly unusual. This makes feasible computations that would otherwise be entirely unreasonable. (note that an application working on the same machine generally has the same issue as one working on a separate system. SQL servers present the application with a stream of data - pipe, socket, etc)

    My opinion: SQL is horrendous. It's a pain to use, and many basic data transforms cannot be described in that language (at least without some huge, awful, convoluted command == maintenance nightmare).

    --
    I won't join Slashcott. OTOH, If Beta goes live, I just won't be back until it's fixed. Sorry Dice.
  18. I don't understand by 93+Escort+Wagon · · Score: 4, Funny

    So a bunch of Excel users got together for dinner in San Francisco - why is this news?

    --
    #DeleteChrome
  19. SQL is to data query languages by jd2112 · · Score: 2, Insightful

    ...What democracy is to methods of government.

    The worst ever devised excepting everything else that has ever been tried.

    --
    Any insufficiently advanced magic is indistinguishable from technology.
  20. How hard is it to understand? by dread · · Score: 2, Insightful

    Use the appropriate tool. Always. There are tons.

    Don't use a relational database to try to represent hierarchical data. Don't try to use LDAP to do analytics. Think of the performance implications before you have more than two users accessing your system. Data storage is a very different animal, you are often (though not always) I/O bound. This is very different from being limited by the amount of instructions you can deal with per unit of time. Don't think otherwise because it will bite you in the ass.

    And still I see people making the same stupid mistakes over and over. But it's pretty simple really:

    A solution designed to be generic will ALWAYS be slower than a solution that is customized. This shouldn't be surprising. If you have serious performance requirements (ESPECIALLY if they are coupled with huge amounts of data) then a custom solution is definitely something you should look into. At some point you will run into a brick wall and find out that there is stuff you can't do with the solution you have in place. This is natural. Custom solutions to hard problems always lead to restrictions in terms of future features. Always. You will NEVER be able to anticipate all features that you would like to have. (Yes, this is true for Google as well. No they don't have any special kind of magic dust that they sprinkle on their things there, they do the best they can and then they get bitten in the ass too, just like everybody else.)

    --
    I've had a wonderful time, but this wasn't it -- Groucho Marx
  21. The RDBMS responds to the troll by smittyoneeach · · Score: 5, Funny

    See, I don't think there is ever a good time or place for SQL.

    SELECT text FROM mild_introductory_statements WHERE id=random();

    Anyone who says so has never had to use it.

    SELECT text FROM statements_indicating_superior_experience WHERE id=random();

    I like to compare it with JavaScript.

    SELECT text FROM unrelated_tool WHERE id=random();

    It's a language that is difficult to refactor, maintain, and while it's a standard, the standard is so vague that it's useless.

    SELECT text FROM seemingly_valid_yet_unsubstantiated_objections WHERE id=random();

    Like JavaScript, people are trying to build other languages on top of it to hide its shortcomings -- for javascript you have tools like GWT, and for SQL you have HQL, Linq, etc.

    SELECT text FROM wrongheaded_causal_analysis WHERE id=same_one_as_two_queries_ago();

    Not to say that there is anything wrong with relational databases, we just lack a good tool to interface with them.

    SELECT text FROM reasonable_sounding_parthian_shot_to_obscure_trolling WHERE id=random();

    --
    Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
    1. Re:The RDBMS responds to the troll by Rich0 · · Score: 2, Informative

      I hate to nitpick, but:

      1. Your database needs normalization. Almost all of that data should have been in one table, using fields to indicate what kind of statements they were.

      2. Your use of random() isn't going to work - unless the random number generator just happens to generate a number that just happens to be the ID of one of your records. The typical way to do what you're doing is to order by random() and limit the query to one record.

  22. Seriously misguided by Stu+Charlton · · Score: 2, Interesting

    Trash SQL in favour of coding all your data access needs. Welcome back to 1973, guys.

    It's not like we could do parallel SQL in the 1980's. Or that you can't do parallel SQL in a compute cloud today.

    No, It basically seems like they don't want to pay software vendors any money for database technology. That's mostly what the arguments boil down to. Oracle RAC is very scalable, arguably easier to do at massive scale than MySQL - but you have to pay Oracle money. For an Internet startup, I can understand why you'd take your chances with "roll your own". For an enterprise... I think not.

    --
    -Stu
  23. I see SQL as more of a "protocol" by TheLink · · Score: 2, Interesting

    You can use SQL with flat files.

    SQL is going to be around for a long time, because it's useful as an "API" - as a protocol or layer of abstraction.

    Programmers can write all sorts of programs in all sorts of programming languages and then use SQL to talk to the DB. If the DB changes a bit, they can often use the same SQL or modify it slightly.

    You often see lots of grumbling and cursing in various companies because people actually end up doing that and companies end up with lots of stuff hooked to the DB - MS Access, perl, python, ruby, java, radius servers, openvpn, accounting and finance stuff...

    They grumble, but the fact is the database is being used. The data has become more useful.

    If you have your database locked up behind some new fangled protocol that only 20 people in the world know, it's not going to be as easy to do that - and often each bunch will start creating their own databases and you end up with a different mess, and a mess that's not as useful.

    Having everyone use SQL to talk to the DB is not actually a bug it's a feature.

    One man's impedance mismatch is another man's layer of abstraction.

    --
  24. Re:What? by SQLLord · · Score: 3, Insightful

    How many Googles or Yahoos are there? Like, 5. Let them do whatever broken things they want -- it works for them... for now. It's still expensive, probably just as much as "big iron". Not to mention the countless engineer hours and hosting/electricity costs for their "scale out" systems. It's what happens when you let a bunch of ivory tower PhDs solve real engineering problems.

    In the end, the rest of us serious enterprise engineers will allow Oracle, Microsoft, and the people who have been doing this for 30 years to optimize their code to run on multicore mainframes ... which is where massive computing belongs. Then we query it with a few lines SQL instead of convoluted algorithms in some "Map Reduce" environment, and you move on with our lives.

  25. Ditch SQL, not Relational DBs! by Lord+Bitman · · Score: 2, Interesting

    SQL syntax sucks, is inconsistent, and just non-standard enough at its corners that it's completely annoying to write anything for more than one DB. Also lacks various features which logically _should_ be there, because of the relational back-end. SQL is a toy, and though I'm the guy everyone in the office turns to if they want to write a query that does more than SELECT * FROM sometable, that doesn't mean I have to like it.

    But that's not the fault of relational databases. The relational logic makes sense, and we'll be seeing it referenced in countless "new ideas" that come along for years, just as ideas which Lisp already had in 1970 will be touted a new features on for the next millennium (you hear? PHP can do Lambda functions as of yesterday!)

    SQL sucks, but SQL is NOT what makes something relational.

    --
    -- 'The' Lord and Master Bitman On High, Master Of All
  26. Re:Relational good, SQL not so by Hognoxious · · Score: 3, Funny

    I'd rather just get the data across a simple protocol (which can be wrapped in a simple API binding for the language in use).

    While you're here, can you fill in the following form.

    I would like my pony to be:
    [ ] female
    [ ] male, entire
    [ ] male, gelded

    with coat colour:
    [ ] white
    [ ] black
    [ ] brown
    [ ] piebald
    [ ] other, please specify _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    --
    Confucius say, "Find worm in apple - bad. Find half a worm - worse."
  27. Cloudscape/Derby by Kupfernigk · · Score: 2, Interesting
    Yes, and Java persistence systems (Hibernate) suck dreadfully; they are a solution for which there is no problem. By the time you've learned the mess that is Hibernate, you can have learned SQL and the Java Collections well enough to be able to knock up any persistence model you need in no time flat.

    Derby 10.5, meanwhile, still has a tiny footprint, and can do most if not all of the SQL you will ever want for a typical Java application, along with features like the ability to do live backups, live table compaction from within the application while running, and now at last the ability to do cursoring in SELECT statements. Installation and configuration are simple.

    I actually think that the actual problem is that we old C programmers actually learned programming and data structures, and as a result know a lot about the kind of problems for which SQL is well suited, while a lot of modern programmers learn a lot of theory about OO, but don't actually learn to program. Therefore, they have to try to reinvent wheels that were in fact designed in the 70s, and have no idea of what tools are available and how they map onto typical real-world application level problems.

    --
    From scarped cliff or quarried stone she cries "A thousand types are gone, I care for nothing, no not one."
  28. The RDBMS responds to the nitpick by smittyoneeach · · Score: 2, Funny

    SELECT text FROM thank_you_for_sharing_your_views_but _you_have_not_seen_the_schema_my_friend
    UNION
    SELECT text FROM same_goes_for_point_two__if_you_lack _the_source_code_what_then_do_you_really_know
    UNION
    SELECT text FROM besides_it_got_plus_five_funny_so_neener_neener_neener;

    --
    Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear