Slashdot Mirror


Enthusiasts Convene To Say No To SQL, Hash Out New DB Breed

ericatcw writes "The inaugural NoSQL meet-up in San Francisco during last month's Yahoo! Apache Hadoop Summit had a whiff of revolution about it, like a latter-day techie version of the American Patriots planning the Boston Tea Party. Like the Patriots, who rebelled against Britain's heavy taxes, NoSQLers came to share how they had overthrown the tyranny of burdensome, expensive relational databases in favor of more efficient and cheaper ways of managing data, reports Computerworld."

423 comments

  1. Quit Whining by KingPin27 · · Score: 5, Funny

    Just use flat text files --- no need for expensive db's .... think of the freedom!

    --
    "i lost my dignity on a slippery wiener"
    1. Re:Quit Whining by Anonymous Coward · · Score: 4, Insightful

      The horrible lag I get when using address completion in Firefox 3 makes me wish more people thought that way!

    2. Re:Quit Whining by phantomfive · · Score: 1

      This is one of the main objectives of ReiserFS, to make such things easy, a project which unfortunately has run into some difficulty of late.

      --
      Qxe4
    3. Re:Quit Whining by Anonymous Coward · · Score: 0

      Like his wife?

    4. Re:Quit Whining by Anonymous Coward · · Score: 1, Funny

      She was screwing Mark Sanford!

    5. Re:Quit Whining by MichaelSmith · · Score: 2, Funny

      This is one of the main objectives of ReiserFS, to make such things easy, a project which unfortunately has run into some difficulty of late.

      I wonder if I could sneak Hans an eeepc inside a birthday cake...

    6. Re:Quit Whining by rubycodez · · Score: 1

      I"ve lost data in two filesystems thanks to the Slasher's shoddy work.

    7. Re:Quit Whining by phantomfive · · Score: 2, Insightful

      You didn't learn to backup after the first time?

      --
      Qxe4
    8. Re:Quit Whining by Paradise+Pete · · Score: 4, Funny

      I"ve lost data in two filesystems thanks to the Slasher's shoddy work.

      Have you looked near Redwood Regional Park? On the side of a hill?

    9. Re:Quit Whining by fooslacker · · Score: 2, Insightful

      I have no idea what rubycodez's experience was but just because I can recover from backup doesn't mean data wasn't lost by the filesystem. More importantly external risk management schemes (i.e. backup) don't relieve the file system of the obligation not to kill the copy of my data I've entrusted it with.

    10. Re:Quit Whining by auLucifer · · Score: 1

      so . hard . to . not . have . dna copy in murdered daughter . joke

      Parent is definitely right. What if the implementation of the filesystem has issues and the data is not correctly saved? Doesn't matter if you backup or not if you're using a shoddy filesystem which doesn't save correctly then the data will ultimately be lost.

      --
      If I was witty I'd put something funny here but, as it stands, I am not and have just wasted seconds of your life
    11. Re:Quit Whining by Klinky · · Score: 1

      I'd hope the Eee PC is RoHS compliant =|.

    12. Re:Quit Whining by CyberLife · · Score: 3, Insightful

      Flat files are a perfectly viable option in some circumstances. Not everything requires data uniformity or the ability to run complex ad-hoc queries, nor does everything need information to be controlled by a separate process running on a different machine. Not every system integrates multiple applications through a shared data-store. The NoSQL crowd isn't arguing that SQL is bad, just overused. There are a great many situations where something like flat files or Berkeley DB is more than sufficient, and yet people still use relational technology. In my experience it's generally because that's all they know. In their mind, if one needs to store data one uses SQL. They don't select the right tool for the job because they honestly don't know there are other tools.

    13. Re:Quit Whining by Anonymous Coward · · Score: 0

      At least she got promoted from Ensign!

    14. Re:Quit Whining by VGPowerlord · · Score: 1

      That's because Relational Databases are an abstraction.

      Heck, isn't SQLite essentially a library that uses SQL commands to store to and retrieve data from a flat file?

      --
      GLaDOS for President 2016! "Well here we are again. It's always such a pleasure." -- GLaDOS, 2011
    15. Re:Quit Whining by Anonymous Coward · · Score: 0

      Hey, an Excel spreadsheet is a great substitute for a multi gig BD too.

    16. Re:Quit Whining by jadavis · · Score: 4, Insightful

      One of the reasons is because RDBMSs offer a lot of tools, like atomicity, durability, backup/restore, centralization, point-in-time-recovery, etc. Many application developers need these things without actually needing the abstraction of a relational system.

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
    17. Re:Quit Whining by Anonymous Coward · · Score: 2, Informative

      The horrible lag I get when using address completion in Firefox 3 makes me wish less people thought that way!

      fixed that for ya. your address completion would probably not lag if there was an efficient data structure behind it, like a btree, like a rdbms would use

    18. Re:Quit Whining by knightghost · · Score: 1

      "The right tool for the job" is always worth striving for. However, rhetoric such as "tyranny of slow, expensive relational databases" merely shows incompetence when it comes to understanding what a database is. If you have structured data then there is nothing more efficient, effective, or faster than a relational database. For unstructured data, go Google indexing or Perl heaps.

    19. Re:Quit Whining by Profane+MuthaFucka · · Score: 1

      I judge a filesystem by its fans. By that measure reiserfs is a fanatic asshole. You can be sure that reiserfs didn't lose your data, and if you say so again you might get yelled at.

      --
      Fascism trolls keeping me up every night. When I starts a preachin', he HITS ME WITH HIS REICH!
    20. Re:Quit Whining by Profane+MuthaFucka · · Score: 3, Funny

      Even flat files are an abstraction. My files are long, skinny, spiral, and spin around like Linda Blair's head.

      --
      Fascism trolls keeping me up every night. When I starts a preachin', he HITS ME WITH HIS REICH!
    21. Re:Quit Whining by Anonymous Coward · · Score: 0

      The horrible lag I get when using address completion in Firefox 3 makes me wish fewer people thought that way!

      Fixed that for ya. The way you wrote it made it sound like he didn't wish as much that people thought that way.

      (Annoying, isn't it)

    22. Re:Quit Whining by Profane+MuthaFucka · · Score: 1

      Never underestimate the bandwidth of a Honda with no passenger seat, filled with backup tapes.

      --
      Fascism trolls keeping me up every night. When I starts a preachin', he HITS ME WITH HIS REICH!
    23. Re:Quit Whining by dna_(c)(tm)(r) · · Score: 1

      I really hate SQL. It failed at being a natural language search tool for business people. Others think this too: The Third Manifesto

      With that out of the way, you mention the point-in-time-recovery and that touches on another issue I profoundly dislike with databases. It is possible to go back to a previous state, but then often applications depending on the DB schema break. When managing software versioning, one of the most difficult parts to get right, one of the highest risks in breaking things, is keeping track of the database schema changes.

    24. Re:Quit Whining by dna_(c)(tm)(r) · · Score: 1

      Oh. Forgot tree structures (or anything not tabular) and retrieving them.

    25. Re:Quit Whining by Mjec · · Score: 1

      If your data isn't complex enough to require a RDBMS, you almost certainly don't need a program.

      Ok, so this is an outright lie - if you're storing a list of integers, you don't need a relational database. But what I'm tired of is people not understanding their data. A LOT of data is relational and proper, normalised storage is generally non-trivial. Of course, well-structured data leads to a well-structured program leads to fun buzzwords like extensibility.

      --
      "But everyone should know everything." -markab
    26. Re:Quit Whining by dna_(c)(tm)(r) · · Score: 2, Insightful

      If your data isn't complex enough to require a RDBMS, you almost certainly don't need a program.

      Really? IM, word processor, spreadsheet, vector graphics, photo editor, ... Google probably uses MapReduce without a "normal" RDBMS behind it - is that data complex enough?

    27. Re:Quit Whining by rbrausse · · Score: 1

      huh? I didn't know that EU directives are obligatory for American prisons...

      a huge step forward for Europe - one set of rules to bind them all :)

    28. Re:Quit Whining by Jarlsberg · · Score: 1

      Well, that's why I am so in favor of SQLite. While it does use SQL syntax, in reality you're simply rummaging through a flat file with superspeed. I use SQLite on all my simpler web projects that doesn't need the advanced functions that you get with a fully fledged relational database.

    29. Re:Quit Whining by jsiren · · Score: 1

      From wikipedia:

      The theme of the manifestos is how to avoid the 'object-relational impedance mismatch'...

      Electrical impendance mismatch between balanced and unbalanced lines is handled by an autotransformer, a balun, short for balanced-unbalanced. What we need, obviously, is an object-relational transformer, an obre, if you will.

      --
      Usage: km/h for speed (kilometers per hour); kph for very slow impulses (kilopond hours).
    30. Re:Quit Whining by TheSunborn · · Score: 1

      Yes but if you use an Relationel to OO mapper such as Torque, or hibernate you loose much of the flexibility of sql, without getting the benefit of an oo database. So you get the worst of both worlds. If you know your database are going to store objects you might as well just use an oo database instead of an relational database with an adaptor.

      The main reason for using a relationel databases with an OO-adaptor(I use torque) is that the relationel databases have used the last 20 years for tuning, so they are stable and quite fast.

    31. Re:Quit Whining by pjt33 · · Score: 1

      It makes me wish that Firefox searched whatever it's searching - whether a flat file or a tiny integrated RDBMS - in a separate thread to the one it uses for processing keyboard input and rendering.

    32. Re:Quit Whining by ShieldW0lf · · Score: 1

      How these databases work:

      1) Ask X nodes if they have information that relates to subject Y.
      2) Wait Z period of time for an answer
      3) Take the answers you got from what nodes answered in Z time, form a conclusion

      The way this works in plain English is, poll a million people on a subject, give them a time limit to answer, ignore those who don't answer in the time limit, form their collective opinions into a result, call it a day. If 10,000 of those people on your list are dead and 50,000 are in jail and 30,000 didn't get the message and none of those people answered, you don't know, you don't care.

      If you can re-think your application so it works within this paradigm, this technology will work. If you can't, it won't.

      For things like search engines or suggesting a vaguely related item that you might also want to purchase, this is fine. Anywhere you need to spit out SOME kind of answer, without consistent accuracy being essential, this is a good way to do it.

      It's not going to become a replacement for RDBMS' as a tool. Pitching it that way is stupid. It is a supplemental tool for an entirely different category of problems.

      --
      -1 Uncomfortable Truth
    33. Re:Quit Whining by mjtaylor24601 · · Score: 1

      Yeah, trees in RDBMs can be a PITA. The best way I've seen for handling them is Nested Set Trees, but even that can be a pain if your tree structure gets updated frequently.

      --
      I wish I were as sure of anything as some people are of everything
    34. Re:Quit Whining by mckinnsb · · Score: 1

      This is why developing using a framework that takes database migrations into account will make your life much easier. Either that, or just roll your own migration script. It really isn't that bad, especially if your writing for only one target database. Then you hook up your versioning system to your framework, and presto, most of your problems will solved.

      (Unless of course, you never implemented a versioning system, and never built on or developed your own framework. Then you've got some work to do)

      Also , SQL was never designed to be a natural language search tool for business people. IBM is not in the 'friendly computing' business.

    35. Re:Quit Whining by loufoque · · Score: 1

      Why it lags is funnily enough because they're using a file-based RDBMS instead of a server.

      Indeed, what causes the lag is the constant disk locking and syncing, which isn't needed with a client/server architecture.

      I filed a bug to request allowing to toggle between SQLite and other RDBMS but was refused.

    36. Re:Quit Whining by loufoque · · Score: 1

      I really hate SQL. It failed at being a natural language search tool for business people. Others think this too: The Third Manifesto

      Business people shouldn't be programming.

      Also, what your link does is say SQL is inappropriate with Object Oriented Programming. Maybe the problem is OOP, actually, that is inappropriate.

    37. Re:Quit Whining by Anonymous Coward · · Score: 0

      Tips to avoid that horrible lag:

      1) Reduce the amount of stuff in your browser history. You really don't need the past five years of your personal browsing history stored. That just makes more work for your computer and also makes it easier for the authorities when the party van comes to pick you up.

      2) Visit a site often? Make a "bookmark". A "bookmark" lets you store the address of a site you visit often and then return to that site later with a single click. Exciting stuff! Having been around since shortly after the invention of the Web Browser this "bookmark" technology is fairly mature and reliable now. Give it a try.

      3) Buy a new freaking computer. Once your Pentium starts getting some numbers equal to 2 or greater after it or maybe even is called some weird name and claims to have Gee-hurts in it then let me tell you that your web browser's performance will really take off.

    38. Re:Quit Whining by umberleigh · · Score: 1

      you kid, but I used to work for a db/information management company that was retarded enough to do this.

    39. Re:Quit Whining by Keeper+Of+Keys · · Score: 1

      Actually Firefox uses SQLite to power the Awesome Bar.

    40. Re:Quit Whining by jadavis · · Score: 1

      But what I'm tired of is people not understanding their data.

      I'd agree with that, but it's pretty vague.

      A lot of developers put too much emphasis on the writing of data somewhere, without much emphasis on how they're going to interpret it later amongst some other few million pieces of data. If you store everything as a text blob, that severely limits your searching options.

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
    41. Re:Quit Whining by jadavis · · Score: 1

      The theme of the manifestos is how to avoid the 'object-relational impedance mismatch'...

      Having read a lot of The Third Manifesto, I don't think I'd put it that way at all. For one thing, I don't think the authors use the words "impedance mismatch" anywhere. But correct me if I'm wrong.

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
    42. Re:Quit Whining by jadavis · · Score: 1

      I really hate SQL.

      I tend to agree, but there are no serious implementations of alternative relational languages. NULL in particular is an abomination, and I don't think most developers really understand its behavior.

      It failed at being a natural language search tool for business people.

      But it succeeded at bringing good implementations of an approximation of relational calculus to a lot of people.

      I would rather that SQL followed some better language design principles, but it's far better than a non-relational language.

      It is possible to go back to a previous state, but then often applications depending on the DB schema break.

      I have no idea what you're talking about here. It's not "I feel like resetting the database state on my production machine". It's point-in-time recovery. That means, if you delete or drop something important, you can restore to the time right before you did that, hopefully on a non-production machine and then you merge the old data in where it's supposed to be.

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
    43. Re:Quit Whining by jadavis · · Score: 1

      Business people shouldn't be programming.

      Lots of people in business (like accounting, management, and executives) are actually very savvy and able to ask questions very precisely. Particularly if they are just reading data, what's the problem? Not everyone works at a company with a team of DBAs; sometimes the "business people" have to fend for themselves. It's either SQL, or they are downloading the entire thing into spreadsheets. I think SQL is an improvement (of course spreadsheets can still be useful in conjunction with a database system).

      This is, in know way, an endorsement of SQL's attempt at a "natural language". I think that was a bad idea. However, to be so dismissive of business people is quite arrogant.

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
    44. Re:Quit Whining by jadavis · · Score: 1

      While it does use SQL syntax, in reality you're simply rummaging through a flat file with superspeed.

      1. SQLite is so far from the standard you can hardly call it SQL. I think even MySQL is closer to the standard, at least if you turn on the right set of options.

      2. What makes you think that SQLite is fast? It has a small footprint, and can be fast in some situations, I'm sure, but I would just assume that it's fast.

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
    45. Re:Quit Whining by Chelloveck · · Score: 2, Informative

      The horrible lag I get when using address completion in Firefox 3 makes me wish fewer people thought that way!

      fixed that for ya.

      Fixed that for ya!

      --
      Chelloveck
      I give up on debugging. From now on, SIGSEGV is a feature.
    46. Re:Quit Whining by rubycodez · · Score: 1

      sure, I do backups. doesn't keep me from losing the data acquired since last backup

    47. Re:Quit Whining by Hawke666 · · Score: 1

      Pretty sure that was referring to not contaminating the cake with lead (and other hazardous materials). Just because your cake has an eeePC inside it doesn't mean you don't want to eat it!

    48. Re:Quit Whining by rbrausse · · Score: 1

      woosh :P

      btw, I like those crunchy chips in the eeePC - can I have one without all the cake around?

    49. Re:Quit Whining by Sloppy · · Score: 1

      If you want to know real pain, run Firefox when your home directory is NFS-mounted over a busy LAN on an already over-worked server.

      If FF would let me use anything other than my NFS server, I would leap for joy.

      --
      As copyright owner of this comment, I authorize everyone to defeat any technological measure which limits access to it.
    50. Re:Quit Whining by Anonymous Coward · · Score: 0

      Horrible lag? What are you running? a 386DX?

    51. Re:Quit Whining by CyberLife · · Score: 1

      Many application developers need these things ...

      And many do not.

    52. Re:Quit Whining by CyberLife · · Score: 1

      I think the point many are missing is that software system architecture is not all about the data. It may be a large part of things, but it's by no means the only concern. Walking into a project with the preconception that one requires what a relational system has to offer is rather naive.

    53. Re:Quit Whining by Cico71 · · Score: 1

      FWIW they use the words in the second and third edition (Databases Types and the Relational Model) of the third manifesto.

      Anyway, the whole point, let me be a little bit reductive, is that with a properly implemented type system in the dbms, and a proper language (along the lines of the educational Tutorial D), we will have a rich formal model and can live without the collection of best practices that people call object "model" <g> ok, ok, a lot of people will need to get used to relations instead of collections or arrays but that's life ;-)

    54. Re:Quit Whining by dfetter · · Score: 1

      Try Common Table Expressions. All the cool database management systems have them :)

      --
      What part of "A well regulated militia" do you not understand?
    55. Re:Quit Whining by Jarlsberg · · Score: 1

      Yes, that's why I said you can use SQL syntax, but it's not a proper SQL database -- far from it. It's the simplest things that give it away, like the fact that you can define a CHAR field with 10 chars, and SQLite will happily store 255 chars in the field anyway ;)

      But it is fast. Very fast. I've measured it, and compared it to proper databases. For my main web site http://www.magicode.org/, I use SQLite 3 with prepared statements. It's lightning. (Comments are stored in mysql5 though, as I access it through my admin, which is on another host, and I would have to ftp download and ftp upload the sqlite-base if I wanted to admin it remotely).

      It's an alternative, and a pretty good one, I think.

    56. Re:Quit Whining by Anonymous Coward · · Score: 0

      The on disk representation is fairly independent of the query/update language. SQL could very easily be implemented on top of tables living in text files. For querying only the hunting could be pretty easy to do with already existing Unix test processing tools, so an SQL query could be turned into a BASH script using things like sed, sort, and cut.

    57. Re:Quit Whining by jadavis · · Score: 1

      But it is fast. Very fast.

      That's interesting. The SQLite website has some very old and very bad benchmarks, so I don't know much about the current performance characteristics.

      I'm a little skeptical that it is faster in a wide range of cases, but if it's faster for most of your queries, I suppose that's all that matters.

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
    58. Re:Quit Whining by einhverfr · · Score: 1

      Well, the problem though is that it isn't mathematically possible to translate in all cases physical db relations into object data structures. Hence the way ORM's end up skewing development is that they make folks program around the ORM, leading to the worst of both worlds-- dysfunctional relations and objects.

      A better solution would be to make intelligent use of views FIRST and then build some sort of ORM inside the db, based on these views. It can be done, but it isn't a piece of software you just plug in as it requires the real db engineering stuff be done first.

      --

      LedgerSMB: Open source Accounting/ERP
  2. A time and place for everything by Marillion · · Score: 3, Insightful

    There is a time and place for SQL. There is a time and place to avoid SQL.
    SQL is great for financial data. SQL is terrible for genetic data.

    --
    This is a boring sig
    1. Re:A time and place for everything by snl2587 · · Score: 0, Offtopic

      It was the best of times, it was the worst of times,
      it was the age of wisdom, it was the age of foolishness,
      it was the epoch of belief, it was the epoch of incredulity,
      it was the season of Light, it was the season of Darkness,
      it was the spring of hope, it was the winter of despair,
      we had everything before us, we had nothing before us,
      we were all going direct to heaven, we were all going direct the other way

    2. Re:A time and place for everything by Bromskloss · · Score: 3, Insightful

      It would be interesting to hear why this is.

      --
      Swedish plasma phys. PhD student; MSc EE; knows maths, programming, electronics; finance interest; seeks opportunities
    3. Re:A time and place for everything by SendBot · · Score: 0

      I'm just throwing a half-educated guess out there, but genetic algorithms have so many outputs tied back into its inputs, all changing around quite frequently, such that an sql implementation would be painfully contorted.

      But then, I don't quite see how neural network programs need mass replication the way db's do.

      or would they.... ?

      this is an interesting issue!

    4. Re:A time and place for everything by Carewolf · · Score: 3, Insightful

      Design an efficient table relating a tree structure. Then design queries to answer questions such as:
      * Find the nodes in the subtree under B.
      * Find all ancesters of G
      * Find the nearest common ancestor of D and H

      Trees is a wellknown problem of SQL, but the fact is that SQL can't handle most datastructures and complex relations, only very simple one dimensional ones.

    5. Re:A time and place for everything by MichaelSmith · · Score: 2, Insightful

      It would be interesting to hear why this is.

      My guess would be that because SQL is a Structured Query Language it is best used for handling structured data. If you have serial, unstructured data you have to invent your own format for it to use inside the database, and then the query language isn't helping you.

    6. Re:A time and place for everything by Marillion · · Score: 2, Interesting

      Genetic sequences are long strings alphabetic characters. One of the most common representations is the FASTA which deals with the most common type of nucleotide polymorphisms. You can't use exact string searching to find a match which makes BLOBS and CLOBS useless. That said, the meta-data of genetic data is reasonably structured and does load into relational databases fairly well.

      --
      This is a boring sig
    7. Re:A time and place for everything by Anonymous Coward · · Score: 0

      See, I don't think there is ever a good time or place for SQL. Anyone who says so has never had to use it. I like to compare it with JavaScript. It's a language that is difficult to refactor, maintain, and while it's a standard, the standard is so vague that it's useless. Like JavaScript, people are trying to build other languages on top of it to hide its shortcomings -- for javascript you have tools like GWT, and for SQL you have HQL, Linq, etc.

      Not to say that there is anything wrong with relational databases, we just lack a good tool to interface with them.

    8. Re:A time and place for everything by Marillion · · Score: 3, Insightful

      Right, I went into a little detail on another post. When I said, "genetic," I mean genes - DNA. There are four main Nucleic Acid types in DNA: Adenosine, Cytosine, Guanine, and Thymidine. Abbreviated ACGT. So you could store a gene sequence as ACGCCTGCAATC. But in other populations, Asian for example, the same gene is more commonly found as ACTCCTGCAATC. (The third nucleotide is different) Exact string matches won't find matches between different population groups. So they create wild-card letters that represent either G or T -> K. So ACKCCTGCAATC would match either the both of sequences commonly found in western and eastern populations. Data of this nature has no business being in a relational database. For that matter, it doesn't belong in these pseudo databases either.

      --
      This is a boring sig
    9. Re:A time and place for everything by Threni · · Score: 1

      One dimensional? So my really-fast snowflake design doesn't exist in your world? 2D tables no good? Sigh...and I was so happy with them.

    10. Re:A time and place for everything by E+IS+mC(Square) · · Score: 5, Interesting

      >> Trees is a wellknown problem of SQL, but the fact is that SQL can't handle most datastructures and complex relations, only very simple one dimensional ones.

      Sorry, that's not true. Have you tried analytical functions? You would be amazed how complex scenarios can be handled easily with them. And they are part of ANSI SQL standards. And db providers (Oracle etc) have taken the concept and improved a lot on it.

      I think the anti-sql 'movement' has more to do with new (internet era) languages and their developers than so called 'lack' of features. In my limited experience, I have observed people coming from C (and such) background have no problem with sql, while java developers (and this is probably true for most developers working on web-based applications) are the worst kind when it comes to understanding even basics of sql. All they want is their objects.

      I strongly believe that a competent programmer designing/developing system which includes data and data-storage should at least know normalization, indexes, and what does it mean by 3NF. Programming language is one thing, database is another, and knowledge of both is required to build a decent system.

    11. Re:A time and place for everything by atmurray · · Score: 1

      This is a thousand monkeys working on a thousand SQL joins. Soon, they'll have written the most efficient yet confusing SQL query known to mankind. (reads one of the resultsets) "It was the best of times, it was the blurst of times"?! you stupid monkey! (monkey screeches) Oh, shut up.

    12. Re:A time and place for everything by MichaelSmith · · Score: 2, Insightful

      Our own nervous system works as a pretty good associative database, though (in my case at least) it seems to be designed to associate places with objects, ie, it is intended to answer the question "what happened the last time I was here?". So as we develop new applications we tend to develop spatial or geographical models for our data.

      The genetic data you describe is not too different from other things we all have to deal with. Trace or log data. Video streams. Sequences of real time events from practically anything. All of these things consist of partly structured streams from which we need to extract meaning. And yes, for all these things storage in a relational database doesn't add any value.

    13. Re:A time and place for everything by CorporateSuit · · Score: 1

      Considering SQL uses a language you can practically master in 15 minutes, I'll have to disagree. For 99% of applications, Query Analyzer was the only tool I've ever needed for SQL, and for 99% of applications, it will get whatever job you need done.

      --
      I am the richest astronaut ever to win the superbowl.
    14. Re:A time and place for everything by julesh · · Score: 4, Informative

      SQL is great for financial data.

      Actually, this isn't true either. See this article for pointers to some of the failings of SQL in dealing with financial data, particularly time series (e.g. sales figures, share prices, etc.). Here's another take on the problem, which essentially is that SQL doesn't recognise that there can be relationships between the rows of a table (e.g., "this happened after this").

    15. Re:A time and place for everything by rs79 · · Score: 1

      You're both right of course, and as a programmer I don't give a shit either way.

      I just want the fastest one. That's the compelling argument to me. That doesn't lose data of course; pity to hear the slasher FS drops bits.

      --
      Need Mercedes parts ?
    16. Re:A time and place for everything by complete+loony · · Score: 1

      Searching DNA is a very subtle and complex CS problem. This was written by a good friend of mine; DASH: Localising Dynamic Programming for Order of Magnitude Faster, Accurate Sequence Alignment

      --
      09F91102 no, 455FE104 nope, F190A1E8 uh-uh, 7A5F8A09 that's not it, C87294CE no. Ah! 452F6E403CDF10714E41DFAA257D313F.
    17. Re:A time and place for everything by diamondmagic · · Score: 5, Informative

      Design an efficient table relating a tree structure.

      Huh? Tree structures are best handled by relational databases, as it is far faster then recursion. Give row a unique ID and a parent ID, and in addition, a left hand and right hand number, the root node having a left-hand value of 1 and a right hand value of (number rows * 2), the first child node has a left-hand value of one more than the parent's, the right-hand value is one less then the left-hand of a younger sibling.

      Then design queries to answer questions such as:
      * Find the nodes in the subtree under B.

      SELECT * FROM rows WHERE left > [left hand value of B] AND right < [right hand value of B]

      * Find all ancesters of G

      SELECT * FROM rows WHERE left < [left hand value of G] AND right > [right hand value of G]

      * Find the nearest common ancestor of D and H

      SELECT * FROM rows WHERE left < [lowest left hand value from D,H] AND right > [highest right hand value from D,H] ORDER BY right LIMIT 1

      Trees is a wellknown problem of SQL, but the fact is that SQL can't handle most datastructures and complex relations, only very simple one dimensional ones.

      Are you saying trees are easy or hard? And for more complex systems, that is what JOINs are for. SQL is by far the most powerful way and often the fastest way to manipulate data that I know of. The only time I can recall that I had to use a non-SQL solution that was faster then the SQL solution was a matrix operation.

    18. Re:A time and place for everything by lumbercartel.ca · · Score: 2, Insightful

      PostgreSQL also has specialized index algorithms for handling exotic arrangements of data. One index type that I'll be looking into in the near future [I've been told] makes it possible to efficiently take two dimensional data, and return rows that all fit within the specified radius starting from two coordinates. Although this can be done with a combination of indexes and various formulas, it's not as elegant as what PostgreSQL can do now. So, when I see statements that SQL hasn't progressed, I question the level of expertise of those making such statements.

      As for "all they want is their objects," I think that's true -- just look at all the PHP newbies out there who like to create tables with hundreds of columns instead of taking advantage of one of SQL's greatest hallmarks: Relationships between tables

      Java (with JDBC) and Perl/mod_perl/mod_perl2 (with DBIC, a.k.a., DBIx::Class) should make these folks happy though because they do provide slick OO interfaces to SQL that, unless you keep all your hundred or so columns in a single table, also require at least a basic understanding of SQL.

      Anybody working with databases in a serious way has to have at least a basic understanding of the underlying technology. Creating an alternative to SQL seems ludicrous to me because it will eventually take away from the large pool of SQL expertise that exists today (I never knew about an anti-SQL movement until I read about it here today on SlashDot); many other alternatives do already exist too, such as BTrieve which has a totally different interface from SQL, and although it performs well it's just not as popular anymore (didn't Pervasive, the company that currently owns the BTrieve technology, switch to a customized rendition of PostgreSQL or something like that anyway? I wonder what their real reasons were for focusing on SQL in favour of BTrieve?).

      As a developer, I see that performance (speed) is a very important factor, but that's not the only factor for me -- quality of code, helpful documentation, and overall system reliability (including, very importantly, the ability to always recover gracefully from a power outage that occurred while thousands of INSERTs, UPDATEs, and DELETEs are active, which I confirmed in some informal "pull-the-plug" testing many years ago that PostgreSQL and Oracle both do by issuing ROLLBACKs at database mount time after the OS recovers) because it will mean easier development with fewer potential problems for me to deal with in the future if something does fail.

    19. Re:A time and place for everything by raddan · · Score: 3, Insightful

      And to expand on that a little, I think each part of the MVC idiom has it's own domain-specific language because those languages are well-suited to those applications. An imperative language with an emphasis on objects (e.g., Java) just doesn't do the same thing that a declarative set-theoretical language (SQL) does. Well, it can, but doing so is a royal pain in the ass. That same imperative language is also total overkill for defining a layout. HTML does that job beautifully and simply.

      There are certainly common CS themes running between all three. We have three languages not because people haven't thought about those things, but because they make our lives easier.

      Whenever I hear people bitching about 'doing away' with SQL, I always wonder what they think is wrong with it. SQL certainly has some limitations, don't get me wrong, but it is a great language for the vast, vast majority of cases. If your application is so specialized that SQL isn't appropriate, well, bravo, but that does not mean that the relational database concept is flawed. Personally, I think if people spent a few moments doing some formal analysis before they built their databases (imagine that, thinking before doing?!), they would find that SQL is a beautiful thing. If your implementation of SQL doesn't cut the mustard, maybe you just need a better query optimizer?

    20. Re:A time and place for everything by allenthelee · · Score: 2, Informative

      That's true, but you should mention that this data representation comes at a tradeoff for update efficiency - insertion of new nodes force you to update the entire subtree's left and right hand values.

      I wouldn't call SQL the natural way to model a tree data structure. It's possible but it comes with a price.

    21. Re:A time and place for everything by lumbercartel.ca · · Score: 1

      I don't agree either. JavaScript is a vague language, and partly thanks to the web browser wars. If JavaScript was actually properly standardized across web browsers, then web developers wouldn't be stuck having to write huge amounts of conditional code just to make it work in Internet Exploder, and then more when doing really complicated stuff. Take a look at any cross-browser JavaScript menu system that supports Opera, Firefox, Safari, Google Chrome, Internet Explorer, and others, and it's easy to find lots of these conditional statements to work-around various infuriating web browser incompatibilities.

      SQL, on the other hand, doesn't tend to suffer from these problems because the client-side isn't choosing which database engine is being used behind the scenes, so the DBAs and other system administrators can get the consistency they need. Sure, every database vendor is tweaking the standards (or relying on outdated standards like MySQL has for a long time -- I don't know if they're still locked into the old SQL standard anymore because I use PostgreSQL for all my database needs), but generally most of the more common statements work very similarily, so moving between database engines (and adapting and testing code accordingly) is far easier than making JavaScript work consistently across multiple versions of multiple web browsers.

    22. Re:A time and place for everything by raddan · · Score: 3, Informative

      A basic premise of the relational model is that there is no relationship between rows. So it isn't surprising that SQL can't see any. Maybe you need to organize that data differently? You can solve a lot of problems in SQL using triggers, temporary tables, and the built-in aggregate and sorting functions.

    23. Re:A time and place for everything by lumbercartel.ca · · Score: 2, Interesting

      TIMESTAMPs are very useful for retaining temporal order of data. Those articles don't deal with timestamps at all. If they knew about TIMESTAMPs (like what PostgreSQL has), things could be a lot better.

      Cumulative totals can be achieved as well, either within a SELECT statement (that sums from the starting date to the current date being processed) or with a stored procedure (which I'd prefer to ensure efficiency since I could just keep adding to an internal variable along the way instead of re-summing everything for each row returned).

      Moving averages (with varying temporal ranges), Ranks, etc., can all be handled with a stored procedure using some fairly straight-forward "plpgsql" or other DBMS back-end language. With enough inventiveness, someone could probably figure out how to do the same with a SELECT statement (possibly needing sub-SELECTs).

      Adding features such as "accreting" (see bottom of first article referenced at dbmsmag.com above) seems like a nice idea to me -- I'm certainly in favour of adding more features to an RDMBS since it will make it more useful to people. I worry, however, that those authors are expecting SQL to do more than simply return data sets (which their reporting code should be responsible for formatting for the user) since that would be missing the point of what SQL is for.

    24. Re:A time and place for everything by smittyoneeach · · Score: 1
      Sure, performance. What about power and ease of use?

      I just want the fastest one.

      I, for one, love to hand-optimize loops in Assembly. No, I don't. I read Hyde and said, "Yeah, I'll wait until after the YAGNI moment passes".

      --
      Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
    25. Re:A time and place for everything by smittyoneeach · · Score: 3, Interesting

      It seems like a huge chunk of all programming is like being Gerardus Mercator.
      You've got a bunch of information with one shape, but you really need it in another shape.
      The code is about dealing with the embarrassment of it all.
      If you've got tabular stuff, and you so very often do, the relational model is fantastic.
      If you're dealing with some kind of graph, you're hating life.
      People coming in complaining that their aircraft makes a poor submarine are initially amusing, but become tedious.

      --
      Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
    26. Re:A time and place for everything by timmarhy · · Score: 1
      you know you can create your own custom data types and functions in an sql database, it pretty much invalids your point.

      i always lol at people trying to invent an SQL killer - lots of talk and whinging about SQL, nothing concrete

      --
      If you mod me down, I will become more powerful than you can imagine....
    27. Re:A time and place for everything by TheNarrator · · Score: 5, Insightful
      I think the main problem that the web 2.0 dynamic language crowd has with RDBMs is that:
      • 1. Relational data is strongly typed. You cannot easily add new fields to a table or store arbitrary types in a column and expect acceptable performance.
      • 2. Migrating large amounts of relational data to a new structure takes a very very long time. Constant refactoring of data models is to be avoided. You have to get it right the first time or at least very early in the development cycle to avoid major headaches...
      • 3. Databases are hard to mock in a testing context. Automated tests can be significantly slowed down with even a test database..
      • 4. Error in database architecture are very difficult to correct due to 1 and 2, especially when used with a dynamically typed language..
      • 5. It's difficult to maintain the data integrity that RDBMSs take for granted in highly scalable distributed systems and have acceptable performance.

      The only real show stopper and a real reason to replace RDBMSs is #5. All the others can be worked around by just deeper study of data modeling techniques. Data modelling is not something most developers can figure out intuitively. There is a lot of theory to be learned to do it right and it can very easily be done badly leading to severe performance problems and an unmaintainable application.
      With regards to # 5: I went to a presentation at Javaone where some Ebay engineers explained that they do not use transactions in any of their database operations. They just leave junk rows around in the db if a transaction half completes and as long as they aren't reachable they don't consider it a big deal. They have to very carefully organize the order in which they manipulate data to avoid data corruption ,but that lets them get around # 5,

    28. Re:A time and place for everything by Anonymous Coward · · Score: 0

      *Nothing* except the most trivial select from group by is simple in SQL (granted, this is already a bit on the complex side, but still).

      The reason for this is simple.

      You cannot assign a select to a "variable" to be reused several times, so anything even remotely complex degenerates into a massive cut and paste or programmed repetition that more often than not fails due to the idiotic column-name-must-be-unique-in-the-whole-schema-or-else b***shit.

      Anyway.

      SQL blows. Ever tried to compare an IDE from the early 1990s with current SQL "development environments"? That's right, Turbo Pascal 4.0 was better. Why? SQL is unparseable. Even if the vendors could agree on a non-useless subset to support (since even sequences are out, I'm not holding my breath), the language itself is not parseable by any modern language development tool (lex, yacc, javacc, whatever). Yeah, "modern" in that context means anything written since the mid 70s.

      The relational model is cr*p. Well, my bad, it's good the same way democracy is. Everything else is worse. The fact of the matter is that every project designed in the past 10 years includes *some* concept of inheritance, and SQL has good way to handle such a simple thing. You can have a wide table, and constraints are right out (which is half of what databases arer good for IMHO), or you have several tables, which duplicate a lot, and transform a single select into a mess. Yet you get the 3rd manifesto guys explaining that the problem is that SQL didn't do relational right. A model is a *model* for crying out loud! It tells you what a system can and cannot do, that's it. I can't hear anyone saying we should program in brainfuck because that's "The Turing Model". Can you?

      I don't personally agree with this whole noSQL thing, because the first thing they do is tossing out transactions, and that's the most important thing. I don't care about SQL, I don't care about the language, ot the model, I might accept checking constraints, foreign keys even in the app layer, but not transactions.

      Cheers.

    29. Re:A time and place for everything by Anonymous Coward · · Score: 0

      That's nice.

      How do you find the *immediate* children of a node?

      How do you insert into this?

      What do you do when you blow your right hand value?

      And let's not forget that your "efficient" implementation transforms a O(depth)=O(log n) operation (finding the ancestry) into a O(n) merge.

    30. Re:A time and place for everything by lawpoop · · Score: 3, Informative

      For anyone wondering, parent is talking about a preorder tree traversal algorithm:
      Link 1
      Link 2

      And parent it right. I was doing an adjacency list in MySQL for a while, because I thought that preorder trees were just a little too complicated, but they are *way* easier and more intuitive.

      --
      Computers are useless. They can only give you answers.
      -- Pablo Picasso
    31. Re:A time and place for everything by Anonymous Coward · · Score: 0

      That's not hard at all and something I have implemented many times. Some peoples ideas of difficulty are a little strange. If there wasn't some degree of difficulty with programming, they'd call it sitting on the beach drinking martinis.

    32. Re:A time and place for everything by Vader82 · · Score: 1

      I couldn't agree more about databases being terrible for genetic data. I do some work in bioinformatics right now. Completely aside from the terrible code quality is the overwhelming desire to shoehorn all this stuff into databases because nobody knows any better. And when you're working with genetic data you don't need transactions, you don't need some kind of rigid consistency, etc. You just need something to hold all the data for you and take care of picking it up from the disk.

      A college friend recently wrote a great article about the shortcomings of regular databases for a lot of the challenges the community faces today. It's available at http://www.roadtofailure.com/

    33. Re:A time and place for everything by aztracker1 · · Score: 2, Interesting

      >> Find all ancesters of G

      > SELECT * FROM rows WHERE left [right hand value of G]

      Now promote a tree node so it's before G...

      --
      Michael J. Ryan - tracker1.info
    34. Re:A time and place for everything by stubob · · Score: 1

      We were just talking about this at work today. I believe that the disconnect between SQL and domain objects comes from the different ways to model data available now. For example, if you define your data with XML first, you can easily add more features than a database can handle. The simple example that seems to keep popping up at work is inheritance. You create a Shape base class, then extend it with Point, Circle, Polygon, etc. etc. Logical, object-oriented and totally allowed in XML. However, putting it into a database is a mess. You wind up with a parent Point table, a type, and then a set of child tables, one of which is populated. (If there's a better way to do this, let me know).

      So the only way to create a useable database schema is to create that first, and work around the limitations in the relational table model. Otherwise, you will wind up jumping through hoops of your own creation. And I think that's what this group is up to: allowing persistence of arbitrary objects, defined and created agnostically of the storage mechanism.

      --
      Planning to be moderated ± 1: Bad Pun.
    35. Re:A time and place for everything by pvera · · Score: 2, Insightful

      As a web programmer, I wanted to take offense at your statement, but something that happened to me only a few weeks ago is making me have a hell of a lot less faith about the available pool of web programmers out there:

      During a round of interviews, we sent out a take-home quiz. We mostly wanted to know if the candidates either knew the actual answers, or could at least google it. One of the questions involved simple aggregates in SQL. Given a table with a unique id and a date of birth, I wanted a query that would produce a list of the months of the year, and how many unique records had a DOB that fell on that month. It's a one-liner.

      One of my candidates wrote TWELVE counting queries, each one counting DOBs between the (hardcoded) start and end of each month, then she used UNION to make it send out the 12 one-row queries as one 12-row query.

      Both of us evaluating the results screamed when we read her answer, and we did not pursue her further. I used to complain that programmers simply didn't give a shit about learning beyond the querying aspects of the RDBMS, which kept us at the mercy of overpaid DBAs. Now? Now we are starting to see that programmers don't even give a shit about learning how to query.

      --
      Pedro
      ----
      The Insomniac Coder
    36. Re:A time and place for everything by Tablizer · · Score: 1

      Maybe sifting individual DNA "letters" by itself is not the best job of a RDBMS, but if you have thousands of sequences, thousands of specimens, thousands of potential match addresses/spots, etc., then a RDBMS may help you manage those. In other words, the "infrastructure" and inventory of things and relationships between things. Just because a RDBMS does not do the whole thing from top to bottom does not mean it's not useful. It's a matter of using the right tool for the right portion.
         

    37. Re:A time and place for everything by Anonymous Coward · · Score: 0

      faster than

    38. Re:A time and place for everything by Anonymous Coward · · Score: 0

      You can actually do slightly better than this using single numbers (rather than ranges) via Farey trees. That gives optimal memory usage for a given tree size, and doesn't require changing extant nodes to insert a new node.

    39. Re:A time and place for everything by cheekyboy · · Score: 1

      I would love to some create a massive SQL query which can decode mpeg4 video.
      Yes its a wrong language for the problem, but decoding compressed video is a BIG data manipulation problem.

      I guess what people are saying is that sure SQL might do its job, but it has some arcane stupid syntaxes that look like cobol. The differences between each commercial/OS vendor are large enough as to be anoying.

      What people are looking for is a common language that is vendor neutral, and more modern looking. I guess kind of like OpenGL or DX10, some api/lang set that runs via a driver to the final DB solution. So the database is like a graphics card, separated from the api/language. So its easy to swap vendors.

      This is the key that sucks with SQL, once you commit to one vendor, its hard to escape. Today we have OS nuetral 3d apis, window apis, network apis, sound apis. A truely layered DB api/lang that isnt locked to stone to the vendor? No chance.

      --
      Liberty freedom are no1, not dicks in suits.
    40. Re:A time and place for everything by Kjella · · Score: 4, Informative

      1) 1996 called, they want their arguments back. For example, most RDBMS have ranking functions now.
      2) Even in 1996, he doesn't know SQL worth shit

      SELECT (prev.sales+now.sales+next.sales)/3 three_day_average
      FROM sales prev,
                    sales now,
                    sales next
      WHERE prev.day_number = now.day_number-1
      AND next.day_number = now.day_number+1

      Easy as pie making most of the calculations he wants. Maybe he should ask someone knowledgable in SQL?

      --
      Live today, because you never know what tomorrow brings
    41. Re:A time and place for everything by shutdown+-p+now · · Score: 1

      Microsoft SQL Server 2008 has an interesting take on trees in tables in form of a special data type for unique tree node IDs that make relatively efficient queries on ancestors/descendants possible. It does that at a cost of referential integrity, however.

      I'm sure that other major vendors have something in that department, too. There might not be a standard, but "SQL" and "standard compliance" haven't been in the same sentence in practice for the last, what, 15 years or so - so it's not really any different than countless variations on other common things (such as autogenerated PK).

    42. Re:A time and place for everything by slashdotwannabe · · Score: 2, Insightful

      I want to say you're breaking fourth normal form, but I can't.

      I want to say you're storing derived data, but I can't.

      I CAN say that data structure is just butt-ass-ugly.

      --
      This comment is my opinion and does not represent an official position of Donald Trump or others I do not work for
    43. Re:A time and place for everything by Anonymous Coward · · Score: 0

      Trees is a wellknown problem of SQL, but the fact is that SQL can't handle most datastructures and complex relations, only very simple one dimensional ones.

      http://www.slideshare.net/quipo/trees-in-the-database-advanced-data-structures

    44. Re:A time and place for everything by daVinci1980 · · Score: 1

      This is the key that sucks with SQL, once you commit to one vendor, its hard to escape.

      This is laughably false if you stick to ANSI/ISO SQL. SQL is itself vendor-agnostic, it's a promise for an api.

      Don't use vendor specific queries and you'll be completely portable. I've migrated many applications and DBs between postgresql, mysql, sqllite and oracle with little difficulty.

      --
      I currently have no clever signature witicism to add here.
    45. Re:A time and place for everything by TheLink · · Score: 1

      What do you think about Unison and Postgresql?

      http://harts.net/reece/pubs/2009/unison-UCSF-sfpug.pdf

      Video of above presentation:

      http://blog.thebuild.com/sfpug/sfpug-unison-20090311.mov
      http://www.vimeo.com/3732938

      See also: http://psb.stanford.edu/psb-online/proceedings/psb09/hart.pdf

      Presenter is not very good though in my opinion :).

      --
    46. Re:A time and place for everything by frenchbedroom · · Score: 1

      I remember using that model in a project about 5 years ago. Great performance when querying. Deleting a node is straightforward, you can leave the child nodes as they are and your queries will still work. But inserting or updating can be expensive operations.

    47. Re:A time and place for everything by chthon · · Score: 1

      What people really want then are ISAM files. These can deliver all the performance and security of an RDBMS without the R or the SQL.

      In fact, I have written applications in such a system : WANG PACE, which was a relational database system built on top of the WANG VS ISAM file structure.

      Admit it, guys : deep down inside you all want to program in COBOL.

    48. Re:A time and place for everything by Anonymous Coward · · Score: 0

      Care to show queries for
      - inserting nodes
      - deleting nodes
      - moving nodes around ?
      Preordered tree traversal algorithms have tradeoffs, too

    49. Re:A time and place for everything by Hognoxious · · Score: 1

      JavaScript is a vague language, and partly thanks to the web browser wars.

      Exactly. Whenever people speak of JavaScript I ask, "which one?"

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    50. Re:A time and place for everything by mcrbids · · Score: 3, Insightful

      Design an efficient table relating a tree structure. Then design queries to answer questions such as:...

      I don't know, but I recall reading that Postgres 8.4 is now out and includes support for recursive queries. (trees) Not sure about the reputation of the blog in question, but you may have heard of it?

      the fact is that SQL can't handle most datastructures and complex relations, only very simple one dimensional ones

      You are kidding, right? Just today I cooked up a 7-table query including 2 subselects, and a left outer join to a meta table consisting of 2 inner joined tables. Total of some 11 tables comprising a highly complex data set. Don't know what you mean by "very simple one dimensional ones" but 11 tables each joined in either a one-to-many or many-to-many mapping provides at least 11 dimensions. (more if you self-join tabls, often needed) And this isn't particularly hard for me - often I have joins combining 12 or more very large tables with unrestrained combinations somewhere in the billions to trillions of possibilities that all somehow seem to parse just a few seconds thanks to a few well-placed indexes and a well-structured query.

      Methinks you don't really understand SQL?

      --
      I have no problem with your religion until you decide it's reason to deprive others of the truth.
    51. Re:A time and place for everything by Burnhard · · Score: 1

      Trees is a wellknown problem of SQL, but the fact is that SQL can't handle most datastructures and complex relations, only very simple one dimensional ones.

      Trees aren't so difficult to manage in SQL. There are various strategies you could use, the simplest of which is to store a path string with each node (of the form 0001\0002\0041, etc.) there are more complex solutions of course). MSSQL now has hierarchyid enabling native tree structure handling too. I'm not saying that they are conceptually easy to implement or particularly efficient if that is your primary requirement, but they can be done without too much pain and suffering.

    52. Re:A time and place for everything by ubersoldat2k7 · · Score: 1

      Yes, please kill JavaScript and implement another standard language. Was Perl that hard? Where did JavaScript came from? All CGI were made on Perl so why not use Perl instead of JavaScript... I have a need of murder.

    53. Re:A time and place for everything by Ambiguous+Puzuma · · Score: 1

      This is the key that sucks with SQL, once you commit to one vendor, its hard to escape.

      This is laughably false if you stick to ANSI/ISO SQL. SQL is itself vendor-agnostic, it's a promise for an api.

      Don't use vendor specific queries and you'll be completely portable.

      This article, along with my own experience with Oracle, suggests this is frequently impossible due to vendor-specific quirks. Have you tried a comparison involving an empty string, for instance? Or case-sensitive and case-insensitive string comparisons?

    54. Re:A time and place for everything by dna_(c)(tm)(r) · · Score: 1

      database generated id

    55. Re:A time and place for everything by dna_(c)(tm)(r) · · Score: 1

      Thanks for the information.

      So in order to insert one node you have to traverse the entire tree to recalculate left/right values? that is, update each row (either left or right value)? That kind of rules it out when there are lots of concurrent users modifying the tree.

    56. Re:A time and place for everything by Anonymous Coward · · Score: 0

      Huh? Tree structures are best handled by relational databases, as it is far faster then recursion. Give row a unique ID and a parent ID, and in addition, a left hand and right hand number, the root node having a left-hand value of 1 and a right hand value of (number rows * 2), the first child node has a left-hand value of one more than the parent's, the right-hand value is one less then the left-hand of a younger sibling.

      Great, now tell what would happen when you insert 1.000.000th node on distributed system ...

    57. Re:A time and place for everything by ardor · · Score: 1

      How about this:
      learn to distinguish between the actual language, and the browser API?
      JavaScript is standardized, and a fine, Lisp-like language. Browser APIs (DOM among others) are not.

      --
      This sig does not contain any SCO code.
    58. Re:A time and place for everything by Anonymous Coward · · Score: 0

      Sounds more like a textbook example than a real world case to me.

    59. Re:A time and place for everything by ukyoCE · · Score: 1

      Funny enough, I get wish list e-mails from half.com (owned by ebay) and I get the same item listed in EVERY email, which no longer actually exists. I even emailed them about it, but every month or so I get another wishlist e-mail with that same cheap imaginary item for sale...

    60. Re:A time and place for everything by ukyoCE · · Score: 1

      I think you're confused - HQL and Linq are there to hide the shortcomings of the *developers* (and development frameworks), not of SQL. They're basically there to try to force bad developers into using an overly-standard methodology, instead of doing things Right and Fast. You redefine Right to be "however the framework does it", and redfine Fast to be....extremely Slow.

    61. Re:A time and place for everything by Anonymous Coward · · Score: 0

      If you're relying on nulls in that way you're a stupid fuck. Step away from that keyboard NOW.

    62. Re:A time and place for everything by dargaud · · Score: 1

      * Find the nodes in the subtree under B. * Find all ancesters of G * Find the nearest common ancestor of D and H

      I am a very bad SQL programmer (but an experienced C coder), so I'd be interested to know how to do this in SQL... I've never been able to make much sense out of this so called 'language'.

      --
      Non-Linux Penguins ?
    63. Re:A time and place for everything by TheRaven64 · · Score: 1

      I think this illustrates the problem with SQL. You have to design your database for the type of query you want to run. For example, try writing an arbitrary transitive closure in SQL. You can't, unless you specifically stored your data in a form that computes this on insert.

      This is fine, except that requirements change over time. It's very easy to get into a situation where the queries you want to run are painfully slow on the data layout you have. Your only solution is to move the data to a brand new schema. SQL, however, doesn't encourage the kind of abstraction that makes this kind of refactoring easy.

      When I evaluate a programming language, one of my main criteria is how easy it is to adapt code written by an average (i.e. not very good) programmer to a new set of requirements. SQL does very badly here.

      If you have good SQL programmers / database architects and you have a set of requirements that aren't likely to change much over time, you can do some great things with SQL. Unfortunately, this isn't a luxury most of us have.

      --
      I am TheRaven on Soylent News
    64. Re:A time and place for everything by dargaud · · Score: 1

      When I look at an example such as yours, all I can think of is: "Huh, yeah, maybe". It's like looking at Perl code: there's no way to know if it'll actually work unless you type it in, all you get is a vague feeling that it might work. Maybe. I suspect that's the reason why a lot of people hate SQL and Perl.

      --
      Non-Linux Penguins ?
    65. Re:A time and place for everything by cyclomedia · · Score: 1

      Actually 1 and 2 are pretty big pains in the bottom when all you want is a customer -> order -> products * qty relationship or a news -> items -> attached images relationship. A lot of the time you will never ever in a zillion years need to see the orderID from code because all you need to do is some pointer munginh something like:

      theCustomer.currentOrder.Items[0].Quantity++;

      One of the new-age / agile mantras of code is refactor-as-you-go-(just-make-sure-it-all-compiles-and-runs-too). And #1 and #2 are bitches when you need to change an object's property from a straight yes/no (mapped to a database boolean) to an choice between yes,no,doesn't-know,hasn't-filled-in-form-yet. (not to mention the pains of getting an enum/lookup into and out of a database what with lookup tables and data columns that are just filled with meaningless ids) it'd be nice if from a developers point of view refactoring the class also refactored the underlying data storage. This would be find for 95% of database driven apps and websites, with the RDBMS expert only needed to be called in when you need to archive, mirror, distribute and stabilise the backend because your customers per day went above eight.

      --
      If you don't risk failure you don't risk success.
    66. Re:A time and place for everything by Anonymous Coward · · Score: 0

      You've designed a structure where *selection* is efficient, but you've forgotten *updates*. These will be very inefficient, as whenever you insert a child node you need to modify a whole bunch of records.

    67. Re:A time and place for everything by DiegoBravo · · Score: 1

      In other words, the "web 2.0 people" wants a database that can be altered in any conceivable way that is convenient to the application programmer as he/she is yet unsure about the needed data model; of course, this can be only work if nothing else is using or will use that database (typically, pet projects.)

      And yes, the ALTER TABLE commands are not very flexible; yet are very understandable for all the IT actors, and reasonably unobtrusive for the production SQL-applications.

    68. Re:A time and place for everything by Anonymous Coward · · Score: 0

      Huh? Tree structures are best handled by relational databases, as it is far faster then recursion.

      Far faster than recursion? That's a tautology that just isn't true. In many cases, properly tail-recursive language can handle recursion much faster than the alternative. (Scheme, Lisp)

      But in most SQL systems, tree tables are HORRIBLE to deal with. If you're dealing with a tree table in MySQL, and have a table with 10,000 rows and you want to traverse the tree, you have to make another SQL call for each iteration. Having O(n) sql calls for a tree will destroy your efficiency. It forces you to make LOTS of small SQL calls and that's terrible.

      The only exception that I know of is CONNECT BY PRIOR in oracle which can do it in one single query. This will speed up your queries a lot

    69. Re:A time and place for everything by dzfoo · · Score: 3, Insightful

      So what you are saying is that you do not know enough of SQL to understand that query; therefore you are not qualified to comment on the practicality and viability of using SQL for complex structures.

      I suspect the same applies to Perl.

                -dZ.

      --
      Carol vs. Ghost
      ...Can you save Christmas?
    70. Re:A time and place for everything by adamjaskie · · Score: 1

      EVERYTHING comes with a price. In some cases, an RDBMS is the answer. In other cases, a key:value store is the answer. Everything has trade-offs, and there is no one answer that fits all use-cases. Dismissing an RDBMS because it doesn't work perfectly in some specific situations is short-sighted. Use the right tool for the job.

      --
      /usr/games/fortune
    71. Re:A time and place for everything by loufoque · · Score: 1

      Zomg, logarithmic complexity to update a tree! What a tradeoff...

    72. Re:A time and place for everything by Anonymous Coward · · Score: 0

      Every DBMS seems to require its very own special SQL syntax that's oh so slightly different from every other DBMS SQL syntax.

      So, I'd like to have a standardized SQL syntax. And have it actually be implemented in the DBMS's and work.

      A lot of platforms have database syntax independence layers for exact this reason. Not because we are in love with objects, but because it's pain writing the same query in 10 different ways for 10 different vendors.

    73. Re:A time and place for everything by mjtaylor24601 · · Score: 1
      Most things have tradeoffs. The question isn't "is an RDBMS the perfect way to store a tree?" (of course it isn't, in real life things are rarely perfect), but rather "is there a better way than an RDBMS for storing a tree?"

      The original point was that RDBMSs were bad at storing trees.
      The GP pointed out one way to store trees in an RDBMS.
      You (and several other posters) correctly point out that this structure has limitations and trade-offs associated with it.

      Fair enough. But to suggest that RDBMSs are bad at storing trees, don't we need to suggest a system that's better?

      So my question is, without using an RDBMS how would you design persistent storage for a tree structure that meet s all the conditions that have been raised

      1. Efficiently finds all ancestors/descendents

      2. Efficiently finds only the immediate parent/children

      3. Inexpensive to update the structure of the tree (insert nodes, delete nodes, move nodes around, etc)

      4. Deals with arbitrarily sized trees (no storing the whole tree in memory, that's cheating).

      And for bonus points it should really also deal with all the other fringe benefits we get from using a RDBMS that have nothing to do with this particular data structure

      1. Automated backup/recovery

      2. Transactional integrity across multiple concurrent users

      3. Easily accessible from a wide variety of programming platforms/languages

      4. Etc.

      When it comes to tree data RDBMSs may not be perfect, but I'm not sure they're any worse than anything else.

      --
      I wish I were as sure of anything as some people are of everything
    74. Re:A time and place for everything by lawpoop · · Score: 1

      So in order to insert one node you have to traverse the entire tree to recalculate left/right values? that is, update each row (either left or right value)? That kind of rules it out when there are lots of concurrent users modifying the tree.

      You don't have to update *each* row, just those to the right of the leftmost value before the node you're inserting. But you're not 'traversing' a tree, you're just doing a simple update: "UPDATE trees SET rgt = rgt + 27 WHERE rgt > 107 AND tree_id = 34". That creates the space for you to insert a new branch.

      And yes, you would have to lock the table ( or at least the rows of that particular tree, if you database has that capability ). How big are your trees, and how many concurrent users do you want updating them? The structure of the of this is pretty simple, so I would think that the updates would go pretty quickly.

      --
      Computers are useless. They can only give you answers.
      -- Pablo Picasso
    75. Re:A time and place for everything by Anonymous Coward · · Score: 0

      Damn you!

      I'm a Java developer, one who actually studied computer science, and I not only ADORE SQL, I use it every day at work. Java's SQL support is better than any other language I've used, and using my objects with an Oracle database (or JavaDB embedded database, which I use in my main project) is a piece of cake.

        Good Java programmers LOVE SQL, because the process of entity-relationship modeling can be run in parallel to (and coordinated with) the process of object modeling. If you're any damn GOOD, the two processes go hand in hand, and mirror each other.

      Object to RDBMS mapping is as easy as writing insert, update, fetch and delete methods in your objects. If you want to make your objects fully extensible, all you have to do is put most of your database code in an abstract base class, and provide abstract methods for setting SQL and assigning bind variables, and let your child classes implement them. It's not rocket science, it's not even difficult.

      Don't you DARE lump us Java guys in with the PhP degenerates! You fool! You dilettante!

      You... PHP CODER!

    76. Re:A time and place for everything by lawpoop · · Score: 1

      I thought that this representation of the data structure was extraordinarily beautiful, in the sense of elegant. Perhaps you just don't like tree data structures, aesthetically?

      --
      Computers are useless. They can only give you answers.
      -- Pablo Picasso
    77. Re:A time and place for everything by lawpoop · · Score: 2, Informative

      That's true, but you should mention that this data representation comes at a tradeoff for update efficiency - insertion of new nodes force you to update the entire subtree's left and right hand values.

      This is not entirely correct. If you want to insert a new node, you need only update the values of the nodes to the right of the insertion point. This covers more and more nodes the closer you get to the left side of the tree. If you wanted to add a new node onto the right of the tree, you need update only one value, the rgt value of the root node.

      --
      Computers are useless. They can only give you answers.
      -- Pablo Picasso
    78. Re:A time and place for everything by Anonymous Coward · · Score: 0

      WHERE gene_sequence = 'ACGCCTGCAATC' OR gene_sequence = 'ACTCCTGCAATC'

      See what I did there?

    79. Re:A time and place for everything by Anonymous Coward · · Score: 0

      isn't this the technique where you have to lock and update the whole table on insert?

    80. Re:A time and place for everything by jadavis · · Score: 1

      Then design queries to answer questions such as

      Those questions can be answered quite easily using either a materialized path representation (using something like ltree, or WITH ... RECURSIVE.

      Materialized path would be an efficient way to answer those questions. However, if you try to use some graph or hierarchical database system, then that will be the only way you can efficiently access your data; with a SQL system, it's easier to allow the user to ask a variety of very different questions without bias toward any one particular type of question.

      Consider storing something like an iPod. You can store it at: /hardware/apple/ipod; or /apple/hardware/ipod

      The former makes it very hard to find all the products that are made by apple (because you have to search other higher-level categories, like software, etc). The latter makes it very hard to find all hardware products for the same reason (you have to search /ibm/hardware, /microsoft/hardware, etc).

      In a relational system, you'd just say something like "where vendor=apple" or "where category=hardware". The relational organization more closely resembles reality because the hierarchy is 100% artificial.

      I'm sure you can always find a particular set of queries that run faster on your favorite hierarchical or graph database system, but that's not the point. The point is that it biases the queries you ask (both in terms of ease and also efficiency) so heavily that the mixed queries found in most real businesses will cause problems for a hierarchical/graph databases system.

      SQL can't handle most datastructures and complex relations, only very simple one dimensional ones

      That shows ignorance of the relational model. Relations are n-dimensional, where n is the number of attributes.

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
    81. Re:A time and place for everything by jadavis · · Score: 1

      I just want the fastest one.

      The user just wants the fastest response. So, you have to look at the system as a whole; you can't look at the pieces in isolation and ignore the performance features they offer. Also, you have to look at the many different types of users of a system, not just one particular user.

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
    82. Re:A time and place for everything by jadavis · · Score: 1

      efficiently take two dimensional data, and return rows that all fit within the specified radius

      Yeah, you're probably talking about the GiST generalized index access method, on top of which everything from spatial search like PostGIS to full text search have been implemented (FTS can also work on top of the GIN generalized index access method). I would recommend looking at PostGIS first, and if you have more specialized needs, you can use another thing build on GiST or build your own.

      but that's not the only factor for me

      Agreed, a database system is more than just the language. The language is a critical part, however.

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
    83. Re:A time and place for everything by Anonymous Coward · · Score: 0

      in addition, a left hand and right hand number, the root node having a left-hand value of 1 and a right hand value of (number rows * 2),

      And what happens when you need to insert an object (row) into the tree? Or 1,000 objects? Do you now have to adjust all of the right hand numbers? All 5 million of them?

    84. Re:A time and place for everything by jadavis · · Score: 1

      I always wonder what they think is wrong with it.

      Most people who think this fall into one or more of the following categories:
      1. They have specialized needs, very few queries, and don't expect to need more queries or ad-hoc queries later on. Because of this simplicity they have little use for SQL.
      2. They need the other tools offered by an RDBMS, like backup/restore, atomicity, durability, etc., but for some reason don't want to use SQL (probably due to #1).
      3. They have narrow complains about the language design of SQL, for instance the ugly natural language syntax that's inflexible and inconsistent. This leads to a general dislike of SQL.
      4. They don't understand SQL or relational database theory, and would just prefer that it went away.

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
    85. Re:A time and place for everything by jadavis · · Score: 1

      So, I'd like to have a standardized SQL syntax.

      The only way that will happen is if people stop using MySQL and SQLite. Those are the two worst offenders when it comes to standard violations.

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
    86. Re:A time and place for everything by jadavis · · Score: 1

      You cannot easily add new fields to a table

      Some database systems make this easy.

      Migrating large amounts of relational data to a new structure takes a very very long time.

      That's true for any large amount of data. Relational data tends to be much more flexible, so this is less important than in, say, a hierarchical database where reorganization is not only incredibly expensive, but may be a prerequisite for even passable performance of some queries.

      It's difficult to maintain the data integrity... and have acceptable performance

      Again, not specific to relational database systems. It's much more difficult and expensive to maintain data integrity in a hierarchical system unless the constraint and the hierarchy are one and the same.

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
    87. Re:A time and place for everything by Anonymous Coward · · Score: 0

      I can't hear anyone saying we should program in brainfuck because that's "The Turing Model". Can you?

      I think the 3rd manifesto guys just want some serious implementation that adheres more closely to the relational model. SQL tables are multisets, so clearly SQL didn't make much of an attempt to follow the relational model. The 3rd manifesto guys don't necessarily say that you should follow any bad implementation that exists.

    88. Re:A time and place for everything by Marillion · · Score: 1

      Most of those are meta-data about the sequences. I think relational databases work brilliantly in that context. Specially when you pull data from many of the world-wide institutions (NCBI, EMBL, KEGG) all of whom have their own accession IDs for the same things.

      Another place where relational databases work brilliantly is gene names and symbols. As an example, HOP (Entrez 84525) was renamed to HOPX a few months back. We only store the Entrez ID in our tables and then join to a gene table to get gene symbols. Thus when changes come (and they do) it's easier to re-export the data.

      --
      This is a boring sig
    89. Re:A time and place for everything by uhoreg · · Score: 1

      No, it's linear complexity.

      --

      To get something done, a committee should consist of no more than three persons, two of them absent.

    90. Re:A time and place for everything by celtic_hackr · · Score: 1

      Design an efficient table relating a tree structure.

      Huh? Tree structures are best handled by relational databases, as it is far faster then recursion. Give row a unique ID and a parent ID, and in addition, a left hand and right hand number, the root node having a left-hand value of 1 and a right hand value of (number rows * 2), the first child node has a left-hand value of one more than the parent's, the right-hand value is one less then the left-hand of a younger sibling.

      Unfortuinately, your solution grows geometrically. Let's take as a sample my family tree, which I've traced, incompletely, to 16 generations.
      Number of people in a generation:
      my daughter (1),
      her parents (2),
      her grandparents (4),
      gr-grandparents (8),
      ...
      12th gr-grandparents (2^16 ~ 32,000)
      Using your algorithm, to find any one of those 12th gr-grandparents, her record would have to have over 16000 left sides and over 16000 right sides creating a solution that grows in time as 2(O(2^(n/2)).

      That is not to say that SQL can't do trees efficiently. For an example of a successful algorithm, you might want to check out the source for PHPGedView ( a genealogy web database app that can use SQL ).

      Then design queries to answer questions such as: * Find the nodes in the subtree under B.

      SELECT * FROM rows WHERE left > [left hand value of B] AND right < [right hand value of B]

      Won't work in your scenario as all the grandchildren of B will have different left hand and right hand values. Or your solution grows in time with O(n^2).

      * Find all ancesters of G

      SELECT * FROM rows WHERE left < [left hand value of G] AND right > [right hand value of G]

      This is just so wrong, where do I begin.

      * Find the nearest common ancestor of D and H

      SELECT * FROM rows WHERE left < [lowest left hand value from D,H] AND right > [highest right hand value from D,H] ORDER BY right LIMIT 1

      Still not working.

      Trees is a wellknown problem of SQL, but the fact is that SQL can't handle most datastructures and complex relations, only very simple one dimensional ones.

      Are you saying trees are easy or hard? And for more complex systems, that is what JOINs are for. SQL is by far the most powerful way and often the fastest way to manipulate data that I know of. The only time I can recall that I had to use a non-SQL solution that was faster then the SQL solution was a matrix operation.

      Finding degrees of affinity in genealogy IS a MATRIX operation (or can be one)! It can't be solved by a tree alone. Using a tree philosophy, the fastest solution, would require you to: traverse a person's family relationships, store that result, traverse the other person's family relationships, then return the relationship with the newest date. Something that can be done very fast in a properly designed SQL dbs [ 2O(log n) + O(log (O(log n))]. Unfortunately, your design isn't one of those I would call properly designed. Of, course, I've oversimplified the problem by ignoring the complexities of the Real World. Such as: adopted children, multiple parents (step parent families), multiple family relationships as a result of cousin marriages, et cetera. It's really a much more complex problem than your simplistic solution would suggest. Which is why it is such an interesting problem.

    91. Re:A time and place for everything by celtic_hackr · · Score: 1
      Oops! That should have been.

      "Won't work in your scenario as all the grandchildren of B will have different left hand and right hand values. Or your solution grows in time with 2O(2^(n/2))."

      Then design queries to answer questions such as: * Find the nodes in the subtree under B.

      SELECT * FROM rows WHERE left > [left hand value of B] AND right < [right hand value of B]

      Won't work in your scenario as all the grandchildren of B will have different left hand and right hand values. Or your solution grows in time with O(n^2).

    92. Re:A time and place for everything by Hognoxious · · Score: 1

      If the implementations all have a different DOM, then for all practical purposes it's hardly the same language, is it?

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    93. Re:A time and place for everything by TheLink · · Score: 1

      But they do store protein sequences in the pseq table in their RDBMS[1]. While protein sequences are not the same thing as DNA sequences, I believe they're not very different from a database storage and retrieval perspective - just store them as a string.

      And you don't need to use exact string matching on RDBMSes, many are fine with other types of matches. While that's slower than an exact match, it's still doable (and in some cases still indexable).

      [1] One of the pdfs I linked to also gives the address of their DB and website.

      --
    94. Re:A time and place for everything by dna_(c)(tm)(r) · · Score: 1

      2 UPDATES, you forgot lft. And since the tree is bludgeoned into the square table, yes you have to 'traverse' the tree. That UPDATE accesses each row that is affected. My first impression was that every node/row was accessed.

      How big are your trees, and how many concurrent users do you want updating them?

      My conclusion was: OK if update/insert is rare. Not OK when large tree, frequent structural modification, many concurrent users. That I've had to work with very large trees is another story, but it was about billing customers while operations continue, SQLServer stored procedures.

      It's an intelligent hack, but it is not proof that RDBMS/SQL is a best fit for trees.

    95. Re:A time and place for everything by loufoque · · Score: 1

      My bad, I thought this was some kind of self-balancing tree.

    96. Re:A time and place for everything by lawpoop · · Score: 1

      2 UPDATES, you forgot lft. And since the tree is bludgeoned into the square table, yes you have to 'traverse' the tree. That UPDATE accesses each row that is affected. My first impression was that every node/row was accessed.

      Yes, I forget lft, but correct me if I'm wrong, those updates are done in one fell swoop, no? Here's the corrected query:
      UPDATE trees SET rgt = rgt + 27, lft = lft + 27 WHERE lft > 107 AND tree_id = 34
      I don't know much about how databases write physical data, but I assumed that an update to a single row, regardless of how many columns are updated, counts as a single update.

      How do you have to "traverse" the tree, any more than any other query? Do you mean the B-tree ( or whatever ) physical storage of the data on disk? In that case, any SQL query is a traversal of the "tree". But talking about this logical tree made up of rows in a database, you only need to access and write the rows that would be affected, which are all the rows to the right of the insertion point. In the example above, all rows with lft < 107 are not affected. Are you saying those rows have to be scanned? Well, any SQL queries have to scan the table the query refers to.

      It's an intelligent hack, but it is not proof that RDBMS/SQL is a best fit for trees.

      Well, I never claimed that RDBMS/SQL were the "best" for trees; op did that :) I was just trying to clue people in about what op was talking about. Is it any more a hack than any other data modeling in SQL?

      Can you clue me in to some other data structures that a more efficient when dealing with trees? My initial inclination is that the modified pre-order tree has nothing to do with relational databases or SQL; it's a data structure that just happens to be implemented in SQL in these examples.

      --
      Computers are useless. They can only give you answers.
      -- Pablo Picasso
    97. Re:A time and place for everything by BitZtream · · Score: 1

      You use use LDAP for genetic data then? Seems like it would fit perfectly.

      --
      Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
    98. Re:A time and place for everything by BitZtream · · Score: 1

      Does your database not have regular expressions? I mean, just looking at it from an utterly simplistic view, sequences stored as text like you've posted (which is not the most efficient way I know) could be searched like you've suggested with a regexp. Of course you could also just use a simple OR ;)

      There are for me efficient ways to store it and search like you want however, sounds more like you're just not really all that good with SQL. Which is fine, I'm certainly no expert, thats why I work with someone who is really good with SQL when I need to deal with a SQL DB. Having a clue makes things much easier.

      --
      Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
    99. Re:A time and place for everything by BitZtream · · Score: 1

      "this happened after this").

      Guess they never heard of unique timestamps or sequences?

      Unique timestamp is just a timestamp with a numeric sequence tagged on the end if the timestamp happens to not be unique in the table, which is easy to come across on a high traffic db.

      Sequences are just that, auto incrementing unique ids, I don't think you should be allowed to say anything about SQL if you don't know what a sequence is and how it effectively solves the 'this happened after this' problem. Use a single sequence for the entire database and you know EXACTLY what order that every row was added in.

      --
      Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
    100. Re:A time and place for everything by dna_(c)(tm)(r) · · Score: 1

      UPDATE trees SET rgt = rgt + 27, lft = lft + 27 WHERE lft > 107 AND tree_id = 34

      I don't know much about how databases write physical data, but I assumed that an update to a single row, regardless of how many columns are updated, counts as a single update.

      I don't know what you mean with the tree_id, but you have to update all nodes that follow the inserted one, so on average, half the nodes in the tree or half the rows in that table. According to your link 1

      UPDATE nested_category SET rgt = rgt + 2 WHERE rgt > @myLeft; UPDATE nested_category SET lft = lft + 2 WHERE lft > @myLeft;

      E.g. if you take the image with the venn diagrams, and insert a new child "ipod" in "mp3", after "flash" these are the changes:

      • Electronics rgt = 22
      • Portable Electronics rgt = 21
      • MP3 rgt = 16
      • CD rgt = 18, lft = 17
      • 2 Way R. rgt = 20, lft = 19
      • Insert iPod, rgt = 14, lft = 15

      5 of the 10 nodes are updated, and another 3 are updated again. I don't thinl there is any situation where only one node is affected.

    101. Re:A time and place for everything by lawpoop · · Score: 1

      I don't know what you mean with the tree_id, but you have to update all nodes that follow the inserted one, so on average, half the nodes in the tree or half the rows in that table. According to your link 1

      With a tree_id, you can store multiple trees in a single table with the appropriate columns. In other words, One table for each tree is a poor use of your relational system. It's like having a separate invoices table for each customer -- just add a customer_id column and you can have all the invoices for all customers in one table.

      It's better to put a tree_id column ( or some other way of distinguishing the trees from each other ) in that table, and you can have as many "virtual" separate hierarchical trees in that table as you like. So then once you do that, tree_id is the way to specify which tree ( or subset of rows in the trees table ) you're talking about.

      So yeah, on average ( if on average, people tend to put new nodes in the middle -- but are trees necessarily symmetrical? ), you would be updating half the rows in your *subset* of tree rows. If you only had one tree in the table, you would be updating half the rows in your table.

      I don't think there is any situation where only one node is affected.

      This would be rare, but if you add a node to the rightmost side of the tree, then the only update you need to do is update the rgt column value of the root node.

      --
      Computers are useless. They can only give you answers.
      -- Pablo Picasso
    102. Re:A time and place for everything by dna_(c)(tm)(r) · · Score: 1

      This would be rare, but if you add a node to the rightmost side of the tree, then the only update you need to do is update the rgt column value of the root node.

      All its ancestors rgt values change, so there is only one case where no other nodes are affected: inserting the root in the empty tree.

      Thanks for a constructive discussion, a actually learned something on /. today :-)

    103. Re:A time and place for everything by lawpoop · · Score: 1

      All its ancestors rgt values change, so there is only one case where no other nodes are affected: inserting the root in the empty tree.

      I'm confused -- are you talking about INSERTs or UPDATEs? The only time when there is one transaction -- an INSERT -- is when you insert the root node. The only time when you can INSERT a node while having to make only one UPDATE to keep the tree ordered is if you INSERT one node on the rightmost side of the tree.

      Have a look at this image. '19' is the rgt value of 'Portable Electronics', which is the rightmost node. '20' is the rgt value of 'Electronics', the root node. If you wanted to add one element, 'Gameboys' to the rightmost of the tree, just under the root node, that's one UPDATE:
      UPDATE tree SET rgt = 22 WHERE category = 'Electronics';
      and one INSERT:
      INSERT INTO tree ( category, lft, rgt ) VALUES ( 'Gameboys', 20, 21 );

      I realize there may be a distinction between SQL terms and tree manipulation terms, which is why I'm trying to be pedantic. If you are talking about 'updates to the tree' in an abstract sense, as in 'changes to the tree', then yes, there is only one time when there is one update to the tree, when you are INSERTing the root node.

      Thanks for a constructive discussion, a actually learned something on /. today :-)

      Mark this on the calendar! :)

      --
      Computers are useless. They can only give you answers.
      -- Pablo Picasso
    104. Re:A time and place for everything by ardor · · Score: 1

      Obviously, DOM != language. The DOM is more like a standard library.

      --
      This sig does not contain any SCO code.
    105. Re:A time and place for everything by deander2 · · Score: 1

      why not use floats for your left/right values, allowing you to partition your space without requiring the rewrite of all left/right values downwind of your insert?

    106. Re:A time and place for everything by lawpoop · · Score: 1

      That's a good question. I don't think a lot of RDBMSes have floats as native column types. At least MySQL doesn't, anyway.

      --
      Computers are useless. They can only give you answers.
      -- Pablo Picasso
    107. Re:A time and place for everything by lumbercartel.ca · · Score: 1

      The problem is that, from an end-user's perspective (because nearly all JavaScript code is focused on improving functionality for end-users), some web sites simply don't work in all web browsers.

      Explaining to end-users that it's not JavaScript, but that it's the stuff that JavaScript uses which is different, is not nearly as easy as telling the user something like "JavaScript works differently on each web browser."

      I really like the idea of client-side PerlScript. That could open up some really amazing possibilities given how powerful Perl can be.

  3. This is what happens by Anonymous Coward · · Score: 4, Funny

    When you get a lot of morbidly obese nerds with no life to program for you.

    Meanwhile SQL users get laid.

    1. Re:This is what happens by Anonymous Coward · · Score: 5, Funny

      It's true. I do a lot of INNER JOINing. Often with multiple tables.

    2. Re:This is what happens by Anonymous Coward · · Score: 1

      Jocks get to SELECT * FROM sys.tables, so they always get the tables with the lovely columns and big BLOB's. The ones we can access have a lot of constraints, but also integrity.

    3. Re:This is what happens by Anonymous Coward · · Score: 0

      I prefer using the INSERT statement through the backend, myself

    4. Re:This is what happens by ZeRu · · Score: 1

      I use only one table, but try to put all of my data in it so I can end up with a high number when I query it with count(distinct).

      --
      If you post as an AC, don't expect me to spend a mod point on you.
    5. Re:This is what happens by turing_m · · Score: 3, Funny

      I call BS. You are in your mom's basement with one eye watching the door. While you construct intricate combinations of self-joins. Until your fingers cramp up. Because you are too scared to even query another table let alone join with it.

      --
      If I have seen further it is by stealing the Intellectual Property of giants.
    6. Re:This is what happens by Anonymous Coward · · Score: 0

      It's true. I do a lot of INNER JOINing. Often with multiple tables.

      that's just sick. UNIONs is where it's all about. Did enough INNER JOINing on my teenage years.

    7. Re:This is what happens by ZeRu · · Score: 1

      It's true. I do a lot of INNER JOINing. Often with multiple tables.

      It isn't the number of tables what matters, but the number you get when you query them for COUNT(DISTINCT)

      --
      If you post as an AC, don't expect me to spend a mod point on you.
  4. Don't Like Traditional Relational Databases? by ChoboMog · · Score: 3, Funny

    Go fork yourself!

    1. Re:Don't Like Traditional Relational Databases? by CorporateSuit · · Score: 2, Funny

      It seems an idiot has modded you down because they don't understand very basic database expressions.

      No need to get mad at Slashdot's mod point system, because, after all, if they outlaw giving mod points to stupids, then only stupid outlaws will have mod points... or something like that.

      --
      I am the richest astronaut ever to win the superbowl.
  5. Tilting at windmills by Anonymous Coward · · Score: 5, Insightful

    Seems to be a silly thing to be against. Relational databases and the stuctured query language may not be perfect, but I bet these people could die in their 90's and people will still be using relational dbs and sql.

    If you want to tout open or cheap dbs and more lightweight types of storage/db servers, then they might have some points, but being against sql is just plain dumb.

    1. Re:Tilting at windmills by Qzukk · · Score: 5, Insightful

      SQL isn't the only way possible to query relational databases. It's nice and does a really good job for even mildly complex queries and I would not want to ditch it just yet, but seriously... who hasn't had a business need for multiple levels of aggregates (eg averages of sums across multiple groupings, say "average across all customers' total balances") As it is, you end up splitting the logic between the database and the application, or creating a view of the first level of aggregation, then querying against that and hoping that the performance doesn't suck total ass.

      --
      If I have been able to see further than others, it is because I bought a pair of binoculars.
    2. Re:Tilting at windmills by profplump · · Score: 2, Insightful

      I agree, there are problems SQL doesn't solve well. But I think it's unlikely that other, better solutions to those problems will also be superior to SQL where it *does* perform well. As such, "no SQL" is probably not the right plan any more than "SQL only".

    3. Re:Tilting at windmills by Anonymous Coward · · Score: 0

      As it is, you end up splitting the logic between the database and the application

      I always thought that was intentional. The DBMS is for data integrity and access. Business logic belongs in the applications.

    4. Re:Tilting at windmills by Anonymous Coward · · Score: 0

      (...) who hasn't had a business need for multiple levels of aggregates (eg averages of sums across multiple groupings, say "average across all customers' total balances") (...)

      I think it's called partitioning. Recent versions of PostgreSQL have it and i think Oracle has it too.

    5. Re:Tilting at windmills by CrashandDie · · Score: 1

      Cheaper and more lightweight than Oracle?

      Next thing we're going to hear people wanting a free DBMS...

    6. Re:Tilting at windmills by Strudelkugel · · Score: 3, Informative

      OLAP was designed to answer that type of question. MDX is the language used to perform multi-dimensional queries.

      --
      Imagine how much harder physics would be if electrons had feelings! -Feynman, maybe
    7. Re:Tilting at windmills by quantum+bit · · Score: 1

      who hasn't had a business need for multiple levels of aggregates (eg averages of sums across multiple groupings, say "average across all customers' total balances")

      Funny you should mention that. Window functions in the SQL:2003 standard address that need, and there was an article on Slashdot earlier today about PostgreSQL 8.4 being released with support for them. Oracle has for a while now.

    8. Re:Tilting at windmills by SanityInAnarchy · · Score: 1

      I bet these people could die in their 90's and people will still be using relational dbs and sql.

      They'll probably still be using FORTRAN and COBOL. If your only argument is job security, you win.

      being against sql is just plain dumb.

      And making a blanket statement like that is just plain uninformed.

      --
      Don't thank God, thank a doctor!
    9. Re:Tilting at windmills by djbckr · · Score: 1

      ... who hasn't had a business need for multiple levels of aggregates (eg averages of sums across multiple groupings, say "average across all customers' total balances")

      Oracle supports queries like this in their SQL natively. Other vendors are working on it... Just sayin'

    10. Re:Tilting at windmills by tepples · · Score: 1

      who hasn't had a business need for multiple levels of aggregates (eg averages of sums across multiple groupings, say "average across all customers' total balances")

      MySQL, MSSQL, and SQLite all support sub-selects. I'm not in front of a sample database right now to test it, but the SQL code should look like this:

      SELECT AVG(TotalBalance)
      FROM (
      SELECT CustomerID, SUM(Balance) AS TotalBalance
      FROM Accounts
      GROUP BY CustomerID
      ) AS AccountTotals

    11. Re:Tilting at windmills by TimothyDavis · · Score: 1

      I have used indexed views in MS-SQL to do this. There are quite a few limitations (including which SKUs support this), but essentially you can pre-aggregate the data on a view which itself is indexed.

    12. Re:Tilting at windmills by Foofoobar · · Score: 1

      You are confusing the purpose of the database. The database is used to store and retrieve data, not process it. Heavy processing leads to a slower database and slower queries. Heavy processing SHOULD be done by the application.

      --
      This is my sig. There are many like it but this one is mine.
    13. Re:Tilting at windmills by TheRaven64 · · Score: 2, Interesting
      I was at an HPC talk a few years ago, where the speaker said:

      I don't know what kind of language people will be using for HPC programming in 20 years. I don't know the features it will have. I do know that it will be called Fortran.

      I wouldn't be surprised if the same applies to SQL. The language has evolved a lot over the years, to better express different kinds of data. In 20 years time, I wouldn't be surprised if the most commonly-used subset of SQL is nothing like the subset currently popular, but I'd be surprised if the thing to replace SQL isn't called SQL.

      --
      I am TheRaven on Soylent News
    14. Re:Tilting at windmills by Anonymous Coward · · Score: 0

      Postgres does too. And when you have a billion invoices, this takes a while. And then once it's done, you're running another query on top of that.

      Postgres has a lot of nonstandard stuff that needs to be standardized. Want to know each customer's most recent invoice in standard SQL? Tough shit! You can

      SELECT customer.name, invoice.id, invoice.date, invoice.owes FROM customer JOIN invoice ON invoice.customer=customer.id JOIN (SELECT inv.customer, MAX(inv.date) as date FROM invoice inv GROUP BY inv.customer) AS latestinvoices ON latestinvoices.customer=customer.id AND latestinvoices.date=invoice.date

      and hope that nobody placed two orders on the same day. And wait for both passes through the invoice table. If you're not using a replication system that affects row numbering, you might (assuming that your dbm guarantees monotonically increasing numbers even in the event of rollbacks and other session chicanery) get away with MAX(inv.id) but most replication systems assign per-server pools of row numbers (either even/odd or blocks of N000) to prevent conflicts.

      But wait! With postgres, I can

      SELECT DISTINCT ON (customer.name) customer.name, invoice.date, invoice.id,invoice.owes FROM customer JOIN invoice ON invoice.customer=customer.id ORDER BY customer.name ASC, invoice.date DESC

      , retrieving the information I want in a single pass through the joined table thanks to postgresql's "halfway distinct" query mode. This is not the same as windowed functions, which still returns all of the rows of a group (and which according to postgres's docs, is done after WHERE and HAVING, meaning that it looks like I can't do something like

      WHERE invoice.id=first_value(invoice.id) OVER (PARTITION BY invoice.customer ORDER BY invoice.date DESC))

    15. Re:Tilting at windmills by Qzukk · · Score: 1

      The problem with WINDOW/PARTITION functions is that (in postgres) they're applied after WHERE and HAVING, meaning that the value appears on every row of the table. In other words I could get

      cust. | avg_charge
      Joe | $1000
      Joe | $1000
      Joe | $1000
      Joe | $1000
      Jim | $80
      Jim | $80

      etc if there's 4 invoices averaging $1000 for Joe. This isn't fatal, I just have to be sure that I order the table by customer and discard the unwanted rows in my application. Postgres's docs doesn't say whether I could use DISTINCT on it, but I suspect that I could take care of it there.

      There's other problems with SQL's aggregation model too: it's usually not possible to join a sum to a sum. Let's say that I've broken payments out into a separate table. (invoice.paid just doesn't cut it anymore when people split the check between two credit cards.) Now the problem is figuring out whether SUM(invoice.owes)=SUM(payment.paid), especially in the case of multiple payments, since just joining invoices against payments would result in an internal table like

      cust | owes | paid
      Jim | $1000| $250
      Jim | $1000| $750

      Where sum(owes) would be $2000 (even though there's only one invoice) and sum(paid) would be correct. Of course, subselects "fix" this by essentially creating a table of customers and total owed, and customers and total paid, but that's a lot of work for thousands of customers with tens of thousands of invoices, just to find the two delinquent customers.

      --
      If I have been able to see further than others, it is because I bought a pair of binoculars.
    16. Re:Tilting at windmills by tepples · · Score: 1

      Postgres does too. And when you have a billion invoices, this takes a while.

      Some SQL servers' query optimizers seem to handle subqueries efficiently. If EXPLAIN shows a plan with huge temporary tables, you can shrink these temporary tables by including a few basic sanity checks in your innermost WHERE clause, such as checking only those orders placed in the past year.

      Want to know each customer's most recent invoice in standard SQL? Tough shit!

      In pretty much every invoicing system I've seen, each invoice has an auto-incrementing primary key. For example, in Transact-SQL, you can find the most recent invoice for each customer who has placed an order in the past year or so:

      SELECT CustomerID, MAX(OrderNumber)
      FROM Orders
      WHERE DATEDIFF(day, OrderDate, CURRENT_TIMESTAMP) <= 366
      GROUP BY CustomerID

      and hope that nobody placed two orders on the same day.

      Even without an auto-incrementing primary key, they'd have to be in the same second, not the same day.

      and which according to postgres's docs, is done after WHERE and HAVING

      If you want something done before WHERE and HAVING, put it in a subquery.

  6. Protest taxes, not databases by Anonymous Coward · · Score: 0, Troll

    Too bad they can't protest the current regimes taxes with as much enthusiasm. At least it would be a protest against something that actually matters.

  7. Flat Earth by Seumas · · Score: 3, Insightful

    I've seen strong reactions from various camps with regard to concern over saying no to SQL. I'm not sure why people freak out over it. First, you have to strike out toward new things if you want to progress the world. Second, SQL hasn't caused people to stop using spreadsheets or Access databases. Third, there are groups that get together to dispute that the earth is round; insisting that it is flat. Or that gray aliens are visiting earth regularly and probing our anuses.

    Bring on the next fascinating data technology. SQL will continue to have a major place for many years to come, no matter what happens.

    1. Re:Flat Earth by syzler · · Score: 3, Interesting

      I've seen strong reactions from various camps with regard to concern over saying no to SQL.. Third, there are groups that get together to dispute that the earth is round; insisting that it is flat.

      Corporations represented in this group included the likes of Google, Last.fm, Amazon, and Facebook. Hardly the same caliber of people who claim the earth is still flat. I'm inclined to listen to engineers from these companies if they say that an SQL database does not scale well for vast amounts of data.

    2. Re:Flat Earth by MightyMartian · · Score: 2, Interesting

      The whole thing is just reactionary mumbo-jumbo. There are kinds of data that relational databases are fantastic for, and kinds of data they're not, and sometimes none of it is exactly perfect. SQL is actually a pretty damned good, single-purpose language. It's not hard to learn, and once you learn it, the differences between RDBMS implementations becomes a little like Javascript, just something you have to put up with, not that a lot of people actually have to worry all that much about writing fully-portable SQL queries.

      --
      The world's burning. Moped Jesus spotted on I50. Details at 11.
    3. Re:Flat Earth by MightyMartian · · Score: 5, Insightful

      And yet where the other corporations; the oil companies, the banks, large merchant conglomerates. In IT we seem to have this sort of myopic view that if it isn't an IT company of some kind, it doesn't exist. Google, as compared to the huge companies that use tools like Oracle, is a bit player. I know that's hard for all of us who have sucked at the teat of silicon valley for so long have a hard time dealing with, but a significant amount of data that has nothing to do with social networking and finding pr0n goes on and does use tools like SQL.

      --
      The world's burning. Moped Jesus spotted on I50. Details at 11.
    4. Re:Flat Earth by Clover_Kicker · · Score: 2, Funny

      You're not going to get many page hits with an attitude like that...

    5. Re:Flat Earth by moderatorrater · · Score: 1

      Agreed. SQL is a generalized solution that works well for a lot of different things and works extremely well for a subset of those thing. For other applications (like indexing the internet), more specialized solutions are going to kick its ass. It's the same way as any programming you do: the easier and more general the tool, the more you sacrifice for it in terms of speed, efficiency, scalability, whatever.

    6. Re:Flat Earth by Vellmont · · Score: 1


      I'm inclined to listen to engineers from these companies if they say that an SQL database does not scale well for vast amounts of data.

      This statement, taken as a whole is pure nonsense. "Databases" scale quite well for "vast" amounts of data. There's retailers that store millions of transactions a day on relational databases that would be out of business if they didn't.

      If I had to guess, I'd say that relational databases might not be a great solution for a quickly evolving web company with possibly constantly changing data structures and new requirements being added. Doing all that glue code sucks, and patchwork solutions like Hibernate aren't much better (and IMO worse).

      It shouldn't be surprising that a tool developed for one purpose isn't well suited to all purposes. Creating some kind of "movement" out of it is about as stupid as being against hammers in favor of screwdrivers. Down with hammers! Yeah screwdrivers!!

      --
      AccountKiller
    7. Re:Flat Earth by hachete · · Score: 1

      I work for a financial company and if the rest use their bright shiny oracle databases like we do - and I don't think we're atypical - then, no, they have no idea how to use a database. Or build applications. At all. Not a clue.I can't begin to describe the inability, the sheer awesome crap-ness of what they do. The amount of work-arounds that the programmers implement to short-circuit the crap-ness. Really, you have no idea what you're talking about.

      --
      Patriotism is a virtue of the vicious
    8. Re:Flat Earth by kraut · · Score: 1

      Actually, the oil companies almost certainly have huge amounts of non-SQL data; I'm not sure whether seismology data comes in HDF, but it certainly doesn't come in SQL ;) Ditto banks have enormous amounts of non-SQL market data in specialised tick databases. That doesn't stop them from also having other important systems using SQL.

      Vice versa, I'm pretty sure that while Google doesn't store its petabytes of web indexing info in a relational database (why on earth would you?), I'm equally sure that its billing, accounting and HR systems use relational databases; why on earth wouldn't they? Same thing applies to Amazon.

      Horses for course may be an old saying, but it's still true.

      --
      no taxation without representation!
    9. Re:Flat Earth by syzler · · Score: 1

      In IT we seem to have this sort of myopic view that if it isn't an IT company of some kind, it doesn't exist.

      I understand that not all companies that maintain large data sets are technology companies. My only point was that when a group of companies known to manage large sets of data say that SQL does not always fit the bill, then I am inclined to listen rather than calling them nuts.

    10. Re:Flat Earth by Anonymous Coward · · Score: 1, Interesting

      You sure there's absolutely no difference between the nature of a bank and the nature of a massive search engine?

      And how sure are you that a bank's IT staff are on the leading edge of innovative technologies? If anything, they lag behind because it's "safer" than risking the untested new thing.

      Try a few of the Post-Relational databases, read up on the CAP Theorem, understand the -nature- of the problem you're talking about, and then come back.

      Or I'll save you some time. RDBMS systems focus on Consistency, and trade Availability for it. Your bank's computer can be down for an hour... inconvenient, but acceptable. But they cannot, under ANY circumstances, be incorrect. Period. Google, on the other hand, can handle some slightly incorrect data... but being offline is totally unacceptable.

      Amazon's CTO gave a great example. He talked about how a Shopping Cart must have Availability, and slight inconsistencies in the data as that data propagates a network are acceptable. In the end, the data is eventually consistent anyways, and you NEVER want your customer to not be able to add a cart item. The checkout, however, is financial, and heavily needs Consistency. Alternatively, after the order is done, the list of past transactions again can lose consistency a tiny bit (since it's read-mostly anyways) in exchange for always being up.

      Hmm... more to the issue than you thought? XD

    11. Re:Flat Earth by Vellmont · · Score: 3, Informative

      And so you're saying this is all the fault of the relational database, and would all be solved by using some sort of object based database? That's the topic at hand here, not developers dealing with legacy systems patched together.

      --
      AccountKiller
    12. Re:Flat Earth by fabs64 · · Score: 1

      I look to the oil companies to innovate in drilling technologies.
      I look to financial companies to hopefully not innovate too much anywhere :-)
      I look to IT companies to innovate in IT.

      I dunno about you, but I've seen an incredible amount of money spent in the last 10 years or so attempting to change those massive relational databases into formats that can be reported on, as well as huge amounts of energy put into moving from one relational schema to another.

      Pretending the big conglomerates present the best answer just because they're big is a recipe for non-movement.

    13. Re:Flat Earth by leenks · · Score: 1

      There's retailers that store millions of transactions a day on relational databases that would be out of business if they didn't.

      He said vast...

    14. Re:Flat Earth by Threni · · Score: 2, Informative

      > Second, SQL hasn't caused people to stop using spreadsheets or Access databases

      If if weren't for SQL there wouldn't be any Access databases...

    15. Re:Flat Earth by SanityInAnarchy · · Score: 1

      I think the point being made was that a successful IT company like Google is probably much better positioned to know WTF it's doing with regards to IT stuff. Oracle clients may have no clue.

      And no, "object-based" is not the only alternative.

      --
      Don't thank God, thank a doctor!
    16. Re:Flat Earth by jedidiah · · Score: 1, Interesting

      The RDBMS wasn't the first thing. If you bother to crack open a
      text book and review the history, you might find a data storage
      model that's better suited to your problem.

      If your problem is "big", any solution is bound to appear overly
      complex and too expensive. That's just how the solutions to big
      problems tend to work out.

      --
      A Pirate and a Puritan look the same on a balance sheet.
    17. Re:Flat Earth by wizzat · · Score: 1

      This statement, taken as a whole is pure nonsense. "Databases" scale quite well for "vast" amounts of data. There's retailers that store millions of transactions a day on relational databases that would be out of business if they didn't.

      A couple of comments:
      - Databases do not in fact scale well to "vast" amounts of data. In fact, the stating that they scale to vast amounts of data makes me think that either you've essentially got an unlimited hardware budget to throw at your problems (and even that only takes you so far) or you've never actually dealt with something that *is* a vast amount of data.
      - Millions of rows per day is barely a noticeable amount of data... let alone "vast"

      If I had to guess, I'd say that relational databases might not be a great solution for a quickly evolving web company with possibly constantly changing data structures and new requirements being added. Doing all that glue code sucks, and patchwork solutions like Hibernate aren't much better (and IMO worse).

      Actually, I'd argue the opposite. A relational database is exactly the way someone in that situation should go unless they know from the start that they will be seeing "vast" amounts of data (hundred million rows/day, etc). You can think of SQL as a really high level programming language for data access, complete with very mature libraries for delivering that data to whatever language you should choose to write your app in. Think of it like Rapid App Development.

      However, there will (sometimes) come a time when you've pretty well exceeded what a database is going to do for you (I'll spare you the details of how painful this process will be when you discover where that limit is). Then you either need to distribute your database (Greenplum, RAC, etc) or you need to reevaluate why you're keeping it in a relational database in the first place. Where I work, where we work with marginally "vast" amounts of data (~10TB of online data for the last 6 months I think) we chose to get our tushes largely out of the database for data processing.

      It shouldn't be surprising that a tool developed for one purpose isn't well suited to all purposes. Creating some kind of "movement" out of it is about as stupid as being against hammers in favor of screwdrivers. Down with hammers! Yeah screwdrivers!!

      Actually, I find it funny that you assume they're trying to use the wrong tool for the job. Like you said: it shouldn't be surprising that a tool developed for one purpose (relational databases) isn't well suited to all problems (actually vast amounts of data) in it's problem domain. Think of it like this: you wouldn't use a wrench to change your car tire (though you probably *could*). You'd use a 4-way, or maybe even an air gun. SQL can be just as underwhelming as said wrench when dealing with actually vast amounts of data, specifically for the same reason I'd pick it when dealing with lesser amounts of data: it's simple and fast to write.

    18. Re:Flat Earth by cervo · · Score: 1

      Sybase is pretty popular on Wall St. too.

    19. Re:Flat Earth by Anonymous Coward · · Score: 0

      included the likes of Google, Last.fm, Amazon, and Facebook. Hardly the same caliber of people who claim the earth is still flat.

      Actually, economists advising those organizations publicly claim the world is flat...and getting flatter! (from an economic perspective)

    20. Re:Flat Earth by hunangarden · · Score: 1

      The whole thing is just reactionary mumbo-jumbo

      Well if by "the whole thing" you mean this /. topic then you are right. Here's some non-reactionary things we could discuss on this topic:

      • What are some examples of when a non-RDBMs solution is better than RDBMS and SQL?
      • What are some examples of the opposite?
      • Is there a way we can generalize these so that its easier identify the class of problems where each solution would be optimal?
      • What are some non-SQL/non-RDBMS solutions out there, that are generalizable to a large class of problems, and don't require developers to roll their own system for each different problem?
    21. Re:Flat Earth by Anonymous Coward · · Score: 0

      SQL does not scale for the *TYPE* of data they are trying to use.

      They have thousands of rows of lists of reports. They can not send a user a list of 20000k in records so they need to have pages. The problem with this is if you want to send page 18 of 5382. What records end up on that page? You need at least pages 1-17 to figure out what 18 should be. With a 'staic' enviroment you can pre calc it and you are done. But they have dynamic environments. Where people pick 20 preferences and what they already bought is not in the list. They will find that SQL doesnt do any better than the 'other' approaches. You STILL need pages 1-17. What they are seeing is a problem in their data. How do you keep those pages around? How long? As people 'page' thru things to you run more and more expensive queries? Blaming SQL for this problem is akin to me blaming the hammer that helped build my house for leak in the sink. When I should blame the builder for not putting the pipe together correctly...

      They have a O(n^m) problem and are trying to solve it in O(log(n)). It just isnt going to happen. It is NP problem. NP problems do not scale very well. Blaming SQL for that is the wrong approach. They should be working on minimizing the cost of doing those searches. By saving them off and reusing them in some way. That way even though you pay the cost for it you are not doing it over and over.

      I would be a little frightened working with people who do not understand that. Especially if they are in corps THAT big...

    22. Re:Flat Earth by Anonymous Coward · · Score: 0

      I'm pretty sure that Google's largest data store is larger than, say, Chevron's. And also vastly different. They've "invented" (er, rediscovered, really) a lot of technology that works well for their data and applications. In fact, trying to do what Google mainly does using SQL would be like attacking a lunatic asylum with a banana: useless but good for a laugh. But I'll wager good money that Google's backend financial systems are running on Oracle. Probably on Solaris.

      SQL and RDBMSs for what they're good for and !SQL||RDMBSs for what they're not. Oh no! The world's going to end. Ahhhhhhhhhhhhhhh.

    23. Re:Flat Earth by lgw · · Score: 1

      This statement, taken as a whole is pure nonsense. "Databases" scale quite well for "vast" amounts of data. There's retailers that store millions of transactions a day on relational databases that would be out of business if they didn't.

      Millions of rows per days is a modest amount of data. If you're not measuring data in PB, you're small fry. A RDMS doesn't scale to the needs of Google, or Visa, or other big players. Even in my own small-scale work, I run into scalability problems at just millions of rows, which is really pathetic.

      It will take a movement like this to get people to realize that screwdrivers *exist*. You'd be amazed how many people don't realize that there are other approaches to databases!

      --
      Socialism: a lie told by totalitarians and believed by fools.
    24. Re:Flat Earth by Vellmont · · Score: 1

      I really have no idea how many rows/day the likes of Walmart throws into a database. 100 million a day wouldn't surprise me. I just have a hard time believing that Google/Amazon/ is the biggest DB users in the world.

      (btw. throwing around the word "vast" like it has some specialized meaning outside of some small group of people is just incredibly wrong)

      --
      AccountKiller
    25. Re:Flat Earth by gd2shoe · · Score: 1

      "When the only tool you have is a hammer..."

      Part of the problem is the fact that SQL is being used for all types of data, even data for which RDBMS systems are not designed to handle. All we have is a hammer, and I for one, am tired of bashing in screws with it!

      --
      I won't join Slashcott. OTOH, If Beta goes live, I just won't be back until it's fixed. Sorry Dice.
    26. Re:Flat Earth by Tablizer · · Score: 1

      ...who have sucked at the teat of silicon valley for so long have a hard time dealing with, but a significant amount of data that has nothing to do with social networking and finding pr0n goes on and does use tools like SQL.


          SELECT mpeg AS movie
          FROM porn
          WHERE cupSize >= 'DD'
              AND hairLength > 12
              AND hornyPercent > 80.0
              AND (doesBJ='Yes' OR doesAnal='Yes')
              AND age BETWEEN 18 AND 28
          ORDER BY price ASCENDING

      Do you still want to give up SQL?

    27. Re:Flat Earth by wizzat · · Score: 1

      I really have no idea how many rows/day the likes of Walmart throws into a database. 100 million a day wouldn't surprise me. I just have a hard time believing that Google/Amazon/ is the biggest DB users in the world.

      There are two kinds of data stores: those that simply store data and those that retrieve it as well. I very seriously doubt that Walmart needs fast random access into the database. Consider:
      - Walmart (as of last reckoning, Aug 2007) had a 4PB database and handles 276M+ events/day
      - Yahoo handled (as of May 2008) 24 billion events per day over a 2 PB database. For reference, Visa handled 50 million events per day, and the NYSE 225M/day.
      - Google seems to be secretive about their db size, but indexes at least 3x as much as Yahoo (implying at least a 6PB database just for web searches... notally neglecting gmail, youtube, etc), and accounts for roughly half of all internet searches. And they store every bit of it.

      So, I guess what I'm saying is that it doesn't matter whether you have a difficult time believing that companies like that are the biggest DB users in the world.. they are. Saying otherwise simply because you have a "hard time believing it" is roughly equivalent to sticking your head in the sand.

      (btw. throwing around the word "vast" like it has some specialized meaning outside of some small group of people is just incredibly wrong)

      I would say it's actually quite correct. Judging from your estimation of 100M rows/day as being huge, and from the email in my inbox, I'd say that most people consider a few hundred gig to be a really large database and a terabyte to be a virtually inconceivable amount of data.

      So yeah, I'd say that the people that deal with petabytes of data have a rather different definition of "vast". Arguably, one that you and I really don't comprehend... but I comprehend enough of it not to doubt then when they say that the relational model breaks down with a hundred TB almost no matter how much hardware you throw at it!

      Sources:
      - http://www.reviewlab.net/2008/05/23/size-matters-yahoo-claims-2-petabyte-database-is-worlds-biggest-busiest/
      - http://storefrontbacktalk.com/story/080307walmart.php
      - http://www.businessintelligencelowdown.com/2007/02/top_10_largest_.html

    28. Re:Flat Earth by shutdown+-p+now · · Score: 1

      SQL is actually a pretty damned good, single-purpose language.

      I disagree. SQL is actually a rather crappy language for what it does - unnecessarily verbose, lots of weird corner cases and exceptions, very unobvious solutions to some rather common problems, etc. In comparison, something like Date's language (as implemented in e.g. Suneido) is far cleaner...

      C is also a crappy language. It has messy grammar, hard both for humans to read and for compilers to parse (a feat oonly surpassed by C++ and Perl), necessitating tools for something that really shouldn't require one. It has some totally unused and pointless features, such as separate namespaces for structs/unions/enums, or the "auto" keyword. It has weird unsigned arithmetic rules. You can write code in it that put Perl to shame. Truly, Modula-2 is a far better system programming language...

      But here's the thing. Everyone knows C. There are dozens of compilers, IDEs and associated tools - every platform has one targeting it. There are hundreds of books and thousands of tutorials, and millions of programmers who know it. C API and ABI are de facto standards for reusable libraries and FFI. All this because, really, C is good enough. It may be messy, but it does mostly everything that needs doing. And in areas where it doesn't do so good, it's easier to extend it than reinvent the wheel.

      Same mostly goes for SQL. It's a mammoth spec now, very complicated and not truly implemented by anyone. But the real-world subset works. For all the quirks you have to learn, it does the job, and most SQL skills actually do transfer between implementations. Again, where there are pain points that are just too inconvenient it's easier to extend the language (and, hopefully, eventually standardize such extensions) than to start from scratch.

    29. Re:Flat Earth by dodobh · · Score: 1

      RDBMSes don't do non-relational data very well. Most of the names mentioned above don't deal with too much relational data in the first place.

      --
      I can throw myself at the ground, and miss.
    30. Re:Flat Earth by The_reformant · · Score: 1

      Their data is inherently low value though, and data corruption is largely unimportant. I bet you'd end up with the exact opposite view if you looked at banks and financial institutions.

      --
      I have discovered a truly remarkable sig which this post is too small to contain.
    31. Re:Flat Earth by dzfoo · · Score: 1

      Thom, is that you?

      OHAI!

              -dZ.

      --
      Carol vs. Ghost
      ...Can you save Christmas?
    32. Re:Flat Earth by Anonymous Coward · · Score: 0

      It seems to me everytime I go do business in person at one of those non-IT places, someone takes my name or account number then talks about how slow their computer is for a few minutes while we wait for my record to come up.

    33. Re:Flat Earth by jadavis · · Score: 1

      First, you have to strike out toward new things if you want to progress the world.

      In this case, people are striking out towards things even older than SQL: key-value and graph database systems. Just because it has a new name doesn't make it new.

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
    34. Re:Flat Earth by jadavis · · Score: 1

      to change those massive relational databases into formats that can be reported on

      Change them into what? Key value stores? Graphs? Motion is not necessarily progress.

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
    35. Re:Flat Earth by Anonymous Coward · · Score: 0

      What you said may have been true in 2002, but not anymore.

      ~$20B worth of ad clicks per year have to be recorded and within SOX compliance. God knows how many PB of user email, photos, etc. that users would notice if you lost (or even if they become unavailable). No planned downtime is acceptable, even on the weekend or holidays.

      Financial companies may deal with larger numbers, but they are probably on par or behind in the number of transactions and 24/365 availability requirements.

    36. Re:Flat Earth by BitZtream · · Score: 1

      Ironically, BIGTABLE seems to look a lot like SQL if you can abstract yourself from the specifics.

      --
      Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
    37. Re:Flat Earth by fabs64 · · Score: 1

      http://en.wikipedia.org/wiki/Star_schema

      It's standard practice for any large data-warehouse to have a process of transforming their relational data into *something else* before it's used by whatever generates reports.

    38. Re:Flat Earth by jadavis · · Score: 1

      Ok, I think I see what you're saying. I think you're talking about denormalization rather than transforming into a non-relational system.

      Denormalization is the symptom of an imperfect separation between the physical and the logical layers. It's not (strictly speaking) non-relational, but it's perhaps bad logical design.

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
  8. I've been using text files and Excel by Anonymous Coward · · Score: 0

    I keep track of all my car bills and cat names with Notepad and Excel. I don't know why anyone would need anything more than that. If I need to sort my text file, I go to this thing called the command line and use the "SORT" command. If I need to find something in my text file, likewise, I use the command line and the "FIND" command

  9. RDB by MichaelSmith · · Score: 1

    I thought DEC RDB was a pretty good query language. I never got into SQL as a result. I am glad people are thinking about alternatives.

    1. Re:RDB by butlerm · · Score: 1

      DEC (now Oracle) RDB is an SQL database. A pretty nice one, actually.

    2. Re:RDB by MichaelSmith · · Score: 1

      DEC (now Oracle) RDB is an SQL database. A pretty nice one, actually.

      Yes but it had its own query language, complete with a precompiler for C so you could embed queries directly into your code. I believe SQL was an option back when DEC owned it as well.

  10. Next Up... by grepya · · Score: 1, Funny

    ...say no to the tyranny of... er.. English. Let's stick with the combination of grunts, squeals, crying and gesturing that has proven so effective for toddlers all over the globe for thousands of years. And if we surrendered the traditional languages that we are so irrationally attached to, who knows what revolutionary new communication scheme the next-generation kids will come up with.

     

  11. Re:Next Up... by FlyingBishop · · Score: 1

    Python?

  12. The problem is performance not SQL by presidenteloco · · Score: 3, Interesting

    The problem is the performance of transactions and persistence and distribution of data techniques, not
    whether we are using a logic-like STRUCTURED QUERY LANGUAGE to ask for data matching certain conditions.

    The latter is still, and will continue to be, very useful.

    It's just that now that we can assume local clusters and WANs worth of co-operating data stores, there
    are probably better, more performant ways of implementing persistence, replication, distribution of data
    than traditional RDBMS implementations.

    The two concerns: The logical model of how we QUERY for data (or combine it in bulk), which is the core of SQL,
    and how we persist it and retrieve it quickly, now have more options for being separated.
     

    --

    Where are we going and why are we in a handbasket?
    1. Re:The problem is performance not SQL by Crias · · Score: 1

      The problem, I think, runs a little deeper than all that though.

      SQL is unfortunately tied fairly tightly to an RDBMS implementation. All those "join" statements, various ways of expressing "constraints" such as "foreign keys" - all are considered "integral" parts of SQL.

      No, you don't have to provide them. A Post-Relational like Amazon SimpleDB could, theoretically, use SQL for querying and just trim back the feature-set.

      But perhaps it'd be wiser to look at a query language more specific to the Post-Relational model?

      Perhaps SQL stopped being "SQL" and started being "Structured Relational Query Language". *shrugs*

    2. Re:The problem is performance not SQL by oGMo · · Score: 4, Insightful

      It's just that now that we can assume local clusters and WANs worth of co-operating data stores, there are probably better, more performant ways of implementing persistence, replication, distribution of data than traditional RDBMS implementations.

      You can also assume magical fairy dust and free energy, but that doesn't make it so. You can ask if there are better ways, but you can't assume it, and in the end you will find there is no magic.

      Clusters and replication are NOT NEW. Not even remotely new. There is, in fact, nothing new architecturally at all that would indicate some new capability that hasn't already been repeatedly analyzed and tried. That doesn't mean you can't tweak something for a situation, or that you need a giant Oracle database for everything, but "the web" and "cheap hardware" change the equation by precisely nothing.

      What has changed the equation is cheap, unimportant data, which covers the majority of the web. "Real" applications, where data integrity is important (like say, your bank account), and immediate accuracy guaranteed, require the main thing you use a database for: data integrity. Your facebook page, your google search, that blog entry, or some video on youtube: these don't matter. If it's a little slow, or doesn't update immediately, or you get an error, no one is losing money. No one cares.

      In essence, if a reliable database isn't important for your app, your app isn't really handling important data. This may be fine; in the mainstream, there's a lot of noncritical stuff. But this doesn't make databases unimportant.

      --

      Don't think of it as a flame---it's more like an argument that does 3d6 fire damage

    3. Re:The problem is performance not SQL by Anonymous Coward · · Score: 0

      Everyone thinks their data is important you insensitive clod!

    4. Re:The problem is performance not SQL by dannannan · · Score: 1

      Cheap, unimportant data is not the only factor. A more significant factor is scale. Some of the big "NoSQL" players in TFA have a very real monetary stake in the data they are putting into these systems.

      No one is saying "no to SQL" because they can do without the reliability. Quite the opposite. Put a DBMS under crushing load, and availability is the first thing to go. The big players want a system that is highly available and maintains data integrity.

      A typical DBMS makes strong consistency guarantees across the entire dataset. e.g. After an update is committed, all subsequent reads MUST reflect the change. Turns out this costs a lot; it is a major sacrifice to the potential throughput that could otherwise be achieved with the same hardware, and fundamentally limits the scale of the dataset. Strong consistency adds nothing to reliability and is unnecessary for many apps.

      You are pretty close in your point that when you upload something to Facebook, it doesn't matter if everyone sees it instantly the next time they refresh their browser. That is absolutely true. However this is not to say that the underlying system is lacking in integrity or reliability. An "eventually consistent" data store can reliably guarantee that the data will eventually be reflected in all queries, without requiring it to be resubmitted.

    5. Re:The problem is performance not SQL by celtic_hackr · · Score: 1

      You can also assume magical fairy dust and free energy, but that doesn't make it so. You can ask if there are better ways, but you can't assume it, and in the end you will find there is no magic.

      Clusters and replication are NOT NEW. Not even remotely new. There is, in fact, nothing new architecturally at all that would indicate some new capability that hasn't already been repeatedly analyzed and tried.

      Wheel bearings were invented by the Danes around 400 AD (or BC), Ball bearings were invented around 1770.
      Just because, something is repeatedly analyzed and tried, doesn't mean that there isn't a better way. Using your logic we might as well say, there is nothing new or undiscovered under the Sun, Moon, and Stars (tm). Just because something is good and works doesn't mean we shouldn't constantly be looking for better ways. I'm sure the first time a person saw sulfur on the stick used to make fire it would seem like magic to someone still using a flint to make fire. Who is to say there isn't some drastically better algorithm just over the horizon? I personally welcome these innovative, skeptical people looking for something better.

  13. Not mutually exclusive by JobyOne · · Score: 3, Insightful

    It's pretty easy to say "yes" to alternatives without saying "no" to SQL.

    Just because a crowbar can pull out a stubborn nail better doesn't mean they should replace all the hammers. Then what would we put nails in with? Different tools for different jobs.

    --
    Porquoi?
    1. Re:Not mutually exclusive by convolvatron · · Score: 1

      yes, and you can have relational algebra without sql

    2. Re:Not mutually exclusive by Anonymous Coward · · Score: 0

      The only reason SQL won't be around longer than COBOL is that COBOL was created before SQL.

  14. RDBs are good, but SQL is horrible by Cyberax · · Score: 0

    The idea of RDB is cool, relational algebra is quite neat. But SQL itself is horrible.

    I'd like to have a language which will allow me to access intermediate tuples cleanly and return hierarchic structures. For example, if I want to fetch all customers and all their bids in one query I have to use inner join. And that results in LARGE number of rows (Cartesian product of customers and their bids).

    Also, I'd like to see stuff which is not easily expressed in relational algebra, like running sums or grouping on a computed field.

    1. Re:RDBs are good, but SQL is horrible by larry+bagina · · Score: 1

      If you want hierarchical data, you could use a hierarchic database.

      --
      Do you even lift?

      These aren't the 'roids you're looking for.

    2. Re:RDBs are good, but SQL is horrible by Anonymous Coward · · Score: 0

      The idea of RDB is cool, relational algebra is quite neat. But SQL itself is horrible.

      I'd like to have a language which will allow me to access intermediate tuples cleanly and return hierarchic structures. For example, if I want to fetch all customers and all their bids in one query I have to use inner join. And that results in LARGE number of rows (Cartesian product of customers and their bids).

      Also, I'd like to see stuff which is not easily expressed in relational algebra, like running sums or grouping on a computed field.

      If you're getting a Cartesian product from a query like that, either the DB architect was a moron or (most likely) you need to learn about the WHERE clause.

    3. Re:RDBs are good, but SQL is horrible by Cyberax · · Score: 1

      I do not want hierarchical data storage. I want to create trees from relational data.

      I don't see anything that prevents me from doing this in theory. In fact, ANSI SQL already has support for hierarchic queries (which makes it Turing-complete, BTW).

    4. Re:RDBs are good, but SQL is horrible by Anonymous Coward · · Score: 1, Informative

      select * from customers c, bids b where c.customer_id=b.fk_customer_id order by c.customer_id, b.bid_date

      Seems pretty simple. What's wrong with an inner join? Your getting exactly the number of rows that you need to answer your question, no more no less.

      A cartesian product would be more like: select * from customers c, bids b. But that's not what you want.

      As for hierarchical structures, Oracle db has ways to do this, although I admit the syntax isn't that straight forward: http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/queries003.htm

    5. Re:RDBs are good, but SQL is horrible by godrik · · Score: 1

      For example, if I want to fetch all customers and all their bids in one query I have to use inner join. And that results in LARGE number of rows (Cartesian product of customers and their bids).

      I am not sure I get your point. If you do an inner join it means you want all the tuple < player,bid > that makes sense. If there is a lot of them, well, there is a lot of them, there is nothing to do about it. If you complain about each player being repeted on several bid (since they bid more than once). It should not be a problem, as long as you stay in the RBMS, this should not incur any overhead. When you read them, you can just compress them on the fly.

      If you really want to remove those "extra" player values, why do you want to have a single query ? You can just make a query for each player.

      Also, I'd like to see stuff which is not easily expressed in relational algebra, like running sums or grouping on a computed field.

      technically, they cannot be expressed in relationnal algebra, you have to add non algebraic operator to do that. SUM and GROUP BY in SQL are not part of relationnal algrebra. The problem with those operators is that it is difficult to do any optimization on them. Howver, the user may still want to have them. I would also be interested in a language that can express such things efficiently

    6. Re:RDBs are good, but SQL is horrible by jyx · · Score: 1

      For example, if I want to fetch all customers and all their bids in one query I have to use inner join. And that results in LARGE number of rows (Cartesian product of customers and their bids)

      You might want to explain yourself a bit more there, if I want all customers and all their bids I would expect a LARGE number of rows. What magic algorithm is out there that will give you all your data, but at the same time make it less than what it is?

      Or do you want just one row of customer data and then all there orders under that? Good luck getting your admin staff creating reports off that spreadsheet.

      I think what you are interested in reporting tools - they do the things you ask for often in a nice drag and droppy way - but you still need to get the data to those reports and I haven't found anything better for that job than SQL (yet).

      Data is hard work, eventually any solution to querying databases is going to be as complicated as SQL because there is a infinite number of ways people will eventually want to look at it.

    7. Re:RDBs are good, but SQL is horrible by Cyberax · · Score: 1

      Let's suppose that we have 1000 customers and each customer has 100 bids, and each bid has 5 sub-items.

      If we retrieve all of them using inner joins - we'll have to transmit and read 100*1000*5 rows. Quite a large number.

      If we first fetch customers and then fetch their bids (using a second query) and then sub-items we'll have to read 1000+1000*100+1000*100*5 rows. However, each time we fetch only relevant data which can result in huge savings of bandwidth (some database protocols are naive enough to transmit full rows).

    8. Re:RDBs are good, but SQL is horrible by Marcos+Eliziario · · Score: 1

      The way to store tree like structures on a relational database is using nested sets, not pointer-like ids.
      The main current backslash against databases is that most developers don't have a clue about set theory, relational algebra, let aside the inner workings of a concurrent database system.
      Many of the problems solved by RDBMS are going to have to be solved again by those new tools that are promised to replace RDBMS.

      Those who ignore history, are condemned to repeat it.

      --
      Your ad could be here!
    9. Re:RDBs are good, but SQL is horrible by caerwyn · · Score: 1

      I'm not sure what your point here is. People do that sort of sequential querying all the time- each query simply asks for the subset of data of interest. What, exactly, are you unable to do in SQL that you want to be able to do?

      --
      The ringing of the division bell has begun... -PF
    10. Re:RDBs are good, but SQL is horrible by caerwyn · · Score: 1

      Also, I'd like to see stuff which is not easily expressed in relational algebra, like running sums or grouping on a computed field.

      Grouping on a computed field is quite easy, so if you're waiting for SQL to support it... you've been waiting too long, it already does.

      As for running sums, that's the sort of thing that Oracle already has and just went into the Postgres 8.4 release that was on slashdot the other day.

      Your SQL complaints are a bit out of date. :)

      --
      The ringing of the division bell has begun... -PF
    11. Re:RDBs are good, but SQL is horrible by Cyberax · · Score: 1

      Why do I need to do several queries? It would be nice to be able to do this in a single query.

    12. Re:RDBs are good, but SQL is horrible by Anonymous Coward · · Score: 0

      Yes, but isn't "repeating it" how a lot of IT folks make money? :-)

    13. Re:RDBs are good, but SQL is horrible by jedidiah · · Score: 1

      If you are trying to "get everything", this will be a problem. It doesn't
      matter what sort of technology you are using. The great white hype (sql
      replacement) will still have the same problem that SQL does. Your data
      is the size of your data.

      The only way to really avoid this is to architect your system as a
      data warehouse or run your databsae out of memory. At that point,
      the great white hype will probably not have any advantage over a
      plain vanilla SQL engine.

      There are no shortcuts. The size of the relational resultset represents the size of the data you're dealing with.

      No magical pixie dust will make it smaller.

      --
      A Pirate and a Puritan look the same on a balance sheet.
    14. Re:RDBs are good, but SQL is horrible by caerwyn · · Score: 1

      You *can*. What you're complaining about is that the DB is giving you back too much information to deal with all at once. Okay, fine- then ask it for smaller chunks of data by making several queries.

      If you ask it for a lot of data and it gives it all back to you, that's hardly the DB's fault. Tell it what you want (ie, a smaller subset) and you'll get that.

      If you want to make only one query and still get small subsets at a time, use a cursor.

      --
      The ringing of the division bell has begun... -PF
    15. Re:RDBs are good, but SQL is horrible by Cyberax · · Score: 1

      It's OK if I get a lot of data. That's what I ask for.

      But I can't get it optimally. And that's the problem.

    16. Re:RDBs are good, but SQL is horrible by Anonymous Coward · · Score: 0

      Hmm, I dont see how fetching 500.000 rows (100*1000*5) will consume more bandwidth than 601.000 rows (1000+1000*100+1000*100*5).

    17. Re:RDBs are good, but SQL is horrible by voidphoenix · · Score: 1

      Inner joins produce intersections of data -- they limit the result set. Outer joins produce Cartesian result sets. That's why they're also referred to as Cartesian joins.

    18. Re:RDBs are good, but SQL is horrible by CorporateSuit · · Score: 1

      How much more optimal is it supposed to be? In storage, you're eliminating the huge amount of space that would be required for redundancy when you go relational (instead of needing to include the client's information and the information on the item they're bidding on for every bid). In either case, if you were looking up an item with 12,000 bids on it, you're going to get a recordset of 12,000 bids to look at. If it's

      Select C.CustomerName,I.ItemName,B.* from Customers C inner join Bids B on C.CustomerID=B.CustomerID inner join Items I on I.ItemsID=B.ItemsID where ItemsID=3323432234 to get bid information on an item, and some basic information to go along with it

      or

      Select C.CustomerName,B.* from Customers C inner join Bids B on C.CustomerID=B.CustomerID where B.ItemsID=3323432234 to get just the bid information on the item and the customer's name to keep it more human-readable. If you don't need customer information to look at, just run a select query on the Bids table.

      --
      I am the richest astronaut ever to win the superbowl.
    19. Re:RDBs are good, but SQL is horrible by Anonymous Coward · · Score: 0

      The way to store tree like structures on a relational database is using nested sets, not pointer-like ids.
      The main current backslash against databases is that most developers don't have a clue about set theory, relational algebra, let aside the inner workings of a concurrent database system.
      Many of the problems solved by RDBMS are going to have to be solved again by those new tools that are promised to replace RDBMS.

      Those who ignore history, are condemned to repeat it.

      The parent post should be modded +5 and put at the top of the comment list. If someone figures out better math than the set theory that SQL is based on, I'll be ready to listen. If someone is upset because they don't like the performance of a MySQL database with a few tens of millions of rows running on a shared virtual host that was set up and designed by someone who doesn't even know what what normalization means and who thinks schema only refers to some programming language they heard of in a class one time, then tell them to take a class or two and read some books t better educate themselves, first.

      The most interesting thing in this story, though, goes with my own anecdotal evidence regarding the current crop of majors graduating with various computer programming related degrees. It's rare to find one that had even one decent class on relational databases, let alone any clue about set theory, etc. What the hell are they teaching kids in class nowadays? Other than Java, I mean.

    20. Re:RDBs are good, but SQL is horrible by ckaminski · · Score: 1

      How do you know what you want your subset to be unless you query it?

    21. Re:RDBs are good, but SQL is horrible by genner · · Score: 1

      No magical pixie dust will make it smaller.

      To be fair no one has actually tried using magical fairy dust to make their data smaller.

    22. Re:RDBs are good, but SQL is horrible by Anonymous Coward · · Score: 0

      It's OK if I get a lot of data. That's what I ask for.

      But I can't get it optimally. And that's the problem.

      No, the problem you're describing is not that you can't get it optimally (you yourself already described how to get it optimally), it's that the system doesn't do the optimization for you.

      So what? You have to optimize it yourself. The power is in the developer's hands. If you just want massive data, you can get it. If you want it in chunks, well, you can get that too. It's all up to you.

      If you want something that does the work for you, well, Microsoft has this thing called Access.....

    23. Re:RDBs are good, but SQL is horrible by Anonymous Coward · · Score: 0

      An Hones question from someone wo does not get a lot of SQL

      How is that different from something like

      from customers c join bids b on c.customer_id=b.fk_customer_id
      ?

      (I get left and right join saves a lot of logic in the query, but is there any other gain)

      I mean, what is there to gain from using a 'join' over using a 'where'?

    24. Re:RDBs are good, but SQL is horrible by colinrichardday · · Score: 1

      Take two "tables" of ordered pairs: {(2, 5), (8, 7), (3, 1)} and {(5 , 9), (4, 3), (7, 5)}. If we do an inner join between the two tables, looking for where the second coordinate of data from the first table matches the first coordinate of data from the second table and then return the first coordinate of data from the first table with the second coordinate of data from the second table, we get {(2,9), (8, 5)}. Inner joins are more like composition of relations.

    25. Re:RDBs are good, but SQL is horrible by Cyberax · · Score: 1

      Inner joins are expressed in relational algebra as a selection and projection over a Cartesian product of two relations.

    26. Re:RDBs are good, but SQL is horrible by colinrichardday · · Score: 1

      The results of inner joins are subsets of Cartesian products, but I would hope that an RDBMS would calculate inner joins without taking the Cartesian product.

      And one can express inner joins without mentioning Cartesian products.

    27. Re:RDBs are good, but SQL is horrible by rjstanford · · Score: 1

      You could always do it (in pseudo-SQL) with 4 statements:

      1. Turn on "Repeatable Read" or "Read Consistency" mode and begin a transaction
      2. Select customer.* from customer join bid on bid.customer_id = customer.id
      3. Select bid.* from bid join customer on bid.customer_id = customer.id
      4. Close transaction (and possibly reset mode for next time).

      There. Done, without changing SQL. Of course, doing it as a join returns more bytes (but fewer rows) - but with a reasonably intelligent compression algorithm on the connection between the database server and the querying host, this may not actually have a significant performance impact - it would be worth benchmarking this before doing anything "creative" around it.

      Also, most (but by no means all) of the time that you're trying to get this much data from the database, there may be ways that you can offload some of the post-processing to the DB itself, saving cycles and reducing the data transfer needed. That's not always the case, of course.

      --
      You're special forces then? That's great! I just love your olympics!
    28. Re:RDBs are good, but SQL is horrible by rjstanford · · Score: 1

      We did, and I think it would have worked too, but we added it on as a patch to Informix 6. Big mistake.

      --
      You're special forces then? That's great! I just love your olympics!
  15. Yeah, so why are they better? by Anonymous Coward · · Score: 5, Insightful

    If I was to read the article, I bet somewhere someone would be wittering on about Key Value Datastores.

    The brainchild of a generation brought up on high level collections, they learn one (in this case Map) and apply it to everything.

    Sadly SQL, and RDBMS, works for most people. It maps object data well (oh whaaaa, i have to do foreign keys - GROW SOME FUCKING BALLS YOU LAZY GRADUATE!) and it is well understood. And with abstractions like LINQ to query them, even the lazy dumb Windows .NET programmer doesn't have to strain their brain to learn SQL.

    And when you have terabytes of specific unique data, you clearly should go away to work out how best to store it. Even a RDBMS/SQL solution is too generic for all problems.

    1. Re:Yeah, so why are they better? by Hurricane78 · · Score: 1

      Well, file systems, databases, object inheritance trees, etc, they all are based on the incomplete concept of hierarchical trees and maps. While in reality, everything can be generalized trough graphs. Generic graphs. Of course everyone got its own poor fix for this. File systems have links, databases have foreign keys, and OO languages have interfaces or multiple inheritance. It's a mess, because it is an afterthought.

      I stopped using all those approximations of data structures, and use my own high-performance ontologic graph library for everything that I would use a treelike structure for. I also can stick it on top of a file system or RDBMS, and even have a UI element to browse it. I do not look back. :)

      --
      Any sufficiently advanced intelligence is indistinguishable from stupidity.
    2. Re:Yeah, so why are they better? by spiffmastercow · · Score: 1

      I'd say LINQ is significantly harder to use than SQL most of the time. The only real exception is when you need to convert the value of a subquery into a comma-delimited list

    3. Re:Yeah, so why are they better? by fabs64 · · Score: 3, Insightful

      Saying RDMS's map object data well is a bit of a stretch, they map relational data well and that's it.

      http://www.codinghorror.com/blog/archives/000621.html for some good background on the problems.

      For me using an RDMS as the persistence layer for an object-oriented application has ALWAYS felt like a bit of a kludge. Like we're using it just because it's what we have, rather than the best tool for the job.

    4. Re:Yeah, so why are they better? by schnablebg · · Score: 1

      You use your own hand rolled libraries for standard data structures? I really hope I don't end up on a project with you or inheriting one you've worked on.

    5. Re:Yeah, so why are they better? by Skapare · · Score: 1

      SQL can do just about everything you need in a data store ... if you aren't considering performance and cost. For many things, the whole point is that SQL is overkill. The alternative solves FEWER problems ... it just solves them BETTER.

      --
      now we need to go OSS in diesel cars
    6. Re:Yeah, so why are they better? by VGPowerlord · · Score: 1

      I've never tried to use an OODB, but from what I've heard... it ends up being a mess, with a custom query language.

      Relational databases, on the other hand, are relatively easy to understand, as they just have tuples (or rows) that are related to other tuples through a specific value.

      --
      GLaDOS for President 2016! "Well here we are again. It's always such a pleasure." -- GLaDOS, 2011
    7. Re:Yeah, so why are they better? by kage.j · · Score: 2, Interesting

      Linq to SQL/Entities(on your entity provider) has it's benefits and downfalls.

      But damn, Linq to everything else fucking rocks faces, and anyone who says otherwise seriously needs to buy a linq book and actually use the shit. Linq to XML/collections .. I don't know where I'd be without it. And I don't want to know!

      Yeah, linq is handy with Entities, but you run into a whole messuh problems if you don't be careful with it. (And people who don't understand relational databases should stay away from it.)

      At least, that is my opinion...but don't take it too seriously

      --
      he demonstrated by A plus B minus C divided by Z that the sheep must be red, and die of the rot
    8. Re:Yeah, so why are they better? by lgw · · Score: 1

      It sounds like he's been using the same hand-rolled library for everything for years. It's probably pretty solid by now.

      --
      Socialism: a lie told by totalitarians and believed by fools.
    9. Re:Yeah, so why are they better? by spiffmastercow · · Score: 1

      LINQ to SQL has its uses. It saves a lot of developer time by writing all of the classes for you. I've got a nice little class derived from the DataContext class that writes to a table all the changes made and who made them (accountability is shit at my workplace). If I wanted to do that in SQL, it would require either a trigger on every column of every table, or require me to write the code to record everything manually on every update/insert/delete transaction. Instead I get it done in about 100 lines of C# code. There'sa a lot of things its not good for, but I just use SQL when its easier/better to accomplish the goal in SQL. Sure, it's not a consistent code design scheme, but it lets me get things done quickly, which leaves me more time to troll /. But yeah, they do need to get their heads out of their ass and learn what "Distinct" means in a query context though.

    10. Re:Yeah, so why are they better? by Anonymous Coward · · Score: 0

      Why should storing the objects your application uses inside a persistance store 'become a mess' ?

      Have a look at db4o. It uses a NATIVE query language in Java or .Net.

    11. Re:Yeah, so why are they better? by jawahar · · Score: 1

      I bet somewhere someone would be wittering on about Key Value Datastores.

      I believe there is a trivial difference between SQL and Key Value Datastores if both the APPLICATION and complete DATA are loaded into RAM (the way Google does).

    12. Re:Yeah, so why are they better? by Anonymous Coward · · Score: 0

      Like we're using it just because it's the best tool we have, rather than anything else.

      Fixed that for ya.

    13. Re:Yeah, so why are they better? by Anonymous Coward · · Score: 0

      I wonder if you wouldn't mind elaborating on the design of your graph library. I think your analysis of the problem is bang on - most of the time what starts out as a tree ends up as a DAG or even contains cycles and is thus a graph in the general sense.

      I've been thinking a lot about this problem as the application I work on essentially represents its project files as giant trees of diverse types of objects. Problem is, properties of some those objects refer in various ways to other objects in different branches of the tree. Thus it's really a graph in the general sense, but the edges can be labelled differently according to relationship, e.g. parent/child, some sort of weaker "reference" relationship, reference by name or ID, etc. The types of relationships have a bearing on how to handle operations on the tree, e.g. if a relationship is really a whole/part/composition relationship then deletion or copying should extend to the related objects. We get into a lot of complicated merge situations for operations like copy/paste, when sets of IDs and so on need to be made unique or automatically renamed/renumbered. And of course we have to do a lot of other things with the tree, like query it, validate it, serialize it, represent it in the UI in various forms, and so on, and the structure of the tree and types of relationships between nodes have an effect on that as well.

      I guess it is the "ontologic" portion of your statement that makes me curious. What does your graph look like, and do you have different types of links, annotations, etc? Any general insight on how you process them?

    14. Re:Yeah, so why are they better? by BitZtream · · Score: 1

      even the lazy dumb Windows .NET programmer doesn't have to strain their brain to learn SQL

      Fuck you, I'm a lazy .NET programmer, but I'm not stupid, most of the time anyway.

      --
      Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
    15. Re:Yeah, so why are they better? by Tablizer · · Score: 1

      everything can be generalized through graphs

      Same with sets. OO tends to be graph-centric and relational tends to be set-centric in it's "feel". It's tricky to get these to live together smoothly. Which one is "better" may depend on the problem at hand and personal psychology. I personally think sets "conceptually" scale better than graphs. Sets are less visual than graphs, but as you scale, graphs also get less visual, at least in a usable sense. Sets offer better "filtering" idioms than graphs when things scale up. It's easier to get "relative" viewpoints with sets so far.

  16. What's the benefit exactly? by SendBot · · Score: 3, Insightful

    I'm not seeing anything that offers a real advantage over using advanced features like one finds in postgres combined with memcached. Some of my program likes to think of its data as a structured object while other parts like seeing that data as rows in a table (they even link up to other tables through foreign keys!).

    1. Re:What's the benefit exactly? by phantomfive · · Score: 1

      The main problem with relational databases is that they use a completely different storage scheme than your program does. Databases are organized into tables, rows and columns, but programs are organized into random access variables, structs, and classes. Thus, to use data from a relational database in your program, you need to have a conversion layer that converts from tables to random access, and back. These guys are saying it would be nice if we had a way to store this stuff that didn't require a conversion layer.

      And I agree. However I also think it would be nice if I could keep all my data in RAM all the time, for easy access. It's just not practical. If all your queries are straightforward selects on ID, then there really is no great reason to use a full database. But once you start doing more complex joins and searches, a database is, while not always convenient, still more convenient than the alternatives.

      --
      Qxe4
    2. Re:What's the benefit exactly? by hibiki_r · · Score: 1

      A conversion layer is wasteful when there's only one way to look at your data. In that case, key value pairs can perform better, no question.

      The problem lies in situations where you need to look at the data in 5 different ways. or 50. Then, a single object model for your data is a whole lot less practical than having a conversion layer, and have the data in a very flexible format, like a relational model.

    3. Re:What's the benefit exactly? by phantomfive · · Score: 1

      Honestly, I don't even care about performance; relational databases perform reasonably well......I just hate all the time I have to spend actually WRITING the conversion layer.

      --
      Qxe4
    4. Re:What's the benefit exactly? by SendBot · · Score: 1

      Well, I have a conversion layer to create the object my program uses, but I can't think of a need to convert it back. All the things that make it what it is are a result of all the little things that interact with the db. Using triggers, it knows when to update parts of itself. The parts that interact with the db often don't care about the object, even if it's being used as in input to those parts.

      When I DO need to care about the object, replicate it, or maintain persistence, then I use...

      DUN DUN DUN!

      memcached. (I rtfa'd and even amazon's thing said it was a basically a key -> value system)

      If I did this exclusively with my object instead of sql, I don't know right away how I would do all my searches and processing because everything is so hugely related... and I think the whole point of this nosql thing is that it's a non-relational alternative for when things are pretty basic, but comprise enormous data size.

    5. Re:What's the benefit exactly? by tg2k · · Score: 1

      Seriously. If your database is of any decent size you should find a good ORM tool so you can keep to what you do best: the business logic. Having written data access layers with minimal help, and used a good ORM, I find there is no substitute for the ORM, especially if it allows you some additional flexibility for corner cases.

  17. Except in the in end by xednieht · · Score: 1

    The Patriots themselves levied their own heavy taxes emulating those against which they had originally rebelled

    In the end it's all just 1's and 0's.

    --

    Hope is the currency of fools
    1. Re:Except in the in end by Anonymous Coward · · Score: 0

      The Patriots themselves levied their own heavy taxes emulating those against which they had originally rebelled

      Why do we call them "patriots"? They were traitors. They rebelled against the legitimate government of the day and started a full-blown war. Sure, they won that war and subsequently set up a great democratic nation that now leads the world, but that doesn't mean we can retrospectively claim that they were being patriotic towards a country that didn't even exist at the time they made their stand.

      I guess that's the answer, of course. We don't want to give people the idea that rebelling is good, even though the USA wouldn't exist unless those treacherous founders had rebelled. So we make up lies instead, and try to pretend that while it was good for them to take up arms against their government, we should implicitly obey everything our government tells us to do -- and we call both these things "patriotism".

  18. What else is there to use besides SQL by Orion+Blastar · · Score: 1

    go back to flat files aka DAT files.

    Use the old DBase III standard DBF files?

    Use the old Lotus 123 WK1 files?

    Use MS-Office MS-Access MS-Excel etc files?

    Use comma separated values files?

    SQL set a standard for relational databases, a structured query language that almost any database can use and then build extensions to it.

    Will the Post-SQL age begin, and will it be object oriented and a fifth generation language?

    --
    Remember, Slashdot does not have a -1 disagree moderation, and no, troll, flamebait, and overrated are not substitutes.
    1. Re:What else is there to use besides SQL by godrik · · Score: 1

      go back to flat files aka DAT files.

      Technically, that is what they do. Basically, they just say that they do not need classical RDBMS to do their job. I agree with them that RDBMS makes poor implementation of big dictionnaries. :)

    2. Re:What else is there to use besides SQL by Orion+Blastar · · Score: 1

      Yeah but DAT files don't work too well in a multiuser environment. For example as one person is changing the DAT file, another could be trying to change it from a different machine. Ever got that "Cannot write to file, it is in use" error before?

      I suppose for small databases a DAT file would work, but in an environment of over 100 users you need something like a RDBMS to handle the requests and make sure that the database isn't overloaded and there are no conflicts and performance issues.

      Can you imagine a terabyte DAT file? Then trying to cycle through records to find what you wanted? Now maybe a megabyte or under DAT file can handle it, but once the database grows to enterprise level, there might be some problems.

      --
      Remember, Slashdot does not have a -1 disagree moderation, and no, troll, flamebait, and overrated are not substitutes.
    3. Re:What else is there to use besides SQL by godrik · · Score: 1

      well, the dat file is only the stored version of the 'database', you still keep a daemon running to manage the access to it. Or even a several daemons. That's basically what a DHT is.

    4. Re:What else is there to use besides SQL by MBGMorden · · Score: 1

      Sounds pretty awesome to me. I'm certainly with you. It might could use some standardized way to request information from that daemon program though. Perhaps some type of structured language that could be used to execute queries against the data. I think we'd be all set with that!

      --
      "People who think they know everything are very annoying to those of us who do."-Mark Twain
  19. Nailguns by tehdaemon · · Score: 1

    Most nails are put in with nailguns. Hammers these days are mostly used for demolition of various sorts, including pulling nails. T

    --
    Laws are horrible moral guides, moral guides make even worse laws.
    1. Re:Nailguns by JobyOne · · Score: 1

      Are we still talking about hammers and nails? Are we talking about SQL now? I'm confused...

      --
      Porquoi?
    2. Re:Nailguns by Anonymous Coward · · Score: 0

      Most nails are put in with nailguns. Hammers these days are mostly used for demolition of various sorts, including pulling nails.

      T

      Frame much?

  20. How about saying yes to the alternative by syousef · · Score: 4, Insightful

    Saying no to SQL and relational databases is just fine if you've got something better to replace it with. However I know of no such thing. The reason they're popular is that they are so powerful for data storage. If something better came along you wouldn't even need to say no to SQL. You'd just say yes to the newer better rival.

    --
    These posts express my own personal views, not those of my employer
    1. Re:How about saying yes to the alternative by Joe+U · · Score: 1

      I'm betting Slashdot runs on a standard SQL type database, most sites like slashdot do as well. It's absolutely the wrong server to use.

      Why no one has realised that NNTP servers are designed to do just this is beyond me.

      SQL is a crutch.

    2. Re:How about saying yes to the alternative by g2devi · · Score: 2, Informative

      Sure. There are several.

      If you do clinical work, you're fairly familiar with EAV databases:
      http://en.wikipedia.org/wiki/Entity-attribute-value_model

      and The Associative Model of Data:
      http://www.lazysoft.com/docs/other_docs/AMD.pdf

      These data models are best when either your schema is inherently hazy (e.g. in case of patient information) of where the schema is so big that it's impossible to manage (e.g. enterprise data warehousing).

  21. XML / XPATH / XQUERY / XSLT / Xhausted by Anonymous Coward · · Score: 0

    SQL can suck. The alternatives the PHB might choose aren't necessarily better. Be careful what you wish for.

    1. Re:XML / XPATH / XQUERY / XSLT / Xhausted by Tablizer · · Score: 1

      Those are mostly tree-based or graph-based query languages. There is a place and time for them, but tree and graph based query languages were tried in the late 60's and early 70's, but fell out of favor. Reliance on set theory (such as relational's) for associations appears more flexible for multi-purpose use of a given piece of data than "pointers-and-nodes" of trees/graphs. It was known as the Codd-Vs-Bachman debates, and by most accounts, Codd won.

  22. Whatever by Anonymous Coward · · Score: 0

    Like unix being dead - someone else thinks SQL is dead and worthless.

    I disagree, there is no ONE solution. SQL works great for many types of data access. But an object based db can be great for other types.

    SQL is dead. Long live SQL. :-)

    1. Re:Whatever by lgw · · Score: 1

      Like unix being dead - someone else thinks SQL is dead and worthless.

      Wake up, it's later than you think. Commercial Unix is dead, and in the ground. Oracle just threw the first handful of dirt on the coffin. Linux and BSD have a future, but all the commercial Unixes that used to dominate are swiftly going the way of the minicomputer.

      --
      Socialism: a lie told by totalitarians and believed by fools.
  23. Misses the point by kc8jhs · · Score: 1

    There are plenty of ways to store data inexpensively in a RDBMS. There are plenty of GPL and low cost RDBMS available.

    The real issue is that the more and more we move into complex data structures and we push the limits of what an ORM can do with those simple, inexpensive RDBMS, the more problems we run into trying to map our objects into rows in tables.

    Here is one of the more interesting solutions that I've seen to the problem, but it only work over relatively simplistic data where managing indexes by hand is ok, and it's okay for the indexes to be incomplete at any given moment. Ironically, that gives them more availability than trying to force MySQL to do indexes. But it really depends on the data and needs.

  24. SQL is not a database by j.+andrew+rogers · · Score: 5, Insightful

    SQL is not a database, it is a standard interface to a feature set commonly associated with relational models. Before everyone standardized on SQL, there were other relational query languages. The "No" part of "NoSQL" refers to the fact that some basic elements of relational implementations cannot be usefully expressed using a much simpler distributed hash table model.

    All the "NoSQL" does is eliminate all the parts of traditional relational databases that do no scale -- discarding the bottleneck rather than fixing it. These are things like joins and external indexing. Unfortunately, discarding those things means you discard a lot of very important functionality as a practical matter, notably the ability to do fast, complex analytics. Adopting the NoSQL architecture runs contrary to the trend toward more real-time, contextual analytical processing. There are a great many analytical applications that are not amenable to batch-mode pattern-matching, and the NoSQL model is a lot less applicable than I think some people want to acknowledge. In its domain, it is a great tool but it has many, many prohibitive limits. We are essentially trading power for scale.

    That said, do not take this as an endorsement of traditional SQL relational databases either, as they have a number of serious limitations themselves. As just mentioned, a number of the core analytical operations those models support are based on algorithms that scale poorly. The SQL language itself has mediocre support for many abstract data types (e.g. spatial) and data models (e.g. graph), which in part reflects the inadequacies of the assumed underlying database algorithms (e.g. B-trees) that are implicit in SQL. The inability to efficiently do event-driven/real-time applications is also more a reflection of the access methods used in databases than any intrinsic weakness in SQL; SQL may be clunky for that purpose, but that is not the real limiter.

    A truly revolutionary deviation from SQL would usefully implement a superset of the features SQL supports, not take them away. Of course, we would need access methods more capable than hash tables and B-trees to useful implement those features, which is a lot more work than discarding features that scale poorly. NoSQL is a stopgap technical measure for that small subset of applications where the serious tradeoffs are acceptable.

    1. Re:SQL is not a database by Anonymous Coward · · Score: 0

      A truly evolution of SQL was something that Informix came up with after their acquisition of Ilustra, an Object-Relation DB.

      Since the Informix 9.x server, you could apply OO concepts to tables and rows. You could create a base table called Person and then create a new one called Employee which will inherit all the attributes from Person plus some additional data attributes like salary for example. With this model you will create the proper functions that are casted to the right datatype so queries will become more simple.

      Another feature was the fact that new access methods, their datatypes and functions could be created to represent data that is not well represented on a standard row/column format. Case in point, Spatial datatypes. in this case special datatypes were created and their physical representation was also different. For these you didn't have regular B-tree indexes to point to a specific coordinates dataset but the new access method allowed to create two dimensional indexes, like a grid on a map that subdivides itself in two dimensions to find the right point on a map based on a (x,y) coordinates pair.

      The were other special datatypes created like the TimeSeries which was created for the financial markets to store and query stock values changes (ticks) for a given stock. In this model the time element was introduced as some sort of "row" separator between value elements. Performance improvements were at least 10x and often around 50x for this specific datatype compared to the regular 3NF representation, even on the same Informix server.

      You will get all this great features in an environment that provided transaction consistency, security and scalability for clusters and multi CPU/Core servers.

      The problem with Informix was their mismanagement during the late 90s when they were trying to chew more that they can bite, ie, they were growing too fast and the CEO and board were too aggressive with growth targets. Hence reinstatement were issued and sales declined.

      At the end IBM bought it and they have kept on supporting and adding some features, but not with the same inventive as the original Informix lead architects...
           

    2. Re:SQL is not a database by Anonymous Coward · · Score: 0

      The real problem is; SQL isn't standardized, it's only structured.

  25. Pros & Cons of non-relational solutions by kpharmer · · Score: 5, Interesting

    Note that most of these solutions come from the interwebs, social networks, etc. And it isn't so much anti-sql as it is anti-relational database (sql != rdb).

    The basic premise is that we need different solutions that: can scale very high for very narrowly scoped reads & writes, don't need to perform ranged queries / reporting /etc, and don't need ACID compliance. And that may be the case. Sites like slashdot, facebook, reddit, digg, etc don't need the data quality that ebay needs.

    On the other hand, ebay achieves scalability AND data quality with relational databases. And when I've worked with architectures that scale massively and avoid the relational trap for better solutions - they inevitably later regret the lack of data quality and complete inability to actually get trends and analysis of their data. It *always* goes like this:
        Me: So, is this thing (msg type, etc) increasing?
        Developer: No idea.
        Me: Ok, so lets find out.
        Developer: How?
        Me: I don't know - typical approach - lets query the database.
        Developer: It'll take four+ hours to write & test that query and then days to run. And when it's done we might find that we wrote the query wrong.
        Me: What?!?
        Developer: We had to do it this way, you can't report on 10TB databases anyhow
        Me: What?!? Are you on crack? there are dozens of *100TB* relational databases out there that people are reporting on
        Developer: well, we probably don't need to know what that trend is anyhow
        Me: I'm outta here

    1. Re:Pros & Cons of non-relational solutions by GryMor · · Score: 2, Informative

      Yah, good luck with that. Unless the index already exists to do nearly exactly what you want, queries against multi terabyte production oracle tables have this bad tenancy of never completing, if you are lucky, and (effectively, from the perspective of the app running on top of it) taking down the database if you are not so lucky.

      If the index doesn't exist, good luck adding in less than a week or two.

      For the most part, novel trend and behavior information is trivial to instrument in the service layer as a side effect of the apps normal operation, at which point you record it in your query logs and build reports based on the logs.

      --
      Realities just a bunch of bits.
    2. Re:Pros & Cons of non-relational solutions by lgw · · Score: 0

      That was very well put, up to the point of associating Ebay with quality. There's nothing inherent in, say, key-value-pair databases that makes reporting hard - it's just that people rolling their own solutions often didn't think about that until it was too late. Massively parallel reporting works as well as massively parallel queries, with a little forethought.

      In the other direction, companies like Visa that have a strong need for data quality seem to scale up, not out, which suggests that there are still difficulties there. Of course, I'm pretty sure Visa doesn't use a RDB, although I guess they might be using DB2 or something these days. It's been quite some time since I heard the CIO of Visa speak, but back then they were exploring ways to stop having to write their entire system themselves - but no possible Sun/Oracle cluster was even close. If you can afford to scale up instead of out, it's really hard to beat running the app and database local to the storage and each other (at which point SQL is pointless, but as you say, SQL != RDB).

      --
      Socialism: a lie told by totalitarians and believed by fools.
    3. Re:Pros & Cons of non-relational solutions by Anonymous Coward · · Score: 0

      After re-reading your post, I'm not sure if you know but ebay is transactionless:
      http://martinfowler.com/bliki/Transactionless.html

    4. Re:Pros & Cons of non-relational solutions by kpharmer · · Score: 1

      Large databases use a combination of range partitioning AND indexing. If you're lucky you've also got hash partitioning - to distribute your database across N servers. Partitioning is far more general and forgiving for purposes like this than indexing.

      And creating trends in your application layer then storing them in logs can work, but:
      1. you still rely on figuring out in advance what you're going to need
      2. if you don't load those logs back into the relational database then you can't effectively join it to all that data. And you then either must log redundant data or have useless logs.
      3. grep & sed or custom reporting code against your logs is a poor substitute for any standard reporting tool or custom code against your database.
      4. database features like partitioning, indexing, parallelism result in in-database aggregates being faster than logs
      5. did i mention automatic query rewrite? where the database automatically converts eligible queries against the base table to actually run against the summary tables...

    5. Re:Pros & Cons of non-relational solutions by Anonymous Coward · · Score: 0

      Fabian Pascal would have a field day with your little dialog there.

    6. Re:Pros & Cons of non-relational solutions by Anonymous Coward · · Score: 0

      You need better developers.

    7. Re:Pros & Cons of non-relational solutions by Thundersnatch · · Score: 1

      We query inindexed, multi-terabyte tables in Microsoft SQL Server all the time. They DBs don't go down or offline. In MSSQL, use WITH NOLOCK if you can afford dirty reads (you usually can for analytics). I am surprised Oracle has no equivalent hint that would allow a long-running query to not generate large transaction state. Even if you turn on MVCC in MS SQL Server, you can still specify locking behavior for individual statements.

    8. Re:Pros & Cons of non-relational solutions by Anonymous Coward · · Score: 0

      The basic premise is that we need different solutions that: can scale very high for very narrowly scoped reads & writes, don't need to perform ranged queries / reporting /etc, and don't need ACID compliance. And that may be the case. Sites like slashdot, facebook, reddit, digg, etc don't need the data quality that ebay needs.

      So you're saying those organizations can get by with a "half-ACID" solution?

  26. forgotten revolution by MaoTse · · Score: 1

    http://www.zope.org/ - both WLS and hibernate made obsolete decade ago
    that "both" - unfortunetaly the case when too much is too much ;-)

  27. Hogwash! by gbutler69 · · Score: 2

    Check out "Window Aggregates" etc in Oracle and PostgreSQL 8.4

    --
    Over-the-top Response Guy! Giving "Over-the-Top Responses" since 1970.
    1. Re:Hogwash! by Mattwolf7 · · Score: 1

      Oracle and PostgreSQL are SQL...

    2. Re:Hogwash! by colinrichardday · · Score: 1

      I believe that was his point, that SQL supports (or could support) such functions. He was responding to the post two up from his.

  28. Data out-lives applications by 4to6Offshore · · Score: 5, Insightful

    First: my mantra: Data belongs to the organization, not the application... if the app fails and data is accessible then we all go on - if the data fails or is locked away - what was the point of the app again?

    In a SQL database then data is understood by the organisation, DBAs and data architects. If left to app developers taking an app-centric approach to data... I get nervous quickly.

    So long as the data is just as definable and accessible as current SQL databases then all good - give me an app with some odd-ball storage then it is bye-bye.

    1. Re:Data out-lives applications by presidenteloco · · Score: 1

      Most relational databases I've seen are written at an "implementation" level of table names, normalization, etc, made very specific to one main application,
      and the databases require the particular application to add high-level (logical level) domain semantics and business rule constraints to the data. i.e. to add interpretation to it.
      i.e. the data without the program that interprets and presents it and controls its modification is pert' near worthless.

      --

      Where are we going and why are we in a handbasket?
    2. Re:Data out-lives applications by Tony-A · · Score: 1

      >i.e. the data without the program that interprets and presents it and controls its modification is pert' near worthless.
      You mean like hijacking credit card numbers, personal data, etc. etc.
      That's kinda like without the facade the building and the land it stands on is of little value.

    3. Re:Data out-lives applications by Rich0 · · Score: 1

      True, but I think the point is that you STILL have easy access to the data - at least for reads.

      I've seen plenty of data migration headaches dealing with proprietary software. Trust me - they're always 10X easier if the data is in a database. Otherwise you spend more time just trying to figure out what's in some proprietary data store than it would otherwise take to just migrate all the data in the database.

      Often with a database you don't even need to migrate most of the data. In many cases you can just write up a few reports and point it at the final read-only snapshot of the database. This takes amost no effort.

      I can see why Google needs BigTable - they had one problem with an ENORMOUS scale and their whole company was built on being able to solve that one problem. If they're using that same solution to run their payroll system, I'd be concerned. Ditching the RDBMS is a viable solution to certain problems, but those are probably only 0.00001% of the problems a company will encounter.

  29. Cartesion Product? by gbutler69 · · Score: 2, Insightful

    Epic Fail. You're wrong. It in now way results in a "Cartesion Product". That would be a "Cross Join", not an "inner join". From my experience, people who complain about SQL and relational database, are, for the most part, ignorant. They really don't even understand what they are saying or what they are talking about. I've seen so much abuse and misunderstanding of relational data and SQL in my career, that I just have to laugh at this sort of thing.

    --
    Over-the-top Response Guy! Giving "Over-the-Top Responses" since 1970.
    1. Re:Cartesion Product? by MightyMartian · · Score: 1

      Amen to that. There are a helluva lot of people getting pushed out of IT schools (maybe they're coming out of the one next door to the Courdon Blu School of Cooking) who have at best a passing knowledge of SQL, probably gained from shitty little LAMP projects.

      When I worked for a small ISP, I had a database that stored every single login and logout from about 1998 until the place shut down in 2006. There were millions of records, and I could still run queries to find out the total number of minutes Bob Jones had been dialed in, or do more complex groupings based on the type of dialup account (number of hours, where the account was geographically, etc.). Even better, I actually wrote an Accounts Receivable system that could hook into the data to do monthly billings and detailed customer histories.

      --
      The world's burning. Moped Jesus spotted on I50. Details at 11.
    2. Re:Cartesion Product? by Cyberax · · Score: 1

      Nope. It DOES result in a Cartesian product of tuples.

      Notice, I never said that it results in a Cartesian product of the whole tables.

    3. Re:Cartesion Product? by lgw · · Score: 1

      If managed carefully, sure, SQL and RDMSs can scale to "millions of records". That's not an impressive amount of data any more. I'm regularly frustrated by the amount of careful design required just to make a DB perform properly when dealing with 100 million records on hardware my customers can actually afford.

      I've worked on non-relational, non-SQL systems that were an order of magnitude faster in practice than any RDBMs I've used: programs in a real language doing the simplest queries, and doing all the heavy lifting in the program (all local to the storage, of couse). Sure, that environement was a bit of a pain in the ass compared to SQL for small problems, but with big problems SQL just isn't very expressive. When the SQL abstraction starts leaking, and the subtle details of your particular implementation styart to matter, it's just easier to go back to lightning-fast rows and columns, and manage the "relational" yourself.

      These "new" key-value pair databases are also supposed to be hot shit, but they seem to me to be slower for small problems, and only really shine once you're past what a cluster of commodity machines can handle. Of course, that's a more and more common problem.

      --
      Socialism: a lie told by totalitarians and believed by fools.
    4. Re:Cartesion Product? by shutdown+-p+now · · Score: 1

      How do you define a Cartesian product of two tuples?

    5. Re:Cartesion Product? by TheThiefMaster · · Score: 1

      An "inner join" never results in more rows than are in either table, if you're joining on something that's unique in one table, e.g. customer id.

      No product about it.

  30. Unorthodox user of LDAP? by Zombie+Ryushu · · Score: 1

    could we see a rise in the use if tree/hirearchial Databases like LDAP?

    1. Re:Unorthodox user of LDAP? by MightyMartian · · Score: 1

      could we see a rise in the use if tree/hirearchial Databases like LDAP?

      Yeah, I think Microsoft wrote a SQL library for that.

      --
      The world's burning. Moped Jesus spotted on I50. Details at 11.
    2. Re:Unorthodox user of LDAP? by hrvatska · · Score: 1

      Under the covers Tivoli Directory Server, an LDAP server, uses IBM's DB2 as a data store. It's basically a relational database that's been optimized for use by an LDAP server.

  31. Re:SQL.... by rjstanford · · Score: 1

    SQL is an OSS tool?

    Why'd I pay so much to Informix around 1990 then I wonder?

    For that matter... SQL is a thing? I always thought it was a spec and a language :)

    --
    You're special forces then? That's great! I just love your olympics!
  32. RDBMS and application logic by gd2shoe · · Score: 4, Insightful

    That is one view. It's nice and all, but incomplete. The issue is performance.

    Any time you're dealing with a large quantity of data, it's always easiest to process or filter where it's located. Transmitting it, processing it, and transmitting back changes adds an unreasonable amount of overhead. Hence, SQL is a "Query" language. In other words, you have the RDBMS do reasonable data processing and filtering of records for you. Your application should only need to specify the operations performed, and should only process data if your computation is particularly unusual. This makes feasible computations that would otherwise be entirely unreasonable. (note that an application working on the same machine generally has the same issue as one working on a separate system. SQL servers present the application with a stream of data - pipe, socket, etc)

    My opinion: SQL is horrendous. It's a pain to use, and many basic data transforms cannot be described in that language (at least without some huge, awful, convoluted command == maintenance nightmare).

    --
    I won't join Slashcott. OTOH, If Beta goes live, I just won't be back until it's fixed. Sorry Dice.
    1. Re:RDBMS and application logic by complete+loony · · Score: 1

      The 2 biggest issues I have with sql used in application development;

      Slightly changing the requirements often involves completely restructuring the query. eg, A group by and having clause no longer cut it, now you need to use a work table and at least 3 separate statements.

      Queries on the same tables are difficult to reuse for similar tasks, resulting in many copies of similar statements that must be maintained.

      --
      09F91102 no, 455FE104 nope, F190A1E8 uh-uh, 7A5F8A09 that's not it, C87294CE no. Ah! 452F6E403CDF10714E41DFAA257D313F.
    2. Re:RDBMS and application logic by Anonymous Coward · · Score: 0

      ...which is why you have a competent DBA wrap "huge, awful, convoluted command ==" nightmares into a some stored procedures and give the programmers access to those.

    3. Re:RDBMS and application logic by ukyoCE · · Score: 1

      Eh, in my view the data is the app. You could put 20 different front-ends on the same data, the data always stays the same. I don't understand devs who look at apps as code only, and then get annoyed that changing the data requires the data change, in addition to the code.

      The problem is when an app has so many redundant layers that a data change means making the same change to 8-10 different layers of abstraction. And thats a *code* problem, not a data problem, and certainly not an SQL problem.

  33. Re:Next Up... by Samah · · Score: 1

    Python?

    If by that you mean "saying no to it", I applaud you. :)
    Lua is the only way to program.

    --
    Homonyms are fun!
    You're driving your car, but they're riding their bikes there.
  34. IBM's IMS is a Hierarchical Database by aoheno · · Score: 1

    IBM has had a hierarchical database called IMS since 1966 http://en.wikipedia.org/wiki/IBM_Information_Management_System

    It holds and manipulates data in a hierarchy accessed and manipulated with the DL/I query language http://en.wikipedia.org/wiki/Data_Language_Interface

    Now, if the venerable IBM would please grow up and Open Source IMS we could have the best of both worlds.

    Better still, reverse engineer the thing from IBM specs, dropping all the legacy fluff accumulated over 40 years, and call it MyHQL.

    --
    Her lips were softer than a duck's bill, but her quacks ...
    1. Re:IBM's IMS is a Hierarchical Database by nitio · · Score: 1

      Please, oh please, do NOT suggest IMS . I have to work with this piece of software everyday and, I'm not kidding, I want to stab myself in the eyes everytime I have to work with it. Maybe it's because I have to work with together with COBOL but it sure does not make my life easier.

      Of course, this is very biased and personal. In a few years I might end up loving IMS but until then I want it do die a painful slow death.

      --
      http://stoploudness.org/
    2. Re:IBM's IMS is a Hierarchical Database by Dark$ide · · Score: 1

      Now, if the venerable IBM would please grow up and Open Source IMS we could have the best of both worlds.

      That's not going to happen.

      About 45% of IMS is still supplied as assembler source to licenced customers. It was first generally available before the IBM dictat that said "all source is confidential" and we'll only supply OCO (object code only) materials to customers.

      DB2, IBM's hierarchical database was developed by the same people originally as a complete replacement for DL/I. But, DL/I databases are sill alive and well and supporting them is paying my mortgage.

      It was used as the parts database for the Apollo program, Neil Armstrong wouldn't have made it to "One small step ..." without IMS

      IMS is still faster than DB2, but DB2 is easier to use from an application design and application programming point of view. That's why there's a Java/SQL interface to IMS databases.

      --

      Sigs. We don't need no steenking sigs.

  35. Relational good, SQL not so by Skapare · · Score: 1

    Relational database have plenty of good uses. They are not the universal solution to data storage, but there are plenty of cases where they are the best case. I would certainly not use a relational database just to store user passwords. Something like Berkeley DB would be good. I have used raw files individually (one file per user named by username) with good success (the underlying filesystem was Reiserfs with tail packing enabled). But even where a relation database is called for (a CRM system, for example), I find that SQL is always a hindrance. I'd rather just get the data across a simple protocol (which can be wrapped in a simple API binding for the language in use).

    --
    now we need to go OSS in diesel cars
    1. Re:Relational good, SQL not so by Hognoxious · · Score: 3, Funny

      I'd rather just get the data across a simple protocol (which can be wrapped in a simple API binding for the language in use).

      While you're here, can you fill in the following form.

      I would like my pony to be:
      [ ] female
      [ ] male, entire
      [ ] male, gelded

      with coat colour:
      [ ] white
      [ ] black
      [ ] brown
      [ ] piebald
      [ ] other, please specify _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    2. Re:Relational good, SQL not so by pjt33 · · Score: 1

      You missed "pink" from the list of colours.

    3. Re:Relational good, SQL not so by Anonymous Coward · · Score: 0

      [X] other, please specify _ pink _ _ _ _ _ _ _

    4. Re:Relational good, SQL not so by pjt33 · · Score: 1

      Be serious. It's like Ford missing "black" from the list of options for the Model T. Having an "other" box doesn't make up for it.

  36. Re:Next Up... by grepya · · Score: 1

    Great job moderator. It's an attempt to make a reasonable point with the help of a deliberately over-the-top analogy. You maynot agree with the point I'm trying to make (that much is obvious), or you may not agree with the applicability of my analogy (which, I humbly submit, is pretty close). That doesn't make it trolling.

  37. Late to the party by jedidiah · · Score: 1

    Invent the successor to SQL and clueless corporate drones that call
    themselves developers but really aren't will come along and abuse
    that too.

    SQL is a useful construct. What replaces it or augments it should
    be similar. IOW, where's the corresponding ACM article for this
    holy-grail-SQL-replacment?

    As far as SQL representing expensive and bloated DBMS engines, you're
    a bit late to the party to be whining about that. Sure perhaps 10 years
    ago you might have had a point but any more the overhead of SQL just
    isn't there anymore. Cheap/free low overhead SQL engines are plentiful.

    SQL in general may be suboptimal but it is a an effective standard.
    It allows a degree of standardization and flexibility that will
    likely not be matched by a successor any time in the near future.

    --
    A Pirate and a Puritan look the same on a balance sheet.
  38. Re:Next Up... by Anonymous Coward · · Score: 0

    Nah, Perl (or, in Perl, $a = )(J0(JE#R09J@#_)JQ_)@J(!)![RL}).

  39. Not going to happen by russotto · · Score: 1

    SELECT * FROM DB WHERE DBCLASS='REAL' QUERY_LANGUAGE NOT LIKE '%SQL%'
    0 Rows Returned

    (Stupid slashdot yelling filter. I hate queries in lowercase. )

  40. I don't understand by 93+Escort+Wagon · · Score: 4, Funny

    So a bunch of Excel users got together for dinner in San Francisco - why is this news?

    --
    #DeleteChrome
    1. Re:I don't understand by shutdown+-p+now · · Score: 1

      The news is that it was actually a bunch of Gnumeric users this time.

    2. Re:I don't understand by Anonymous Coward · · Score: 0

      So a bunch of Excel users got together for dinner in San Francisco - why is this news?

      exactly ;)

  41. SQL by Baseclass · · Score: 1

    I had no idea this was even an issue. Very enlightening indeed.

    --
    ^^vv<><>BA
  42. Nested sets by tepples · · Score: 1

    Trees is a wellknown problem of SQL

    And the "nested set" representation of a tree, explained by Joe Celko, is a well-known solution. Give each node an ID making sure that the IDs' collation is pre-order, and then in each node, store the ID of its first and last descendants. So node B is A's descendant if A.firstdesc <= B.nodeid AND B.nodeid <= A.lastdesc. There are some situations where INSERT statements can become slow (worst case: O(n)), but planning your ID space carefully can ease those.

  43. SQL is to data query languages by jd2112 · · Score: 2, Insightful

    ...What democracy is to methods of government.

    The worst ever devised excepting everything else that has ever been tried.

    --
    Any insufficiently advanced magic is indistinguishable from technology.
    1. Re:SQL is to data query languages by arevos · · Score: 1

      The worst ever devised excepting everything else that has ever been tried.

      Uh, actually there have been plenty of relational query languages that are superior to SQL.

  44. Can you say "Cloud"? by Eskarel · · Score: 1

    That's who these people are. Google and Amazon in particular have heavy investments in the cloud.

    SQL and relational databases work fine for most real purposes, they even scale fairly well. What they don't do, is scale very well via a cloud infrastructure without very serious amounts of data analysis and design. This isn't so much a problem with relational databases or the way they scale, but with the way they distribute. You can't magically throw more small servers at an sql database and make it work faster because they're not designed that way.

    Amazon managed to put their databases in a cloud, but only because they did a lot of analysis on their data and found some very convenient patterns. Those patterns don't directly translate to other data sources.

    If you have to spend half a million dollars redesigning your database it's cheaper to buy servers than to pay Amazon to host them in the cloud. Amazon and Google both want you to pay them to host your data in the cloud. For obvious reasons that means they want a new database system which works well when applied in small amounts over their "computational unit" servers as opposed to when stuck on a gigantic powerful single server as is the case with current SQL databases. They don't really care if their new alternative is actually any better, just that it's good enough that people find it cheaper to use the cloud, if they can get a cut of the use of the technology as well as the hosting, all the better.

  45. 9 years ago by stimpleton · · Score: 1

    I did my BSc in Information Systems 9 years ago. A strong part of the practical in the final year was Oracle DBMS with SQL but along with theory about object oriented dababase and modeling.

    I doubted back then that OO in databases would see the broad use light of day.

    I still think its years away. Its all very good for a few visionaries who are clever with coding wish to see their better mouse trap take control of data. But in the real world when an agile business needs to step up to the plate on a monthly basis to adapt to some crap board decision that you have no control over then DBMS's allow one to respond to their decisions.

    --

    In post Patriot Act America, the library books scan you.
  46. You can do cache tables in SQL by tepples · · Score: 1

    RDBMS systems focus on Consistency, and trade Availability for it. Your bank's computer can be down for an hour... inconvenient, but acceptable. But they cannot, under ANY circumstances, be incorrect. Period. Google, on the other hand, can handle some slightly incorrect data... but being offline is totally unacceptable.

    You can simulate this behavior in pure SQL by inserting from a select. This makes a table that caches what needs to be cached on each partition.

    INSERT INTO cache_newitems (column, column, column)
    SELECT column, column column
    FROM catalog_items
    WHERE conditions

    You can even TRIGGER this to automatically update when practicable, giving similar functionality to the "materialized view" features in some DBMS implementations.

  47. How hard is it to understand? by dread · · Score: 2, Insightful

    Use the appropriate tool. Always. There are tons.

    Don't use a relational database to try to represent hierarchical data. Don't try to use LDAP to do analytics. Think of the performance implications before you have more than two users accessing your system. Data storage is a very different animal, you are often (though not always) I/O bound. This is very different from being limited by the amount of instructions you can deal with per unit of time. Don't think otherwise because it will bite you in the ass.

    And still I see people making the same stupid mistakes over and over. But it's pretty simple really:

    A solution designed to be generic will ALWAYS be slower than a solution that is customized. This shouldn't be surprising. If you have serious performance requirements (ESPECIALLY if they are coupled with huge amounts of data) then a custom solution is definitely something you should look into. At some point you will run into a brick wall and find out that there is stuff you can't do with the solution you have in place. This is natural. Custom solutions to hard problems always lead to restrictions in terms of future features. Always. You will NEVER be able to anticipate all features that you would like to have. (Yes, this is true for Google as well. No they don't have any special kind of magic dust that they sprinkle on their things there, they do the best they can and then they get bitten in the ass too, just like everybody else.)

    --
    I've had a wonderful time, but this wasn't it -- Groucho Marx
  48. Hash tables, caches and time sequences... by flyingfsck · · Score: 1

    I must be getting old, but there are many types of simple table based systems around and they used in things like sendmail, postfix, squid-cache, round robin database, look aside tables, computer instruction caches and so on. This istuff is nothing new. It is a matter of using the right tool for the job without the need to consult an oracle.

    --
    Excuse me, but please get off my Pennisetum Clandestinum, eh!
  49. The RDBMS responds to the troll by smittyoneeach · · Score: 5, Funny

    See, I don't think there is ever a good time or place for SQL.

    SELECT text FROM mild_introductory_statements WHERE id=random();

    Anyone who says so has never had to use it.

    SELECT text FROM statements_indicating_superior_experience WHERE id=random();

    I like to compare it with JavaScript.

    SELECT text FROM unrelated_tool WHERE id=random();

    It's a language that is difficult to refactor, maintain, and while it's a standard, the standard is so vague that it's useless.

    SELECT text FROM seemingly_valid_yet_unsubstantiated_objections WHERE id=random();

    Like JavaScript, people are trying to build other languages on top of it to hide its shortcomings -- for javascript you have tools like GWT, and for SQL you have HQL, Linq, etc.

    SELECT text FROM wrongheaded_causal_analysis WHERE id=same_one_as_two_queries_ago();

    Not to say that there is anything wrong with relational databases, we just lack a good tool to interface with them.

    SELECT text FROM reasonable_sounding_parthian_shot_to_obscure_trolling WHERE id=random();

    --
    Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
    1. Re:The RDBMS responds to the troll by Rich0 · · Score: 2, Informative

      I hate to nitpick, but:

      1. Your database needs normalization. Almost all of that data should have been in one table, using fields to indicate what kind of statements they were.

      2. Your use of random() isn't going to work - unless the random number generator just happens to generate a number that just happens to be the ID of one of your records. The typical way to do what you're doing is to order by random() and limit the query to one record.

  50. What? by SQLLord · · Score: 1

    That article you linked to is one of the craziest things I've ever read. RDBMSs are powerful enough to do anything -- but so many engineers are too lazy to learn proper SQL.

    If you need to handle more transactions or queries, buy a bigger box. You can have hundreds of TB in a nice enterprise-grade server. If you HAVE to run your data across multiple machines, just spend some time and actually *think* about what data you want where, and then write a little code to send it to the right database.

    Even Google used a ton of MySQL boxes for a long time to deliver their searches. Scale-out architecture is a lot more of a lie than its proponents make it out to be, check out this post on CodingHorror.

    Eventually, all these new "databases" will need to actually be used in a *real* environment, and they'll have transactions, SQL support, indexing, a good UI, and everything else. All MS or Oracle need to do is a little tweaking to make their multi-box configurations more robust, and they'll crush everything else.

    1. Re:What? by Vader82 · · Score: 1

      That article you linked to is one of the craziest things I've ever read. RDBMSs are powerful enough to do anything -- but so many engineers are too lazy to learn proper SQL.

      Actually there are plenty of engineers who know not only SQL but other query languages too! Proper SQL doesn't help you get extra parallelism in a write-intensive environment.

       

      If you need to handle more transactions or queries, buy a bigger box. You can have hundreds of TB in a nice enterprise-grade server. If you HAVE to run your data across multiple machines, just spend some time and actually *think* about what data you want where, and then write a little code to send it to the right database.

      And all that disk does you how much good when you need to read ALL the data into RAM to run a query? Oh, right.

       

      Even Google used a ton of MySQL boxes for a long time to deliver their searches. Scale-out architecture is a lot more of a lie than its proponents make it out to be, check out this post on CodingHorror.

      Emphases on "used" rather than "currently uses." If you've got more than a few TB worth of data scaling out is really the only good way to go. And Jeff Atwood takes a simplistic look at licensing fees (and assumes that you're not using FOSS).

       

      Eventually, all these new "databases" will need to actually be used in a *real* environment, and they'll have transactions, SQL support, indexing, a good UI, and everything else. All MS or Oracle need to do is a little tweaking to make their multi-box configurations more robust, and they'll crush everything else.

      You know what? You're right. In fact, Google and Yahoo are currently in the process of converting BACK to SQL right now. They did the whole scale-out thing and it totally sucked. While it was good enough to run the company and ensure that everything worked reliably and whatnot, it wasn't good enough to appease the automatic report generator machine. That and pretty GUIs are what really matter in life. Not simplicity, robustness, and CERTAINLY not scalability.

    2. Re:What? by SQLLord · · Score: 3, Insightful

      How many Googles or Yahoos are there? Like, 5. Let them do whatever broken things they want -- it works for them... for now. It's still expensive, probably just as much as "big iron". Not to mention the countless engineer hours and hosting/electricity costs for their "scale out" systems. It's what happens when you let a bunch of ivory tower PhDs solve real engineering problems.

      In the end, the rest of us serious enterprise engineers will allow Oracle, Microsoft, and the people who have been doing this for 30 years to optimize their code to run on multicore mainframes ... which is where massive computing belongs. Then we query it with a few lines SQL instead of convoluted algorithms in some "Map Reduce" environment, and you move on with our lives.

    3. Re:What? by SQLLord · · Score: 1

      The more I read that post on the "Road To Failure" blog, the less sleep I want to get.

      You simply can't scale when you have to deal with network latency. More cores, RAM, and disks talking to each other will always be much, much more efficient than waiting for little packets to cross the ethernet.

  51. Toss the bathwater, not the dBaby by Tablizer · · Score: 1

    The problem with the existing RDBMS is not so much relational nor SQL[1], but the fact that they are not dynamic. Why not allow columns and tables to be created willy-nilly and without (up-front) width limits, like in a dynamic programming language? It would need a more flexible type system, or even do away with internal types and make the operators a bit more type-explicit, like Perl does. We have static/compiled programming languages and dynamic/interpreted programming languages. Why not have a similar choice with RDBMS? Stop cloning Oracle's style of relational so closely.

    [1] There's a few minor tweaks that could be done to SQL to improve some really sore spots, like named temporary query references, a kind of user-defined and/or temporary views. This would reduce run-on sentence queries. There's also graph and tree traversal query needs, but somebody pointed out that these already exist to some extent.

  52. Re:Tilting at windmills [Sub-Queries] by Tablizer · · Score: 1

    who hasn't had a business need for multiple levels of aggregates

    I created a draft query language called SMEQL (Structured, Meta-Enabled Query Language, pronounced "smeegol"), which is roughly based on IBM's early BS-12 language. It partitions relational operations at a finer level than SQL clauses and allows one to use named references such that one does not have to massively nest sub-queries. It borrows heavily from FP concepts. It also has a simpler syntax than SQL, making it easier to create DBA-defined extensions. I just wish some bored sucker locked in jail or something would program a production version of it for me. Hans Reiser? :-)

    http://www.c2.com/cgi/wiki?TqlRoadmap
       

  53. SQL Needs Competition by Tablizer · · Score: 1

    Slightly changing the requirements often involves completely restructuring the query. eg, A group by and having clause no longer cut it, now you need to use a work table and at least 3 separate statements. Queries on the same tables are difficult to reuse for similar tasks, resulting in many copies of similar statements that must be maintained.

    I created a draft query language called SMEQL that purposely allows better partitioning and reuse of concepts than SQL. I described it in a nearby reply. It borrows heavily from "functional programming". The problem is that it only exists on paper. Sigh.

    We have thousands of app programming languages, but only three[1] relational query language contenders: SQL, Tutorial-D[2] (and derivatives), and SMEQL. Time for some real competition to SQL.

    [1] There are some older ones, but most consider them "legacy" and I see no interest from others in reviving them.

    [2] Tutorial-D's syntax is unnecessarily complex in my opinion.
       

  54. Been there, done that already! by itsybitsy · · Score: 1, Flamebait

    Dah! Everyone with any technical skills has known for many decades that relational database are junk. Sure they do a particular job but then so does a swiss army knife but you'd not use it to cut a forest down you'd get a proper volume tool and nothing beats object databases with a simple file structure. Flat files are great too however they tend to require extra parsing slowing things down. You can roll your own object database system easily even with full on transactions and concurrency support in under a man month. Death to Relational Databases.

    1. Re:Been there, done that already! by adnonsense · · Score: 1

      You can roll your own object database system easily even with full on transactions and concurrency support in under a man month.

      OK, go on then.

    2. Re:Been there, done that already! by itsybitsy · · Score: 0, Flamebait

      [meta]How the heck can the parent comment be flame bait? You have a posting about a "new" anti-sql approach that has been done N times before where N is a very large number being passed off as "new" and then you don't like a comment that point that out? WTF? It's not flamebait, it's a dissident's comment. I most certainly support their efforts to promote anti-sql agendas. I was merely pointing out that it's been done before. Oh, and I use Relational Databases and SQL all the time. I've also written object to relational mapping layers that have actually worked with transactions in memory! So I know of what I speak, which I doubt of the moderators as they are clearly blind. In fact why can't we see who moderated? Why can't we see all the ratings for a post? If someone moderates a post up as interesting why can't we see that too? The moderation system here at slashdot sucks big time for it's highly denigrating to those of us who attempt to share our hard earned wisdom. Sigh.[/meta]

    3. Re:Been there, done that already! by itsybitsy · · Score: 1

      I've done it already a number of times on various projects that have a need for speed. I know of what I speak which is why I can say it with certainty.

    4. Re:Been there, done that already! by itsybitsy · · Score: 1

      [meta] The fact that someone moderated the parent comment to flame bait indicates that my thesis described above is 100% correct and that mindless fools abound here at slashdot. This isn't flame bait it's a slashdot dissident speaking freely. I have no respect for your silly moderations and neither should you if you support free speech.[/meta]

  55. Seriously misguided by Stu+Charlton · · Score: 2, Interesting

    Trash SQL in favour of coding all your data access needs. Welcome back to 1973, guys.

    It's not like we could do parallel SQL in the 1980's. Or that you can't do parallel SQL in a compute cloud today.

    No, It basically seems like they don't want to pay software vendors any money for database technology. That's mostly what the arguments boil down to. Oracle RAC is very scalable, arguably easier to do at massive scale than MySQL - but you have to pay Oracle money. For an Internet startup, I can understand why you'd take your chances with "roll your own". For an enterprise... I think not.

    --
    -Stu
  56. lots of these guys belong to flat earth society by Dan667 · · Score: 0, Troll

    Think of the discussions!

  57. Replacing SQL? :) by youn · · Score: 1

    Yawn... they said that about Cobol... call me when they have actually eliminated Cobol... and in my opinion is a much larger horror than SQL.

    SQL is here to stay... maybe a few enhancements... but it shouldnt change much... too much of criticial stuff is dependent on it... besides, it's surprisingly flexible when it comes to access data

    --
    Never antropomorphize computers, they do not like that :p
  58. I see SQL as more of a "protocol" by TheLink · · Score: 2, Interesting

    You can use SQL with flat files.

    SQL is going to be around for a long time, because it's useful as an "API" - as a protocol or layer of abstraction.

    Programmers can write all sorts of programs in all sorts of programming languages and then use SQL to talk to the DB. If the DB changes a bit, they can often use the same SQL or modify it slightly.

    You often see lots of grumbling and cursing in various companies because people actually end up doing that and companies end up with lots of stuff hooked to the DB - MS Access, perl, python, ruby, java, radius servers, openvpn, accounting and finance stuff...

    They grumble, but the fact is the database is being used. The data has become more useful.

    If you have your database locked up behind some new fangled protocol that only 20 people in the world know, it's not going to be as easy to do that - and often each bunch will start creating their own databases and you end up with a different mess, and a mess that's not as useful.

    Having everyone use SQL to talk to the DB is not actually a bug it's a feature.

    One man's impedance mismatch is another man's layer of abstraction.

    --
  59. Ditch SQL, not Relational DBs! by Lord+Bitman · · Score: 2, Interesting

    SQL syntax sucks, is inconsistent, and just non-standard enough at its corners that it's completely annoying to write anything for more than one DB. Also lacks various features which logically _should_ be there, because of the relational back-end. SQL is a toy, and though I'm the guy everyone in the office turns to if they want to write a query that does more than SELECT * FROM sometable, that doesn't mean I have to like it.

    But that's not the fault of relational databases. The relational logic makes sense, and we'll be seeing it referenced in countless "new ideas" that come along for years, just as ideas which Lisp already had in 1970 will be touted a new features on for the next millennium (you hear? PHP can do Lambda functions as of yesterday!)

    SQL sucks, but SQL is NOT what makes something relational.

    --
    -- 'The' Lord and Master Bitman On High, Master Of All
    1. Re:Ditch SQL, not Relational DBs! by rycamor · · Score: 1

      I wish more developers understood this.

      SQL is the perfect example of how a temporary hack becomes a standard that exists well beyond the original intention of the developers.

      Read Hugh Darwen's "The Askew Wall" and "The Importance of Column Names" for a couple teeth-grinding examples of how needlessly screwed up this language is.

      In fact, the language itself forces a suboptimal implementation of relational databases at the lower level. For one simple example, allowing for duplicate rows of necessity means storing more data than is needed, and not allowing for certain back-end optimizations. Another insanity is the need to preserve column ordering, and even allow for duplicate column names in query output.

      Unfortunately, rather than fix SQL at the core, we get tons of additional features jammed into the SQL standard, many of which are contradictory to the relational model, and which often force 'implementation showing through in the interface'.

    2. Re:Ditch SQL, not Relational DBs! by BitZtream · · Score: 1

      SQL is as standardized as HTML. Do you suggest we ditch HTML as well?

      Its not that there isn't a standard, its that none of the vendors follow it to the letter.

      This is both good and bad. Good because pretty much every DB server worth mentioning supports things that are VERY useful and not part of the SQL standard. Its bad because I can't just switch an app from PostgreSQL to Oracle to MySQL if its anything more than basic queries without rewriting a lot of those queries. Okay, you've got a good chance of jumping between Oracle and PostgreSQL, but good luck going between them and MySQL or MSSQL, or any of the others.

      SQL does not suck, it works rather well which is one of the reasons its still here.

      Vendors who don't support the standard to the letter and THEN extend it, suck.

      Don't hate the game, hate the players who don't follow the rules.

      --
      Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
    3. Re:Ditch SQL, not Relational DBs! by Lord+Bitman · · Score: 1

      COBOL is still here, so I can't depend on that argument too well.

      As for "as standard as HTML", there's a simple test to see if that's true. It's completely anecdotal, but "in my experience:"

      In every company I've worked in, whenever we were writing queries (queries that had to run on multiple systems), we ALWAYS looked at each DB's documentation individually, and ALMOST ALWAYS needed to add at least a switch() for what the database we were talking to was. (I can't actually think off-hand of any query which didn't have such a thing, but I'm assuming one probably existed)

      In every company I've worked in, whenever we were writing HTML (HTML that needed to render correctly on multiple systems, hell, even some that technically didn't) we ALWAYS looked at general information online and ALMOST ALWAYS didn't need to worry about what browser the page was written about. (Note that this is talking about HTML, not CSS or Javascript, which IE shits all over. Even for those, we can rely on standards for day-to-day things at least 98% of the time, without looking anything up.)

      Look up any "help me with this query!" whine on a forum, it will almost ALWAYS be for a specific database. Look for the same for HTML, and it will almost ALWAYS be talking about general information.

      --
      -- 'The' Lord and Master Bitman On High, Master Of All
  60. ur doin it rong by Hognoxious · · Score: 1

    Are the two sequences equivalent, in that they both code for the same thing, like (just an example) red hair? In which case the solution should be another layer of abstraction plus a join; get the code(s) for being a carrot top and find everybody who has them.

    If they aren't coding for the same thing, why would you want them both returned anyway?

    --
    Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    1. Re:ur doin it rong by Marillion · · Score: 1

      Genome wide association studies. http://www.genome.gov/20019523

      --
      This is a boring sig
  61. Cloudscape/Derby by Kupfernigk · · Score: 2, Interesting
    Yes, and Java persistence systems (Hibernate) suck dreadfully; they are a solution for which there is no problem. By the time you've learned the mess that is Hibernate, you can have learned SQL and the Java Collections well enough to be able to knock up any persistence model you need in no time flat.

    Derby 10.5, meanwhile, still has a tiny footprint, and can do most if not all of the SQL you will ever want for a typical Java application, along with features like the ability to do live backups, live table compaction from within the application while running, and now at last the ability to do cursoring in SELECT statements. Installation and configuration are simple.

    I actually think that the actual problem is that we old C programmers actually learned programming and data structures, and as a result know a lot about the kind of problems for which SQL is well suited, while a lot of modern programmers learn a lot of theory about OO, but don't actually learn to program. Therefore, they have to try to reinvent wheels that were in fact designed in the 70s, and have no idea of what tools are available and how they map onto typical real-world application level problems.

    --
    From scarped cliff or quarried stone she cries "A thousand types are gone, I care for nothing, no not one."
  62. Re:Tilting at windmills [Sub-Queries] by gazbo · · Score: 1
    I've not fully read through your linked page, but I suggest you really go and look at the SQL WITH statement (now in Postgres 8.4!) that addresses a huge chunk of what you propose - namely aliasing subqueries for use in subsequent queries without nesting. And using WITH RECURSIVE extends that functionality even further, allowing for e.g. better querying of hierarchical data.

    But still, props for actually doing something, rather than most of the other whiny pissants in this story whose comments can be summed up as "I've never learnt anything beyond an inner join, therefore SQL sucks."

  63. My Biggest problem with sql by TheSunborn · · Score: 1

    My Biggest problem with sql is that i have defined my database as a graph, but I newer get data back as a graph. Example for slashdot.org:

    There are users, who can post stories and there are comments to each story. Imagine that I want all users that have postet a story, where a comment contains the word Amiga. And I also want the stories, and the comments that contain the word. With sql that's easy, I just do a join between user and story, and between story and comment and take the comments that contain the word amiga.

    No problem, EXCEPT that what I get back is a bag of data with no specific order(And with much redundancy). That's not what I want.

    What I want is a list of users, and for each user I want the stories that he have submitted with at least one comment that contain the given word. And for each story I want the all the comments that contains the word. The sql does contain all the information i need, but extracting it is quite some work.

    My database is defined as a graph where foreign keys are the links, so why does sql(And relational algebra) insist on not using this graph when returning data?

    1. Re:My Biggest problem with sql by BitZtream · · Score: 1

      You do realize that with SQL can order your data, right? Both rows and columns? It can group data to remove redundancy or just provide unique responses.

      I'm not sure about other databases or the official SQL standard, but ... PostgreSQL has no problem returning to me a list of rows contained in a single column of a single row in a result. The tree you are looking for is easy to return, I have stored procedures that work this way myself. It may be PostgreSQL specific, but I really doubt there isn't something like this in Oracle and MSSQL. Sorry if MySQL doesn't support it, and I realize this is going to raise my troll flag, but with a couple of exceptions, no one uses MySQL for anything important. Yes, wikipedia uses it, yes its a popular website, no, nothing on it is actually important, contrary to what you might thing. Yes, we would lose cultural value if it disappeared, but since that can be recovered from other sources, using wikipedia as an example of how awesome and perfect MySQL is just makes you look like a fanboy. It is one of those rare places where MySQL shines. Of course, its a safe bet that PostgreSQL out shines it, as well as Oracle, and possibly even MSSQL. Sorry, now I sound like a pgsql fanboy :(

      --
      Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
    2. Re:My Biggest problem with sql by TheSunborn · · Score: 1

      Re-reading my text: Nope I don't mention mysql. Most likely because I like PostgreSQL much more, and only really use Mysql on legacy systems. (The kind I take over from someone else, because the fired the original developer).

      But can you clerify on the

      PostgreSQL has no problem returning to me a list of rows contained in a single column of a single row in a result. The tree you are looking for is easy to return, I have stored procedures that work this way myself. It may be PostgreSQL specific,

      I looked in the manual, but I could not find anything doing what you described above. Got any sql samples or tutorials that explain how you do it?

  64. Tree structures in relational by JerryQ · · Score: 1

    I do it all the time,

    I add a string key, e.g., AAA_BBB_CCC_DDD..... and alpha index it

    To find all the offspring of AAA_BBB_CCC look for like 'AAA_BBB_CCC*'

    To find all the siblings of AAA_BBB_CCC look for like 'AAA_BBB_%%%' where % is a single char wildcard.

  65. Absolute BS! by Anonymous Coward · · Score: 0

    As a DBA for a major well know RDBMS product, we get this crap every few years, "A new paradigm!". Yeah well the first time you load you wonderful no optimised SQL into any puckka RDBMS system, it will balk and like a bunch pre-schoolers all go crying to thr DBA that you app won't work any more. It worked fine with 2 test users, but now with 2,000 it balks and crashes! Duh! Wonder why? 'Cos you're a plank who believes all the RAD product BS!

  66. The End Of ... by jandersen · · Score: 1

    How many times have we now had somebody announcing "the end of" something? COBOL, FORTRAN, C, mainframes, UNIX - and now SQL. All these things are still around because they serve a useful purpose. It is well possible that this "No-SQL" concept can serve a purpose other than hype, but that largely remains to be seen.

    The big, fundamental advantage of SQL databases, as far as I can see, is not that they are transactional or scalable or fast, but that you can organise your data in a way that fits fairly naturally with your data, and then you can analyse things in ways that you hadn't thought of when you designed the database using the select statement, even if it isn't always the most efficient of tasks. This is one thing that is hard to build in to hierarchical or networked databases, and of course even more so in simple, indexed files. And that is why RDBMSes are going to be the most important kind of databases for a long time yet.

    That is not to say that simpler mechanisms don't have their place; few things beat a simple ISAM file when it comes to whipping up a program that can quickly look up and organise a simple dataset.

  67. Access is good enough for anything by FreakyGreenLeaky · · Score: 1

    I'm not surprised; these are probably the same class of people who maintain that MS Access is all you need for a high-traffic website backend.

    They're also the same class of people who are at a loss to explain why their site/application is so slow under load...

    Stupid dummies.

    Then again, I haven't RTA, so I'm probably babbling.

    Anyway, it's Friday and it's been a long week dealing with fuck'n dummies who don't apologise when shown the error of their ways, so fuckit, fuckthem, fuckemall. Say ghello to my leetle friend...

  68. SQl works by nurb432 · · Score: 1

    And works well. Let them go off on their own and play around, and let the rest of us get some work done.

    What is next, complaining that wheels aren't the coolest way to get around town?

    --
    ---- Booth was a patriot ----
  69. The RDBMS responds to the nitpick by smittyoneeach · · Score: 2, Funny

    SELECT text FROM thank_you_for_sharing_your_views_but _you_have_not_seen_the_schema_my_friend
    UNION
    SELECT text FROM same_goes_for_point_two__if_you_lack _the_source_code_what_then_do_you_really_know
    UNION
    SELECT text FROM besides_it_got_plus_five_funny_so_neener_neener_neener;

    --
    Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
    1. Re:The RDBMS responds to the nitpick by Anonymous Coward · · Score: 1, Insightful

      Ok, you owe me a new keyboard. Coffee EVERYWHERE!

  70. What do we learn from the answers here? by drolli · · Score: 1

    There are some people who have reasonable objections against pressing data in a relational scheme because they really understand what it cant do for them and there are some people who have unreasonable objections against it because they really dont understand what it can do for them.

  71. Prolog/Datalog by weston · · Score: 1

    who hasn't had a business need for multiple levels of aggregates (eg averages of sums across multiple groupings, say "average across all customers' total balances") As it is, you end up splitting the logic between the database and the application

    This is one class of problem Prolog or Datalog or some other variant would be great at, and I often wish not so much that I had a non-relational database as I had a better way of querying relational data.

  72. You are 100% Correct by gbutler69 · · Score: 1

    I've been struggling with maintaining someone else's multiple, redundant, pointless layers of abstraction that all need to be updated to add a F-ing field to a table. If I ever meet the guy who created it in a dark alley, I'll surely injure him severely (i.e. Cave his skull in with my bare hands).

    It drives me crazy when developers can't get the understanding that the DATA is central, not their idiotic layers of abstractions and so-called "Frame-Works" (I'm going to vomit now)!

    --
    Over-the-top Response Guy! Giving "Over-the-Top Responses" since 1970.
  73. Parent is right on the money!!! by dooguls · · Score: 1

    I just spent 3 months doing cloud computing where we used a 'columner database' similar to big table. We got around the problems because the database was auto indexed lexigraphically by key and we'd make up different keys to help us index the data to find various 'cells' of data quickly and easily. It was beautiful because if we decided a day later that we wanted a new column, we just changed our keys to include the new column and it was there. That way we could quickly prototype, eventually settle on good 'table structure' for lack of a better term, and we could withdraw results very quickly. We could also add new columns much later if we needed to, like if we wanted to store totally different types of data in the same tables later.

    The downside to this database is that its very inefficient for rapid transactions. So you would never use it for something like ebay, where the records change 'status' (from for sale to sold). But you could easily use it for something like craigslist or google which stores __lots__ of data that doesn't change.

    --
    Sig 'em boy!
  74. Re:Tilting at windmills [Sub-Queries] by Tablizer · · Score: 1

    The "with" clause definitely looks promising. I hope it or something like it becomes standard. Another way that may require less new syntax is to use the existing INTO clause with some special character or sub-clause, something like this:

        SELECT AVG(grade) AS GPA INTO $T1 FROM Grades GROUP BY studentID;
        SELECT AVG(GPA) FROM $T1;

        OR

        SELECT AVG(grade) AS GPA INTO VIRTUAL T1 FROM Grades GROUP BY studentID;
        SELECT AVG(GPA) FROM T1;

    Still, SQL is syntactically far more complex than needed in my opinion. I attempted to address that in my pet language by using a simpler syntax style. A DBA could add a new "operation" without having to change the syntax.

    Thanks

  75. Re:Tilting at windmills [Sub-Queries] by Nicolay77 · · Score: 1

    If you think anyone with half a brain will program a production version of a spec created by a megalomaniac bastard that calls coders 'suckers', then you either have a lot of money or are delusional.

    Nerds need and deserve respect.

    --
    We are Turing O-Machines. The Oracle is out there.