Slashdot Mirror


The Future of Databases

gManZboy writes "Ever wonder where database technology is going? This is something that Turing award winner Jim Gray from Microsoft has given a lot of thought to. He recently published an article in which he looks at the many forces pushing database technologies forward, and what those new technologies will look like. Gray writes, 'the greatest of these [research challenges] will have to do with the unification of approximate and exact reasoning. Most of us come from the exact-reasoning world -- but most of our clients are now asking questions that require approximate or probabilistic answers.'"

21 of 315 comments (clear)

  1. Great Article by Spaceman40 · · Score: 5, Insightful

    The requirements for a database today aren't too much different from those twenty years ago - except for what we want to get out of them.

    Now that data mining is a $[insert large number here]million industry, databases are being asked to do a lot more processing with this data than before. For example: old database query = get these attributes from tuples that match this pattern. New database query = determine how likely a user who has accessed 30 or more times this last month is to subscribe to the second-level pay service within the next ninety days, with or without an email advertising said service.

    --
    I [may] disapprove of what you say, but I will defend to the death your right to say it.
  2. In other words ... by Daniel+Dvorkin · · Score: 5, Insightful

    ... MBA's want the magic glowy box to do their thinking for them.

    Fortunately, Microsoft will be there to take their money.

    --
    The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
  3. Re:Why complicate things so much? by dioscaido · · Score: 4, Insightful

    This random example just server to clarify what you mean -- How implement a airline database that has entries for 1,000,000 customers, 150,000 flights a year, and 12,000,000 reservations a year? and what would a query look like to find an open flight on a particular date range, and register a reservation? And how would doing all this on a ReiserFS be any less prone to data corruption than an often backed up database?

  4. Re:A real problem comes full circle by zappepcs · · Score: 4, Insightful

    I think I agree with the parent. Databases are methods of storing and retrieving data. Trying to make queries fuzzy, or less structured is just wrong.

    If you want to be able to ask probablistic type queries of a database, you need to add some code between you and the database.

    More to the point, the fuzzier your logic is, the higher the probability that your database will not contain all of the answers on its own, and you will have to cross reference your data to the data owned by someone else or gathered from a different disparate source.

    It sounds like M$ is going to try to re-invent data warehousing? and then of course, patent it.

    Trying to make the database do everything is not right and simply doesn't make sense. The code that accesses the data for you needs to do the fuzzy probablistic stuff.

    P.S. I have no faith that M$ (no matter who they hire) can effectively provide the code required to make it work in the idealistic manner spoken of... mostly because they would have to patent accessing other people's data before they could do it.

    Just my thoughts

  5. Re:A real problem comes full circle by kfg · · Score: 2, Insightful

    When did adding overhead become the mark of skill?

    The second it became profitable to market it as such.

    KFG

  6. That is what SAS is for... by the+eric+conspiracy · · Score: 3, Insightful

    most of our clients are now asking questions that require approximate or probabilistic answers

    What are my chances of getting laid tonight...
    What are the odds of my winning the lottery...
    What are the chances that my boss will find out about that phoney dinner reciept...

    Seriously, SAS stat analysis software does exactly what this numbskull is talking about. You don't need a new kind of database, merely somebody with training in stats.

  7. Re:Accountable bitemporal DBs by TopSpin · · Score: 3, Insightful

    The rise of Sarbanes-Oxley highlights a key insecurity in the accountability of enterprise systems.

    Yeah, I've heard that one too. Reality has a way of factoring out the ambiguity of such abstract, open-ended claims.

    On way to deal with the problem of DBAs and their ability to access/modify financial data is to register them with the exchange, just like the finance and executive types. Now they're Sarbanes-Oxley insider compliant! That's what has been done where I earn my living.

    Thus, we may dispense with elaborate schemes of secure data version control using unspecified, hypothetical systems, paid for with budgets that don't exist. Next!

    Until some future revision of Sarbanes-Oxley begins to specify the design and implementation of electronic finance systems, no one can claim a database is more or less susceptible to malfeasance than a locked filing cabinet. That's why the auditors stop once they've concluded you're changing your password with adequate frequency.

    --
    Lurking at the bottom of the gravity well, getting old
  8. Re:Turing award winner? by Khashishi · · Score: 2, Insightful

    He must not be a slashdot user then.

  9. Re:moving past relational model? I thinketh not by kpharmer · · Score: 2, Insightful

    > it reads like a battle cry for us to move beyond the relational model. ... it's NOT going to happen...

    a couple of thoughts on that -

    1. relational databases are really quite wonderful for analytical apps. Need to store two years of firewall/sales/whatever data - then churn away analysis? Great - no problem. And it's easy enough to do either through hand-written sql or via a tool. There's plenty that requires third-party tools (and data stores), but even in this scenario the staging area is almost always the relational database.

    2. a lot of folks who would like to eliminate relational databases fail to account for point #1. They complain of the object/relational mapping problems. Ok, that's fine - but if you put your data in container-managed persistence or an object database - you'll then have to pay someone to pull it out and put it into a separate relational database for analysis. Of course, you might be fired right about that time...

    3. java in the database is mostly a pain in the butt: On the performance side you've got optimization complexity, on the managability side you've got unusual dependencies, build processes, etc, on the availability side you've got the ability to take out the entire server with some bad code (ok, sometimes).

    4. two-tiered architectures with a web service driven directly out of the database is more than just a pain in the butt. It's a security disaster. A cobbled together architecture. And Jim Gray shilling for microsoft.

    5. the column-store as Jim Gray described it has never really left us. And we don't really need new technology to handle it.

    6. users love tables. there are quite a few users out there that truly love tables - they understand them, they look just like spreadsheets, they can query them. This is important: it's fabulous when your users can easily understand your design.

    7. however - like Gray said, we now need methods of working with data that go far beyond boolean logic. We need fuzzy logic queries. And we need new types of models - allowing for multiple many-to-many relationships via relationship tables. This breaks codd's rules - but is essential for agile & fast-moving projects.

  10. Google is a good example... by dantheman82 · · Score: 3, Insightful

    I'd personally ask a Google employee where the future of databases is heading. The Google FS really shows where databases are moving...

    I give Gray a lot of respect in most cases because he's a really smart guy. But the math and computationally-intensive parts should be focused in the probabilistic searches.

    In one sense, though, Gray is quite right. And this is the direction of speech recognition. I might add that the Speech Server beta out by Microsoft is quite good...even at this stage.

    --
    This sig donated to Pater. Long live /.
    1. Re:Google is a good example... by Anonymous Coward · · Score: 1, Insightful

      Google is a fine example of a specific kind of database that is optimized for a very specific kind of query. Google writes everything to support their access patterns.

      That's fine for Google, but they don't make general purpose DBs. There's no way that Wal-Mart would want to run their transaction processing or OLAP on anything resembling Google's DBs.

      dom

  11. Re:Umm, Yep! by Anonymous Coward · · Score: 1, Insightful

    Funny, I know a lot about databases, and I don't know what the hell he was talking about either. I think he was saying that it would be cool if Java was embedded in the DB, or something.

    I guess I've just learned to tune out any article that talks about "XML-enabled object-relational databases" and other nonsense.

    Show me the relational model. Show me how you've made a better implementation. If you haven't done either of those, go away. Leave the buzzwords for the vendors..oops, this guy IS a vendor. It all falls into place...

  12. Re:The future of databases is... no Database at al by Anonymous Coward · · Score: 1, Insightful

    Uhm, keeping your data in RAM with a serialized version on disk is a database, what makes you think it isn't?

    But what if you want to access your DB from a different application that has a different serialization format? What if you want to perform arbitrary, ad-hoc queries that have nothing to do with your original object structure? What if my DB grows beyong my RAM? Oops. Welcome to 1970, we're working on solving these problems.

    (For the record, the author did talk about memory databases.)

  13. Re:Why complicate things so much? by Tack · · Score: 2, Insightful
    That's not very big. It's down right small, in fact. [...] [T]his is not the beginning of a pissing contest.

    I must be missing something.

    Jason.

  14. Re:The future of databases is... no Database at al by rossifer · · Score: 4, Insightful

    [misc drivel] Read more at:
    http://www.prevayler.org/


    Oh my dear god. You've never actually used Prevayler have you? Prevayler isn't nearly as useful on actual data problems as Prevayler's worshippers would have you believe.

    I know this because I tried to use it. If you'd ever tried to use it, you'd know how unbelievably poorly it performed when attempting to implement real world queries. You have to implement every query in Java, and Java is a particularly poor implementation choice for creating complex queries.

    What if I said that this can be as fast as 8000 times faster than Oracle?

    This "performance comparison" that the Prevayler group trots out is particularly funny as their test uses a single ArrayList of objects as in-memory "storage" and then "queries" it by index. Not exactly a realistic problem. Try a query across four classes with a few million instances of each class and you'll quickly discover what relational databases are good for.

    Regards,
    Ross

  15. You mean in the last 25 years? Nowhere. by Qbertino · · Score: 3, Insightful

    Honestly, folks, databases are like crutches: Pathetic, but you when you need them, there's hardly an alternative. They are the living proof that abstract concepts and computer simulation of those on real world hardware need the strangest type of hacks to be mended together.

    On top of that - and this is the worse part - what we call databases today is nohing much more of a historically grown apocalyptic chaos. With one of the crappiest programming languages ever as a cornerstone of its technology. A weedy mumbojumbo of wanna-be virtual machines, wanna-be server daemons, makeshift security layers, obstrusive user management and pseudo operating systems and a bazillion proprietary variants of said programmin language. With features bolted on left right and center. This basically is the case with any current DB in widespread use, be it MySQL, Oracle or anything inbetween.
    And if you look at the core of it Database technology and how long it has been that way there isn't much hope that DB's will go anywhere anytime soon.

    Then again, if you want to get a glimpse of a possibly brighter future, I'd actually recomend Zope. I consider it's object relational DB a working proof of avantgarde "database" concepts and a prototype of what DBs generally could look like in the future if anyone were interested.

    --
    We suffer more in our imagination than in reality. - Seneca
  16. Most of our clients... by cardpuncher · · Score: 2, Insightful
    ... are now asking questions that require approximate or probabilistic answers

    I suspect that may translate as "most of our clients want to be given easy answers to difficult questions".

    I'm sure there'd be a big market for a database system that stored flight bookings and could answer the question "which of our customers is a terrorist?". You don't address that market with new technology, though, but by developing new sources of snake oil.

  17. Re:Why complicate things so much? by poot_rootbeer · · Score: 2, Insightful

    How many times have we heard of huge sites going down because databases become corrupt or unrecoverable, or of the huge resource strain (memory and CPU) from a large database?

    How many times? Not all that many, in my experience. How many times when the sites were running off a hardy RDBMS like Oracle, rather than something in the MySQL range? Even fewer.

    Of course, "websites going down" is not exactly the best indicator of database reliability in the first place...

    While you're proposing making databases more like filesystems, what Reiser and others are actually doing is trying to make filesystems more like databases. That's an important distinction to note -- databases are the superior design.

  18. Ask Chris Date and Hugh Darwen about ... by CodeArt · · Score: 2, Insightful

    .. future of Database Technology. Actually you don't need to ask them. Just go to any bookstore and buy one of their books and you will quickly learn that relational doesn't mean SQL. Relational databases are about two-valued predicate logic and set theory and there is not more solid then this to be used as a basis for storing and manipulating information. Future databases will be truly relational truth systems with the support for user defined types and temporal data at the logical level and the much better implementation at the physical level. Jim Gray is authority in area of transaction processing but not in area where databases and database languages in general.

  19. Re:Why complicate things so much? by Anonymous Coward · · Score: 1, Insightful
    As others have pointed out....you're talking out of your ass.

    But...just for kicks...

    How many times have we heard of huge sites going down because databases become corrupt or unrecoverable, or of the huge resource strain (memory and CPU) from a large database?


    Um...never?

    How many times have I seen a small badly coded site not respond to me because of an application error? A few...one this week, and perhaps one 6 months ago.

    But the database going down? A REAL database? (Insert my disparaging view of mysql as not counting as a real database here). Umm..never.

    Come on...let's get real here. "Fragile databases"...hahaha

  20. Re:A real problem comes full circle by WaterBreath · · Score: 2, Insightful
    Yet what do databases offer us to represent textual data: a block of text! Fifty years of computers and they best method we've comeup with for representing a richly structured piece of writing like a wikipedia article is: a block of text. Ok, theres a bit of markup there but its all just stuffed together in a textblob. I might as well just dump it in a file for all the good a database does, indeed thats what yahoo does.

    To my mind databases are broken beyond belief.

    Well let us know when you think of an alternative to expressing concepts through some sort of language, in a way that simulatneously allows the measure of definition and ambiguity that all conceptual (i.e. ideas, not strict data) communication requires. I'm sure there will be great fanfare, as it would revolutionize life for all humanity.

    I propose an alternative complaint that gets more to the source of the issue:

    "Thousands of years of communication and the best method the human race has come up with for representing a rich tapestry of ideas and concepts is: words. Aural and written communication. Ok, sure writing has a bit of markup in there and speech has little pauses strewn about, to delimit blocks of thought, but it's all just stuffed together as a stream of letters or phonems. I might as well just draw a picture in the dirt.

    To my mind, languages are broken beyond belief."

    Leaky != broken. However, using the improper abstraction = a waste of time, money, and effort. The problem is probably not that databases "only" support text as "blobs" of characters. It is more likely that people insist on applying a data-oriented abstraction (it's right there in the name: DATA-base) to a fluid body of information that requires human language.