Slashdot Mirror


The Future of Databases

gManZboy writes "Ever wonder where database technology is going? This is something that Turing award winner Jim Gray from Microsoft has given a lot of thought to. He recently published an article in which he looks at the many forces pushing database technologies forward, and what those new technologies will look like. Gray writes, 'the greatest of these [research challenges] will have to do with the unification of approximate and exact reasoning. Most of us come from the exact-reasoning world -- but most of our clients are now asking questions that require approximate or probabilistic answers.'"

59 of 315 comments (clear)

  1. Turing award winner? by Anonymous Coward · · Score: 5, Funny

    As in, he passed the Turing test?

    1. Re:Turing award winner? by Khashishi · · Score: 2, Insightful

      He must not be a slashdot user then.

    2. Re:Turing award winner? by Erpo · · Score: 2, Funny

      Turing award winner? As in, he passed the Turing test?
      --
      He must not be a slashdot user then.
      --
      Why do you say He must not be a slashdot user then.?

    3. Re:Turing award winner? by sydb · · Score: 2, Funny

      Do you often feel the need to ask questions like that?

      --
      Yours Sincerely, Michael.
  2. approximate answers.. by ShaniaTwain · · Score: 4, Funny
    most of our clients are now asking questions that require approximate or probabilistic answers.'"

    • 42..ish
  3. Why complicate things so much? by bigberk · · Score: 5, Interesting

    How many times have we heard of huge sites going down because databases become corrupt or unrecoverable, or of the huge resource strain (memory and CPU) from a large database?

    In my opinion, the future of databases is nothing so complicated as pitched here -- but rather a move to simpler, more reliable back ends where the filesystem is the database. This is certainly the vision pitched by Hans Reiser and reiserfs, which aims to put more database like intelligence within the filesystem. So you eliminate extra unnecessary layers that just eat up resources and create fragile databases.

    1. Re:Why complicate things so much? by fireboy1919 · · Score: 3, Informative

      Ah yes. Harken back to the earlier days, when databases were just files on a file system, and did not distribute the resourses at all.

      Certainly that's not going to lead to more crashes.

      Certainly it's a better idea than, for example, distributing the databases and using load-balancing and regularly scheduled back-ups to ameliorate the loss of the least realiable portions of a databases design - the harddrives.

      When you've only got a hammer, everything seems like a nail...what does Hans Reiser do? He could be right. Microsoft is jumping on the filesystem-database wagon with their new filesystem, and we all know that if anyone knows and cares about reliability it's Microsoft.

      --
      Mod me down and I will become more powerful than you can possibly imagine!
    2. Re:Why complicate things so much? by dioscaido · · Score: 4, Insightful

      This random example just server to clarify what you mean -- How implement a airline database that has entries for 1,000,000 customers, 150,000 flights a year, and 12,000,000 reservations a year? and what would a query look like to find an open flight on a particular date range, and register a reservation? And how would doing all this on a ReiserFS be any less prone to data corruption than an often backed up database?

    3. Re:Why complicate things so much? by quinnharris · · Score: 3, Interesting

      Hans Reisers vision is about unifying namespaces (filesystem, relational database, XML, etc...) by providing the functionality in a filesystem to make this reasonable. In otherwords, making the file system better than current databases.

      Do we evolve the file system into a database (Reiser approach) or evolve a database into a file system (Microsoft WinFS approach)?

    4. Re:Why complicate things so much? by drsmithy · · Score: 3, Informative
      Microsoft is jumping on the filesystem-database wagon with their new filesystem, [...]

      No, they're not. WinFS is *not* a filesystem, it's a DB layer that sits on top of the filesystem.

      And when you consider NTFS *on its own* (like BFS) has the capabilities to do most of what WinFS is supposed to achieve, WinFS just looks sillier and sillier...

    5. Re:Why complicate things so much? by Nutria · · Score: 2, Interesting

      <i>databases that are 150 GB large with hundreds of thousands of records</i>

      That's not very big. It's down right small, in fact.

      These figures, on one of many systems I manage, are about 30 minutes old. And they don't include index space, rollforward logs, etc, etc.

      Names have been changed for privacy, of course.

      TABLE_NAME CARDINALITY TOT_BYTES
      TABLE_1 850,719,662 195,665,522,260
      TABLE_2 756,309,106 223,867,495,376
      TABLE_3 317,181,446 72,951,732,580
      TABLE_4 179,099,344 11,462,358,016
      TABLE_5 103,419,546 4,343,620,932
      TABLE_6 95,075,479 9,222,321,463
      TABLE_7 67,378,918 20,820,085,662
      TABLE_8 64,940,525 12,598,461,850

      Since I am fully aware that "my" databases are no where near the biggest, this is not the beginning of a pissing contest.

      --
      "I don't know, therefore Aliens" Wafflebox1
    6. Re:Why complicate things so much? by Tack · · Score: 2, Insightful
      That's not very big. It's down right small, in fact. [...] [T]his is not the beginning of a pissing contest.

      I must be missing something.

      Jason.

    7. Re:Why complicate things so much? by poot_rootbeer · · Score: 2, Insightful

      How many times have we heard of huge sites going down because databases become corrupt or unrecoverable, or of the huge resource strain (memory and CPU) from a large database?

      How many times? Not all that many, in my experience. How many times when the sites were running off a hardy RDBMS like Oracle, rather than something in the MySQL range? Even fewer.

      Of course, "websites going down" is not exactly the best indicator of database reliability in the first place...

      While you're proposing making databases more like filesystems, what Reiser and others are actually doing is trying to make filesystems more like databases. That's an important distinction to note -- databases are the superior design.

  4. A real problem comes full circle by Anonymous Coward · · Score: 5, Interesting

    Data is data. Just data. Save it, read it, sort it how you like. Efficient results mean having rapid, low-latency access to data.

    Add code to it, and you have data+code.

    OF course, code is data, and thus data can be treated as code, and handled by other code. LISP does this moderately well.

    But you can't avoid the fact that, as it stands, databases are just engines for keeping your data structures outside your code, or when you add code to them, engines for reading your data structure for you so that you don't have to think about how to do it. ... except that you still do, because SQL isn't a way of avoiding logical errors. ... and that they still don't save time. At best, they allow for some parallelism, external access to the data, and a separation of concerns.

    I'm getting rather tired of the fad that databases should be tacked on to everything, ranging from a shopping list to guidance systems. When did adding overhead become the mark of skill?

    1. Re:A real problem comes full circle by zappepcs · · Score: 4, Insightful

      I think I agree with the parent. Databases are methods of storing and retrieving data. Trying to make queries fuzzy, or less structured is just wrong.

      If you want to be able to ask probablistic type queries of a database, you need to add some code between you and the database.

      More to the point, the fuzzier your logic is, the higher the probability that your database will not contain all of the answers on its own, and you will have to cross reference your data to the data owned by someone else or gathered from a different disparate source.

      It sounds like M$ is going to try to re-invent data warehousing? and then of course, patent it.

      Trying to make the database do everything is not right and simply doesn't make sense. The code that accesses the data for you needs to do the fuzzy probablistic stuff.

      P.S. I have no faith that M$ (no matter who they hire) can effectively provide the code required to make it work in the idealistic manner spoken of... mostly because they would have to patent accessing other people's data before they could do it.

      Just my thoughts

    2. Re:A real problem comes full circle by kfg · · Score: 2, Insightful

      When did adding overhead become the mark of skill?

      The second it became profitable to market it as such.

      KFG

    3. Re:A real problem comes full circle by WaterBreath · · Score: 2, Insightful
      Yet what do databases offer us to represent textual data: a block of text! Fifty years of computers and they best method we've comeup with for representing a richly structured piece of writing like a wikipedia article is: a block of text. Ok, theres a bit of markup there but its all just stuffed together in a textblob. I might as well just dump it in a file for all the good a database does, indeed thats what yahoo does.

      To my mind databases are broken beyond belief.

      Well let us know when you think of an alternative to expressing concepts through some sort of language, in a way that simulatneously allows the measure of definition and ambiguity that all conceptual (i.e. ideas, not strict data) communication requires. I'm sure there will be great fanfare, as it would revolutionize life for all humanity.

      I propose an alternative complaint that gets more to the source of the issue:

      "Thousands of years of communication and the best method the human race has come up with for representing a rich tapestry of ideas and concepts is: words. Aural and written communication. Ok, sure writing has a bit of markup in there and speech has little pauses strewn about, to delimit blocks of thought, but it's all just stuffed together as a stream of letters or phonems. I might as well just draw a picture in the dirt.

      To my mind, languages are broken beyond belief."

      Leaky != broken. However, using the improper abstraction = a waste of time, money, and effort. The problem is probably not that databases "only" support text as "blobs" of characters. It is more likely that people insist on applying a data-oriented abstraction (it's right there in the name: DATA-base) to a fluid body of information that requires human language.

  5. Great Article by Spaceman40 · · Score: 5, Insightful

    The requirements for a database today aren't too much different from those twenty years ago - except for what we want to get out of them.

    Now that data mining is a $[insert large number here]million industry, databases are being asked to do a lot more processing with this data than before. For example: old database query = get these attributes from tuples that match this pattern. New database query = determine how likely a user who has accessed 30 or more times this last month is to subscribe to the second-level pay service within the next ninety days, with or without an email advertising said service.

    --
    I [may] disapprove of what you say, but I will defend to the death your right to say it.
  6. Re:Umm, Yep! by Sinus0idal · · Score: 4, Interesting

    Imagine for example, you want to use a database to store information about packets flowing through your network. Thats all dandy on normal network links, but if we are talking about a multigigabit link, it is likely that your hard disk can't keep up with storing that data. Or that the hardware to do so would cost too much. So instead you could take every second packet and look at that, and approximate. This particular example is refering to data stream management systems.

  7. In other words ... by Daniel+Dvorkin · · Score: 5, Insightful

    ... MBA's want the magic glowy box to do their thinking for them.

    Fortunately, Microsoft will be there to take their money.

    --
    The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
    1. Re:In other words ... by furchin · · Score: 4, Funny

      MBA's want the magic glowy box to do their thinking for them.

      If I had to pick between a magic glowy box and an MBA to show signs of intelligence, I'm definitely going with the magic glowy box.

    2. Re:In other words ... by YrWrstNtmr · · Score: 5, Funny
      I don't know how many times I've heard that thought process over the years.

      [MBA tool]"I want to come in in the morning, push a button, and have the program distribute all my stuff."

      [me]"If I could make it do that, I could make it push its own button, and the company wouldn't need you anymore."

      [MBA tool]"Oh."

    3. Re:In other words ... by jallen02 · · Score: 2, Interesting

      Tragic really. I have seen it as well. I always held CS people to a higher standard for coding.. what with the hours of courses you spend just learning how to design and implement basic things like data structures. How do people make it out of college without being able to code these things? It always amazes me.

      Jeremy

  8. $article_title by $blowhard. by Anonymous Coward · · Score: 5, Funny

    $techology is dying. It will be replaced with $replacement. Insert 4000 more words sprinkled with $random_buzzwords. I am so smart! The end.

  9. Bioinformaticists (and spies) use this a lot by Dioscorea · · Score: 5, Informative

    most of our clients are now asking questions that require approximate or probabilistic answers

    Bioinformatics databases are a good example of this. DNA and protein sequence databases are often searched by approximate string-matching algorithms based on "dynamic programming" to hidden Markov models and other stochastic grammars.

    Historically, drug target-hunters in Big Pharma created a market for accelerated hardware to facilitate dynamic programming searches, some of which (e.g. Paracel's Fast Data Finder chip) was originally marketed to government agencies who, um, shared an interest in approximate string-matching ;)

  10. Accountable bitemporal DBs by G4from128k · · Score: 3, Informative

    The rise of Sarbanes-Oxley highlights a key insecurity in the accountability of enterprise systems. Although the high-level applications can do a good job of tracking who did what to the financial data, the core DB may be open to tampering. If a DB admin with the right password can manually diddle a field in a database, they can change the financials of the company.

    In contrast, a secure bitemporal DB would record not only the date of the what the data refers to (e.g., the purchase order was entered on March 3rd, 2004) but also the date(s) of any modifications of the data (the quantity and total was changed on December 31, 2004, Uh-Oh!).

    This is more than just securing the DB with a hierarchy of privileges, it means that no one can overwrite the old data or change any data without creating an audit trail. This, of course, also means changes in the DB, OS or file system to make critical data only accessible through a secure DB layer that tracks changes (e.g., no accessible plain-text DB data structures). These same concepts could be used (probably are, for all I know) for OSS version control to track who did what and when to the code.

    --
    Two wrongs don't make a right, but three lefts do.
    1. Re:Accountable bitemporal DBs by TopSpin · · Score: 3, Insightful

      The rise of Sarbanes-Oxley highlights a key insecurity in the accountability of enterprise systems.

      Yeah, I've heard that one too. Reality has a way of factoring out the ambiguity of such abstract, open-ended claims.

      On way to deal with the problem of DBAs and their ability to access/modify financial data is to register them with the exchange, just like the finance and executive types. Now they're Sarbanes-Oxley insider compliant! That's what has been done where I earn my living.

      Thus, we may dispense with elaborate schemes of secure data version control using unspecified, hypothetical systems, paid for with budgets that don't exist. Next!

      Until some future revision of Sarbanes-Oxley begins to specify the design and implementation of electronic finance systems, no one can claim a database is more or less susceptible to malfeasance than a locked filing cabinet. That's why the auditors stop once they've concluded you're changing your password with adequate frequency.

      --
      Lurking at the bottom of the gravity well, getting old
  11. I predict... by rainman_bc · · Score: 3, Interesting

    Better indexing, faster lookups...

    That's... about... it...

    Object relational was the "new thing" that didn't really take off as well as they'd hoped.

    Hell, I work with people who still can't handle compound keys and joins well...

    --
    09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
  12. I want clustered databases for high-availability by SpecialAgentXXX · · Score: 5, Interesting

    The "next great advancement" in databases will be when I can setup 2 or more linux servers and have them act as a single database server. Our database server is the most expensive item in our datacenter because it's an N-way IBM server.

  13. It warms my heart... by Baldrson · · Score: 2, Interesting
    To see Bill Gates' organization of "really smart people" spinning their wheels so energetically really warms my heart.

    I can see they've no hope of being any competition at all come the real db revolution.

  14. Atomicity in filestores is a great benefit by Anonymous Coward · · Score: 2, Informative

    The parent makes a good point, and it's pretty easy to see why if one holds off the usual anti-Reiser reactions and thinks it through a bit.

    Databases require a mechanism for atomicity to create their transactions, and because no common operating system has ever provided such, they need to implement it themselves at application level. It's like the bad old days before PCs provided networking, and you had to run up your own networking stack if your application needed comms.

    Well reiserfs has the goal of providing atomic transactions at filestore level, so in principle it will become possible to leave a good chunk of the very hairy rollback processing of conventional RDBMSs to the operating system.

    It won't remove the need for proper RDBMSs for power database applications, nor will it in any way obviate the need for database distribution, but it should make professional databases both simpler and more robust. And it should also allow mini-database applications to be coded directly around the filestore with better transactional properties than the traditional flat-file designs.

    1. Re:Atomicity in filestores is a great benefit by dioscaido · · Score: 2, Informative

      Does reiserFS support atomicity at the group level? Can I edit a group of 30 files, and only once the modifications are done for those 30 files do we commit it to the file system, and in any other case none of the files change? That is a major feature of a transactional database, where you can modify various tables simultaneously and if at any point there are issues, all the data is easily restored by doing a roll-back.

    2. Re:Atomicity in filestores is a great benefit by Unordained · · Score: 2, Interesting

      ... oh, and the file system should also verify the integrity of the files, and the system as a whole -- make sure that your changes are "allowed" (both state-constraints and transition-constraints), make sure that everything works together (imagine your FS making sure that your changes to your mail server config match up with your changes to the user list?) ... ... oh, and allowing multiple users to modify files at the same time, and know enough about the file formats to reconcile possible conflicts (not stupidly like CVS does, where everything is either binary or treated as sequences of lines of text delimited by a carriage return) ... ... oh, and maybe we should resolve the issue of putting the type of the file in the filename (variables have names, values have types) ... ... oh, and don't forget support for, say, two-phase commits, nested transactions, and all those other things ... which, by the way, Jim Gray has one of the authoritative books on.

  15. That is what SAS is for... by the+eric+conspiracy · · Score: 3, Insightful

    most of our clients are now asking questions that require approximate or probabilistic answers

    What are my chances of getting laid tonight...
    What are the odds of my winning the lottery...
    What are the chances that my boss will find out about that phoney dinner reciept...

    Seriously, SAS stat analysis software does exactly what this numbskull is talking about. You don't need a new kind of database, merely somebody with training in stats.

  16. Sure... by Anonymous Coward · · Score: 5, Funny

    Could someone summarize it without using the letter 'e'?

    Sure.

    Th Futur of Databass
    Postd by timothy on Monday May 02, @08:12PM
    from th your-flight-status-is-'mayb' dpt.

    gManZboy writs "vr wondr whr databas tchnology is going? This is somthing that Turing award winnr Jim Gray from Microsoft has givn a lot of thought to. H rcntly publishd an articl in which h looks at th many forcs pushing databas tchnologis forward, and what thos nw tchnologis will look lik. Gray writs, 'th gratst of ths [rsarch challngs] will hav to do with th unification of approximat and xact rasoning. Most of us com from th xact-rasoning world -- but most of our clints ar now asking qustions that rquir approximat or probabilistic answrs.'"

    Hmmm, I kind of like 'databass'.

  17. Re:I want clustered databases for high-availabilit by jjohnson · · Score: 2, Funny

    And since Oracle is *way* cheaper than IBM, it's problem solved!

    --
    Anyone who loves or hates any language, platform, or manufacturer, doesn't know what they're talking about.
  18. Re:moving past relational model? I thinketh not by Spiked_Three · · Score: 2, Interesting

    I doubt if he is "confusing two issues" as he probably knows a lot more about it than you and I. He may indeed have a different opinion, but that is not confusing two issues.
    I will admit I was around before relational databases. Back then there was good old hierarchical databases, and they did a damn good job of what a relational database does 50% of the times these days. The problem was the other 50% they couldn't do. So along came relational databases. Now to think that there is nothing beyond relational is like IBM saying no one will ever need more than 16 colors on their PC, shortsighted.
    The part I really wish would die is SQL. It was invented as a way end users could enter data queries. It became adapted to be imbedded in COBOL programs, and the fact that it's at the center of most enterprise applications today is hideous. I don't care when the next database technology comes along, but please get rid of the SQL dinosaur.
    Personally, I'd just as soon get rid of databases, I have already designed my business logic, why must I now design and code ways to store objects? Yes, I know, some technology is already out (and I use it), but it is not mainstream yet. That is what I would like to see sooner, persistent object oriented databases become mainstream.

    --
    slashdot troll = you make a compelling argument I do not like the implications of.
  19. Re:I want clustered databases for high-availabilit by kpharmer · · Score: 4, Informative

    > The "next great advancement" in databases will be when I can setup 2 or more linux servers and have
    >them act as a single database server. Our database server is the most expensive item in our datacenter
    >because it's an N-way IBM server.

    lol, IBM has supported *exactly* what you are talking about for at least five years.

    That is, you can spread your db2 database across 10,100, or 1000+ linux commodity boxes (ideally blades). Or you can use windows, or aix, or solaris, or hp-ux, etc. Of course, those individual boxes can be SMPs in their own right - so a thousand 8-way aix boxes is certainly possible, if not cheap.

    Oracle is now in this game as well - oracle 10g can certainly support 32, and maybe 64 individual linux boxes in a cluster. The techniques are different between the two - oracle might be better at transactional systems. db2 is definitely better at data warehousing, data mining, etc.

    Of course, there are still benefits to a big smp: a single P570 16-way will cost you $250k. But each of those 16 cpus is multi-code (and far faster than intel or amd), and with its micro-partitioning - it can run at least 150 linux or aix lpars (logical partitions). These lpars can grow or shink as they need - so you aren't always over-buying for size, buying new hardly-used hardware, or having to colocate apps on a busy server - when a different os would be preferable. Not to say everyone should go this way - but there are definite benefits.

  20. Hmmm Databases by Chitlenz · · Score: 4, Interesting

    As a 15 year DBA, currently we are working with some of the would-be far reaching (to most people) concepts described in this paper. The idea of a TRUE SQL Debugger is like, so big it's sick. Quest offers some tools that kinda sorta do this for Oracle systems, but a true realtime debugger would save me YEARS of work during my career as a SQL coder. For an Idea of scale, The last replication project I wrote for an employer propogated over Oracle DB_LINKS via triggers to synchronize a dataset in two cities, log it, and do something with errors. Because this particular system was a Peoplesoft installation, it was a subset of 6800 tables and 15k lines of code give or take some triggers, with NO debugger. OMG, it's like a "finally" moment to have someone even claim to be fixing this soon in their architecture.

    Next, there was some inane reference to reiserfs above, which clearly ignores what a database fundamentaly both is, and is becoming. It really began (and I hate to admit this as a former Solaris/Oracle admin) with SQL Server 7 and Oracle 8, and the concept that a database should be object programmable. Reiser is not going to be streaming still frames of image data fast enough to a remote client to rebuild seamlessly into a movie, for instance. Or recalculate all of a company's business logic for point of sale systems so that, for instance, the wrong type of credit card gets rejected, or so a supply chain gets populated, the list is endless. Reiser, and for that matter VFS and the other myriad of database enhanced filesystems, are tools. Good ones, but tools...

    It's interesting to note that MS has finally figured out that the "n-tier" was a dumb idea. It's almost like, well you take all this shit, then sell it through a middle man, but expect to not have to pay him anything for brokering. Like, duh. We actively benchamrked this process, in fact, and discovered that it does, not suprisingly, take time to pass data through an extra server.

    Workflow is life. It's what make this page exist (SD is I believe run in MySQL). The idea of publishing-subscribers with atomic transactions is hardly new, but I agree with the authors that this is the direction of the market, simply because businesses now are getting spread all over. Read - If your job just went to India, learn to be a DBA, cuz when all that shit they sent over there comes back, you can bet its going to be a mess (and is a mess actually already, which is why, in particular, people in ERP fields that intertwine with mine(as a DBA) demand and recieve very large salaries, 200$US an hour is not unusual). The reason this particular ramble is relevant, is because lots of global companies are either looking at, or are already implementing, the idea of data grids, where all the data servers inside a global network stay in sync. Suzy the secretary checks out a document in Baltimore, and that document flags as in use in Madrid through transactional replication within a kind of database trust-relationship network. It's a very very good way for companies with lots of data to keep it all together, but today it's still a pain in the ass to manage.

    Vertical partioning is pretty much worthless except to data warehousing installations, most of whom are probably running on strong equipment already (to have that much data). Not to mention, I believe (I'd have to check, since it's not a feature I'd really use) Oracle's 10G product allows for this already if you really want it. Materialized views is another point here that raises my hackles. This guy is writing about the wonder of materialized views and column partitions, which ARE a cool performance cheat in large systems, but make no mistake that by the time you get to this point, you are probably rearranging deck chairs on the titanic anyway. Essentially Materialized views precache SQL resultsets into a temporary table which gets constantly updated so it can always provide a full resultset without having to parse the parent table. This is processor and space expensive. Vertical par

    --
    Imagination is the silver lining of Intelligence.
    1. Re:Hmmm Databases by julesh · · Score: 3, Interesting

      [XML] isn't bad as an over-the-wire
      protocol.


      Yes, it is. I've worked on a project that allowed offline modification of a database by replicating a copy to user's PCs, and it originally used XML as the format for data transfer. We got a 30% speedup by switching to tab-separated variables with a line of metadata at the start of each chunk of the stream. Any technology that costs that much in overhead and provides little or no perceivable benefit is a waste of time. (Of course, if your data isn't relational, this is probably not much use to you, but then... what are you storing it in? XML documents?)

      The only justification for XML is that there are a lot of tools out there that work with -- I use it is an intermediate interchange format between different environments because the libraries available make it easy with just about anything I want to access the data with.

  21. Re:moving past relational model? I thinketh not by kpharmer · · Score: 2, Insightful

    > it reads like a battle cry for us to move beyond the relational model. ... it's NOT going to happen...

    a couple of thoughts on that -

    1. relational databases are really quite wonderful for analytical apps. Need to store two years of firewall/sales/whatever data - then churn away analysis? Great - no problem. And it's easy enough to do either through hand-written sql or via a tool. There's plenty that requires third-party tools (and data stores), but even in this scenario the staging area is almost always the relational database.

    2. a lot of folks who would like to eliminate relational databases fail to account for point #1. They complain of the object/relational mapping problems. Ok, that's fine - but if you put your data in container-managed persistence or an object database - you'll then have to pay someone to pull it out and put it into a separate relational database for analysis. Of course, you might be fired right about that time...

    3. java in the database is mostly a pain in the butt: On the performance side you've got optimization complexity, on the managability side you've got unusual dependencies, build processes, etc, on the availability side you've got the ability to take out the entire server with some bad code (ok, sometimes).

    4. two-tiered architectures with a web service driven directly out of the database is more than just a pain in the butt. It's a security disaster. A cobbled together architecture. And Jim Gray shilling for microsoft.

    5. the column-store as Jim Gray described it has never really left us. And we don't really need new technology to handle it.

    6. users love tables. there are quite a few users out there that truly love tables - they understand them, they look just like spreadsheets, they can query them. This is important: it's fabulous when your users can easily understand your design.

    7. however - like Gray said, we now need methods of working with data that go far beyond boolean logic. We need fuzzy logic queries. And we need new types of models - allowing for multiple many-to-many relationships via relationship tables. This breaks codd's rules - but is essential for agile & fast-moving projects.

  22. Google is a good example... by dantheman82 · · Score: 3, Insightful

    I'd personally ask a Google employee where the future of databases is heading. The Google FS really shows where databases are moving...

    I give Gray a lot of respect in most cases because he's a really smart guy. But the math and computationally-intensive parts should be focused in the probabilistic searches.

    In one sense, though, Gray is quite right. And this is the direction of speech recognition. I might add that the Speech Server beta out by Microsoft is quite good...even at this stage.

    --
    This sig donated to Pater. Long live /.
  23. The Future Of Databases? by brian_olsen · · Score: 2, Interesting

    I see the future now and it will happen in two phases: getting rid of SQL and then replacing it with something half-way decent (like a properly implemented relational algebra.)

  24. The future of databases is... no Database at all!! by vhogemann · · Score: 2, Interesting

    Picture this... memory nowdays is a hell lot cheaper than a full Oracle Licence. So, instead of investing on a DBMS why not buy massive quantities of ECC memory and keep all instances of your data in-memory for near instant access?

    Crazy idea, huh? What if I said that this can be as fast as 8000 times faster than Oracle? And 3000 times faster than MySQL!

    Crash recovery? No big deal, keep a serialized version of your in-memory-objects, and a transaction log and you're set!

    Read more at:
    http://www.prevayler.org/

    --
    ---- You know how some doctors have the Messiah complex - they need to save the world? You've got the "Rubik's" complex
  25. Database as file system by bananahead · · Score: 2, Interesting
    The only force that can change the nature and architecture of current database technology is a fundamental change in the way they are used. Change the requirements and the technology will change to meet the new requirements. Change the requirements in a radical way and you will get radically new technology.

    The use of a database as a file system will require radical new technological advances in database theory as the current methods break down under the new requirements. The functionality of the file system will change as the capabilities of an underlying database are realized. The two forces together will create an interesting discontinuity in the industry, the kind the venture capitalists look for.

    It's all good. Pray for WinFS.

    --
    A most overlooked advantage to owning a computer is if they foul up there's no law against wacking them around a bit.
  26. Re:I want clustered databases for high-availabilit by afabbro · · Score: 2, Informative
    Who moderated this interesting? "-1 Clueless" or "-1 ill-informed" is more like it.

    2 servers acting as a single database server has been available for many years...e.g., Oracle 9i RAC, Oracle 10g, DB/2's something or other, etc.

    --
    Advice: on VPS providers
  27. Re:$clever_title by protohiro1 · · Score: 2, Funny

    although both $buzzword and $sarcastic_comment should be stored in some sort of database...

    --
    Sig removed because it was obnoxious
  28. What ever happened to OODBs? by elgee · · Score: 2, Interesting

    At one time, I though object oriented databases were going to be the next big thing.

  29. Re:The future of databases is... no Database at al by kpharmer · · Score: 2, Informative

    > So, instead of investing on a DBMS why not buy massive quantities of ECC memory and keep all
    > instances of your data in-memory for near instant access?

    because a *well-tuned* relational database with a 1:4 ratio of memory to disk is almost as fast as an in-memory database - due to efficient caching

    because some queries require an enormous amount of temp space. supporting them can easily double your space requirements - which have to be purchased in memory.

    because if you just want to run your database in-memory you can already do that with most databasees.

    because you don't have the same speed requirements for every piece of data in your database. You might have some tables used for session & user management that are often read & written to and must be very fast. But other tables that just hold seldom-accessed historical data. A modern database would allow you to keep the small & fast tables effectively in memory, and the huge 100 gb history table on disk. And you don't have to buy 100 gbytes of memory to do it.

    because...it's just a bad idea.

  30. Re:The future of databases is... no Database at al by rossifer · · Score: 4, Insightful

    [misc drivel] Read more at:
    http://www.prevayler.org/


    Oh my dear god. You've never actually used Prevayler have you? Prevayler isn't nearly as useful on actual data problems as Prevayler's worshippers would have you believe.

    I know this because I tried to use it. If you'd ever tried to use it, you'd know how unbelievably poorly it performed when attempting to implement real world queries. You have to implement every query in Java, and Java is a particularly poor implementation choice for creating complex queries.

    What if I said that this can be as fast as 8000 times faster than Oracle?

    This "performance comparison" that the Prevayler group trots out is particularly funny as their test uses a single ArrayList of objects as in-memory "storage" and then "queries" it by index. Not exactly a realistic problem. Try a query across four classes with a few million instances of each class and you'll quickly discover what relational databases are good for.

    Regards,
    Ross

  31. Bullshit artist... by Anonymous Coward · · Score: 2, Informative
  32. A difference between "DBA" and "clown" by Moraelin · · Score: 4, Interesting

    Yes, a good DBA and/or Database Developper is a very valuable addition to any team.

    The problem is that in a lot of corporations (e.g., the one I work for), they -- and all other admins -- have been taken and put in a different building. And more importantly they don't actually have to cooperate with any team.

    Their job's goal is no longer the same as the developpers: to get a program done by a deadline. They've been turned into a bureaucracy whose only job is to see that the servers run. No more.

    That's an _awful_ job description, because it directly makes the developpers their enemy. I'm not even talking "slippery slope", but direct cause-effect. Instead of being "the other half of the team that will make this program work", developpers just become "those assholes who crash our servers."

    It's not hard to get from that point of view to pathologic cases like the admin that limited our productive servers to 3 connections per server. He kept his own servers running perfectly (which is his job description) at the expense of making the company's productive programs grind to a halt (hey, it's not in his job description to care about those.)

    That's the problem with that kind of internal organization. As one BOFH-wannabe once said "The source of the problems on my network are the users. Would you prefer that I cut your access? Then there wouldn't be any problems any more." Another one threw a hissy fit that we dared ask that he does his job, during work hours. Yeah, how dare we bother him by asking if he could please reboot the test server he's managing.

    That's the underlying problem. Instead of providing a service _to_ the users, a whole caste has been created whose job is to serve the computer, and the users are just those pesky assholes disturbing his majesty the computer. That's a very unproductive situation to create.

    Worse yet, a bunch of companies invented the devastating practice of internal invoices. The admins in one department won't even go to the toilet unless they can send an bill to another department for it.

    They won't even talk to each other (e.g., the WebSphere admin telling the DBA and the Unix admin that he needs a Solaris patch and a newer version of Oracle for the "transactionBranchesLooselyCoupled" setting.) No, you have to personally talk to all three of them, because otherwise they can't send three bills for it.

    And predictably, they'll do _nothing_ more than the bare minimum that was requested and billed. E.g., you have to tell the DBA explicitly to set this and that, to this and that value, because she won't do that on her own. Which basically means you already need to have all the knowledge of a DBA, and she is just acting as a proxy over the phone... and sending you a bill for it.

    Basically if you're not that kind of a DBA, you have my respect. All I'm saying is that when you read about "teams of clowns" or about people who'd rather invent their own storage than deal with a DBA... well, they're not necessarily avoiding _your_ kind, but the kind of clown I've described above.

    --
    A polar bear is a cartesian bear after a coordinate transform.
    1. Re:A difference between "DBA" and "clown" by Chitlenz · · Score: 3, Informative

      I live in the middle. Im a DBA Architect, which means I both design and build the databases our company uses. Add to that, we're a small company, and we design very specialized software in a way that not many people can do so I also wear the hat of C# coder. I understand both sides of this fence, and have actually been in the odd position of fighting for both points of view. A good DBA is responsible for all of the flexible information that makes a modern corp. run. Think about that. All the paper, all the reports, your payroll, everything worth owning informationwise within a company is in a database somewhere. HELL YES these guys live at corporate hq. That said, in a healthy company, the DBAs and devs are able to debate rather than fight. One particularly obstinate Peoplesoft lead dev in my past and I have become very good friends over the years through this kind of argument, so its not all bad =)

      My sympathy, however, does indeed go out to the poor devs who get stuck with some tool that doesn't really understand, or even want to understand, his position as an admin. Too many people slipped into the field with dollars in their eyes in the 90s, and it's led to some truly spectacular screwups. Essentailly, in my mind, almost every single failed ERP implementation could and should be blamed on insufficient database administration, and there are LOTS of flameouts there.

      The upside ... hehe maybe.. is that corporate scrutiny of their IT staff is at an all time high! So if they really suck that bad, their days are probably numbered.

      --chitlenz

      --
      Imagination is the silver lining of Intelligence.
  33. The future may be open source databases by Anonymous Coward · · Score: 2, Interesting

    There's a lot of talk in database circles about the fact that open source databases may do to commercial databases what linux did to commercial unixes. i.e. wipe them out. Recently LazyDBA one of the most well known websites for database administrators started supporting open source databases. Add to that the fact that Oracle is going on an app buying fest (Peoplesoft and now maybe Siebel), database people see that the commercial database in danger.

  34. To answer the question... by jim_v2000 · · Score: 2, Funny

    "Ever wonder where database technology is going?"

    Yeah, all the time.

    --
    Don't take life so seriously. No one makes it out alive.
  35. You mean in the last 25 years? Nowhere. by Qbertino · · Score: 3, Insightful

    Honestly, folks, databases are like crutches: Pathetic, but you when you need them, there's hardly an alternative. They are the living proof that abstract concepts and computer simulation of those on real world hardware need the strangest type of hacks to be mended together.

    On top of that - and this is the worse part - what we call databases today is nohing much more of a historically grown apocalyptic chaos. With one of the crappiest programming languages ever as a cornerstone of its technology. A weedy mumbojumbo of wanna-be virtual machines, wanna-be server daemons, makeshift security layers, obstrusive user management and pseudo operating systems and a bazillion proprietary variants of said programmin language. With features bolted on left right and center. This basically is the case with any current DB in widespread use, be it MySQL, Oracle or anything inbetween.
    And if you look at the core of it Database technology and how long it has been that way there isn't much hope that DB's will go anywhere anytime soon.

    Then again, if you want to get a glimpse of a possibly brighter future, I'd actually recomend Zope. I consider it's object relational DB a working proof of avantgarde "database" concepts and a prototype of what DBs generally could look like in the future if anyone were interested.

    --
    We suffer more in our imagination than in reality. - Seneca
  36. Most of our clients... by cardpuncher · · Score: 2, Insightful
    ... are now asking questions that require approximate or probabilistic answers

    I suspect that may translate as "most of our clients want to be given easy answers to difficult questions".

    I'm sure there'd be a big market for a database system that stored flight bookings and could answer the question "which of our customers is a terrorist?". You don't address that market with new technology, though, but by developing new sources of snake oil.

  37. Ask Chris Date and Hugh Darwen about ... by CodeArt · · Score: 2, Insightful

    .. future of Database Technology. Actually you don't need to ask them. Just go to any bookstore and buy one of their books and you will quickly learn that relational doesn't mean SQL. Relational databases are about two-valued predicate logic and set theory and there is not more solid then this to be used as a basis for storing and manipulating information. Future databases will be truly relational truth systems with the support for user defined types and temporal data at the logical level and the much better implementation at the physical level. Jim Gray is authority in area of transaction processing but not in area where databases and database languages in general.

  38. Re:Pretty long by Woody77 · · Score: 2, Interesting

    (this is serious)

    Aside from the access mechanism on top, really, what's the difference? I've used both (OODBs heavily), and really, I've always looked at it as a bunch of tables with columns for member variables and rows for objects.

    Is it really all that different under the hood? Or is this more marketing hype/spin?