Sneak Peek at IBM 'Viper' DB2 Release
Rob let us know that Computer Business Review magazine is reporting that IBM is about to add more fuel to the database fire. The company has offered up a sneak peek at their upcoming "Viper" release of their DB2 database. From the article: "DB2 Viper will be distinct from current DB2 database implementations in that it will be able to store XML formatted data inside the database natively--XML support will not be bolted onto the side. Viper will also support relational data stores, of course, and access to those database tables using the SQL programming language."
Talking about native XML databases... My company can't find a decent one, preferably open source.
That's probably because an XML database is NOT a decent idea. XML is NOT meant to be used as a way to store data! Rather, it's a way to communicate data between entities.
Sadly, XML is a one of those words that have the magic power to make marketing people happy. So they put it everywhere. If that doesn't work, they just put more.
IBM reckons that the addition of native XML support will expand the $7.8bn relational database market by another $1.4bn. And IBM wants to get the bulk of that additional XML-related revenue for databases.
Sql support has been on the most wanted list for most companies for quite some time now. With Web Services being used everywhere, and most data formats going XML, representing all those in old-style tabular form and querying them is such a pain. Now, Sql Server 2005 and Oracle have excellent Xml support right now, not next year. Which means IBM, you are late. The deperate switchers are already switching (I know many who did to MSSQL 2005). And many for whom it is desirable have been playing around with it for atleast a year now. By the time Viper is done, they would already be running some database which supports Xml.
Which not only means that you would get very little of the Xml pie, but also that you will have to work real hard to make sure your existing customers don't move to Oracle or MS, because they want Xml support much earlier.
Life is just a conviction.
If you've been running these databases successfully, you're probably spending a lot of time writing and maintaining code to handle ACID issues, locking, and other headaches.
Why not pay someone else to do that kind of work?
[And yes, you can donate to PostgreSQL development!]
Oracle has stored XML data in a tree structure and allowed querying via XQuery since version 9.
Stephen
"Don't write down to your readers, the only people less intelligent than you can't read" - Sign on Newspaper Office Wall
> So for more agility in your database designs, you endorse LESS normalization? I can't imagine a less normalized
> databse every being more agile than a properly normalized one. Either I'm missing what you mean by dynamic
> model, or you don't understand the benefits normalization.
Right - i'm not talking about 'denormalization' - in the way that you would denormalize a modeling to simplify sql and improve performance on a reporting application. I'm talking about not applying that set of database modeling rules at all.
> You do know one of the main goals of the relational model was to allow agility right?
Yep, and it has done that well: relational databases are far more agile than the hierarchical ones that preceded them. But - they aren't agile enough for some problems.
For example, lets say that you have a bicycle-shop-management application that you sell to small shops. You sell it for, what? $5,000 plus 18% annual maintenance. It handles bicycle inventory, sales, some light marketing, etc. Well, one day one of your customers decides to sell books about bicycles. Well, perhaps you've got a generic inventory table that he can describe things in - but if you've got a 3-5NF model - it isn't that generic. There are no columns specific to books in it. And he really can't afford to spend $10-50k on an update to support that.
So, ideally you've got a model in which some attributes of items are kept in key-value pair tables. This isn't wonderful for a lot of reasons - but it does give the application owner the ability to define new kinds of attributes that were unforseen by the dba. And, if done well, he can even define (in the database) rules for when some of these attributes are required, what their domain is, what their type is, what their default is, etc. These "dynamic attributes" would give the user the ability to create whatever new columns they want to describe the entity "book".
Additionally, you could design the model to support the concept of "dynamic entities": in which concepts such as book, bike, helmet, wrench, tire can be logical subtypes of inventory item. Not just identified through a single simple tag - these concepts can be related through many-to-many relationships to one another, to multiple stores, to customers, etc. The relationships between these entities can be dated, prioritized, weighted, and the entities can inherit from multiple parents in this case. Now when the store owner wants to add the concept of book they can *easily* also create overlapping sub-categories below it (mountain biking, road biking, family biking, competitive biking, history, etc) - and then relate these items to other inventory items that share that category. End result - you click on the bike shop's web site and look at a heading called "winter biking" - and see everything remotely related to this concept. And - it was easy to set up, and there's nothing specific to "winter biking" in the structure of the data.
Sort of similar to what the topic maps community is trying to do with XML:
http://www.topicmaps.org/
Though in my opinion they are only shooting for a subset of what we should be trying to do at this time, and what we can do via relational databases or whatever. Still, with strong db2 support for topic maps that may be the easiest way to go for now.
A few people have asked whether DB2 is going to support XQuery, or said that it won't, or that putting XML in databases is stupid, or that there are no advantages to having XML in relational databases.
The IBM article does say that their Viper product will support XML Query (it's also known as XQuery).
So yes, looks like they will be supporting XML Query.
Is it a good thing? Some pretty smart people seem to think it's a good idea, so maybe it's worth at least taking the time to listen to them.
If the only XML you've dealt with is the result of marking up relational tables, you might not see much advantage.
If you have a lot of XML documents, though (say, five million) that all validate to an XML Schema, you know some things about them. You might know, for example, that all of the price elements contain numbers. You might know that the description elements may contain embedded partnumber elements intermixed with the text, and that those partnumber elements contain part numbers formatted a particular way.
A database can build an index based on this sort of information, and can do very efficient searches and "joins".
You might also think about what you could do if you had all of the XHTML documents from some major Web site (perhaps an Intranet corporate site, or maybe your own personal site) stored in a database in such a way that you could easily make different views of the information.
I think the real niche for XQuery might be as middleware: the ability to run queries against multiple databases, whether XML or relational or flat file or whatever, without caring about how the data is stored, can be very interesting, not to say useful.
ISO SQL has also standardised on how to map between SQL and XML Query data types, and on how to evaluate XML Query expressions embedded in SQL expressions. The Java Community Process has been working on XQJ, a way to reach out to XQuery data stores from within Java.
The XML Query Home Page (disclaimer: I maintain this) lists some 45 implementations, both proprietary and open source. Not all of these are complete, but, as others have noted here, XML Query is a W3C Candidate Recommendation: we're asking for public feedback from implementors, and trying to make sure that the specification is clear and precise enough that implementations all work the same way.
I think XML Query support in SQL databases is likely to become pretty widespread. Until it is, you can also use some open source implementations that support JDBC, as well as one or other of the commercial implementations that support query optimisation over external SQL-based data stores.
Live barefoot!
free engravings/woodcuts