Sneak Peek at IBM 'Viper' DB2 Release
Rob let us know that Computer Business Review magazine is reporting that IBM is about to add more fuel to the database fire. The company has offered up a sneak peek at their upcoming "Viper" release of their DB2 database. From the article: "DB2 Viper will be distinct from current DB2 database implementations in that it will be able to store XML formatted data inside the database natively--XML support will not be bolted onto the side. Viper will also support relational data stores, of course, and access to those database tables using the SQL programming language."
the SQL programming language
It's a query language. Ffs, the name even says so.
Although, on second thought, the name also says it's structured.
Not "peak". Sheesh!
Talking about native XML databases... My company can't find a decent one, preferably open source.
That's probably because an XML database is NOT a decent idea. XML is NOT meant to be used as a way to store data! Rather, it's a way to communicate data between entities.
Sadly, XML is a one of those words that have the magic power to make marketing people happy. So they put it everywhere. If that doesn't work, they just put more.
There aren't any. XML databases are a dumb idea, and they will never perform as well as regular relational databases. The best thing you can do is store your data the regular way, and use an application layer to read and write XML as needed.
The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
Oracle's had this for years. Since v 8, I think? (corrections welcome)
;-)
Glad to oblige
Oracle basically chucks it's XML into a LOB, and you can search the lob for strings, etc.
What IBM has actually breaks down the XML, creating a tree structure behind the scenes. There may be no out-and-out benefits at the moment, but the solution is a much better implementation than Oracle. The applications will come.
Visit here and have a look at the paper "An Overview of Native XML Support in DB2". Also maybe see "Learn how IBMs new XML technology differs from other XML storage", which is a link to a Register article.
Note to ACs: I won't mod you up, even if you are being funny or insightful. So take a chance! It's not real life!
Back when IBM bought Lotus, Notes was a very unique platform for document databases. I wonder if they've taken the old Notes document database concept and exapanded it to XML. IBM owns so much esoteric intellectual property; you would hope they could find some interesting overlaps.
As IBM indicates in their press release, they're making sure it integrates with PHP as well.
BTW, the register has some good coverage on the new XML integration.
I am the vinder viper!!!! I vill be there in three months!!! I come to vipe your vindows!!!
IBM reckons that the addition of native XML support will expand the $7.8bn relational database market by another $1.4bn. And IBM wants to get the bulk of that additional XML-related revenue for databases.
Sql support has been on the most wanted list for most companies for quite some time now. With Web Services being used everywhere, and most data formats going XML, representing all those in old-style tabular form and querying them is such a pain. Now, Sql Server 2005 and Oracle have excellent Xml support right now, not next year. Which means IBM, you are late. The deperate switchers are already switching (I know many who did to MSSQL 2005). And many for whom it is desirable have been playing around with it for atleast a year now. By the time Viper is done, they would already be running some database which supports Xml.
Which not only means that you would get very little of the Xml pie, but also that you will have to work real hard to make sure your existing customers don't move to Oracle or MS, because they want Xml support much earlier.
Life is just a conviction.
If you've been running these databases successfully, you're probably spending a lot of time writing and maintaining code to handle ACID issues, locking, and other headaches.
Why not pay someone else to do that kind of work?
[And yes, you can donate to PostgreSQL development!]
Sql support has been on the most wanted list for most companies for quite some time now.
Indeed, SQL support is often the first thing I look for when shopping for a database.
The strange thing about this development is that the navigation model used by XML is essentially :-)
the old "network" model used by among other CODASYL in the early seventies. This model
became unfashionable when the relational model gained popularity, but seems to be quite fashionable
when it is wrapped in XML syntax
I wonder what the price point for Viper is going to be in comparison. I already know what it is for the various versions of SQL Server 2005. Ouch! I'm waiting for my Enterprise and Developer versions to show up now so I can play more (I've been playing with the betas for a long time now as I do DBE work as well).
"[I]t is a wise man who admits the limits of his knowledge or skill, and that pretending either causes harm." --Terry Go
As I recall, that's the same sentiment expressed about relational databases when the current state of the art was hierarchical and CODASYL databases in the 70's. All it takes is one really good implementation of the new idea to change this perception.
There were huge debates about the "abstract model" of a relational database that didn't make sense in The REAL WORLD (TRW), because "real" problems were more complex than the relational model and performance would suffer.
I don't know that an XML database is "better", but then again, I don't know that it isn't. Maybe I'll learn something!
We develop documentation (manual, onlinehelp, etc.) and develop our content modular in XML, and publish it to different output formats (PDF, HTML, CHM, etc). In this case XML is an excellent format for storing content.
Ops, of course I should have been more clear. What I meant is that I don't think XML is meant (IMHO) to be used to store MASSIVE amounts of the kind of data you USUALLY store on a DB.
While it can be extremely useful to use XML to represent the data you just extracted from the DB (or the data you are about to insert), saying "we store data as XML natively" sounds to me just a silly marketing campaign.
Oracle has stored XML data in a tree structure and allowed querying via XQuery since version 9.
Stephen
"Don't write down to your readers, the only people less intelligent than you can't read" - Sign on Newspaper Office Wall
So it's not the storage that counts, it is the ability to extract useful information from the text field/clob without requiring a great amount of processing overhead. Which is where I wonder how useful this is except in situations where there is very little post-processing or querying to be done against the XML. For example, if I am always just going to render the XML or pass it along without any post-processing. Even then, in terms of processor time, etc. it just isn't that hard to write good code to pull the data from a regular SQL database, output it as XML, etc. thus gaining gain all of the other advantages that a modern dbms has over flat file storage without imposing the dreadful data overhead required for all of the xml tags, etc.
Am I missing something?
...Open Source isn't the only answer -- but it's almost always a better value than the alternatives...
Viper will also support relational data stores, of course, and access to those database tables using the SQL programming language.
Thank you Captain Obvious! Until I read the headline on slashdot, I was concerned the new DB2 might not support SQL queries. Now I can sleep tonight.
On a radical tangent, I was thinking of buying a new car. Has anyone heard if the new cars from GM have wheels that turn? I'm not sure because it doesn't say on the website anywhere. I really hope the new cars have wheels that turn. If the wheels didn't turn... that'd be like a database without SQL... or something.
XML is neither a means of storing data, nor a means of communicating data, but is only a means of *marking up* data.
"Times have not become more violent. They have just become more televised."
-Marilyn Manson
> It still sucks for real time applications. DB2 is a good warehouse DB, good for batch processes and such.
The differences between oracle & db2 for transactional apps are mostly:
- db2 is about 1/3rd the cost of oracle
- db2 is faster
- db2 includes some warehousing features (range-partitioning via MDC) for free which are often also useful in these applications
- db2 is simpler to administer
- oracle has a locking interface that's easier to use (MVC instead of row-locks)
- db2 likes to use static sql that requires binds (pita, but optional)
> I must admit those IBM guys know how to butter the sales to the management with all those golf subscriptions,
> hockey tickets what have you.
Hmmm, i've worked with sales staff from quite a few different companies. But I've never worked with people as nasty as at oracle. They go *way* beyond mere buttering up of management all the way to stabbing the technical staff in the back when the want their professional services team to get their work, or when the oracle product fails to deliver the labor savings that sales promised. Oh, and then there's the famous oracle trick of leaving vital pieces of the product out of the discounted original deal, and slaying the customer when they discover that these are required...
For processor intensive searches, you have the option to throw hardware at the problem, moving up into RISC and mainframe platform's if needed.
1 gig? Major database? BWAHAHAHAHAHAHA!
I think that all the folk saying that "XML is bad for databases" are dismissing it far to quickly. Let us think about the general case of "marked up" data being included in a database.
First, is there a difference between doing this in a relational database versus another kind (say object DB). Perhaps so, but I wish to focus on RDBMS since it is the one that is on topic here and the one that seems so counterintuitve.
Marked up data (XML, HTML, perhaps even SGML) consists of field values _and_ the schema of the fields themselves (even if not always the base data type). Whilst it may be necessary to have the grammar to be certain about the full domain of the *ML there is enough in the marked up data to construct a record from the input data. Think about it, this means that each record arriving at the database contains some information about the schema of the record as well as the data itself.
A database that took this *ML and integrated it natively would, in my world allow the user to create tables with an indeterminate number of fields that could vary from record to record whilst still allowing normal RDBMS functionality.
The complexity of such an implementation would be high, particularly within the context of a database that still has good indexing, table management and performance. Foreign keys would be an intriguing challenge. There is nothing about the problem that is inherently unsolvable but performance would be a real challenge.
I don't think that this functionality is a category killer. But I can imagine why some people love the idea. Lots of people would like to be able to define records in their RDBMS that have arbitrary fields that the designer of the schema did not know about when the database was built. SQL does not cope with this scenario at all. However in my view correct normalisation solves most of these issues and makes the need for native XML unnecessary. Perhaps it would have been easier for IBM to ship DB2 with a copy of McGovern and Date.
"The first thing to do when you find yourself in a hole is stop digging."
Oracle basically chucks it's XML into a LOB
How *else* do you store a value from a type in a database?
How does Oracle store integers? "Uhh, that's different" I hear you mumble. No, it's not. An XML document and the associated tree representation is a *value*, an instance of an *XML data type*, with associated operators (xpath, text search, update, etc). So it goes into an attribute (column).
Go back and review your relational theory (that advice applies to 99.99% of users and vendors unfortunately).
If Oracle's marketing has convinced you of something different, then that's their marketing department's fault. The exact implementation (how the XML tree is stored) and syntax (how you query it) is irrelevant. The relational model, and classical type theory (which predates the RM) already tells you how to think logically and abstractly about any data storage and manipulation task, without regard to the peculiarity of any particular product.
For a more concrete example, not using the syntax of any particular product:
Of course, this needs a cup and a half of syntactic sugar to make it more pleasant when using XML-heavy applications (for instance, XPath could be embedded a little more gracefully), but surely you can see that all XML databases can reduce to the same model. Those that are created with ignorance of the relational model won't be as useful.
A well-designed relational database would already be an XML (hierarchic) database, and would already be an object (network) database, because those are both less general than the RM (entities related by arbitrary assertions).
One problem with today's SQL products (and they have MANY) is that you can't create your own types easily. You should be able to add XML support, object support, or whatever else, as easily as you can with a general-purpose programming language. You shouldn't have to wait for the vendor to "add" it.
Imagine if you had to *wait for a new release* to get XML support in Java or Perl. Yeesh. Yet database users seem perfectly content to suck down the crap from the vendors. They don't know what to ask for or how to evaluate what they get. Even though this was mostly figured out 30 years ago.
Actually the pricing of DB2 is quite resonable -- especially for the express version.
<flame suit on>The other issue is that many companies using products such as MySql have to re-implement features that are standard in other systems. Features such as robust replication, clustering, etc also are just coming on line for MySql and Postgres, but have been part of DB2 and friends for years.
<flame suit off>A few people have asked whether DB2 is going to support XQuery, or said that it won't, or that putting XML in databases is stupid, or that there are no advantages to having XML in relational databases.
The IBM article does say that their Viper product will support XML Query (it's also known as XQuery).
So yes, looks like they will be supporting XML Query.
Is it a good thing? Some pretty smart people seem to think it's a good idea, so maybe it's worth at least taking the time to listen to them.
If the only XML you've dealt with is the result of marking up relational tables, you might not see much advantage.
If you have a lot of XML documents, though (say, five million) that all validate to an XML Schema, you know some things about them. You might know, for example, that all of the price elements contain numbers. You might know that the description elements may contain embedded partnumber elements intermixed with the text, and that those partnumber elements contain part numbers formatted a particular way.
A database can build an index based on this sort of information, and can do very efficient searches and "joins".
You might also think about what you could do if you had all of the XHTML documents from some major Web site (perhaps an Intranet corporate site, or maybe your own personal site) stored in a database in such a way that you could easily make different views of the information.
I think the real niche for XQuery might be as middleware: the ability to run queries against multiple databases, whether XML or relational or flat file or whatever, without caring about how the data is stored, can be very interesting, not to say useful.
ISO SQL has also standardised on how to map between SQL and XML Query data types, and on how to evaluate XML Query expressions embedded in SQL expressions. The Java Community Process has been working on XQJ, a way to reach out to XQuery data stores from within Java.
The XML Query Home Page (disclaimer: I maintain this) lists some 45 implementations, both proprietary and open source. Not all of these are complete, but, as others have noted here, XML Query is a W3C Candidate Recommendation: we're asking for public feedback from implementors, and trying to make sure that the specification is clear and precise enough that implementations all work the same way.
I think XML Query support in SQL databases is likely to become pretty widespread. Until it is, you can also use some open source implementations that support JDBC, as well as one or other of the commercial implementations that support query optimisation over external SQL-based data stores.
Live barefoot!
free engravings/woodcuts
You can store XML according to a schema, and index it using B*Trees, or you can store it as a LOB and index it with function-based indexes or text indexes. There are tradeoffs to either. Here's a document explaining Oracle's implementation, though you need to sign up to OTN (free reg)
/ appdev.101/b10790/xdb01int.htm#sthref46
http://download-west.oracle.com/docs/cd/B14117_01
-Stu