Slashdot Mirror


Do XML-based Databases Live Up to the Hype?

douthitb asks: "I have recently started work as a contractor with a company developing/improving an application for exchanging large amounts of data. The current solution exchanges data via XML, but the data itself is stored in a SQL Server database. There is a concern about the overhead involved with wrapping and unwrapping the XML to get the data in and out of a relational database. The proposed solution is to use Tamino, an XML-based database. Neither I nor any of the other developers have any experience with Tamino, but the desired result is to remove the bottleneck of converting the XML back and forth. Does anyone have experience using Tamino (or any other XML-based database)? What benefits and/or difficulties did you have in using an XML database, as opposed to its relational counterpart? How large of a learning curve should be expected with a product like this? Do XML databases really live up to the hype? A similar topic was discussed on Slashdot way back when, so I was hoping to get some more up-to-date feedback on the subject." "Sales reps from Software AG, the makers of Tamino, were brought in to discuss the benefits of their product with us. They, of course, presented Tamino as the end all, cure all database system (it will even clear your acne and make you popular with the girls!). The management of the company I'm contracting with were basically eating out of the sales reps' hands, without asking any of the "tough" questions about what the product can do; I was less convinced. Doing some initial searching on the Internet, I have had trouble finding much information about Tamino outside of the Software AG website."

8 of 105 comments (clear)

  1. yeah, i support a tamino server at work. by Anonymous Coward · · Score: 4, Informative

    it runs in tomcat or similar. it's really crashy. we can't wait to get rid of it.

  2. Berkeley DB XML by SchnauzerGuy · · Score: 4, Informative

    I haven't tried it, but the regular Berkeley DB is highly regarded, and both are open source and (depending on your situation) free, so it is definitely worth a look.

    Berkeley DB XML 2.0

    1. Re:Berkeley DB XML by selectspec · · Score: 2, Informative

      Berkley DB (XML) is great for some applications, but it lacks high availability (remote replication, clustering, etc).

      Tamino seems to claim recent support for "Enterprise High Availability" but I'm not sure what that means.

      Before I'd decide on XML, SQL,flat files,OODMBS, RDBMS etc, I'd want to know four things:

      1. How will it be secured.
      2. How will I back it up and recover it.
      3. How will I replicate/mirror/cluster it locally and over distances in case of a failure/disaster.
      4. Do upgrades require downtime.

      Then I'd discuss the academic issues.

      --

      Someone you trust is one of us.

  3. The devil's in the details by LeninZhiv · · Score: 5, Informative

    The first question to answer is, why is this data in a relational database to begin with? More to the point, is this application the only one that accesses the data, or are there other, non-XML centric databases that make use of the same data? The relational model gives you flexibility that XML does not for dealing with the data in arbitrary and unforeen ways (XML can be quite flexible with XSLT, but a programmer must still intervene for each and every new way you want to use tha data, with a much bigger performance hit). The normalised relational database stores your data in a mathematically sound way that puts the priority on integrity of data independently from its past, present or future structure; XML preserves data structure based on its present use while leaving the door open to moving from that to any arbitrary future use... which of the two ideals is more attractive depends on the nature of the data and how many applications need to use it.

    Relational databases with good XML support (my background is DB2 but most major databases should be able to do this) reach a good compromise by giving you acces to normalised relational data as XML (which you can compliment with XSLT it if that's what needs to be done), while preserving it internally reduced to its bare essence as data (according to relational calculus' idea of what constitutes the bare essence of data, anyway.)

    On the other hand, for single-app applications, or data that is more file oriented than datum-oriented (databases of XML documents where the document rarely or never needs to be abstracted from the data it contains), XML databases offer simplicity and efficiency by removing the need to work out a relational data model. Why break up your structured documents into a DBA's hand-tuned data model when 99.9% of your queries will just build these data sets back into XML documents (even when DB2, Oracle, and I assume SQL Server can automate this last task)? An XML database can give you more flexibility in querying than an all-XSLT solution, while saving a lot of unnecessary work over an SQL-to-XML solution for what is really an XML-to-XML application.

    As I see it, that's the big picture. The actual decision has to come down to your applications. An XML database will be less efficient for non-XML applictions, plain and simple. Querying XML cannot be made as fast as querying relational tables, meaning extra overhead for non-XML apps. But *your* application encurs overhead in turning relational tables into XML (probably via the RDBMS's internal facility), and in transforming it if necessary. The question is therefore: who makes more queries on the database, this application or other non-XML ones? Who will make more queries in 5 years?

    If you answer 'others' to either question, use a relational database--their XML support is decent now and will only get better, and they're far more popular in business which is an important CYA factor. If you answer 'your app' or 'other XML-based apps' for both questions, it's time to check out what XML databases have to offer right now. I expect other posts to comment on the current state of the art right now, but you can expect things to only get better as industry support for XQuery et al. improves--but don't expect them to *ever* pass up the relational databases in terms of raw performance, it's impossible. But as the evolution from Assembler to C to Java has shown in programming languages, the day may come when raw performance takes a back seat to other concerns.

  4. XML,SQL,XML Query, Databases by Ankh · · Score: 4, Informative

    There seem to be a lot of confused comments on this, but hey, it's slashdot :-)

    If you mostly deal with the sort of data for which relational databases are generally optimised, you'll probably not be very interested in XML solutions, as they are solving problems you don't have.

    If you routinely get questions like "how often is part 1976 mentioned in the same repair procedure as part 2001?" or "which of our 150,000 documents have chapters containing five or more subsections any of which does not yet have a summary?" then the XML approach becomes more interesting.

    In my book on XML databases (1999 so I don't recommend going out and getting a copy today) I talked about using a hybrid system, with metadata picked out of XML whenever a changed version is stored (e.g. you might use a CVS commit script) and stored in a relational database.

    With a relational database you have a lot of flexibility to change your queries but the data representation has to be static. Even changing the type of a column can be difficult in an RDBMS.

    Queries may be a little harder with the XML system, but the data storage is more flexible and you have native knowledge of sequence and hierarchy that are traditionally absent using SQL.

    More recent versions of SQL have added some XML support, understanding the different sorts of queries that people typically run against such very different sors of data. There has been a lot of research over the past 30 or 40 years (hierarchical databases predate the relational model) on hierarchy, sequence and thesort of irregularity that RDBMS people call semistructured data and the rest of us call XML :-)

    XML Query is a query language designed to run over both relational and XML-native data sources (and others, for that matter) and to be optimized very efficiently, so that people like IBM (makers of DB2), Oracle, BEA, Software AG and othes can have efficient implementations. There's also standards work on how to embed XML Query expressions in SQL.

    The public XML Query Web page is at www.w3.org/XML/Query and lists quite a large number of implementations. Software AG have participated in the XML Query development.

    You might like to look at the XML Query use case document and see how close the examples map to your own situation.

    Disclaimer: I work for the W3C, participate in the XML Query WOrking Group, and maintain the XML Query Web page. But it sounded like it's the sort of information you were looking for.

    I can't comment on the quality of Tamino, as I have not used it, but I will also note that if you stick to openly-defined standard query languages wherever you can, there's a good chance you could move to a different implementation if you needed to with relatively little cost. This is similar to SQL, of course.

    There was lots of hype around XML, but that doesn't mean it's all false, nor that it was all true. XML is a good way to interchange structured, hierarchical imformation, but it probably won't cure acne :-)

    Liam

    [slashdot::Ankh -- Liam Quin, W3CXML Activity Lead]

    --
    Live barefoot!
    free engravings/woodcuts
  5. Ananova was built on Tamino by munkinut · · Score: 2, Informative

    I worked on the http://www.ananova.com/ website, which was originally built on Tamino. Tamino couldn't handle the load and was a nightmare to admin at the time. Doubtless SoftwareAG will have fixed the lack of backup and restore tools by now. Not soon enough for us to migrate the whole thing onto Oracle shortly after release though.

    --
    re-invent wheels ... you never know
  6. Project 90% XML based by golgoth14 · · Score: 3, Informative
    I'm working on a project using XML Native database, Java JAXB and Mozilla platform.
    Actually, I'm using exist-db.org and it works fine but I have some performance problems when I want to sort data.
    I have tested Ipedo, TextML, dbXML, XHive, ...
    and TextML was the faster but it doesn't support XQuery.
    Ipedo was the faster with an old XQuery version support. I think it's the best product because it provides an RDBMS bridge to query with SQL and XQuery and some other features like XViews.
    My application use XQuery/XSLT to read data and JAXB to check and execute business method before storing.
    I think the main problems with XML Native database is performance and no transaction support but document locking.
    But, the advantages are:

    • Powerful query language XQuery
    • No code to modify data directy in the database.
    • Database export as XML documents, modify with "notepad" and re-import.
    • Easy and quick test data editing. Don't need to use SQL and Java to insert the test data.
    • Easy database deployment without DBA !!!
    • Power(full) Full Text searching
    I think XML Native databases are to use when your application needs to manipulate a lot of text data.
    Like CRM, Groupware, Administrative application, fulltext and contextual searching, ...

    You must try at least one XML Native Database in your life to compare it with RDBMS and Object databases and make your own opinions.
  7. Re:Oracle and XSQL by Rich · · Score: 2, Informative

    It's a decent size - the results of around a million security assessments. The number of transactions per second is low, but the amount of data needed to generate reports is quite high.