Slashdot Mirror


What Do You Know About Databases And XML?

Dare Obasanjo writes: "XML has become a pervasive part of significant segments of software development in a relatively short time. From file formats to network protocols to programming langauges, the influence of XML has been felt. I have written an overview of XML schemas, XML querying languages, XML-Enabled databases and native XML databases. Below is a shortened version of the article." Obasanjo's original OODBMS article has been updated to reflect more of the disadvantages between picking an OODBMS over an RDBMS.

9 of 257 comments (clear)

  1. Re:Super short intro to XML by Skapare · · Score: 3, Insightful

    So if someone designs a new (not like XML) format for exchanging data, and manages to get it standardized, then won't this also allow two systems that do not share a predetermined data exchanged protocol to share data? One could also be careful in this design and make sure it is more efficient than XML, not only in space and bandwidth, but also in CPU time and programming time. Now does such a format need to be text based as XML is?

    --
    now we need to go OSS in diesel cars
  2. Re:xml is an interchange format, not a storage for by sphealey · · Score: 5, Insightful
    Why is NASA switching to MySQL from Oracle [fcw.com] and noticing speed increases?
    I will defer to you on the advantages/disadvantages of using databases to store OO data.

    However, citing NASA as a source for technology or trends is a bit silly, for a number of reasons. The primary one is this: NASA is so large, and so diverse, that at one of their sites/on one of their projects they use one of just about every technology product you can name.

    I was once running two back-to-back software evaluations for products in the $20-million range. For both applications, the top ten vendors all claimed that their system was "used by NASA for the Space Shuttle". We checked up and guess what - they were all telling the truth.

    So you need a better example.

    sPh

  3. Re:xml is an interchange format, not a storage for by Skapare · · Score: 3, Insightful

    So what do you think of using XML for system configurations? That tends to be in UNIX systems a lot of separate files, traditionally edited with vi although today the tools are getting more and more dummy friendly and have a smaller space of possibiities.

    --
    now we need to go OSS in diesel cars
  4. Oracle vs. MySQL performance by Raul+Acevedo · · Score: 3, Insightful

    Comparing Oracle and MySQL performance in the context of XML is silly. It is a well-known fact that MySQL is significantly faster than Oracle, but not because of XML, Java, or other "OO crap". It is simply because MySQL doesn't have transactional support, and probably a host of other non-OO high end RDBMS features.

    I wouldn't be surprised if "OO crap" does indeed slow down Oracle, but I know the JVM for Oracle is completely optional. I can't speak to any XML features in Oracle, I'm not familiar with them.

    --
    In a real emergency, we would have all fled in terror, and you would not have been notified.
  5. Database storage in XML format is fine, if... by jlowery · · Score: 4, Insightful

    Of course, this is not an easy question to answer, but the right answer involves knowing three things:

    1) Can certain records be considered 'atomic'?
    This is similar to the RDBMS question of whether or not it makes sense to construct a view or not. View definitions represent a common query. If you considering a query as a means of tying together disparate data from many tables into a single, denormalized set of records, the record could just as easily be expressed in some XML format.

    Now, if that record represents some physical or conceptual entity in the data model, it is in fact a set of properties about an object. This is what XML is good at representing. Decomposing that set of object data (record) into normalized relations may not make sense if such 'objects' are frequently requested; but there other considerations...

    2) Ad hoc queries are difficult when data is stored internally in XML, because each XML blob has to be parsed and checked for the query values. If you don't know in advance if the XML structure even has the fields you're looking for, then you must do an exhaustive search. Some have used indexed XPath information to work around this issue. Since we're mentioning indexes...

    3) How do you find the XML blobs you're looking for. We've used an ORDBMS for our XML data, and indexed on the ID or key values (as defined in an XML Schema) for each element stored in the database. This makes looking up element instances easier. It also makes relating them easier, too, if you use IDREF or keyrefs as your foreign keys.

    Now every XML document has a single root element. If you're storing that document in a database, you could choose to store just that one root element instance. More likely, you'll want to decompose the root so that accessing subelements by ID or key in the database will be easier.

    Got to run off now,

    Jeff Lowery

    --
    If you post it, they will read.
  6. Report writers for non-relational databases by cthlptlk · · Score: 2, Insightful

    I've been properly brainwashed in the Open Source way, and I use XML all the time as an interchange mechanism, but you'll have to pry Crystal Reports from my cold, dead fingers.

    I have spent a lot of time training non-technical users to get their own damn reports from databases. It's hard to imagine putting data--any data--into a system where the tools to get it out haven't been written yet.

  7. Missing the Big Picture by SuperKendall · · Score: 5, Insightful

    I can't believe no-one has posted my standard response to someone who thinks XML is just for "interchange".

    The interesting thing about XML to me is NOT that it solves the interchange problem (though it helps with that). The great thing is that it solves the PARSING problem. No longer do I have to write a parser everytime I have some simple task of reading in something externally.

    What XML does is define for you a standard means of parsing, and by defining the API for parsing and the structure of the documents lets you think about how you want to structure external information, not how you're going to read it in.

    Also, because the API for parsing is now hiding the engine details below, parsers can be specialized depending on what kind of task you have. Parsing thousands of 1k XML documents would seem to demand a different processor altogether from a few multi-GB documents, but you only have to know one parser (Ok, really two - SAX and the DOM interface). You could even have specialized XML processors that did write the stream out in a wierd custom binary format for compactness and read it back in with the normal DOM API so clients wouldn't have to adjust. I'll grant you that there don't seem to be many specialized XML processors - yet.

    I also like the robustness of XML exchanges (here I'm getting more into your main point). If you add or drop attributes from an XML document, clients that read that document are less likley to break (unless of course they relied entirely on the node(s) you have removed!). That is especially true of XSL, where missing nodes of a document simply correspond to missing parts of output (which can also be a useful effect).

    You might think of XSL as a useless language, but I'll be happy to make a counter-prediction that it will grow and thrive. It's simply too useful a transformation tool to do anything else. I know the syntax seems overbearing, but for the kinds of short transformational work it's normally put to that's not much of an issue and you get used to it quickly.

    --
    "There is more worth loving than we have strength to love." - Brian Jay Stanley
  8. Re:most problems xml is used for by leandrod · · Score: 2, Insightful

    First one has to think about what's XML.

    XML is not a language, notwithstanding its own name. It's a metacodification, used to create codifications such as XQL, HTML, DocBook and so on.

    OO people are usually programmers with very little CS fundamentals, so they don't even get this right: when they are talking about XML in database contexts, they should at least specify the coding they want to use. And then it should be understood that you need to use it for storage encoding, or for data communications, or both.
    Thus one cannot say that XML was created for data interchange -- it was created for metacodification. One can create a data interchange codification based on XML -- but that's kind of stupid, since XML codifications usually will give big overheads. We've been doing data interchange with text files with little problems for years. The issue of agreeing on data model and codification between applications does not go away just because you agreed on using some codification with a big overhead.

    But I haven't still touched on the worst on using XML codifications in database contexts -- it is that both XML and OO are hierarchical, thus a regression to thirty years ago when there were navigational databases, no data independence, hierarchical and network systems... we are throwing away thirty years of relational research without ever having implemented it right.

    But that's the way of an uneducated world... just as people adopting proprietary technologies have thrown away open systems ideals without ever having got it right.

    --

    --
    Leandro Guimarães Faria Corcete DUTRA
    DA, DBA, SysAdmin, Data Modeller
    GNU Project, Debian GNU/Lin
  9. Re:A markup weenie rebuts. by blif · · Score: 2, Insightful
    Finally, a rational discussion of the merits of XML. Hate to use the buzzward, but it's all about repurposing baby! XML facilitates creating docs that you can then convert to a variety of output formats. This is pretty much 50% of my job, so that's why I think it's so cool. And after years of poring over binary dumps of other people's data (well, and my data too), It's very nice to use a human, self-documenting format. I think a lot of the XML posts you see are from people who don't do this sort of stuff for a living.

    XSL is also cool, once you climb the steep learning curve and bend your mind around it's declarative style.

    As for native xml db's - that is probably mostly hype.