What Do You Know About Databases And XML?
Dare Obasanjo writes: "XML has become a pervasive part of significant
segments of software development in a relatively short
time. From file formats to network protocols to
programming langauges, the influence of XML has been
felt. I have written an
overview of XML schemas, XML querying languages,
XML-Enabled databases and native XML databases.
Below is a shortened version of the article." Obasanjo's original OODBMS
article
has been updated to reflect more of the disadvantages
between picking an OODBMS over an RDBMS.
Databases are for storing data. End of Story.
Oracle is taking some BIGTIME performance hits for stacking all that OO crap in there, and MS SQL Server is seeing the same thing now that they've got the XML in theirs. Don't believe me?
Why is NASA switching to MySQL from Oracle and noticing speed increases?
Don't get me wrong, I'm a big fan of XML.. as a data interchange format.. but when i want tight storage and quick retrieval, give me a normalized RDBMS any day of the week. Because that's what it's for.
However, citing NASA as a source for technology or trends is a bit silly, for a number of reasons. The primary one is this: NASA is so large, and so diverse, that at one of their sites/on one of their projects they use one of just about every technology product you can name.
I was once running two back-to-back software evaluations for products in the $20-million range. For both applications, the top ten vendors all claimed that their system was "used by NASA for the Space Shuttle". We checked up and guess what - they were all telling the truth.
So you need a better example.
sPh
I hear you. The product that I'm working on right now is XML heavy. It's using entirely proprietary data formats, and the XML processing is taking up 80% of the query time. After achieving full buzzword compliance, we decided that the system is way too slow, and now have to strip the whole bloody lot back out again.
Note that there was no reason to use XML in the first place, other than some designers wanted to put it on their resumes. I kid you not.
If you were blocking sigs, you wouldn't have to read this.
Granted, XML has some advantages. Data interchange among disimilar clients, for one. But storing XML in a database is a gross waste of space and processing power, and is realistically impossible for all but the smallest of databases.
The society for a thought-free internet welcomes you.
xml is an interchange format, not a storage format
Absolutely, positively agree. Not only is XML only an interchange format, but it only makes sense in some situations (for instance if we have an embedded piece of hardware that we have to communicate with, and we're communicating to it from a Windows box, and there is no shared common data encapsulation format, I'd greatly prefer XML (with XSD) vastly over Jimmy the Programmer making up his own data encapsulation format/documentation method/extraction system, but if I have two Windows machines running SQL Server and they're in a common security context and they'll never change, I'd use DTS or replication, not XML).
and MS SQL Server is seeing the same thing now that they've got the XML in theirs
The XML "in" SQL Server is surface fluff (I love SQL Server and I'm saying this as a good thing, not a bad thing). i.e. Some modules that'll convert an XML query to an underlying DB query, and the results back to XML, and some basic XML importing and exporting routines. This hasn't affected the underlying operations of SQL Server whatsoever.
Can we please, please, please append the definition of XML to allow "</>" to close whatever the last tag was?
That simple change would probably cut the size of the average XML file in half.
(corrected post, please moderate my other one down. I have plenty of Karma to spare...)
Sometimes it's best to just let stupid people be stupid.
I can't believe no-one has posted my standard response to someone who thinks XML is just for "interchange".
The interesting thing about XML to me is NOT that it solves the interchange problem (though it helps with that). The great thing is that it solves the PARSING problem. No longer do I have to write a parser everytime I have some simple task of reading in something externally.
What XML does is define for you a standard means of parsing, and by defining the API for parsing and the structure of the documents lets you think about how you want to structure external information, not how you're going to read it in.
Also, because the API for parsing is now hiding the engine details below, parsers can be specialized depending on what kind of task you have. Parsing thousands of 1k XML documents would seem to demand a different processor altogether from a few multi-GB documents, but you only have to know one parser (Ok, really two - SAX and the DOM interface). You could even have specialized XML processors that did write the stream out in a wierd custom binary format for compactness and read it back in with the normal DOM API so clients wouldn't have to adjust. I'll grant you that there don't seem to be many specialized XML processors - yet.
I also like the robustness of XML exchanges (here I'm getting more into your main point). If you add or drop attributes from an XML document, clients that read that document are less likley to break (unless of course they relied entirely on the node(s) you have removed!). That is especially true of XSL, where missing nodes of a document simply correspond to missing parts of output (which can also be a useful effect).
You might think of XSL as a useless language, but I'll be happy to make a counter-prediction that it will grow and thrive. It's simply too useful a transformation tool to do anything else. I know the syntax seems overbearing, but for the kinds of short transformational work it's normally put to that's not much of an issue and you get used to it quickly.
"There is more worth loving than we have strength to love." - Brian Jay Stanley
Separation of content, logic, and presentation is very difficult to do in current web-app developments environments.
The breakdown is not on the logic/content side of the equation, or the presentation/content side, but mainly in the presentation/logic arena.
Imagine an HTML designer who has mocked up a page for a web-app, and hands it off to the dev team for them to add in the neccessary laogic to dynamically include the user-name, current balance, contents of the shopping cart, etc. Depending on the exact paragdigm taht their tools use, they will either:
a) Chop up the page and include various fragments in the programs that are designed to emit said fragments at the opportune times to be assembled into a text stream eventually recived by a browser
or b) Various bits of logic get stuck into the page in oder to parameterize and/or conditionalize it, using either some sort of speacial tagging format or actual inlined blocks of code.
Whichever approach the dev team's tools use, the result is the same: the designer can no longer change the altered page.
Even in case b), which maintains some semblance of a coherent 'page', the designer cannot load the page-with-logic into their favorite visual editor and see anything resembling the actual page. They certainly can't edit it to change the look-and-feel without breaking the carefully constructed logic.
The end result is that the designer has no recourse other than to take their page design, change it, and hand it over to the dev-team again for them to re-include (in some cases re-code) all of their logic.
This is obviously a very wasteful approach.
Amazingly, there actually is a solution to this problem. It's called Template Attribute Language (TAL), and it solves the problem by adding programming directives to the page via XHTML attributes on the existing tags. The language is deliberately designed to only be suitable for presentation logic, relegating business logic code to some other objects, where the designer can't see them. This helps enforce the appropriate distinction between presentation logic and business logic that most current development environments ignore, thus encouraging their admixture.
Currently, TAL (and the related specifications TALES and METAL) are only implemented in one environment, but the language has been deliberately designed to be as platform agnostic as possible. Other implementations of the specification are possible, and even desireable.
Articles:
Zope Page Templates: Getting Started
Zope Page Templates: Advanced Usage
Using Zope with Amaya, Dreamweaver, and other WYSIWYG Tools
The real Webmaven is user ID 27463. I don't rate an imposter, because my ID is such a lame-ass high number.