DTD vs. XML Schema
AShocka writes "The W3C XML Schema Working Group has released the first public Working Draft of Requirements for XML
Schema 1.1. Schemas are technology for specifying and constraining
the structure of XML documents. The draft adds functionality and
clarifies the XML Schema Recommendation Part 1 and Part 2. The XML Schema Valid FAQ
highlights development issues and resources using XML Schema. This article at webmasterbase.com addresses the
XML DTDs Vs XML Schema issue.
Also see the W3C Conversion Tool from DTD to XML Schema
and other XML Schema/DTD Editors."
There's no "vs."
XML Schema are much more flexible and powerful.
There're also about 100 times more difficult and confusing.
1. DTD 2. XML Schema 3. CowboyNeal validation (via SOAP over SMTP)
I am a programmer for a commercial company (yes I like to make money, and I program on WinTel). I year ago we had the XML craze we converted all our internal protocols to XML. I discovered that XML was just a lot of hype about nothing. There is nothing self-describing about it. Or maybe there is, just like the section names in an INI file describe the keys in them...
On the other hand the one thing that I did find XML useful for is easy parsing. If you use XML to develop a lower level protocol you end up with bloated 10k messages. But for high-level protocols or for configuration files it's great for only one reason: There are lots of ready-made tools. If you want to parse XML in Windows just load the IXMLDocument interface and it works at lightening speed. If you want to parse the messages in a web-browser through together a quick DOM parser or even use the build in DOM one! If you want to parse XML in PERL or C/C++ there are great libs. The only reason XML is good is because all the hype got people developing very neat tools. In one of my latest projects that needs to pass information between two programs written in different languages a used a Home-Made SOAP and designed a base class the persists using XML. I developed it in both langauges in under an hour!
So although it wastes bandwidth and there really isn't anything neat about it, it is comfortable I'll give it that.
God made the natural numbers; all else is the work of man - Kronecker
I think James Clarke's RELAX NG and W3C XML Schema is the best description (if slightly biased ;--) of the relative strength of the 2 technologies. Note that James Clarke also just released a new version of Trang , a tool that does conversions between Relax NG, Schemas and DTDs.
Look, that's why there's rules, understand? So that you think before you break 'em. (Terry Pratchett)
Better yet, use S-Expressions.
There are tons of parsers available.
markup is simple:
(this_is_the_tag
this is all data
(except_this_is_a_nested_tag with still more data))
Even better still, there are customizable parsers available that can treat these S-Expression as data OR interpret them as program OR a combination of both. One such parser is called "Lisp". Once again, several implementations are available.
Note that things like S-Expressions and Lisp have only been around for 40 years so you might want to give these technologies some time to mature.
This approach to interfaces allows systems to interchange messages without exact version consistency, and without requiring a tight congruence of the applications. It allows a system to "tell what it knows" and another system to "read what it needs" without further ado.
Unfortunately, the use of schemas goes against this idea. It is IMHO a more old fashioned approach of rigidly constraining the messages to an exact specification. This can make interfaces far less robust and flexible, and increase the amount of work.
If your talking about using XML for data messaging not using schemas is just lazy. XML Schema allows optional elements and attributes and/or default values. So if it isn't required, then just make it optional. If you want multiversion interfaces, you have a different XMLSchema for each version. Then each side knows explicitly what the messaging protocol is.
While it's probably true that things mostly kinda work if the versions don't match, you shouldn't be relying on this. There's lots of software out there that does this but that doesn't mean it's the ideal.
If your using XML for markup of documents, schemas are somewhat less useful since the underlying semantics of the tags is usually more important.
I am not a number! I am a man! And don't you
Trimming bloat like namespaces and comments? Are you nuts?
How do you embed MathML in another document (like XHTML)? Currently it's with namespaces. How do you propose to do that without namespaces? Just the prefixes? What happens when two different markups use the same prefix? Wups! You're screwed!
No comments? This is supposed to make a better alternative to XML? It won't help readability, and it certainly isn't a major bottleneck during parsing.
Don't want the "bloat" of namespaces and comments? Wait for it... Wait for it... Don't use namespaces and comments in your documents! Wow! What a concept!
Maybe no Unicode in PXML hunh? So much for interoperability for any kind of data. You don't ever want your pet project used in East Asia (or Russia or Greece or most other places in the world) do you? Unicode too bloated? Why not just use ISO-8859-15 (basically ASCII w/ a Euro character -- which incidentally a Euro character isn't available in ASCII)? Oh wait! That's right. You don't want to allow processing instructions, which in XML tell you what encoding is used.
What happens if you want to change some of the basic syntax of PXML? Because you've nuked processing instructions, you can't specify a markup version like you can in XML.
Yes, yes. We've all seen your little pet project. I hope it was just a class assignment.
- I don't need to go outside, my CRT tan'll do me just fine.
I can't believe nobody's mentioned this yet. Microsoft has a tool that will do several things:
This makes writing your XSD almost trivial. The code-generation capabilities are very powerful, as well, as you can generate runtime classes for serialization/deserialization or classes derived from DataSet so you can treat XML files like any other database, etc. It's very useful if you're doing any
I'd be very surprised if there weren't other tools out there doing similar things. I simply mentioned xsd.exe because that's what I'm familiar with.
This is a misunderstanding of the way schema validation is supposed to work. Schemas have what is called "location hints" which should be used in case you have never before encountered a particular namespace. The key word, however, is "hints" - i.e. you should never have to remotly obtain a schema if you don't need to.
..."master" XSD schema... you never ever have to get it remotely - the parser should be implementing it already...
In most cases, if you are doing schema validation, you already know whta schema you can expect, so they should be not only locally available, but also cached in memory...
As for the
In my experience, many benefits of XML come when dealing with the presentation layers of many application architectures, with the ability to repurpose syndicated data at wil, here are a few examples:
Effective use of XML and XSLT allows you to easily aggregate informational data from one or multiple sources and "repurpose" for an infinite variety of business and technological goals.
One of the main benefits of XML is that it offers and effective, textual representation of "scructured data", that can be conveniently accessed and manipulated according to a slew of various surrounding standards such as XPath, DOM, XSLT, namespaces.
Extraordinary Vacations. Exceptional Prices
Absolutely. All the possible attributes, and kids of any element are there in one (OK, two) place(s) and you can garner the information about any element in a matter of seconds. With XML Schema you have to keep track of the levels of nesting and rifle through a series of name/value pairs to get the same information. It is in its greater expressiveness that the advantage of XSD is seen to lie. And there might be applications where this expressiveness necessitates the use of XSD.
However, XML Schema, has besides this expressivenss, one other great advantage. It is XML. As such it can be processed with the same XML tools one uses elsewhere with an XML application.
As an example, in one application, I take a DTD, translate it into XSD, and then run an XSL stylesheet over the XSD file to generate some base code used in my application. In this way I can ensure that my code will automatically be changed to reflect any minor changes made to my Schema.
So while I continue to write DTDs, I look on XML Schema as a way to translate, and bring my DTD into the XML universe, with all its attendant advantages.
Better to be despised for too anxious apprehensions, than ruined by too confident a security. --Edmund Burke