DTD vs. XML Schema
AShocka writes "The W3C XML Schema Working Group has released the first public Working Draft of Requirements for XML
Schema 1.1. Schemas are technology for specifying and constraining
the structure of XML documents. The draft adds functionality and
clarifies the XML Schema Recommendation Part 1 and Part 2. The XML Schema Valid FAQ
highlights development issues and resources using XML Schema. This article at webmasterbase.com addresses the
XML DTDs Vs XML Schema issue.
Also see the W3C Conversion Tool from DTD to XML Schema
and other XML Schema/DTD Editors."
While the W3 continues to push Schema, they are also forming working groups for RELAX after pressure from XML luminaries such as James Clark.
XML Schema is also kinda whacked. It shows all the signs of being a committee specification.
The big problem with schema is that you actually have two type systems going. Element definitions are types for elements. Type definitions are actualy types for types for elements. I saw a hopelessly confused attempt by some UML people to express XML schema in UML, they simply could not understand that there was no way it could ever work. UML has completely different semantics.
There are a bunch of schema proposals that folk have said good things about. Eve keeps telling me I should look at Relax. But for the time being XML schema is going to be the basis for standards in W3C and OASIS.
There might be an opportunity to do a clean up job on XML schema in 4 or 5 years but that will only happen if it is causing real problems.
Looking for an Information Security student project suggestion?
Try http://dotcrimeManifesto.com/
Better yet, use S-Expressions.
There are tons of parsers available.
markup is simple:
(this_is_the_tag
this is all data
(except_this_is_a_nested_tag with still more data))
Even better still, there are customizable parsers available that can treat these S-Expression as data OR interpret them as program OR a combination of both. One such parser is called "Lisp". Once again, several implementations are available.
Note that things like S-Expressions and Lisp have only been around for 40 years so you might want to give these technologies some time to mature.
It's occurred to me maybe we are being too diligent in actually validating the schema itself, but I'm wondering what others think?
I can't believe nobody's mentioned this yet. Microsoft has a tool that will do several things:
This makes writing your XSD almost trivial. The code-generation capabilities are very powerful, as well, as you can generate runtime classes for serialization/deserialization or classes derived from DataSet so you can treat XML files like any other database, etc. It's very useful if you're doing any
I'd be very surprised if there weren't other tools out there doing similar things. I simply mentioned xsd.exe because that's what I'm familiar with.
Great thing about Lisp, is if you need to convert your communications, you can write Lisp against it to convert it while you convert your Lisp source.. easily.
I plopped an XSLT processor in front of it. Took minutes to implement. In the mean time, I was able to properly rewrite the XML producing code. So I had some flexibility in terms of patching the protocol quickly, while taking the weeks I needed to fix things right.
I plopped a Lisp processor in front of it. Took minutes to implement. In the mean time, I was able to properly rewrite the Lisp producing code. So I had some flexibility in terms of patching the protocol quickly, while taking the weeks I needed to fix things right.
the point is, XML IS descriptive, so long as you use good names.
the point is, Lisp IS descriptive, so long as you use good names.
If you use XML to develop a lower level protocol you end up with bloated 10k messages.
If you use Lisp S-expressions to develop a lower level protocol you don't end up with bloated 10k messages.
Besides, in Common Lisp you'll really appreciate MOP - Meta-Object Protocol. Much better than SOAP.
Trust me, I know well, actively use and actually love both Lisp *AND* XML.
Less is more !
that the same applications of XML that drive the keening about bloat and hype seen in these comments are precisely those which are driving the specs to the wrong side of the 80/20 for XML/XSL's original goals: bringing the semantic power of SGML and DSSSL to the Web. Goals for which its purist cousins RelaxNG, REST, et. al. remain admirably suited.
The back-end curmudgeons are right, XML stinks for a universal wire format. But for loosely-coupled, message-based, semantically-rich systems it is hard to beat. And document-oriented systems which don't use XML barely deserve notice any longer.
I gently refer s-expression trolls to paul and oleg
illegitimii non ingravare
There are tons of parsers available.
How does one specify the character set in some, imagined or real, S-Expression markup? Do these "tons of parsers" support Unicode at least? Where to put processing instructions? Character entities? External entities? "Raw data" sections with markup suppressed? How does one specify the document type identifier? Namespaces? All these things fulfill important tasks for XML to be an universal, yet concise, markup language, and all this can make your dreamt-up S-Expression language as contrived as XML is sometimes perceived to be.Attributes, I presume, are out of our concern? You note that the means for syntactic description of data trees are around for 40 years. Yet there was yearning for something more... handy, or something. Doesn't it give any hint to you?
My exception safety is -fno-exceptions.