Domain: thaiopensource.com
Stories and comments across the archive that link to thaiopensource.com.
Comments · 24
-
Re:So let me get this rightNow, if you really want to complain about OOXML, what about the lack of a compliance test suite? There's a set of RELAX NG XML Schemas for the Transitional and Strict OOXML specs that are trivial enough to use with something like jing; what more do you want?
-
Make working with XML suck less...
"XML is really just data dressed up as a hooker."
XML does suck if you stick with some of the W3C standards and common tools. Suggestions to make it less painful:
- Ditch W3C's XML Schema
W3C Schema is painful; it forces object-oriented design concepts onto a hierarchical data model. Consider RELAX NG (an Oasis-approved standard) instead; it's delightful in comparison. Use the verbose XML syntax when communicating with the less technical - if you've seen XML before, it's pretty easy to comprehend:
<r:optional>
<r:element name="w3cSchemaDescription">
<r:choice>
<r:value>painful</r:value>
<r:value>ugly</r:value>
<r:value>inflexible</r:value>
</r:choice>
</r:element>
</r:optional>Switch to the compact syntax when you're among geeks:
element w3cSchemaDescription { "painful" | "ugly" | "inflexible" }?
There's validation support on major platforms, and even a tool (Trang) to convert between verbose/compact formats, and output to DTD and W3C Schemas. And, if you need to specify data types, it borrows the one technology W3C Schema got right: the Datatypes library.
- Don't use the W3C DOM
The W3C DOM attempts to be a universal API, which means it must conform to the lowest common denominator in the programming languages it targets. Consider the NodeList interface:
interface NodeList {
Node item(in unsigned long index);
readonly attribute unsigned long length;
};While similar to the native list/collection/array interfaces most languages provide, it's not an exact match. So, DOM implementers create an object that doesn't work quite like any other collection on the platform. In Java, this means writing:
for(int i = 0; i < nodeList.length(); i++)
{
Node node = nodeList.item(i); // Do something with node here...
}Instead of:
for(Node node : nodeList)
{ // Do something with node here...
}Dynamic languages allow an even more concise syntax. Consider this Ruby builder code to build a trivial XML document:
x.date {
x.year "2006"
x.month "01"
x.day "01"
}I thought about writing the W3C DOM equivalent of the above, but I'm not feeling masochistic tonight. Sorry.
The alternatives depend on your programming language, but plenty of choices exist for DOM-style traversal/manipulation.
- Forget document models entirely (maybe)
In-memory object models of large XML document can consume a lot of resources, but often, you only need part of the data. Consider using an XMLPull or StAX parser instead. Pull means you control the document traversal, only descending into (and fully parsing) sections of the XML that are of interest. SAX based parsers have equivalent capabilities, but the programming model is uncomfortable for many developers.
Even better, some Pull processors are wicked fast, even when using them to construct a DOM. In Winter 2006, I benchmarked an XML-heavy application, and found WoodStox to be an order of magnitude faster at constructing thousands of small DOM4J documents
- Ditch W3C's XML Schema
-
Maximizing Composability and Relax NG Trivia
Tim Bray is right, and he couldn't have put it better: W3C XML Schemas (XSD) suck. The reason Relax NG is so much cleaner and more powerful than committee-designed XML Schemas, is that it's based on a sound mathematical foundation (tree regular expressions, or "hedge automata theory"). While XML-Schemas suffer from ad-hoc design, committee-burn, lack of focus, and half-baked attempts to solve too many unrelated problems.
Here's some interesting stuff from my blog about the design and development of Relax NG.
-Don
James Clark wrote about maximizing composability:
First, a little digression. In general, I have made it a design principle in TREX to maximize "composability". It's a little bit hard to describe. The idea is that a language provides a number of different kinds of atomic thing, and a number different ways to compose new things out of other things. Maximizing composability means minimizing restrictions on which ways to compose things can be applied to which kinds of thing. Maximizing composability tends to improve the ratio between functionality on the one hand and simplicity/ease of use/ease of learning on the other.
Clark describes the derivative algorithm's lazy approach to automaton construction:
I don't agree that <interleave> makes automation-based implementations impossible; it just means you have to construct automatons lazily. (In fact, you can view the "derivative"-based approach in JTREX as lazily constructing a kind of automaton where states are represented by a canonical representative of the patterns that match the remaining input.)
The Relax NG derivative algorithm is implemented in a few hundred elegent declarative functional lines of Haskel, and also in tens of thousands of lines and hundreds of classes of highly abstract complex Java code.
Clark's Java implementation of Relax NG is called "jing", which is a Thai word meaning truthful, real, serious, no-nonsense, and ending with "ng".
Comparing the Java and Haskell implementations of Relax NG illustrates what a wicked cool and powerful language Haskell really is. The Java code must explicitly model and simulate many Haskel features like first order functions, memoization, pattern matching, partial evaluation, lazy evaluation, declarative programming, and functional programming. That requires many abstract interfaces,, concrete classes and brittle lines of code.
While the Java code is quite brittle and verbose, the Haskell code is extremely flexible and concise. Haskell is an excellent design language, a vehicle for exploring complex problem spaces, designing and testing ingenious solutions, performing practical experiments, weighin
-
Maximizing Composability and Relax NG Trivia
Tim Bray is right, and he couldn't have put it better: W3C XML Schemas (XSD) suck. The reason Relax NG is so much cleaner and more powerful than committee-designed XML Schemas, is that it's based on a sound mathematical foundation (tree regular expressions, or "hedge automata theory"). While XML-Schemas suffer from ad-hoc design, committee-burn, lack of focus, and half-baked attempts to solve too many unrelated problems.
Here's some interesting stuff from my blog about the design and development of Relax NG.
-Don
James Clark wrote about maximizing composability:
First, a little digression. In general, I have made it a design principle in TREX to maximize "composability". It's a little bit hard to describe. The idea is that a language provides a number of different kinds of atomic thing, and a number different ways to compose new things out of other things. Maximizing composability means minimizing restrictions on which ways to compose things can be applied to which kinds of thing. Maximizing composability tends to improve the ratio between functionality on the one hand and simplicity/ease of use/ease of learning on the other.
Clark describes the derivative algorithm's lazy approach to automaton construction:
I don't agree that <interleave> makes automation-based implementations impossible; it just means you have to construct automatons lazily. (In fact, you can view the "derivative"-based approach in JTREX as lazily constructing a kind of automaton where states are represented by a canonical representative of the patterns that match the remaining input.)
The Relax NG derivative algorithm is implemented in a few hundred elegent declarative functional lines of Haskel, and also in tens of thousands of lines and hundreds of classes of highly abstract complex Java code.
Clark's Java implementation of Relax NG is called "jing", which is a Thai word meaning truthful, real, serious, no-nonsense, and ending with "ng".
Comparing the Java and Haskell implementations of Relax NG illustrates what a wicked cool and powerful language Haskell really is. The Java code must explicitly model and simulate many Haskel features like first order functions, memoization, pattern matching, partial evaluation, lazy evaluation, declarative programming, and functional programming. That requires many abstract interfaces,, concrete classes and brittle lines of code.
While the Java code is quite brittle and verbose, the Haskell code is extremely flexible and concise. Haskell is an excellent design language, a vehicle for exploring complex problem spaces, designing and testing ingenious solutions, performing practical experiments, weighin
-
Maximizing Composability and Relax NG Trivia
Tim Bray is right, and he couldn't have put it better: W3C XML Schemas (XSD) suck. The reason Relax NG is so much cleaner and more powerful than committee-designed XML Schemas, is that it's based on a sound mathematical foundation (tree regular expressions, or "hedge automata theory"). While XML-Schemas suffer from ad-hoc design, committee-burn, lack of focus, and half-baked attempts to solve too many unrelated problems.
Here's some interesting stuff from my blog about the design and development of Relax NG.
-Don
James Clark wrote about maximizing composability:
First, a little digression. In general, I have made it a design principle in TREX to maximize "composability". It's a little bit hard to describe. The idea is that a language provides a number of different kinds of atomic thing, and a number different ways to compose new things out of other things. Maximizing composability means minimizing restrictions on which ways to compose things can be applied to which kinds of thing. Maximizing composability tends to improve the ratio between functionality on the one hand and simplicity/ease of use/ease of learning on the other.
Clark describes the derivative algorithm's lazy approach to automaton construction:
I don't agree that <interleave> makes automation-based implementations impossible; it just means you have to construct automatons lazily. (In fact, you can view the "derivative"-based approach in JTREX as lazily constructing a kind of automaton where states are represented by a canonical representative of the patterns that match the remaining input.)
The Relax NG derivative algorithm is implemented in a few hundred elegent declarative functional lines of Haskel, and also in tens of thousands of lines and hundreds of classes of highly abstract complex Java code.
Clark's Java implementation of Relax NG is called "jing", which is a Thai word meaning truthful, real, serious, no-nonsense, and ending with "ng".
Comparing the Java and Haskell implementations of Relax NG illustrates what a wicked cool and powerful language Haskell really is. The Java code must explicitly model and simulate many Haskel features like first order functions, memoization, pattern matching, partial evaluation, lazy evaluation, declarative programming, and functional programming. That requires many abstract interfaces,, concrete classes and brittle lines of code.
While the Java code is quite brittle and verbose, the Haskell code is extremely flexible and concise. Haskell is an excellent design language, a vehicle for exploring complex problem spaces, designing and testing ingenious solutions, performing practical experiments, weighin
-
Maximizing Composability and Relax NG Trivia
Tim Bray is right, and he couldn't have put it better: W3C XML Schemas (XSD) suck. The reason Relax NG is so much cleaner and more powerful than committee-designed XML Schemas, is that it's based on a sound mathematical foundation (tree regular expressions, or "hedge automata theory"). While XML-Schemas suffer from ad-hoc design, committee-burn, lack of focus, and half-baked attempts to solve too many unrelated problems.
Here's some interesting stuff from my blog about the design and development of Relax NG.
-Don
James Clark wrote about maximizing composability:
First, a little digression. In general, I have made it a design principle in TREX to maximize "composability". It's a little bit hard to describe. The idea is that a language provides a number of different kinds of atomic thing, and a number different ways to compose new things out of other things. Maximizing composability means minimizing restrictions on which ways to compose things can be applied to which kinds of thing. Maximizing composability tends to improve the ratio between functionality on the one hand and simplicity/ease of use/ease of learning on the other.
Clark describes the derivative algorithm's lazy approach to automaton construction:
I don't agree that <interleave> makes automation-based implementations impossible; it just means you have to construct automatons lazily. (In fact, you can view the "derivative"-based approach in JTREX as lazily constructing a kind of automaton where states are represented by a canonical representative of the patterns that match the remaining input.)
The Relax NG derivative algorithm is implemented in a few hundred elegent declarative functional lines of Haskel, and also in tens of thousands of lines and hundreds of classes of highly abstract complex Java code.
Clark's Java implementation of Relax NG is called "jing", which is a Thai word meaning truthful, real, serious, no-nonsense, and ending with "ng".
Comparing the Java and Haskell implementations of Relax NG illustrates what a wicked cool and powerful language Haskell really is. The Java code must explicitly model and simulate many Haskel features like first order functions, memoization, pattern matching, partial evaluation, lazy evaluation, declarative programming, and functional programming. That requires many abstract interfaces,, concrete classes and brittle lines of code.
While the Java code is quite brittle and verbose, the Haskell code is extremely flexible and concise. Haskell is an excellent design language, a vehicle for exploring complex problem spaces, designing and testing ingenious solutions, performing practical experiments, weighin
-
Re:I have to agree.Yes. I've done it using Relax NG and it was easy, simple and readable.
It also works really, really well with the nXML mode for emacs.
Finally, XML schemas in a way that are not verbose, ugly and unreadable. And if you do need one of the older schema languages there are translators from RelaxNG available.
-
Re:Check out Oxygen, its a cross platform XML edit
Even better than Oxygen is nxml-mode for emacs, written by James Clark (of expat fame).
http://www.thaiopensource.com/nxml-mode/ -
Paging James Clark...
I hope that James Clark will be able to help correct the situation.
In case you haven't heard of James Clark, he wrote groff (for displaying man pages amongst other things), XSLT, the expat XML Parser and the Relax NG schema language. I'd be very surprised if anybody here hasn't used his stuff... Take a look at his bio.
-Dom
-
Haskell rocks!
That's quite true: Haskell's notion of pattern matching is much more powerful and extensible than mere regular expressions.
Jim Clark's Haskell implementation of the derivative algorithm for validating Relax NG is a wonderful example of how powerful, elegant and concise Haskell is. Relax NG is all about tree structured regular expressions over "hedges" (mixed trees of XML elements and text). It's based on the same automata theory as regular expressions, extended to describe "hedges" (XML documents).
An Algorithm for RELAX NG Validation By James Clark, January 07, 2001. Author's note to XML-DEV: 'I have written a paper describing one possible algorithm for implementing RELAX NG validation. This is the algorithm used by Jing, which I believe has also been adopted by MSV... If you try to use this to implement RELAX NG and something isn't clear, let me know and I'll try to improve the description.' From the introduction: "This document describes an algorithm for validating an XML document against a RELAX NG schema. This algorithm is based on the idea of what's called a derivative (sometimes called a residual). It is not the only possible algorithm for RELAX NG validation. This document does not describe any algorithms for transforming a RELAX NG schema into simplified form, nor for determining whether a RELAX NG schema is correct. We use Haskell to describe the algorithm. Do not worry if you don't know Haskell; we use only a tiny subset which should be easily understandable." Jing is a validator for RELAX NG implemented in Java;
Another interesting Haskell library for XML is HaXml:
Consider Haskell in lieu of DOM, SAX, or XSLT for processing XML data. The library HaXml creates representations of XML documents as native recursive data structures in the functional language Haskell. HaXml brings with it a set of powerful higher order functions for operating on these "datafied" XML documents. Many of the HaXml techniques are far more elegant, compact, and powerful than the ones found in familiar techniques like DOM, SAX, or XSLT. Code samples demonstrate the techniques.
Here's some more stuff about Relax NG, comparing the Haskell implementation to the Java implementation (jing): Maximizing Composability and Relax NG Trivia:
Here's some interesting stuff about the design and development of Relax NG:
James Clark wrote about maximizing composability:
First, a little digression. In general, I have made it a design principle in TREX to maximize "composability". It's a little bit hard to describe. The idea is that a language provides a number of different kinds of atomic thing, and a number different ways to compose new things out of other things. Maximizing composability means minimizing restrictions on which ways to compose things can be applied to which kinds of thing. Maximizing composability tends to improve the ratio between functionality on the one hand and simplicity/ease of use/ease of learning on the other.
Clark describes the derivative algorithm's lazy approach to automaton construction:
I don't agree that <interleave> makes automation-based implementations impossible; it just means you have to construct automatons lazily. (In fact, you can view the
-
Haskell rocks!
That's quite true: Haskell's notion of pattern matching is much more powerful and extensible than mere regular expressions.
Jim Clark's Haskell implementation of the derivative algorithm for validating Relax NG is a wonderful example of how powerful, elegant and concise Haskell is. Relax NG is all about tree structured regular expressions over "hedges" (mixed trees of XML elements and text). It's based on the same automata theory as regular expressions, extended to describe "hedges" (XML documents).
An Algorithm for RELAX NG Validation By James Clark, January 07, 2001. Author's note to XML-DEV: 'I have written a paper describing one possible algorithm for implementing RELAX NG validation. This is the algorithm used by Jing, which I believe has also been adopted by MSV... If you try to use this to implement RELAX NG and something isn't clear, let me know and I'll try to improve the description.' From the introduction: "This document describes an algorithm for validating an XML document against a RELAX NG schema. This algorithm is based on the idea of what's called a derivative (sometimes called a residual). It is not the only possible algorithm for RELAX NG validation. This document does not describe any algorithms for transforming a RELAX NG schema into simplified form, nor for determining whether a RELAX NG schema is correct. We use Haskell to describe the algorithm. Do not worry if you don't know Haskell; we use only a tiny subset which should be easily understandable." Jing is a validator for RELAX NG implemented in Java;
Another interesting Haskell library for XML is HaXml:
Consider Haskell in lieu of DOM, SAX, or XSLT for processing XML data. The library HaXml creates representations of XML documents as native recursive data structures in the functional language Haskell. HaXml brings with it a set of powerful higher order functions for operating on these "datafied" XML documents. Many of the HaXml techniques are far more elegant, compact, and powerful than the ones found in familiar techniques like DOM, SAX, or XSLT. Code samples demonstrate the techniques.
Here's some more stuff about Relax NG, comparing the Haskell implementation to the Java implementation (jing): Maximizing Composability and Relax NG Trivia:
Here's some interesting stuff about the design and development of Relax NG:
James Clark wrote about maximizing composability:
First, a little digression. In general, I have made it a design principle in TREX to maximize "composability". It's a little bit hard to describe. The idea is that a language provides a number of different kinds of atomic thing, and a number different ways to compose new things out of other things. Maximizing composability means minimizing restrictions on which ways to compose things can be applied to which kinds of thing. Maximizing composability tends to improve the ratio between functionality on the one hand and simplicity/ease of use/ease of learning on the other.
Clark describes the derivative algorithm's lazy approach to automaton construction:
I don't agree that <interleave> makes automation-based implementations impossible; it just means you have to construct automatons lazily. (In fact, you can view the
-
Haskell rocks!
That's quite true: Haskell's notion of pattern matching is much more powerful and extensible than mere regular expressions.
Jim Clark's Haskell implementation of the derivative algorithm for validating Relax NG is a wonderful example of how powerful, elegant and concise Haskell is. Relax NG is all about tree structured regular expressions over "hedges" (mixed trees of XML elements and text). It's based on the same automata theory as regular expressions, extended to describe "hedges" (XML documents).
An Algorithm for RELAX NG Validation By James Clark, January 07, 2001. Author's note to XML-DEV: 'I have written a paper describing one possible algorithm for implementing RELAX NG validation. This is the algorithm used by Jing, which I believe has also been adopted by MSV... If you try to use this to implement RELAX NG and something isn't clear, let me know and I'll try to improve the description.' From the introduction: "This document describes an algorithm for validating an XML document against a RELAX NG schema. This algorithm is based on the idea of what's called a derivative (sometimes called a residual). It is not the only possible algorithm for RELAX NG validation. This document does not describe any algorithms for transforming a RELAX NG schema into simplified form, nor for determining whether a RELAX NG schema is correct. We use Haskell to describe the algorithm. Do not worry if you don't know Haskell; we use only a tiny subset which should be easily understandable." Jing is a validator for RELAX NG implemented in Java;
Another interesting Haskell library for XML is HaXml:
Consider Haskell in lieu of DOM, SAX, or XSLT for processing XML data. The library HaXml creates representations of XML documents as native recursive data structures in the functional language Haskell. HaXml brings with it a set of powerful higher order functions for operating on these "datafied" XML documents. Many of the HaXml techniques are far more elegant, compact, and powerful than the ones found in familiar techniques like DOM, SAX, or XSLT. Code samples demonstrate the techniques.
Here's some more stuff about Relax NG, comparing the Haskell implementation to the Java implementation (jing): Maximizing Composability and Relax NG Trivia:
Here's some interesting stuff about the design and development of Relax NG:
James Clark wrote about maximizing composability:
First, a little digression. In general, I have made it a design principle in TREX to maximize "composability". It's a little bit hard to describe. The idea is that a language provides a number of different kinds of atomic thing, and a number different ways to compose new things out of other things. Maximizing composability means minimizing restrictions on which ways to compose things can be applied to which kinds of thing. Maximizing composability tends to improve the ratio between functionality on the one hand and simplicity/ease of use/ease of learning on the other.
Clark describes the derivative algorithm's lazy approach to automaton construction:
I don't agree that <interleave> makes automation-based implementations impossible; it just means you have to construct automatons lazily. (In fact, you can view the
-
Haskell rocks!
That's quite true: Haskell's notion of pattern matching is much more powerful and extensible than mere regular expressions.
Jim Clark's Haskell implementation of the derivative algorithm for validating Relax NG is a wonderful example of how powerful, elegant and concise Haskell is. Relax NG is all about tree structured regular expressions over "hedges" (mixed trees of XML elements and text). It's based on the same automata theory as regular expressions, extended to describe "hedges" (XML documents).
An Algorithm for RELAX NG Validation By James Clark, January 07, 2001. Author's note to XML-DEV: 'I have written a paper describing one possible algorithm for implementing RELAX NG validation. This is the algorithm used by Jing, which I believe has also been adopted by MSV... If you try to use this to implement RELAX NG and something isn't clear, let me know and I'll try to improve the description.' From the introduction: "This document describes an algorithm for validating an XML document against a RELAX NG schema. This algorithm is based on the idea of what's called a derivative (sometimes called a residual). It is not the only possible algorithm for RELAX NG validation. This document does not describe any algorithms for transforming a RELAX NG schema into simplified form, nor for determining whether a RELAX NG schema is correct. We use Haskell to describe the algorithm. Do not worry if you don't know Haskell; we use only a tiny subset which should be easily understandable." Jing is a validator for RELAX NG implemented in Java;
Another interesting Haskell library for XML is HaXml:
Consider Haskell in lieu of DOM, SAX, or XSLT for processing XML data. The library HaXml creates representations of XML documents as native recursive data structures in the functional language Haskell. HaXml brings with it a set of powerful higher order functions for operating on these "datafied" XML documents. Many of the HaXml techniques are far more elegant, compact, and powerful than the ones found in familiar techniques like DOM, SAX, or XSLT. Code samples demonstrate the techniques.
Here's some more stuff about Relax NG, comparing the Haskell implementation to the Java implementation (jing): Maximizing Composability and Relax NG Trivia:
Here's some interesting stuff about the design and development of Relax NG:
James Clark wrote about maximizing composability:
First, a little digression. In general, I have made it a design principle in TREX to maximize "composability". It's a little bit hard to describe. The idea is that a language provides a number of different kinds of atomic thing, and a number different ways to compose new things out of other things. Maximizing composability means minimizing restrictions on which ways to compose things can be applied to which kinds of thing. Maximizing composability tends to improve the ratio between functionality on the one hand and simplicity/ease of use/ease of learning on the other.
Clark describes the derivative algorithm's lazy approach to automaton construction:
I don't agree that <interleave> makes automation-based implementations impossible; it just means you have to construct automatons lazily. (In fact, you can view the
-
XML mode and a little history
As for history, I started using EMACS in 1979 at about version 134, when it was written in TECO on DEC PDP-10s. When RMS stopped maintaining it to move on to the GNU project he had just started, I took over its maintance for a year or so. One of the things I added in 1981 was M-$ (which I called "correct word spelling") and m-x check buffer spelling (which is a much better name than the present m-x spell-buffer).
If you use XML you should try James Clark's NXML Mode which uses Relax NG schemas. It gives you on-the-fly validation and completion when editing XML. Relax NG is very easy to use, especially in its Compact RNC form, which looks a lot like BNF but is isomorphic to the XML representation of the RNG Schema.
At work, I find it easy to develop Emacs-lisp scripts or even simply keyboard macros to do large refactoring jobs in Emacs. I often find that I'm the only person who can accomplish tasks that cut across programming language boundaries.
I used Emacs for several years to take real-time minutes for various W3C working groups, and made great use of the dabbrev-expand function (which I bound to meta-space). I like to think of it as an approximation to a Hidden Markov Model. In fact, at a Face-to-Face meeting in Cambridge, MA, I was able to type an entire sentence that the person sitting next to me spoke by typing only Meta-Space (and judiciously using only one or two letters to redirect the completions), a feat that lead many to believe that some form of magic was involved...Thereafter when others asked how our group had such complete minutes, people always responded, "Oh, he uses Emacs." -
Re:Complaint about RelaxNG and acceptance
Well, RelaxNG was developed by OASIS, who developed ODF... Those confused costumers will probably have dealt with the anguish of non-W3C-ness when they decided they wanted to do ODF, which, remember, comes from OASIS too.
Also, there is Trang to do the conversion (and others, actually)
-
Re:DTDs are pass
I think
/you/ are completely confused.
It stands for Document Type Definition, and includes all the human-readable specification that describes what all the elements mean as well as the formal, machine-readable part of the specification
Hmm. You mean it has comments? Because a DTD is nothing more than a machine-readable specification, which often (but not always) comes with comments.
If you somehow came to the conclusion that xml schemas (in general) are not meant to be human readable, take a look at the compact syntax for RNG.
Norm Walsh (ie Mr DocBook) is already making progress replacing the DTD infrastructure of DocBook with RNG. And guess what, he uses an editor (nXML-mode for GNU Emacs) that supports RNG!
I guess he must wear a blindfold? Or maybe you should take a read of James Clark's paper on the design of Relax NG?
-
Re:DTDs are pass
I think
/you/ are completely confused.
It stands for Document Type Definition, and includes all the human-readable specification that describes what all the elements mean as well as the formal, machine-readable part of the specification
Hmm. You mean it has comments? Because a DTD is nothing more than a machine-readable specification, which often (but not always) comes with comments.
If you somehow came to the conclusion that xml schemas (in general) are not meant to be human readable, take a look at the compact syntax for RNG.
Norm Walsh (ie Mr DocBook) is already making progress replacing the DTD infrastructure of DocBook with RNG. And guess what, he uses an editor (nXML-mode for GNU Emacs) that supports RNG!
I guess he must wear a blindfold? Or maybe you should take a read of James Clark's paper on the design of Relax NG?
-
flexible & powerful != difficult and confusing
Suggest you (and anyone else finding W3C XML Schema confusing and/or difficult) take a look at RELAX-NG.
RELAX-NG is an alternative schema language that is much more flexible and easy to use than W3C XML Schema. RELAX-NG also defines a non-XML syntax that makes it even easier to work with, since the non-XML syntax can be translated into XML using a tools such as Trang. There are many other tools that make it easy to work with RELAX-NG.
Eric Van der Vlist is working on a RELAX-NG book that you can read at http://books.xmlschemata.org/relaxng/
Take a look at RELAX-NG - you might never want to work with XML Schema again... -
Check out Relax NG (RNG)
I recently decided to go with RNG for my schemas after reading up on W3C XML Schema (WXS) and Relax NG (RNG) . RNG is just so much easier to read and understand. The real clincher for me was the inability in WXS 1.0 to describe non-deterministic structures. I mean, give me a break. I can't allow people to put the elements in a different order? That's just lame.
What's more there's a fantastic tool dtdinst that converts DTDs into Relax NG. There's also tools to convert back and forth between WXS and RNG. So if I ever need to provide someone with a WXS schema I can just run it off automatically.
Now I'm working on a system using AxKit to parse out the RNG schema, generate HTML forms for completion, roundtrip the data back to the server, assemble an instance document using DOM and display it using XSLT and CSS. But that's another story. People who don't "get" XML should really check out AxKit.
simon -
Re:One is derided, one is end-of-life'd
I think James Clarke's RELAX NG and W3C XML Schema is the best description (if slightly biased
;--) of the relative strength of the 2 technologies. Note that James Clarke also just released a new version of Trang , a tool that does conversions between Relax NG, Schemas and DTDs. -
Re:Just Say NO!
There's no such thing as a perfect schema language in a general sense. It all depends on the application at hand. Actually, trying to be perfect is one of the main reasons for the W3C XML Schema bloat; they simply try to squeeze too much into a single specification.
RELAX NG is much more streamlined; it focuses on specifying grammars for XML structures. Nothing more. I doesn't try to glue on concepts like object orientation (which is another reason for the W3C XML Schema blur). It's just very pure and hence easy and intuitive to learn and use.
Also, with the recent addition of the Compact Syntax, editing and reading schemas has never been easier. Utilites for working with the compact syntax can be found here and here.
So even though there is no perfect schema language, I'd say RELAX NG is far more perfect than W3C XML Schema in many situations. If you have applications that require W3C XML Schema, you can use Trang to convert your RELAX NG schemas.
-
Re:Just Say NO!
There's no such thing as a perfect schema language in a general sense. It all depends on the application at hand. Actually, trying to be perfect is one of the main reasons for the W3C XML Schema bloat; they simply try to squeeze too much into a single specification.
RELAX NG is much more streamlined; it focuses on specifying grammars for XML structures. Nothing more. I doesn't try to glue on concepts like object orientation (which is another reason for the W3C XML Schema blur). It's just very pure and hence easy and intuitive to learn and use.
Also, with the recent addition of the Compact Syntax, editing and reading schemas has never been easier. Utilites for working with the compact syntax can be found here and here.
So even though there is no perfect schema language, I'd say RELAX NG is far more perfect than W3C XML Schema in many situations. If you have applications that require W3C XML Schema, you can use Trang to convert your RELAX NG schemas.
-
Re:Darn DTD's
-
Re:DTD is sooo 1999.
When every tool under the sun is using XML schemas, the House is announcing their support for DTDs.
Jeezus, why would you even consider using Schemas when there is there is Relax-NG, a much better, simply, and based on theory system. Note the author of that document I gave; it's James Clark; if you are using an XML parser, chances are good it was written by him (expat). Heck, there is not even any normative spec for XML-Scheme!