Tim Bray Says RELAX
twofish writes to tell us that Sun's Tim Bray (co-editor of XML and the XML namespace specifications) has posted a blog entry suggesting RELAX NG be used instead of the W3C XML Schema. From the blog: "W3C XML Schemas (XSD) suck. They are hard to read, hard to write, hard to understand, have interoperability problems, and are unable to describe lots of things you want to do all the time in XML. Schemas based on Relax NG, also known as ISO Standard 19757, are easy to write, easy to read, are backed by a rigorous formalism for interoperability, and can describe immensely more different XML constructs."
When you want to come.
On the other hand, RELAX NG "just works".
(all IME of course...:)
ant.
Has anyone here ever tried to read an XML schema for anything relatively complex? It's a nightmare. RELAX looks much cleaner and more direct, which I wholeheartedly approve of.
using namespace slashdot;
troll::post();
...and RELAX thinking of the fact someone wants to change your standards. What kind of programmer can't use XML effectively anyhow...oh wait... (No, I didn't read TFA!)
"W3C XML Schemas (XSD) suck"
Hey Tim, don't hold back, tell us what you really think.
Hey, Tim: p {line-height:1.7em;}, seriously.
Sig Sig Sputnik
if something, anything, is intended to be primarily parsed by human eyes, write it in c++/java style
if something, anything, is intended to be primarily parsed by machine, use xml
xml is a b**ch to read
intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
Every time I have to interface with with XML, I just groan.
It's extensible, rah rah.
It's self documented, rah rah.
rah rah blah blah
It's way over-hyped, unnecessary in most cases and way over used.
Give me a DB connection or a flat file, period.
I refuse to use XML in any shape way or form no matter what anyone say or does with it!!!
Neat article, but it took me a few clicks to find it. Does anyone else find it supremely annoying how there are always a plenty of links in the story but you have to actually read the url's to find which one goes to the thing being discussed. How about a (-story-) link, or perhaps a different color of link for all those that rare just links to wikipedia (I can look it up myself, thanks). _____ , it's infuriating.
Between this standard and REST, it looks like we have some very lazy web services, RESTing and RELAX NG all the time . . .
the first thing i did was misread the title as "Tim's bra says RELAX", then I reread it as "Tim Brays RELAX", ...
Relax NG has a compact non-XML syntax. But C++/Java is a horrible syntax to use if you want a language to be readable and easy to understand. Since when was 17 levels of operator precedence easy to understand? Of course any good programmer always uses parenthesis to avoid ambiguity, so why should a language have 17 levels of built-in ambiguity just to make it that much easier to make hard to find mistakes?
-Don
From my blog: Relax NG Compact Syntax: no to operator precedence, yes to annotations!
James Clark is a fucking genius! Hes the guy who wrote the Expat XML parser, works on Relax NG, and does tons of other important stuff. Relax NG is an ingeniously designed, elegant XML schema language based on regular expressions, which also has a compact, convenient non-xml syntax.
I totally respect the way he throws down the gauntlet on operator precedence (take that you Perl and C++ weenies!):
You can translate back and forth between Relax NG's XML and compact syntaxes with full fidelity, without losing any important information. Relax NG supports annotating the grammar with standard and custom namespaces, so you can add standard extensions and extra user defined meta-data to the grammar. That's useful for many applications like user interface generators, programming tools, editors, compilers, data binding, serialization, documentation, etc.
Here's an interesting example of a complex Relax NG application: OpenLaszlo is an XML/JavaScript based programming language, which the Laszlo compiler translates into SWF files for the Flash player. The Laszlo compiler and programming tools use this lzx.rnc Relax NG schema for the OpenLaszlo XML language. This schema contains annotations used by the Laslzo compiler to define the syntax and semantics of the XML based programming language.
The schema starts out by defining a few namespaces:
default namespace = "http://www.laszlosystems.com/2003/05/lzx" .0"
namespace rng = "http://relaxng.org/ns/structure/1.0"
namespace a = "http://relaxng.org/ns/compatibility/annotations/1
datatypes xsd = "http://www.w3.org/2001/XMLSchema-datatypes"
namespace lza = "http://www.laszlosystems.com/annotations/1.0"
The a: namespace defines some standard annotations like a:defaultValue, and the lza: namespace defines some custom annotations private to the Laszlo compiler like lza:visibility and lza:modifiers. Thanks to the ability to annotate the grammar, much of the syntax and semantics of the Laszlo programming language are defined directly in the Relax NG schema in the compact syntax, so any other tool can read the exact same definition the compiler is using!
To show how truly simple and elegant it is, here is the snake eating its tail: The Relax NG XML syntax, written in the Relax NG compact syntax:
# RELAX NG XML syntax specified in compact syntax.
default namespace rng = "http://relaxng.org/ns/structure/1.0"
namespace loc
Take a look and feel free: http://www.PieMenu.com
I've been picking up Emacs lately, and the xml-mode standardly used (nxml-mode) uses RELAX over XML Schema. I suspect that probably says a lot for RELAX's parseability. I've had just a little bit of experience playing around with Schemas and they seem about as navigable as DTDs, which is to say not very. I haven't tried RELAX though.
then why are you using an ASCII encoding in the first place? Those tags just lower the signal to noise ratio. Even Apple's given up and started saving their meta data in a "compiled" version of XML.
Oh, and, "Hi! How you doing? Long time no see!"
Clear, Dark Skies
Relax NG is a great example of the triumph of Design-by-Inspired-Individuals vs. Design-by-Committee.
In The State of XML, Edd Dumbill explains the secret behind the success of Relax NG:
-Don
Take a look and feel free: http://www.PieMenu.com
"I may be old school, working with flat files and all for over 20 years, but I do work with a lot of newer technology."
Well I'm your counterpart in India and I'm happy to hear you're having problems getting use to newer technologies. Keep up the good work.
If you want data to be human readable, then use a nice interface with data fields in the right places and functions for accessibility. (I would say "GUI", but a front-end isn't necessarily graphical.) Just the idea of a sequential stream of data sucks for human introspection and/or modification.
If you want data to be machine readable, then make it a flat binary file. A memory dump that you can mmap(), essentially. Having to parse/serialize sucks, from a programmer's point of view.
XML is a compromise between the two. It doesn't fit either mandates well. It is not easily human readable (try to shove it in the face of the average user), and it is not easily machine readable (bulky, slow to parse and generate).
However, XML is great for:
- Interoperability, when you suddenly find out that your flat mmap'ed file doesn't work across heterogeneous systems.
- Extensibility, because you can rather easily support prior file versions if you do your design correctly. And you can make some fields optional.
- Structure. Having to write a schema or relax grammar or an UML diagram forces you to think about the structure of your data before you start filling in global variables in your code.
Also, with a schema or grammar at hand, it can be useful to just shove files at a validating parser.But just having a schema or grammar for its data does not make any application magically interoperable. Without a good documentation, an XML file is just as binary as before. Or how else should I know what a <fooStuff> tag does.
A good schema or grammar can go a long way to augment documentation (of data files for interoperability purposes), but it is no replacement for it.
Just my 2 cents.
With a notation similar to RELAX NG compact syntax. XML has been a killer of readable formats like windows-style ini files. It tries to be readable by both human and machine and succeeds at neither. It's like programming in assembler, because it can be read by a human better than machine code and compiled faster than C.
Tim Bray is right, and he couldn't have put it better: W3C XML Schemas (XSD) suck. The reason Relax NG is so much cleaner and more powerful than committee-designed XML Schemas, is that it's based on a sound mathematical foundation (tree regular expressions, or "hedge automata theory"). While XML-Schemas suffer from ad-hoc design, committee-burn, lack of focus, and half-baked attempts to solve too many unrelated problems.
Here's some interesting stuff from my blog about the design and development of Relax NG.
-Don
James Clark wrote about maximizing composability:
Clark describes the derivative algorithm's lazy approach to automaton construction:
The Relax NG derivative algorithm is implemented in a few hundred elegent declarative functional lines of Haskel, and also in tens of thousands of lines and hundreds of classes of highly abstract complex Java code.
Clark's Java implementation of Relax NG is called "jing", which is a Thai word meaning truthful, real, serious, no-nonsense, and ending with "ng".
Comparing the Java and Haskell implementations of Relax NG illustrates what a wicked cool and powerful language Haskell really is. The Java code must explicitly model and simulate many Haskel features like first order functions, memoization, pattern matching, partial evaluation, lazy evaluation, declarative programming, and functional programming. That requires many abstract interfaces,, concrete classes and brittle lines of code.
While the Java code is quite brittle and verbose, the Haskell code is extremely flexible and concise. Haskell is an excellent design language, a vehicle for exploring complex problem spaces, designing and testing ingenious solutions, performing practical experiments, weighin
Take a look and feel free: http://www.PieMenu.com
Speaking of XML, how much smaller would XML files be if they made one minor simple change...
Add to mean "close the matching element".
*sigh* I wish I'd been on the committee when they specified the standard.
Sometimes it's best to just let stupid people be stupid.
Of course ASCII (or UNICODE for that matter) is a binary standard as well. So special tools called text editors were created so that people could read it.
There are more sophisticated binary standards that are more efficient than ASCII and it wouldn't take a lot of effort to create viewers/editors for them as well. Of course most markup documents would be significantly smaller if tags didn't have to be S-P-E-L-L-E-D O-U-T character by character. Each HTML tag could be encoded in just two bytes with lots of room to spare.
It always fascinates me that we have no problem making customers use a new specialized tool like a browser, but it's taboo to use a non-ASCII tool for development. So we continue to structure our data as if it were going to be processed by a VT100.
Simple, readable, concise. Ah, but not XML-like, maybe because they are simple, readable, and concise.
If XML Schema was a work colleague they would be Wally from Dilbert - it's not that things are impossible to do with it, it's just that the relative simple things become hard and the complex almost impossible. Due to the fact that almost anything is possible with XML schema with enough work (weeks, months years...) instead of just scrapping it, people keep at it doggedly despite the number of times we get bitten. I'd love to see the community move more completely to RELAX NG if it makes my life easier.
From the xml-dev mailing list:
From: Rick Jelliffe
To: xml-dev@lists.xml.org
Date: Wed, 29 Nov 2006 12:46:06 +1100
Robert Koberg wrote:
Maybe a better analogy would be that the people who say that XSD is lovely is Mr Bush's "Mission Accomplished!"
Though of course there are differences between Iraq and XSD. One seems to be about people with their own fiefdom agendas stubbornly miring us in a quagmire, using a grabbag of thin reasons to justify it, denying any evidence that things are not rosy, perpetually promising that things are turning around, and enmeshing all sorts of decent people in a life of horror, difficulty and with no confidence in accomplishing the mission. The other is in the Middle East.
Just joking...
Rick
Take a look and feel free: http://www.PieMenu.com
Slashdot tags are officially useless. Who the hell is going to search for "dontdoit" when looking for this article.
Mono has complete support for RelaxNG in the form of the Commons.Xml.Relaxng assembly.
In addition to RelaxNG, it provides NVDL and RNC support.
As someone who has used XML schemas pretty extensively, I was pretty amazed at how I was able to skim through the tutorial in about 10 minutes and understand Relax NG, versus reading an entire XML Schema book and still needing to refer to it whenever I write schemas.
One thing I really like about Relax NG is that it's possible (with very easy syntax) to constrain the XML structure based on an attribute value, something you can't do in schema or a DTD. For example, suppose you want to have an XML element:
true
'
With Relax NG it's possible to constrain the text in the arg element (e.g. "true" or "false") based on the value of the type attribute. For example, if type="int", you could limit the text in arg to an integer value. This is something you can't do in schemas or dtds.
Since you are simplifying your life by making the schema for web requests simpler, why not go all the way, ditch SOAP, and embrace REST for XML-over-HTTP communications?
"There is more worth loving than we have strength to love." - Brian Jay Stanley
I believe James Clark, who co-designed Relax/NG, understands and programs in Lisp pretty well (as well as Haskel, Java, C and many other languages). He helped design and implement DSSSL (wikipedia article), which is based on Scheme, and led to XSLT, which he also designed.
-Don
Take a look and feel free: http://www.PieMenu.com
It's good for transmitting information/energy, but it's not good for storing it.
-Don
Take a look and feel free: http://www.PieMenu.com
I call this the Line of View (as in PoV) or 'Horizon' Problem. The general problem is this: In XML we've got a standard that is universal for displaying n-dimensional structures in a basically 1-dimensional enviroment. (For the time being, we're ignoring that XML text ususally goes from left to right and top to bottom, making that something 2D to look at) ... At some point you will have to look at in which way you want to handle your stuff and which way you're going to unravel it. This will undoubtly influence on how much XML clutter you will have to construct. With XML it's the same as with databases: It/they will allways be pathetic crutches for us to latch on to the real work. Undispensable, but crutches nontheless.
The question now is: where do you draw the line of view? Along which line do I take my knife to cut open my n-dimensional structure to unravel it and flatten it out into a 1-dimesional string of characters? This is a problem that is impossible to solve satisfactory for all possible PoVs or - as I say - Lines of View, or better yet, Horizons to the structure. Will I unravel my DB of books by authors? By issues? By vendors? By publishers or by weight and size?
What I'm getting to is this: mapping n-dimensional stuff to 1-dimensional structures will allways suck one way or the other. It's just that with XML we all start agreeing upon in which way it's supposed to suck. I don't think that changing the Schema standard (or worse: introducing additional standards) will actually attack this hard problem. I have a strong suspicion that Relax NGs relief is illusional, short term and re-introduces downsides that XML Schema allready has takled with it's pesky and strict nature. For one it would be consistency with the View-Horizon once chosen all the way through the given data-structure. I don't know for shure - go test and find out - but I do know that universal serialization will allways come with downsides and RelaxNG (or any other schema) won't change that.
We suffer more in our imagination than in reality. - Seneca
This guy claims that this:
<element name="addressBook" xmlns="http://relaxng.org/ns/structure/1.0">
<zeroOrMore>
<element name="card">
<element name="name">
<text/>
</element>
<element name="email">
<text/>
</element>
</element>
</zeroOrMore>
</element>
is easier to read than this:
<!DOCTYPE addressBook [
<!ELEMENT addressBook (card*)>
<!ELEMENT card (name, email)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT email (#PCDATA)>
]>
WTF ?!
Religion is what happens when nature strikes and groupthink goes wrong.
I have enough experience with Relax NG to say that it is great.
The compact syntax is enjoyable as you can be quite precise (compared to XSD) and there are tools that convert between the compact syntax and the xml Relax NG syntax allowing you to use syntax that suites your needs. In general, JING it is quite a bit quicker than a few of the XSD validators for comparably complex schemas.
There are a few disadvantages:
* The full range of tools that are available are not advanced on a regular basis. I found a few bugs in the JING source code and had the opportunity to fix them where necessary.
* I feel that RelaxNG is marginalized because of XSD and along with that goes alot of additional OSS support. They are maintained by individuals instead of teams. I would recommend that the author of JING puts his software forward to the apache foundation (jakarta commons) and see if it can attract a bit more attention.
* Web services are a bit of a sticking point. The use of a Relax NG schema can be embedded into the WSDL, however, the various 3rd party clients may not necessarily understand the schema, and by extension, they would not generate any supporting classes making integration with a relax NG defined webservice a little more complex than it needs to be.
Relax NG really is great.
-Tim
I don't see why XML schemas has to exist. BNF notation serves the exact same purpose: it describes a grammar. A BNF-like derivative is more than enough to define XML schemas. The compact syntax of RELAX NG is just that, and a bright idea.
It is really annoying when CS has to be discovered all over again. The problem of validating text to a certain format has been solved many decades ago, and BNF and variations of are known from the 60s...
...and chew it.
(damn short subject lines!)
I agree that RelaxNG is much easier to read, and it will much more completely describe a grammar than will the other standard - and MUCH more completely define it than will a DTD.
Unfortunately, as far as I can tell there is no way to, within an XML document, state "Use THIS RelaxNG schema file to validate this document", as you can with a DTD. Thus, even if I have placed my RelaxNG schema on my web server, I cannot set things up such that (for example) libXML2 can automatically fetch that schema when it starts parsing my document. I can map the RelaxNG schema to a DTD (losing information) and allow that to be fetched, but if I want to use a RelaxNG schema with libXML2 I the programmer must tell libXML2 where the schema is.
IMHO it would be a Good Thing if the W3C would standardize on some way to associate a RelaxNG schema with a given XML file - say, by some form of XML processing directive within the XML file.
www.eFax.com are spammers
As someone who has used XML Schema a little, it amazes me that no on thought to shoot the designers as soon as they published the first draft. I've learned entire Turing-complete programming languages in less time than it took me to get to even moderate competence with Schema (Lisp, Erland and Smalltalk, for example, all took less long to learn than Schema; I could write a program in any of them that would validate an arbitrary XML document more easily than I could write a Schema, in spite of spending longer learning Schema).
I am TheRaven on Soylent News
Schema definition by it's nature is tedious but necessary at this point. If you're going to take a standard thats already entrenched and suggest everyone stop and polish the edges from it how about we kill the verbosity of the xml end-tag instead?
Do we lose anything other than bandwidth use by doing this,
<tagNameThatCanBeLong>Some Text</>
instead of this:
<tagNameThatCanBeLong>Some Text</tagNameThatCanBeLong>
If the next end tag must belong to the last start tag what's the point of naming it?
You are checking your backups, aren't you?
Flame: to insult or criticize angrily
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
En tee
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
I just showed your example to a half dozen people (some programmers, some managers) and they agree that the longer form is vastly more readable and understandable.
Shit, you think people are born knowing what an asterisk postfix means? Terseness != Clarity.
Not Lisp, but S-expressions, which are the basis of Lisp syntax; Lisp is an "application" of S-expressions, the same as XML applications are applications of XML. S-expressions extended with something similar to XMLs encoding declarations could substitute for XML and would be arguably cleaner—certainly, cleaner to Lispers, though I'm not so sure that:
(foo
(bar baz (spam: "eggs"))
is really more readable (rather than just more compact) than:
<foo>
<bar>
<baz spam="eggs"/>
</bar>
</foo>
I've learned entire Turing-complete programming languages in less time than it took me to get to even moderate competence with Schema
What do you expect? Schemas were a Microsoft initiative IIRC.
FreeSpeech.org
That's what XML catalogs are for.
2 001-08-06.html
http://www.oasis-open.org/committees/entity/spec-
The only thing I've found useful from the Schema namespaces is the set of datatypes (int, float, string, etc) which are quite useful for other things.
Could W3C please split these off into their own "standard" namespace family?
Some restriction examples: Is that enough restriction for you?
XSDs might be too complex for their own good, but if you're gonna bash them, at least know what you're talking about first. And btw, who the heck uses DTD nowadays? I never thought I'd see people mentioning those in 2006! Who in their right mind would use a non-xml-compliant definition file to validate a xml file? Weird...
shana