Slashdot Mirror


Tim Bray Says RELAX

twofish writes to tell us that Sun's Tim Bray (co-editor of XML and the XML namespace specifications) has posted a blog entry suggesting RELAX NG be used instead of the W3C XML Schema. From the blog: "W3C XML Schemas (XSD) suck. They are hard to read, hard to write, hard to understand, have interoperability problems, and are unable to describe lots of things you want to do all the time in XML. Schemas based on Relax NG, also known as ISO Standard 19757, are easy to write, easy to read, are backed by a rigorous formalism for interoperability, and can describe immensely more different XML constructs."

35 of 180 comments (clear)

  1. Don't do it. by Anonymous Coward · · Score: 4, Funny

    When you want to come.

    1. Re:Don't do it. by Loconut1389 · · Score: 2, Funny

      I tried to tag the article with frankiegoestohollywood,zoolander,killthemalaysianp rimeminister but it didn't all fit ;)

  2. Couldn't agree more by antonyb · · Score: 5, Insightful
    My experience with XML Schema is exactly that; hard to write in the first place, hard to maintain, and regular interop problems between different implementations that make the theory of web services a practical nightmare (idrefs are the first example that spring to mind).


    On the other hand, RELAX NG "just works".

    (all IME of course...:)

    ant.

    1. Re:Couldn't agree more by camperdave · · Score: 2, Funny

      RELAXiNG works for me too.

      --
      When our name is on the back of your car, we're behind you all the way!
  3. I have to agree. by JanusFury · · Score: 4, Insightful

    Has anyone here ever tried to read an XML schema for anything relatively complex? It's a nightmare. RELAX looks much cleaner and more direct, which I wholeheartedly approve of.

    --
    using namespace slashdot;
    troll::post();
    1. Re:I have to agree. by sien · · Score: 4, Interesting
      Yes. I've done it using Relax NG and it was easy, simple and readable.

      It also works really, really well with the nXML mode for emacs.

      Finally, XML schemas in a way that are not verbose, ugly and unreadable. And if you do need one of the older schema languages there are translators from RelaxNG available.

    2. Re:I have to agree. by radtea · · Score: 4, Interesting


      I was at SGML '96 where XML was first announced, and was one of those people who went home and wrote a (non-validating) XML parser over the weekend, based on the draft spec. I've used both DTDs and XML Schemas and can say without question that schemas are actually a bigger pain to work with than DTDs. DTDs were bad enough, but schemas have been a major step backwards, adding complexity without adding the features one actually needs.

      Some years ago I wrote a code generator that used DTDs as the data modelling language. I sold it to the company I was working for at the time and someone I had no control over re-wrote it use schemas because they were "simpler". The result had major bugs and dropped features, not entirely due to schema-related problems, although it is worth noting that the "simplifications" included handling schemas in completely incorrect ways, because if you handled them correctly they could not do the job. I created a new generator from scratch last year and tried to do thing "properly" with schemas. It was essentially impossible, and I wound up creating a custom XML-based language use as input.

      At the time there was no Relax NG standards process, so I stayed clear of it. But it has the blessing of James Clarke too (author of the SP SGML parser and the expat XML parser.) So it is probably worth another very hard look.

      --
      Blasphemy is a human right. Blasphemophobia kills.
  4. To the point. by jhd · · Score: 2, Funny

    "W3C XML Schemas (XSD) suck"

    Hey Tim, don't hold back, tell us what you really think.

  5. Re:it's a rather straightforward observation by GroovinWithMrBloe · · Score: 2, Insightful

    if something, anything, is intended to be primarily parsed by machine, use xml

    xml is a b**ch to read
    Don't forget what we used to use... binary is even worse. XML was designed with people in mind, which is why it's easier for people to read and manipulate than your traditional binary file format.
  6. Re:Just sit back... by ubernostrum · · Score: 4, Informative

    What kind of programmer can't use XML effectively anyhow...oh wait... (No, I didn't read TFA!)

    Helpful hint for understanding the above: Tim Bray, author of TFA, is one of the guys who originally developed and spec'd out XML. Really. His name's on the spec and everything. So if he says that a particular XML tool has problems, it's probably a good idea to take him at his word ;)

  7. Re:XML Totally Sucks - All of it! by beavis88 · · Score: 2, Insightful

    And if you can't have a DB connection?

    For flat data, sure a flat file is fine...for structured/hierarchical data, a flat file is :(

  8. I agree! by Maddog787 · · Score: 3, Funny

    I refuse to use XML in any shape way or form no matter what anyone say or does with it!!!

  9. Re:it's a rather straightforward observation by Peter+Cooper · · Score: 4, Informative

    Check out YAML.

  10. Re:XML Totally Sucks - All of it! by Anonymous Coward · · Score: 2, Insightful

    XML would be great if people validated their XML files before sending them out. And cut the verbosity and redundancy down by 90%. And used english elements instead of numbers. Ahh XML, the ideal most people pay lip service to but up to which they fail to live.

  11. Relax NG's compact non-XML syntax by SimHacker · · Score: 2, Interesting

    Relax NG has a compact non-XML syntax. But C++/Java is a horrible syntax to use if you want a language to be readable and easy to understand. Since when was 17 levels of operator precedence easy to understand? Of course any good programmer always uses parenthesis to avoid ambiguity, so why should a language have 17 levels of built-in ambiguity just to make it that much easier to make hard to find mistakes?

    -Don

    From my blog: Relax NG Compact Syntax: no to operator precedence, yes to annotations!

    James Clark is a fucking genius! Hes the guy who wrote the Expat XML parser, works on Relax NG, and does tons of other important stuff. Relax NG is an ingeniously designed, elegant XML schema language based on regular expressions, which also has a compact, convenient non-xml syntax.

    I totally respect the way he throws down the gauntlet on operator precedence (take that you Perl and C++ weenies!):

    There is no notion of operator precedence. It is an error for patterns to combine the |, &, , and - operators without using parentheses to make the grouping explicit. For example, foo | bar, baz is not allowed; instead, either (foo | bar), baz or foo | (bar, baz) must be used. A similar restriction applies to name classes and the use of the | and - operators. These restrictions are not expressed in the above EBNF but they are made explicit in the BNF in Section 1.

    You can translate back and forth between Relax NG's XML and compact syntaxes with full fidelity, without losing any important information. Relax NG supports annotating the grammar with standard and custom namespaces, so you can add standard extensions and extra user defined meta-data to the grammar. That's useful for many applications like user interface generators, programming tools, editors, compilers, data binding, serialization, documentation, etc.

    Here's an interesting example of a complex Relax NG application: OpenLaszlo is an XML/JavaScript based programming language, which the Laszlo compiler translates into SWF files for the Flash player. The Laszlo compiler and programming tools use this lzx.rnc Relax NG schema for the OpenLaszlo XML language. This schema contains annotations used by the Laslzo compiler to define the syntax and semantics of the XML based programming language.

    The schema starts out by defining a few namespaces:

    default namespace = "http://www.laszlosystems.com/2003/05/lzx"
    namespace rng = "http://relaxng.org/ns/structure/1.0"
    namespace a = "http://relaxng.org/ns/compatibility/annotations/1 .0"
    datatypes xsd = "http://www.w3.org/2001/XMLSchema-datatypes"
    namespace lza = "http://www.laszlosystems.com/annotations/1.0"

    The a: namespace defines some standard annotations like a:defaultValue, and the lza: namespace defines some custom annotations private to the Laszlo compiler like lza:visibility and lza:modifiers. Thanks to the ability to annotate the grammar, much of the syntax and semantics of the Laszlo programming language are defined directly in the Relax NG schema in the compact syntax, so any other tool can read the exact same definition the compiler is using!

    To show how truly simple and elegant it is, here is the snake eating its tail: The Relax NG XML syntax, written in the Relax NG compact syntax:

    # RELAX NG XML syntax specified in compact syntax.

    default namespace rng = "http://relaxng.org/ns/structure/1.0"
    namespace loc

    --
    Take a look and feel free: http://www.PieMenu.com
    1. Re:Relax NG's compact non-XML syntax by Anonymous Coward · · Score: 2, Funny

      Stop cutting and pasting from your fucking blog already. Make your point without it, or if you need to, then link to it.

  12. Re:XML Totally Sucks - All of it! by pleb1024 · · Score: 2, Insightful

    Totally agree.

    While XML may have it's places (I've yet to encounter one in the commerical world), passing large amount of data is not one of them. A good flat file design is a lot more efficent than XML, and short of hardware accelartion I don't see that changing.

    I'm currently trying to assist a customer, whose changing from one system to another, the current system generates flat files of approx 2gig in size every couple of days (billing data). The new system produces files of approx 13gig. The data contained within files result in the exact same bill being produced for the customers.

    Needless to say, the extra diskspace (yes we do compress them), and processing time to parse/compress is such a waste.

    In my mind, XML trades shorter development time / 'portability' (well so the theory goes), for greater resource usage (CPU/Disk), whereas most customers I've dealt with would rather take a little longer to develop, and have a lot less resource limitation issues on the production systems. The old methods of 'just throw more hardware at it' just don't work in the real world anymore.

  13. Relax NG: Design-by-Inspired-Individuals by SimHacker · · Score: 3, Interesting

    Relax NG is a great example of the triumph of Design-by-Inspired-Individuals vs. Design-by-Committee.

    In The State of XML, Edd Dumbill explains the secret behind the success of Relax NG:

    Incidentally the RELAX NG success can equally well be framed as a case of design-by-inspired-individuals vs. design-by-committee as much as it can be seen as a OASIS vs. W3C thing.

    -Don

    --
    Take a look and feel free: http://www.PieMenu.com
  14. Great job, now to clean up XML itself by iamacat · · Score: 2, Insightful

    With a notation similar to RELAX NG compact syntax. XML has been a killer of readable formats like windows-style ini files. It tries to be readable by both human and machine and succeeds at neither. It's like programming in assembler, because it can be read by a human better than machine code and compiled faster than C.

    1. Re:Great job, now to clean up XML itself by killjoe · · Score: 3, Insightful

      I believe you are looking for lisp. It's XML cleaned up, simplified and hulkified.

      --
      evil is as evil does
  15. Re:it's a rather straightforward observation by radtea · · Score: 2, Informative

    XML was designed with people in mind, which is why it's easier for people to read and manipulate than your traditional binary file format.

    Err... no.

    XML was a step back from SGML's "human-friendly" clever tricks. XML was intended to be easy to PARSE, not easy to read.

    --
    Blasphemy is a human right. Blasphemophobia kills.
  16. Maximizing Composability and Relax NG Trivia by SimHacker · · Score: 4, Informative

    Tim Bray is right, and he couldn't have put it better: W3C XML Schemas (XSD) suck. The reason Relax NG is so much cleaner and more powerful than committee-designed XML Schemas, is that it's based on a sound mathematical foundation (tree regular expressions, or "hedge automata theory"). While XML-Schemas suffer from ad-hoc design, committee-burn, lack of focus, and half-baked attempts to solve too many unrelated problems.

    Here's some interesting stuff from my blog about the design and development of Relax NG.

    -Don

    James Clark wrote about maximizing composability:

    First, a little digression. In general, I have made it a design principle in TREX to maximize "composability". It's a little bit hard to describe. The idea is that a language provides a number of different kinds of atomic thing, and a number different ways to compose new things out of other things. Maximizing composability means minimizing restrictions on which ways to compose things can be applied to which kinds of thing. Maximizing composability tends to improve the ratio between functionality on the one hand and simplicity/ease of use/ease of learning on the other.

    Clark describes the derivative algorithm's lazy approach to automaton construction:

    I don't agree that <interleave> makes automation-based implementations impossible; it just means you have to construct automatons lazily. (In fact, you can view the "derivative"-based approach in JTREX as lazily constructing a kind of automaton where states are represented by a canonical representative of the patterns that match the remaining input.)

    The Relax NG derivative algorithm is implemented in a few hundred elegent declarative functional lines of Haskel, and also in tens of thousands of lines and hundreds of classes of highly abstract complex Java code.

    Clark's Java implementation of Relax NG is called "jing", which is a Thai word meaning truthful, real, serious, no-nonsense, and ending with "ng".

    Comparing the Java and Haskell implementations of Relax NG illustrates what a wicked cool and powerful language Haskell really is. The Java code must explicitly model and simulate many Haskel features like first order functions, memoization, pattern matching, partial evaluation, lazy evaluation, declarative programming, and functional programming. That requires many abstract interfaces,, concrete classes and brittle lines of code.

    While the Java code is quite brittle and verbose, the Haskell code is extremely flexible and concise. Haskell is an excellent design language, a vehicle for exploring complex problem spaces, designing and testing ingenious solutions, performing practical experiments, weighin

    --
    Take a look and feel free: http://www.PieMenu.com
    1. Re:Maximizing Composability and Relax NG Trivia by heinousjay · · Score: 2, Funny

      Thanks for the Java flame. I was worried that there wouldn't be any offtopic ranting in this story, but you eased my worries just a few comments into it.

      --
      Slashdot - where whining about luck is the new way to make the world you want.
    2. Re:Maximizing Composability and Relax NG Trivia by Erixxxxx · · Score: 2, Insightful

      From the Haskell implementation:

      "This document does not describe any algorithms for transforming a RELAX NG schema into simplified form, nor for determining whether a RELAX NG schema is correct."

      From the Jing implementation:

      "This version of Jing implements:

              * RELAX NG 1.0 Specification,
              * RELAX NG Compact Syntax, and
              * parts of RELAX NG DTD Compatibility, specifically checking of ID/IDREF/IDREFS."

      also from the Jing implementation:

      "Jing also has experimental support for schema languages other than RELAX NG; specifically

              * W3C XML Schema (based on Xerces-J);
              * Schematron;
              * Namespace Routing Language."

      Implement the same level of functionality in Haskell as is being implemented in Jing, then come back and compare.

      Also, number of lines of code is only one standard, how does the Haskell implementation hold up under heavy loads? How well does it scale?

      I personally think Jing tries to do too much, and I think there is definitely a need for a better java implementation of a RelaxNG validator, but your post (largely dealing with a non-sensical argument about semantics) is rather lazy.

    3. Re:Maximizing Composability and Relax NG Trivia by John+Whitley · · Score: 2, Informative
      That's an awful lot of cutting and pasting just to take a worthless jab at the Java language.


      For many problem domains, it often doesn't matter what language you throw up against Haskell -- the Haskell program will often be smaller by one or more orders of magnitude (for a sufficiently rich/interesting program, anyways). The grandparent poster didn't even craft the example in question; Java was just the vicitm-elect of this particular case. I'll observe that even if the Java program there could be made shorter by an order of magnitude (!!), it would still be an order of magnitude larger than the Haskell implementation.

      Although it's a bit long in the tooth now, Paul Hudak and Mark Jones wrote a paper that surveys the results of a Naval Surface Warfare Center prototying study comparing a number of different programming languages. See Haskell vs. Ada vs. C++ vs. Awk vs. ... An Experiment in Software Prototyping Productivity. It's a fascinating read if you aren't already familiar with how different programming in Haskell is from many currently popular languages. I highly recommend delving into Haskell for any dedicated developer. Even if you don't find yourself developing in Haskell on a daily basis, the experience will positively impact how you think about code, and bring new conceptual models and patterns into your toolbox.
  17. Re:XML Totally Sucks - All of it! by Just+Some+Guy · · Score: 4, Insightful
    While XML may have it's places (I've yet to encounter one in the commerical world), passing large amount of data is not one of them.

    Yeah, well I have to look at EDI every day. I'd switch to XML in a heartbeat if it were up to me.

    You picked some obvious strawmen to shoot down. XML isn't for building gigabyte databases (regardless of whether some people try to use it for that). It's for easily moving data between applications. If you think writing a flat text parser is easy, then you've never had to deal with nested data or escaped characters. Say what you will about XML, but it's nice to have one set standard that deals with all that, even if suboptimally, because I never want to write another ad-hoc parser for as long as I live. Been there, done that, have no desire to bother again.

    --
    Dewey, what part of this looks like authorities should be involved?
  18. Re:One fix to XML I'd like to have... by Reality+Master+101 · · Score: 2, Interesting

    Damn! I mean, add </>...

    (Argh, the "wait between comments" thing is infuriating...)

    --
    Sometimes it's best to just let stupid people be stupid.
  19. XML uses a binary format by ClosedSource · · Score: 4, Insightful

    Of course ASCII (or UNICODE for that matter) is a binary standard as well. So special tools called text editors were created so that people could read it.

    There are more sophisticated binary standards that are more efficient than ASCII and it wouldn't take a lot of effort to create viewers/editors for them as well. Of course most markup documents would be significantly smaller if tags didn't have to be S-P-E-L-L-E-D O-U-T character by character. Each HTML tag could be encoded in just two bytes with lots of room to spare.

    It always fascinates me that we have no problem making customers use a new specialized tool like a browser, but it's taboo to use a non-ASCII tool for development. So we continue to structure our data as if it were going to be processed by a VT100.

    1. Re:XML uses a binary format by 2short · · Score: 3, Interesting


      You could certainly make XML vastly more compact if you had some table of tags mapped to 2-byte codes. You're not the first to have such an idea, and I and others will be happy to use it... as soon as you've got it standardized, implemented, and as widely accepted as ASCII. Point being, I, and everyone I've never even met who will ever touch some particular XML file, already has a text editor.

      We also all have some way of decompressing files in several standard compression formats, which will squash the XML down to the same size as your custom scheme, if storage space is an issue, which it generally isn't. There's all manner of custom schemes one can use to do various things better when one defines the platform. When you want to inter-opperate well, you need to use the capabilities that already exist on only semi-known systems.

      Generally we don't actually make customers use new specialized tools. We take advantage of the new specialized tools they already have. I'm pretty sure not one of my customers ever got a browser to read my documentation; I wrote it in HTML because they've all got browsers already.

  20. Re:One fix to XML I'd like to have... by nuzak · · Score: 4, Insightful

    That feature is in SGML. In fact it can be even shorter than that, you can express an entire tag and its content with is optional). SGML even lets you change the angle brackets to anything else you want. You can make any SGML doc look like nothing you or anyone else has ever seen ... all part of the feature set.

    SGML is full of fun little hacks like that, and it was a pain in the ass to read. Omitting the tag name from the end tag makes it impossible to know you have an improperly closed tag til the end of the document, and then you have no idea which tag wasn't closed. And no, that wasn't a theoretical problem either, this became a real problem with giant SGML docs that used all the shortcuts.

    If you really hate XML's verbosity so much, realize that it was designed for easy reading, not easy writing. I whipped up my own xml mode in emacs and made '</' trigger an "electric-slash" behavior that closes the tag automatically. Not rocket science.

    --
    Done with slashdot, done with nerds, getting a life.
  21. XML nightmare by rgaginol · · Score: 4, Insightful

    If XML Schema was a work colleague they would be Wally from Dilbert - it's not that things are impossible to do with it, it's just that the relative simple things become hard and the complex almost impossible. Due to the fact that almost anything is possible with XML schema with enough work (weeks, months years...) instead of just scrapping it, people keep at it doggedly despite the number of times we get bitten. I'd love to see the community move more completely to RELAX NG if it makes my life easier.

  22. XSD: "Mission Accomplished!" by SimHacker · · Score: 3, Funny

    From the xml-dev mailing list:

    From: Rick Jelliffe
    To: xml-dev@lists.xml.org
    Date: Wed, 29 Nov 2006 12:46:06 +1100

    Robert Koberg wrote:

    I wonder if the people who think RNG won have "Re-elect Gore" bumper stickers...

    Maybe a better analogy would be that the people who say that XSD is lovely is Mr Bush's "Mission Accomplished!"

    Though of course there are differences between Iraq and XSD. One seems to be about people with their own fiefdom agendas stubbornly miring us in a quagmire, using a grabbag of thin reasons to justify it, denying any evidence that things are not rosy, perpetually promising that things are turning around, and enmeshing all sorts of decent people in a life of horror, difficulty and with no confidence in accomplishing the mission. The other is in the Middle East.

    Just joking...
    Rick

    --
    Take a look and feel free: http://www.PieMenu.com
  23. Re:Relax NG. by SeaFox · · Score: 2, Funny
    Mono has complete support for RelaxNG in the form of the Commons.Xml.Relaxng assembly.

    So should the lesson here be to "RELAX if you have MONO"?
  24. XML is like Electricity by SimHacker · · Score: 4, Insightful

    It's good for transmitting information/energy, but it's not good for storing it.

    -Don

    --
    Take a look and feel free: http://www.PieMenu.com
  25. I call this the LineOfView (as in PoV) Problem by Qbertino · · Score: 4, Insightful

    I call this the Line of View (as in PoV) or 'Horizon' Problem. The general problem is this: In XML we've got a standard that is universal for displaying n-dimensional structures in a basically 1-dimensional enviroment. (For the time being, we're ignoring that XML text ususally goes from left to right and top to bottom, making that something 2D to look at)
    The question now is: where do you draw the line of view? Along which line do I take my knife to cut open my n-dimensional structure to unravel it and flatten it out into a 1-dimesional string of characters? This is a problem that is impossible to solve satisfactory for all possible PoVs or - as I say - Lines of View, or better yet, Horizons to the structure. Will I unravel my DB of books by authors? By issues? By vendors? By publishers or by weight and size? ... At some point you will have to look at in which way you want to handle your stuff and which way you're going to unravel it. This will undoubtly influence on how much XML clutter you will have to construct. With XML it's the same as with databases: It/they will allways be pathetic crutches for us to latch on to the real work. Undispensable, but crutches nontheless.

    What I'm getting to is this: mapping n-dimensional stuff to 1-dimensional structures will allways suck one way or the other. It's just that with XML we all start agreeing upon in which way it's supposed to suck. I don't think that changing the Schema standard (or worse: introducing additional standards) will actually attack this hard problem. I have a strong suspicion that Relax NGs relief is illusional, short term and re-introduces downsides that XML Schema allready has takled with it's pesky and strict nature. For one it would be consistency with the View-Horizon once chosen all the way through the given data-structure. I don't know for shure - go test and find out - but I do know that universal serialization will allways come with downsides and RelaxNG (or any other schema) won't change that.

    --
    We suffer more in our imagination than in reality. - Seneca