Tim Bray on the Birth of XML, 10 Years Later

← Back to Stories (view on slashdot.org)

Tim Bray on the Birth of XML, 10 Years Later

Posted by ryuzaki0 on Monday February 18, 2008 @04:34AM from the all-bloatetd-and-grown-up dept.

lazyguyuk writes "Tim Bray posts a lengthy blog on the birth of XML, formalized as 1.0 in Feb 1998. 'XML is ten years old today. It feels like yesterday, or a lifetime. I wrote this that year (1998). It's really long. The title was originally Good Luck and Internet Plumbing but the filename was "XML-People" and I decided I liked that better. I never got around to publishing it, so why not now?'"

13 of 260 comments (clear)

Min score:

Reason:

Sort:

Java and XML, bad tastes that are worse together by Omnifarious · 2008-02-18 04:58 · Score: 4, Insightful

I've recently taken a job at a primarily Java shop. After seeing XML used and abused for ant, maven and various other things I've grown even more disenchanted with it. And now I've also gotten the chance to see that not only does Java represent a poor trade off between the annoyances of a strongly typed language and the speed of a dynamic interpreted one, it has a horrible mess of dependency issues that nobody really solves besides.

I'm much more hopeful about technologies like Thrift and/or D-Bus than I ever was about such abysmal abominations as SOAP, or the only slightly better XML-RPC.

The Java XML world seems like this little closed ecology of mutual masturbators who all come up with more Java and XML 'solutions' to problems that never existed before they started using Java and XML.

I see the value of XML for long-lived documents that don't spend a lot of their life on the wire. And possibly for config files, though IMHO it is too ugly and unreadable for those. But as a general tool for Internet plumbing it's awful.

--
Need a Python, C++, Unix, Linux develop
Java and XML - Addendum by Omnifarious · 2008-02-18 05:03 · Score: 3, Insightful

And, of course, my post is incomplete with reference to my little rant on why CORBA and other forms of RPC are bad. Both Thrift and D-BUS are pretty close to the ideal solution I describe later. They focus on message content over semantics and are extremely easy to parse. SOAP and XML-RPC fail on both of those counts. They are about semantics (you are making a remote function call that does some specific thing, not sending a hunk of data that has some particular content) over content and they are a huge pain to parse.

--
Need a Python, C++, Unix, Linux develop
1. Re:Java and XML - Addendum by Omnifarious · 2008-02-18 05:31 · Score: 4, Insightful
  
  CORBA is a minor pain to parse. From what I could tell you could just sit down with a spec and code up your own parser for ye-old random language in a day or two. But that's not my major issue with it.
  
  My major issue with it was that it promotes designing distributed systems that focus on the semantic roles of the participants instead of the data moving around. In fact it discourages programmers using it from even thinking of what they're doing as sending messages to some system many milliseconds away. Among other evils this leads to all kinds of interesting issues with threading and concurrency that didn't even have to exist.
  
  --
  Need a Python, C++, Unix, Linux develop
Re:Classic by smittyoneeach · 2008-02-18 05:07 · Score: 5, Insightful

In defense of XML, the parsing problem is handled.
Best wishes on solving the semantic snarls.
XML, like all good approaches, handles mechanism, not policy.

--
Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
Here, let me fix that for you ... by trolltalk.com · 2008-02-18 05:33 · Score: 4, Insightful

If everyone would just use one of the already written XML producers or parsers (the big ones, the ones that work) life would be much easier around here from time to time.
If everyone would just went back to using simple delimited ascii text life would be much easier around here.

--
Kevin Smith on Prince
1. Re:Here, let me fix that for you ... by kyz · 2008-02-18 06:28 · Score: 5, Insightful
  
  I have, and I can tell you that it's a waste of time.
  
  It amazes me how something that looks so simple can have so many corner cases, and how they can be solved so differently by different implementations.
  
  CSV is fine if you want to store data that has no quote marks, commas, carriage returns or linefeeds. For everything else, please use a better specified format, preferably one that has a formal definition. Like XML, for example.
  
  --
  Does my bum look big in this?
2. Re:Here, let me fix that for you ... by CaptainPinko · 2008-02-18 07:22 · Score: 3, Insightful
  
  ASCII doesn't even support the letters needed by the majority of the world's language.
  
  --
  Your CPU is not doing anything else, at least do something.
Re:Java and XML, bad tastes that are worse togethe by bckrispi · 2008-02-18 05:36 · Score: 4, Insightful

I'll take an Ant XML build file over an "is that a tab or a space" Makefile any day...

--
Xenon, where's my money? -Borno
Your comments seem tainted with inexperience. by sidragon.net · 2008-02-18 05:39 · Score: 3, Insightful

In general, if you have data to be structured and serialized, XML is one way to do it. If you think XML a poor choice, then could you suggest an alternative? Incidentally, that suggestion should not imply that everyone reinvent their own formats (again).

[N]ot only does Java represent a poor trade off between the annoyances of a strongly typed language and the speed of a dynamic interpreted one ...

Would you provide evidence aside from personal anecdotes, and possibly consider evidence to the contrary?

[Java] has a horrible mess of dependency issues that nobody really solves besides.

Perhaps you meant “modern software” instead. Any complex application these days relies on dozens of libraries and services to perform tasks. Not quite sure where exactly you are having difficulties, so I cannot elaborate further.

[XML] is too ugly and unreadable ... But as a general tool for Internet plumbing it's awful.

XML is intended for consumption by machines first, people second. You might also argue that in-memory data structures are ugly and unreadable.
Re:XML was formalized? by Jerf · 2008-02-18 05:43 · Score: 4, Insightful

Yes. XML was formalized. It is strictly defined and easy to check for compliance (with the right tools). Only a little bit of the definition has passed out of common usage, mostly focused around DTDs.

If you encounter a file that claims to be XML, but does not meet the XML standard, then it is not the XML standard that is to blame. The claim is wrong and the file is not XML.

XML is not a fuzzy-wuzzy adjective that can be applied willy-nilly to anything and magically turn it into "XML". It is not a marketing term or English Professor term. It is a rigidly specified engineer term for a document format, and a given document is XML if and only if it meets that format.

If someone wants to hack together a half-assed parser or emitter of any language, they will. I've seen half-assed XML parsers, I've seen half-assed JSON parsers, I've seen half-assed HTML parsers, I've seen half-assed YAML parsers, I've seen ... you get the idea. If a standard can't solve the problem, you can't count the lack of solution against it.
Re:Classic by oyenstikker · 2008-02-18 05:45 · Score: 4, Insightful

There are only a few problems with this:
1) Non-ancestor relationships and references (i.e., having the same node as multiple locations in the XML document) are not covered by XML, but are possible with objects.
You can with refids and keys.
but with a more effecient format (binary)
It is wonderful to be able to easily read and edit the data in a text editor. If you want it more compact for storage and transmission, compress it. I understand that a binary format could lead to more efficient processing and parsing, but I think the benefits of readable text outweigh the efficiency.

--
The masses are the crack whores of religion.
Re:Classic by Flambergius · 2008-02-18 06:08 · Score: 3, Insightful

To me that says that XML handles a problem that wasn't there. Parsing problem for pretty much everything is almost universally solved by regex...

XML doesn't handle parsing. XML makes parsing easier; in fact so easy that parsing XML isn't a problem anymore.

For an expert, I think XML and regex are complementary techniques. For anyone other than an expert regex are way too brittle. Ordinary people need to be able to operate on their data, it can't require voodoo. (Not that XML in all its arcane application is anything close to plain English, but it's much better than custom data formats and regex.)

--
Computers are useless. They can only give you answers - Pablo Picasso
Re:10 Years and still waiting by iamacat · 2008-02-18 06:30 · Score: 3, Insightful

Here is another obvious rules: If a computer, at any time at all, has to parse or generate XML in large amounts, you are doing it wrong. There is really no need to resend the same string 100000 times, encode multi-megabyte binary data as BASE64 or lose floating point precision by encoding to or from strings. If need be, an efficient binary format can represent the data with an arbitrary schema. Communicating parties can exchange their schemas at runtime and avoid sending attributes that the other end is not going to use.