Tim Bray On The Origin Of XML

← Back to Stories (view on slashdot.org)

Tim Bray On The Origin Of XML

Posted by Zonk on Friday March 18, 2005 @02:35PM from the makes-feed-users-happy dept.

gManZboy writes "Queue just posted an interview with XML co-inventor Tim Bray (currently at Sun Microsystems). Interestingly enough the interviewer is none other than database pioneer Jim Gray (currently at Microsoft). Among other things, in their discussion Tim reveals where the idea for XML actually came from: Tim's work on the OED at Waterloo."

11 of 218 comments (clear)

Min score:

Reason:

Sort:

Re:Lisp strikes again by r2q2 · 2005-03-18 14:59 · Score: 2, Informative

I believe you are refering to greenspuns 10th law .http://c2.com/cgi/wiki?GreenspunsTenthRuleOfProgr amming

--
My UID is prime is yours?
Re:OH come on.. by Mistlefoot · 2005-03-18 15:22 · Score: 2, Informative

Microsoft is not applying for a patent on XML but rather, a patent

" that cover word processing documents stored in the XML (Extensible Markup Language) format. The proposed patent would cover methods for an application other than the original word processor to access data in the document."

<URL:http://news.com.com/2100-1013_3-5146581.htm l/ >
Not Very insightful! by stevens · 2005-03-18 18:01 · Score: 2, Informative

I hadn't thought about that. Very insightful.

Lots of people have thought about it. Not Very Insightful.

The reason is that if the parser encounters unbalanced end-tags, and they're all just </>, the parser will go farther and get very confused before it dies.

It will be very difficult to pinpoint *which* tag isn't closed, like C's optional {} after an if(), or SGML's optional closing tags.

It's much easier to correct if your parser can say "You forgot to close <account> on line 115" rather than "Something or other is unbalanced somewhere before line 224."
Re:This is article is amazingly honest by Anonymous Coward · 2005-03-18 19:33 · Score: 1, Informative

XML is just a representation of hierarchy data via named parameters and list.

That may be the only part of XML that your application is using, but don't make the mistake of believing that it is all that XML is good for. XML also gives you a platform-indepentant representation of your data with parsers already available for each platform. It also gives you automatic validation of the data structures using DTDs or XSDs and it gives you a framework and tools for doing data transformations (XSL).

It also gives you the ability to edit by hand. This is the biggest bloat area. XML could be much more compact and be parsed faster if you use a binary representation.

So just remember that you were using a small subset of XML's features. High performance is not one of them. If you need high performance, design your own format and write your own parser. It's not hard, just time consuming.
I COULD NOT AGREE MORE. gzip is our friend! by TheLittleJetson · 2005-03-18 21:30 · Score: 2, Informative

when i work with XML in java, i generally use just pass the XML through a GZIP stream. need to see the file contents? zcat. XML compresses well since it's repetative text. Lately I've been doing a lot of XUL code with PHP/smarty as the back-end, and again, I transparently gzip this...

So, this solves the problem of the size of the XML to be stored on disk or transmitted over network... The only difference is parsing. Again, when i'm in java, i use PICCOLO to parse the XML -- it uses a lexical analyzer (jflex?) to parse XML more like a compiler parses code, by tokenizing it. turns out, this is really fast.

Disk space is cheap. CPU's are fast. Mainstream XML parsing technology can always be made faster. Why must we abandon our beloved, human-readable, standardized format for files and protocols alike in favor of binary files?
Re:Why, oh why, did they have to repeat the tag na by ikkonoishi · 2005-03-18 21:32 · Score: 4, Informative

< ele1> < ele2> < ele3> < /> < /> < ele4> < ele5> < /> < />

Which element did I forget to close?

< ele1> < ele2> < ele3> < /ele3> < /ele1> < ele4> < ele5> < /ele5> < /ele4>

Clearer now?
Re:Please explain by Anonymous Coward · 2005-03-18 22:23 · Score: 5, Informative

johannesg writes: "I've heard this quote in relation to XML before, and I don't get it. LISP is a programming language. XML is a method for storing data. About the only relation between the two that I can find is that both use nesting. So, why does this get brought up whenever XML is being discussed?"

Lisp source code is first parsed into S-expressions before being compiled. The programmer can manipulate these S-expressions to generate new programming constructs.

S-expressions are nested lists of dynamically typed data. The compiler turns these nested lists into bytecode or assembly code. But before this happens you're able to manipulate a well defined, concise and platform independent data format. The format is so useful that it is also used to store and transport non-code.

Here's a Lisp function call nested within another function call:

(/ (+ 1 2 3) 6)

[i.e. add 1, 2, and 3 together and then divide by 6] Let's first give different names to the function operators:

(divide (plus 1 2 3) 6)

Now introduce redundancy by duplicating the opening function names:

(divide (plus 1 2 3 /plus) 6 /divide)

Translate the dynamically typed integers to explicit type indentifiers:

(divide (plus (integer 1 /integer) (integer 2 /integer) (integer 3 /integer)) (integer 6 /integer) /divide)

Now convert the parentheses and spaces to angle brackets to generate XML:

<divide>
<plus>
<integer>1</integer>
<integer>2</integer>
<integer>3</integer>
</plus>
<integer>6</integer>
</divide>

Lisp S-expressions are a method for storing/expressing data AND code. They have less overhead than XML, solve more problems than XML (comfortably human readable programming languages can also be written in S-expressions, e.g. Scheme and Common Lisp) and they were invented decades earlier.

Regards,
Adam Warner
Re:This is article is amazingly honest by Decaff · 2005-03-19 02:16 · Score: 2, Informative

XML is just a representation of hierarchy data via named parameters and list.

It is far more than that.

It conforms to a standard. It allows its format to be extended in standard ways without breaking the original meaning. It has rules for allowing internationalisation. Also, there are a large number of efficient parsers and processors already written for it in almost every language.

Also with code structures you can add dynamic functionality like

'rsv_time' = localtime(time)

The XML dialect known as XSLT allows for such dynamic functionality, and in a standard way.

which you can't with XML...
Re:This is article is amazingly honest by hankaholic · 2005-03-19 02:25 · Score: 2, Informative

I think you may have misread. He said "blah blah blah instead using Data::Dumper", not "blah blah blah instead of using Data::Dumper".

If you haven't misread, your post was a little unclear, but I thought I'd respond by posting instead of with a nondescript "Overrated" mod.

--
Somebody get that guy an ambulance!
Re:SGML by smallpaul · 2005-03-19 18:57 · Score: 2, Informative

XML is defined as a subset of SGML. From the specification:
"The Extensible Markup Language (XML) is a subset of SGML that is completely described in this document."
Re:SGML by JohnQPublic · 2005-03-21 03:11 · Score: 2, Informative

GML even had tags for doing Gantt charts, and I would dearly love to find a publishing system that could do printouts from such tags. ... ... Here it is 10 years later, and we still haven't gotten back to the level of ease of use and flexibility that GML had in the '80s

You're looking for Gary Richtmeyer's B2H program, available from IBM's z/VM download site. It's written in Rexx and runs on every system you're likely to be using, comes in source form, and can process just about everything the BookMaster markup can dish out (even the syntax diagram tags).