Why XML Doesn't Suck
Richard Eriksson writes "Recalling the earlier discussion on why XML sucks for programmers, Tim Bray clarifies his stance on his co-creation, XML, and gets back on his pulpit to declare that XML Doesn't Suck. He writes: 'Let's look at some of XML's chief virtues, then I'll address some of the XML-sucks arguments, in the same spirit that Sammy Sosa addresses a fastball.'"
in the same spirit that Sammy Sosa addresses a fastball
You mean he strikes out swinging on three pitches while trying to jack the ball in the stands instead of trying to make contact?
.... because people will pay you out the ying-yang to convert their system to use XML ...
... enough said!
Besides, it is a great buzz word!!!
HallmarkOrnaments.Com
Going from "XML sucks" to "XML doesn't suck" isn't clarifying your stance! It is doing a 360. Even Bill "I didn't have sex with that woman" Clinton would have a tough time with this one.
Just my 2 cents.
today is spelling optional day.
Mr. Bray makes a point about the longevity of XML based documents (where he says that tying up documents in a binary format is foolish), but this is a point that (La)TeX users have been arguing for years.
Will XML really solve this problem? Hopefully the OpenOffice format will help, but if Microsoft maintains its marketshare (and keeps its XML generation limited or even proprietary), are we really better off?
I'll just stick with LaTeX.
I havn't read the article yet, but XML does NOT suck because:
1. the data and/or fields added at anytime WITHOUT breaking anything
2. the data is in a heiracherical format, reducing data replication and allowing for a more sophisticated data structure.
3. the daya can be changed by a text editor.
4. and BECAUSE the data is text, it compresses REALLY well.
I don't get all this fuss over XML. It seems to me that it's just a pretty handy markup language for programmers to use to store data in a human-readable (and therefore human-editable) fashion, that (with the help of things like libxml) also happens to be fairly machine readable. It's also extensible (X- duh!) and yet also has its limits.
/. stories about this? Can somebody explain why this raises people's passions so? It seems to me like arguing the merits of HTML or SGML - it's all so bloody obvious!
Why are there so many
As a web developer & admin, XML is my best friend. I have cases where I need non-webheads to develop content (better yet, portable content), and XML is the only way - they only have to know a basic set of HTML tags, they don't have to worry about HTML validation, formatting, or anything else, and everything they generate is consistent!
Not ony can I transform their content into different views or formats, but (for example) the same XML file that is used to provide software documentation also is used to build the software GUI and provide tool tips and other forms of context sensitive help.
No database required. No parsing required. Just a couple libraries and tools, and we're set to go.
"Times have not become more violent. They have just become more televised."
-Marilyn Manson
XML is much better that anything else in certain situations.
XML is much worst that lots of other choices in certain situations.
Why can't you see the shades of grey, and insist on seeing all in black and white ?
Have fun,
Daniel
I saw a letter to Dr. Dobbs recently that was saying that XML needed to have the ability to embed things like Visual Basic and javascript in it to be really useful. I think that this is a horrible idea. The whole point of XML was to have a generic data model, i.e. one parser to rule them all.
I've been able to do thing like export MySQL schemas into XML, then using XSLT generate an entire set of base classes providing persistent objects. What was once weeks worth of work, now takes an afternoon (from concept to final product). The whole set is entirely consistent, no misspelled names or changed signatures. When bugs were found, I fixed all the files in one place and rerun the XML/XSLT script. Massive productivity boost. If that isn't an argument on why XML doesn't suck I don't know what is.
The idea of embedding code in XML is a perverse distortion of what XML is really about. XML would suck if one uses it for unintended purposes. I don't use a hammer to tighten machine bolts, well I guess some people do.
I used to wonder what was so holy about a silent night, now I have a child.
The main thesis of Tim Bray's original post was that he didn't like having to choose between either storing all his data in memory (i.e. DOM) or using a callbacks(i.e. SAX) when processing XML. The problem with this kind of thinking is that although it may have been true two or three years ago that the only way to process XML was via DOM or SAX this is no longer the case.
.NET Framework. Similar APIs exist in the Java world as well as Python from what I've heard. This is besides the current push in some quarters for programming languages that natively process XML (i.e. intrinsicly understand an XML datamodel or datatype).
.NET Framework. This article on XML.com points to other people who also point out that such pull-based APIs for processing XML are available on other platforms and languages as well.
There are more classes of APIs supported on multiple platforms for processing XML such as pull-based APIs and cursor based APIs which are represented by the System.Xml.XmlReader and System.Xml.XPath.XPathNavigator in the
Tim Bray's original problem was that he doesn't have a pull-based API for XML parsing in Perl. I pointed out in my kuro5hin diary how the pseudo code he showed as being his ideal for processing XML already exists in C# and
poking around his site I came across this. hehe.
/., and by following back the links in from other blogs, I sure did learn a whole bunch about the state of the programming art as regards XML. Some of the things I said were wrong (or at least open to challenge), and I got fodder for a really substantial follow-up piece, which I'll get around to soon. I don't suppose it's mathematically possible for everyone to get their theses batted around by some tens of thousands of well-informed people, which is a real pity."
1 9/ Who
"Slashdot and Stupidity I visit Slashdot once per day, sometimes more, because they seem to do a really good job of relaying the geek zeitgeist. It's a long time since I read much of the follow-ups, but I thought I ought to this time, and I'm reminded why. How can a publication that caters (on the face of it) to smart people attract the attention of so many shallow, drivelling morons?"
"Interactivity Again There were a few smart things there in among the chaff on
http://www.tbray.org/ongoing/When/200x/2003/03/
People who say XML sucks are the people who are forced to look at it and change it by hand.
But XML is not for that!
XML is like dough. Nobody eats raw dough (it's probably OK to eat it, but it ISN'T tasty), but eats cookies and bread instead.
XML is NOT for user and/or administrator usual exposure, XML is for application data transfer.
And applications that require XML to be written by human are only half done: they should be used in combination with HumanInput -> XML generation programs.
If you're using the proper tools, and programming with the proper libraries, there's no reason you have to dig down into the XML in order to "write SOAP calls". I've used SOAP for a handful of tasks, and I can't tell you anything significant about how the requests are represented in XML. Developers don't necessarily need to know that. If things are breaking for you, and you're having to debug the actual XML data to figure out what's going wrong, then either your toolset is buggy or you're not using it correctly.
You read some of the arguments against XML, and you realize that people just don't "get it".
1 - XML sucks as a language
Repeat after me, XML is NOT a language. Certainly not in the sense that C++ is a language. XML is a standard that defines how one structures data.
2 - XML is bloated, I can send binary much cheaper/easier
DUH. If your application is fine using binary data transfer, then USE it. HOWEVER, many applications that either have to A) communicate with other applications or B) have to deal with varying data sets benefit greatly from using XML. Anyone who has been programming for any length of time knows that while binary is more compact, it is less flexible and potentially more error prone. Want to add a new field in the middle of your data, boy you better not get your software versions mixed. Want to write an app that can do reasonably intelligent things with ANY data it recieves, binary is not the way to go. As with all things in life, use the tool for that which it was intended (vs some peoples view that it is the end all be all of data representation).
3 - It's slow
Same as 2 above. If absolute performance is an issue, then by all means, use whatever representation gives you what you need. XML is about flexibility and standardization, NOT performance.
4 - It's complex
Well as complex as you want to make it, and it does sometimes encourages more complexity than is really needed, but it doesn't FORCE you into it. If you want/need schemas, go for it. If you need the functionality but in a simpler form, then do that (unless of course you need to communicate with another system expecting a schema, but his is obvious). It's just like C++, you don't HAVE to use templates and multiple inheritence (hell, you don't even have to create classes if you don't want/need), you use the parts of the tool that are useful and provide benefit, you don't use them just because they're there.
So I don't see what all the bruhaha is about. It has it's strengths, it has it's weaknesses. As with anything, relatively, new, people are trying it in various places. Some of these places not really fit, others do. I've designed apps that benefited greatly, others I've dismissed xml for entirely.
I work for a VAN (Value Added Network) which is basically a middleman for data. You send an electronic purchase order to us; the company you're ordering from gets it from us. The value we add is we'll say you sent and tell you they got it.
However, we charge by the kilocharacter of data you send and receive per month. So, for us, XML is awesome, because it increases the size of an ASCII-X12 or EDIFACT document by a factor of 5-a lot more (usually somewhere around 15-20 I think).
X12 and EDIFACT are standards for business document exchange that have been around for a while, but people are converting to XML because they think it's better (eventhough, usually, they just use the X12 or EDIFACT format, but with XML tags).
For example, a line item record may go from something like this:
LIN:0001
to something like this:
<LIN_GROUP>
<LIN>
<LIN_01>0001</LIN_01>
</LIN>
</LIN_GROUP>
It's not always that bad, but it can also be much worse. (Imagine replacing each instance of "LIN" above with "Line_Item" and "LIN_01" with "Line_Item_Number".) (And why won't that semi-colon after the LIN_01 end tag go away?)
so-- for us, XML doesn't suck-- it increases our revenue. For our clients, it's sucks, because it increases their monthly bill.
Two words:
Human-readable.
As a programmer, this is the most useful property a data stream can take on. Why? Debugging. The reasoning here is twofold:
1. Non-parallel development of opposite ends of the data stream:
It's quite a challenge to develop the code which produces the data and the code which uses the data at the same time. If it doesn't work, you don't know where the problem is. With a human-readable format, you can simply pipe the data in or out of the app directly from a text file, and verify that it's correct yourself.
2. Debugging:
Something of an extention of the previous, if you have two bits of code communicating through XML, you can log the bad transmission and read it yourself to find out if the bug is in transmission or reception.
Now, I won't pretend that XML is the only human-readable data-structuring format, but it has a lot of nice advantages over the others, each of which is covered in the article. XML makes apps a pain to develop, but a breeze to debug--and the debugging is far more important!
-Amalcon
Are you smoking crack? I hate Microsoft as much as the next guy, but have you seen
Granted, MS hasn't backported everything to XML (think we'll ever see an XML registry?) but everything going forward has XML tattooed all over it. I happen to love XML, but if anything Microsoft tends toward the zealous side.
Ppppppht! *sprays water all over monitor* Microsoft's not "implementing it?" What in the heck do you mean by that? Have you taken a look at anything in the .NET suite lately? The entire system is built on XML. The solution files, project files, assembly manifests, application configuration files, setup binding files - they're all XML! Visual Studio .NET is build extensively on XML, and the .NET API includes some very intuitive and powerful classes for reading, manipulating, and building XML documents. I suggest you do at least a cursory investigation before spouting something so outrageously inaccurate next time.
Like woodworking? Build your own picture frames.
... Like most of folks here, we've successfully used it in several situations, across different languages (Java, Perl, ASP) and different purposes(configuration, data transfer, web page generation, small online data storage, etc). It's da bomb.
XSL/XSLT on the other hand can be a pain to use in anything other than trivial transforms, in my unschooled opinion. The concept of recursive processing is great, but the math/logic syntax available is byzantine (eg "variable" is really a constant).
*sigh* I know this will get modded offtopic, but seriously... anyone agree with me, or do you actually like writing transform logic and processing in XSL? Please comment.
In the last five years, XML has - for instance - completely revolutionized the way my company writes software. We use code generators that mungle XML definitions into templates (imagine PHP controlling the generation not of HTML but of C or... PHP, and using XML to specify the abstract model in question).
We don't need schemas, stylesheets, xpaths,... just simple XML. And yet we can write very rich code in XML instead of in native code. Today we're producing about 25 lines of final code for 1 line of XML, and we're pushing this up all the time. My current project generates workflow engines from XML definitions, building a 10k workflow application from a single 500-line XML file.
My point is that XML is not just a handy way to store data. It is a meta language, able to formally define any concept, no matter how abstract. This is an incredible but subtle thing. The power comes not from XML technology itself, which is really very, very simple once you ignore the W3C fluff. The power comes from the freedom that XML technology gives you, namely the ability to abstract your problem to as high a level as your mind can take it, and to solve it at that level.
This is difficult, and takes time, but as the XML space settles down it will become clear that this is the real value of the technology.
The 'con' arguments all appear to be related to people trying to use XML in the wrong place, for the wrong thing, or to replace existing abstractions that work perfectly well.
Sig for sale or rent. One previous user. Inquire within.
Most of his (excellent) points have to do with exchanging data between applications (with long-term storage being essentially a special case of that). And he's right -- for those, XML is a huge win, and we should all bow down and worship at its feet.
However, because XML is such a huge buzzword now, people are proposing (or insisting on) using it as a format at the heart of complicated applications. Where anyone would have said 'Use a database' a few years ago.
In doing so, people are losing sight of the essential beauty of the relational data model. With a RDBM, you, the programmer, have tremendous flexibility about *how* you view your data. This is a huge win inside of an application. XML forces you to commit to one specific view of your data. Yes, if that data needs to live forever and yes, if that data needs to get sent to someone else, than by all means, store it in an XML file. But if you need to *do* something with that data, you're going to be much happier with a relational db.
-Dan
I have written a truly remarkable operating system which this sig is too small to contain.
I work for a publishing services firm that is focusing on XML-based production of print and online materials, ranging from books to scientific journals to grade-school testing applications.
Simply put, XML is the best tool available for storing content to be databased, searched, rendered in multiple formats and broken apart and reconstituted into custom documents. XML also lends itself nicely to the representation of complex mathematics using MathML. Because of this, we've based many of our production processes on XML.
One particular journal we produce is a heavily mathematical, 250 page weekly scientific journal. This journal is produced in both print and online forms, as well as being databased by the publisher. Using tools such as Arbortext Epic (www.arbortext.com) for content editing and Advent 3B2 (www.advent3b2.com) for semi-unattended formatting we are able to produce the journal with a staff of only 10 people. A year ago, it took twice as many people and the end product was not nearly as flexible. In this application, XML rocks.
However, using XML in every application imaginable without considering whether or not it's the appropriate tool can be quite foolish. A hammer is great for pounding on things, but is pretty worthless in nearly every other application. A lot of the frustration felt by coders implementing XML solutions is due to the fact that it may not be the best tool for the job.
That said, the challenge stems from MV-fields. Those nifty things in PICK which give you the power of keeping associated fields within one table, with as many associations as you like. (for good or for bad, bad usually when it's been abused or good housekeeping neglected.) Piling MV stuff into CSV is just plain icky. Normalizing it first is also icky. However XML may offer a simple, elegant way of keeping it all together in the shape it existed in (which may be important down the road if someone has to produce a report from it (auditors, second guessers, or a55-covering because some account didn't have the right amount of debits or credits for years and the difference needs to be found.)
I'm off to explore XML more fully. There's probably yet-another O'Reilly book in my future...
A feeling of having made the same mistake before: Deja Foobar
Is it the best? Probably not. But it's undeniably an effective lingua franca. A human can easily creat, edit, and manage it dynamically - you want a new tage you just do it.
Then, it's also as easy on the software side to reflect those changes. The fashionable arguments people use against it (why is it so fashionable to bash anything that happens to be a buzzword?) are non sequiturs in terms of what XML is intended for.
I use it, hell I probably overuse it. It's so damn easy to parse that I don't want to waste time building a custom format just to save that extra 1K of space or 1/100th of a second.
if you were trying to convey the fact that MS has embraced and extended the fsck out of XML, thus totally destroying it and not properly implementing it, then yes, I would agree...
Micro$oft sure has some balls extending the "eXtensible Markup Language"...
But bureaucrats being what they are (and bureaucrats being in charge of environmental agencies), they've been told that XML is a GOOD THING, and want to force everything into that mold. And it doesn't fit!
Call it the "law of the instrument," as someone (Poul Anderson, I think, put it:
That's XML, to a tee!"My opinions are my own, and I've got *lots* of them!"