Effective XML
Before I tell you what's inside though, let me tell you what you won't find in these pages. Primarily you need to know that this book does not teach XML. I know a lot of books say that, yet still include an introduction or appendix that covers the basics, but this isn't one of them. You're expected to know XML from page one. Even syntax is only covered from a proper usage angle. Personally, I appreciated this. It always bothers me when an obvious non-beginner's book starts off by wasting a chapter on things I should already know. You just need to be aware when you buy that you won't learn XML here. Knowledge of namespaces, DTDs, the W3C's Schema Language, XSLT, and more aren't strictly required to get something out of this book, but they certainly would help you get a lot more out of it.
What you will get here is coverage of fifty miscellaneous topics spread across four sections on "Syntax", "Structure", "Semantics", and "Implementation". In "Syntax", ten topics delve into the details of things like DTDs, entity references and the XML declaration itself. It may sound silly to dig deep into a single line of XML that simply declares the format, but I doubt you will think so after reading that topic. There's a lot going on in that line and you want to be in control of those decisions instead of just copying and pasting. Entity references are an even smaller chunk of XML output, but they too get illuminated by a rare insight on how and when they should be used, and for what. Did you know that it is possible to write a namespace savvy DTD? I do now and I learned that in this section as well.
The second section of the book covers "Structure", and to me it was the best part. This collection of seventeen topics is loaded with good advice about how to build an XML document that will be ideal for anyone who needs to work with it. Here you see how metadata should be stored in XML, get tips on embedding binary content, learn which schema language is better for which tasks, and finally understand rare XML constructs like processing instructions and exactly what they are for. Additionally, there's a lot of general advice on the right way to mark up content that's really worth its weight in gold. Just one example of what I learned here is that I under appreciate mixed content for great constructs like <name><given>John</given> <family>Doe</family>, <title>Ph.D.</title></name>. If you like that, you'll enjoy this whole section.
Section three, "Semantics", deals primarily with parsers and their APIs. Again, you won't learn any APIs here. What's covered is their strengths and weaknesses and why you should choose a given API for a given task. SAX and DOM are the main focus of these ten topics, but there are other details sprinkled in, like XPath.
The fourth and final section is all about "Implementation". The thirteen topics here address client-side XML styling, server-side transformations, signatures, encryption, compression, and more. My favorite topic here was a terrific coverage of Unicode and how it affects XML. All developers should know at least as much about Unicode as what's printed here and this is a fine source to learn it from.
One thing that really stands out in the whole text is that the author isn't afraid to cover the dark side of XML. He will tell you where the design process was less than perfect, which tools have little practical value, and some of the problems with where XML technologies are headed. This isn't complaining though. All of this is targeted at how it affects XML developers today. You learn what you can safely skip and what should be outright avoided. The author even tells you what XML is bad at and gives you advice about when you shouldn't use it. That's the mark of a man who knows his subject, if you ask me.
All told, I think the author failed to completely convince me his way is perfect on only 2 topics. That means I learned 48 expert XML tricks. Surely that's worth the cost of the book in time and money. This isn't the first XML book you need, but I think it is the second XML book everyone should read.
You can purchase Effective XML from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
I love the book, but once it encountered a humid day the binding fell apart. Anyone else have this experiance ?
Is that it's not a very machine-friendly language (more wordy than it ought to be; parsing of tags is not very efficient) and it's not a very human-friendly language (the human style is free-style, really). I don't think it's a very good universal data description language. sorry that I had to go on a bit of a tangent...
One thing that really stands out in the whole text is that the author isn't afraid to cover the dark side of XML.
[Obligatory Star Wars joke]
____
~ |rip/\/\aster /\/\onkey
I want to say something funny about XML, but there is nothing.
-pyrrho
After seeing what can be done with simple javascript and XML, I'm wanting to get into this. Can someone point me to the best OSS way to do this (I can hear the groans now). I like Postgres but I don't see much in the way of getting it to spit out XML. I like documentation... MySQL? Am I missing something?
More
XML is all about loosely bound interfaces.
Get with the program.
<letter>r bose">verbose</link>.e ><nickname>Letter</nickname></name>
<salutation>Dear XML-Junkies</salutation>
<body>
I type all my business letters in <link href="http://www.google.com/?q=XML>XML</link>. Sometimes it can be a bit <link href="http://dictionary.reference.com/search?q=ve
</body>
<signature>
<nam
</signature>
</letter>
XML seems cool to me. I like the thought of being able to design a schema to suit my personal needs. But when it comes time to make use of that schema and actually keep data in it, it seems to be useless, as least as far as an end user (non programmer) is concerned.
Do I have the wrong impression?
I just bought a book a couple days ago. Great one so far, even it does not teach you XML, but for anyone who have even small experience with XML, the book is still great. Just like me, you will pick up really fast.
Bookpool has it for $28.50. Don't click the bn sponsored link (where it's a whopping $44.95).
/. gets a kickback from doing something dumb like clicking the link to overpriced merchandise.
PS, I don't work for Bookpool, I hate it when
If you like this book, don't forget to check out Scott Meyers' Effective C++ or Joshua Bloch's Effective Java. Both are great. I devoured Meyers' book when it first came out, and I was happy to see Bloch's book was similarly useful. There is also an Effective Perl book out, but I don't know how good it is -- it follows the same general format, but hasn't been updated since 1997. (Neither has the C++ book, but C++ hasn't changed that much since then.)
EricSee your HTTP headers here
Sometimes, the most effective use of XML is to simply not use XML at all. XML is a wonderfully useful tool when applied correctly. It's architecture-independent and is a great way to communicate unstructured and/or hierarchial data.
Sometimes, though, your data can be simple enough that XML is overkill. Software developers need to make themselves aware of situations when they might be better served by a simple "flat file" of delimited data. In situations like this, using XML can amount to what I like to call "gratuitous complexity."
Always use the right tool for the job.
Tired of FB/Google censorship? Visit UNCENSORED!
$28.27 at overstock.com.
The essence of XML is this: the problem it solves is not hard, and it does not solve the problem well. - Phil Wadler
XML is not the end of our problems, it is the beginning of our problems. - ditto
Shortly after the release of XML, some folks, including some very important folks in W3C and its members, who had been big supporters of XML, actually got around to reading the spec, and discovered to their horror that they had an XML which included entities, DTDs, PIs, and assorted other baggage. - Tim Bray
When XMI came out, I had just been studying up on UML, and I thought "Cool! I'll print out the DTD so that I can look it over on the subway ride home!" When I saw how big the XMI DTD was, I decided not to print it out--I prefer not to spend that much time in the subway. - Robert DuCharme
XML was monocase until quite late in its design, when we ran across this ugliness. I had a Java-language processor called Lark - the world's first - and when XML went case-sensitive, I got a factor of three performance improvement, it was all being spent in toLowerCase().- Tim Bray
XML-based technologies seem particularly susceptible to the "if we standardize it, everyone will use it" fallacy. - Simon St. Laurent
I'm more interested in using XML as a means for language independent object persistence (not just cheesy .NET XmlSerializer class stuff either). How much coverage of such things is there in the book? Ie; creating an object in Java on one machine, persisting it and it's state to an XML file, and recreating it on some other machine in C++ or C#. I'm tired of writing my own "protocols" to migrate running code from one app to another.
;)
You have obviously never looked into soap, which seems to be able to address every requirement you are describing.
But, not using Soap is quite common on Slashdot
Not sure if you were serious here or not, but this is necessary to disambiguate the following improperly formed XML:
<start> Now is the time for all good men to come to the aid of their <noun>country</noun></phrase>which is either missing a "phrase" start tag or mixed up the start & end tags... in a long XML document, the parser can give you a better hint where to look for the error.
Or you were kidding and I missed the joke, in which case I'm about to be called all sorts of impolite things... (I might even be referred to as Sean Penn).
Proud neuron in the Slashdot hivemind since 2002.
There are valid uses for XML. Just look at http://www.x-cp.org/
Ever try to debug deeply nested LISP in a plain vanilla text editor? Ever try to find exactly which closing parenthesis is missing where? That's why end-tags have names. It's pure human factors. Computers don't care about this. People do.
SGML (XML's precursor) did have minimized end-tags like . Experience proved this caused more pain than it alleviated. Hence the lack of minimized end-tags in XML.
ridiculing the verbosity of xml, on a web page.
Hmm, that's one I haven't been asked before.
I suspect what it offers is that you don't have to define and write your own BNF grammar, and then implement it in lex and yacc or similar tools.
Grammar design is non-trivial, especially if you need to consider issues like internationalization. Picking XML as the underlying format means you don't have to do this work yourself. Why reinvent the wheel?
Sometimes you do need something different, but a lot of alternative formats don't really have a good reason to exist. More often than not, custom parsers just come about because a programmer is more comfortable writing bad parsing code quickly than learning a new, more robust API in order to use someone else's parser.
After all XUL and RDF together with js, css and resource files - that's what makes FireFox tick.
You can't handle the truth.
There's a very real tension between making examples too trivial to be interesting and making them too long to be readable. I struggle with it in every book I write, and every other programming book author I know does so too. I've tried putting so-called real-world examples in books, and it's hopeless. It can't be done. There wouldn't be any space left for the explanatory text, nor would anyone put up with reading page after page of code.
Most importantly, while I tend to be writing about just one topic at a time, real world programs wander all over the map. I may be trying to explain how to use callbacks in SAX, but a realistic program also has to consider network latency, GUI design, error logging, numerical algorithms, internationalization, and a hundred other things that aren't on topic. Covering them all would obscure the subject I'm actually trying to explain. Some things you just have to leave for other books and other authors.
As an author, I try to strike the right balance between excessive simplicity and excessive length. Sometimes I hit it. Sometimes I don't. I actually think Effective XML hits it fairly well. In fact, this book was one of the toughest I ever had to write, precisely because it was so short that I couldn't spew pages like I did in Processing XML with Java (1100 pages) or the XML 1.1 Bible (1000 pages). I had to be really picky about how much code I included, and make sure that each example carried its weight, demonstrated just the point at hand, and nothing else.
By the way, the chapter with that specific example is online if anyone cares to see for themselves just what it is that makes names a more interesting and complex problem than "John Doe Ph.D" seems to be at first glance.
I give customers a specification showing how I would like data sent to me. They can use the specification to tell them how to store their data, because they can read it. They can check that their data matches the specification, because their machine can read it.
When I receive their data, I can check that it matches the specification, because my machine can read it. If there is something wrong with their data, I can point out where it's broken, because it's human-readable.
Writing specifications is easy. Writing generators and parsers is easy. The tools are ubiquitous. Generation and parsing are usually fast 'enough'. The standards are freely available. Complex data structures may be described. Data may be transformed using a common language based on XML itself.
Yes, I'd like it to be easier to write XML parsing tools. Yes, I'd like it to be easier to write tools which handle XML more efficiently. No, the two points above don't make XML the devil's data encapsulation.
Rik
When it comes to speed, XML sucks. It does provide incomparable interchange of data on a human- and machine-readable level. It would be nice on the other hand to be able to select a faster standard when both ends of a transaction support it. XML would become the lowest denominator.
... but yeah, you're right. Helps do away with the (ugh!) parenthesis matching crap in LISP, so actual people can edit it too, verbose as it may seem.
You can hold down the "B" button for continuous firing.
The review almost sold me on the fact that I could actually learn something from this book. Looking at the sample chapters here told me the truth