Slashdot Mirror


XML in a Nutshell

The indefatigable chromatic wrote this review of what sounds like another solid offering from the hard workers at O'Reilly & Associates. If you're in the market for dead-tree references to XML, it probably belongs on your list of candidates. XML in a Nutshell author Elliotte Rusty Harold & W. Scott Means pages 480 publisher O'Reilly & Associates rating 8.5 reviewer chromatic ISBN 0-596-00058-8 summary A solid and useful reference for XML developers.

The Scoop

While one of the original goals of XML was to create a specification simple enough that a computer science student could produce a working parser in a week, a few new developments have complicated things slightly. The sea of W3C-recommended acronyms includes namespaces, XPath, XSL, XPointers, schemas, and dozens of specific XML applications. Adopting the simple rules of well-formed data helps, but the quickly-growing stable of related technologies is enough to make the sturdiest information architect weep. The specifications aren't as easy to read as, say, the latest Terry Pratchett novel, either.

XML in a Nutshell covers just the most important concepts. Cleanly written, it walks through the XML aspects likely to be used in most projects. As it assumes existing familiarity with the subjects, it does not spend much time in tutorial mode. Instead, these are the guts of the subjects, arranged nicely in dissection jars.

The first section covers XML basics. This includes the ubiquitous grove of angle brackets, the semantic intent and implication, a good chapter on DTDs, as well as internationalization concerns. The short discussion of namespaces is the clearest explanation this author has yet encountered.

Part two delves further into the reasons for using XML, exploring documents that use the structure to explain semantic relationships. DocBook and XHTML appear, as extended examples. Further, it explores the assistive technologies of XSL, XPath, XLinks, and XPointers. Again, the discussions of XSL and XPath compare very favorably to longer works, intended as tutorials. A brief examination of CSS and XSL Formatting Objects rounds out the section.

Part three explores the use of XML as a data transport. In this section, programming languages come into play. There's a strong hint of Java in the air, though most of the discussion follows a language-neutral path. Both the DOM and SAX parsing models have a dedicated chapter. They're short, but the essential pieces are described simply and effectively.

The final section makes or breaks the book. Luckily, XML in a Nutshell won't have much chance to gather dust. The two-hundred page reference section includes the most useful information. There's an annotated copy of the XML 1.0 Reference, arranged logically. The XSL reference, in particular, is quite good. DOM and SAX programmers will also enjoy their respective chapters. Finally, it's nice to have a large set of printed character tables handy.

What's to Consider

The parsing examples don't go much beyond DOM or SAX, and there's more than a strong Java flavor. (Of course, the models are very similar in most modern languages.) As well, some of the class interfaces in the SAX reference are hard to read. This is probably due to the complexity of the information instead of any editorial decision. There's also little discussion of actual XML applications. Instead, the book covers the principles behind perhaps 90% of XML usage. Again, this is not a complaint, just a clarification of the intended audience.

The Summary

The value of XML in a Nutshell should be readily apparent to XML developers. The material is well-organized and concise. It's a quintessential Nutshell book, upholding a tradition of utility and quality. Readers who've already been exposed to the presented material will likely keep this book close at hand.

Table of Contents
  1. XML Concepts
    1. Introducing XML
    2. XML Fundamentals
    3. Document Type Definitions
    4. Namespaces
    5. Internationalization
  2. Narrative-Centric Documents
    1. XML as a Document Format
    2. XML on the Web
    3. XSL Transformations
    4. XPath
    5. XLinks
    6. XPointers
    7. Cascading Stylesheets (CSS)
    8. XSL Formatting Objects (XSL-FO)
  3. Data-Centric Documents
    1. XML as a Data Format
    2. Programming Models
    3. Document Object Model (DOM)
    4. SAX
  4. Reference
    1. XML 1.0 Reference
    2. XPath Reference
    3. XSLT Reference
    4. DOM Reference
    5. SAX Reference
    6. Character Sets

You can purchase this book at Fatbrain.

40 of 122 comments (clear)

  1. <POST TYPE="FIRST"> by andy@petdance.com · · Score: 5, Funny


    I'm sorry. Really.
    </POST>

    1. Re:&lt;POST TYPE="FIRST"&gt; by jfunk · · Score: 2



      <MODERATION SCORE="-1">troll</MODERATION>

      <REPLY TYPE="response to troll">How can you say that Windows NT is better at running Broadcast 2000 than Linux? It doesn't even run under Windows! RTFM!</REPLY>

      <EXPRESSION TYPE="angry">Damn lameness filter!</EXPRESSION>

  2. XML In a Nutshell: by rkischuk · · Score: 5, Funny

    Take information you want to store and sandwich it between <{name}> and </{name}> where {name} describes the information in between. Mimic the structure of the data, and sprinkle in <{name} otherData="{neatStuff}"> every once in a while. Congratulations, that's XML.

    --
    Seen any BadMarketing lately?
    1. Re:XML In a Nutshell: by Foggy+Tristan · · Score: 3, Funny

      As opposed to the obvious joke...

      <NUTSHELL VERSION="1">XML</NUTSHELL>

      --
      Beware typoes.
  3. Great Book... by DA_MAN_DA_MYTH · · Score: 2, Insightful

    the only problem is, I learned a lot of the concepts, however I usually learn a lot faster with code examples. Anyways the SAX and DOM areas have a little bit of code, but do not go into huge parsing examples. (Maybe I read it wrong...) Good book O'Reilly usually doesn't put out bad ones. Hopefully there will be Java / XML Cookbook. (I know there already is a Java Cookbook) I love those...

    --
    "It takes many nails to build a crib, but one screw to fill it."
    1. Re:Great Book... by Wiggin · · Score: 2, Informative

      there already is. its title is "Java and XML". It is an O'Reilly book. It can be found here at fatbrain.

      --

      "I don't need a compass to tell me which way the wind shines." - Mr. Furious, Mystery Men
    2. Re:Great Book... by Wiggin · · Score: 2, Informative

      Sorry to reply to my own post, but elsewhere in the comments i was made aware that a second edition of this book exists, and can be found here.

      --

      "I don't need a compass to tell me which way the wind shines." - Mr. Furious, Mystery Men
  4. Re:XML is not likely to succeed by TechnoVooDooDaddy · · Score: 3, Insightful

    *sigh* xml is NOT JUST A WEB TECHNOLOGY...

    think of XML as the ultimate replacement for the comma delimited file.. it's delimited data, that's all.. has a lot of extensions hung on it, lots of neat features, handles hierarchial data pretty well, but it's JUST A MARKUP LANGUAGE..

    damn.. ok, i'm done now

  5. No XML Schema, unfortunately by Anonymous Coward · · Score: 2, Interesting

    Unfortunately the book doesn't cover the successor to DTDs: XML Schema.

    Some people are under the misapprehension that XML's role is as the successor to HTML; that's a very limited viewpoint. Far more important and interesting is the role of XML as a language and host independent way of specifying data, particularly with respect to relational databases, and to type in conventional languages.

  6. Online XML references? by L-Wave · · Score: 2, Interesting

    Does anyone have any *good* links to online XML references? whenever I look all i find are things like "What is XML? ..Its not HTML"

    --
    I SURVIVED THE GREAT SLASHDOT BLACKOUT OF 2002!
    1. Re:Online XML references? by Matts · · Score: 3, Informative

      Try Zvon. http://www.zvon.org. They are a great site that is regularly updated.

      PS: I was a tech editor for XML in a Nutshell, so it's really cool to see it reviewed here :-)

      --

      Matt. Want XML + Apache + Stylesheets? Get AxKit.
    2. Re:Online XML references? by mir · · Score: 2, Informative

      And of course the basic reference is the annoted specification. The spec is actually quite simple (and short!) and the annotations are a great way to get the extra details that you can't get usually unless you sit in the working groups.

      It is really a shame that the rest of the XML-related specs (XSLT, DOM...) have forgotten one of the basic design goals of the XML spec: simplicity!

      --
      Look, that's why there's rules, understand? So that you think before you break 'em. (Terry Pratchett)
  7. RE:XML is not likely to succeed by pubjames · · Score: 4, Informative

    XML is not likely to succeed

    We had dumb comments like this last time XML was discussed here.

    Let's me make this clear now, before we get too many more comments like this. HTML is a formatting language for displaying information in web browsers. XML is a data storage toolkit, a configurable vehicle for any kind of information. It is completely different to HTML - the majority of uses for XML have nothing to do with displaying information in a browser.

    XML is an extremely important standard and I urge everyone to learn it.

    And please, don't make comments on Slashdot about technologies you don't know much about.

  8. XML !=HTML by digital_freedom · · Score: 5, Informative

    XML is not going to replace HTML and that's great because XML is better suited to data than display.

    I have used XML on several projects not to send to Browsers to display, but to transfer data between disparate systems. Finally there is a way that two computers can exchange data & meta data without worrying about memory use, big/little endian, EDI formats, and character positions. XML is great in that almost everyone agrees to use it to transfer information. HTML is great for formatting display to a degree (PostScript people please don't flame me! ;) ). I have worked with EDI formats before and it is a pain in the butt to set up message positions for all of your data and to work with nested lists of information. XML makes that so much easier and lets you use DTDs to enforce stuff. I also like the fact that XML was made to be read by a human being. We can actually look at the data file and tell what a field is by looking at the tag. This is why XML is going to be ubiquitous.

    Don't expect it to be a browser language, it's just data. With nicely structured data you can use that to generate HTML, WML, anything...

    The future of data transfer looks bright.

    1. Re:XML !=HTML by digital_freedom · · Score: 2, Informative

      Here's some links to some good info & tutorials on XML

      W3C School -- excellent
      Anti-christ XML school -- MSDN site
      Sun's Java/XML school
      Crash Course in XML

      Hope these help!

    2. Re:XML !=HTML by jfunk · · Score: 2

      Right on!

      There are still people hung up on the perceived XML == ++HTML thing.

      I realised the importance as you have, and use it *constantly*. I use it for all stored files, data interchange, and I even stick XML-RPC into everything now.

      While it still is a format, I realised it was better to think of it as a protocol. It, for some reason, made more sense to me.

  9. Re:XML is not likely to succeed by nachoman · · Score: 2, Insightful

    XML != just a web technology...

    XML is much more that a technology for the web. Many applications are written today using XML for config files, data transmission protocols as well as many other things which make data easy to read to a user in the middle.

    HTML has a set syntax for creating web documents. With XML you can specify your syntax depending on what you are doing. XML is a buzzword, which marketing people drool over, but the reason it is is because it's a powerful technology for standardized representation of data.

    On another note, to say that you don't expect any more advances in web technologies is utterly rediculous... Of course there will be advances, just like every other Computer technology over the years.

  10. Re:XML is not likely to succeed by Anonymous Coward · · Score: 2, Interesting

    You're linking XML too tightly with web mark-up languages.

    XML allows mini-development languages to be made. For instance, an installation tool can offer XML objects for various actions (query-package, remove-package, install-package) and offer attributes for each (package-name, extra-args, show-statusbar, show-hourglass, show-drummingfingers, etc.). Using simple tags, these objects can be assembled into sophisticated installation apps with very little coding (or deep knowledge of C/C++/Java/C#/whatever). That's just one example.

    Creating super-high-level languages that allow non-techies to make sophisticated graphic apps is a Good Thing(TM), and XML can make it all possible. I think that's pretty cool.

  11. Read between the links... by rkischuk · · Score: 2, Informative

    You can purchase this book at Fatbrain.

    The link:

    http://www1.fatbrain.com/asp/bookinfo/bookinfo.a sp ?theisbn=0596000588&from=MJF138

    Not a bad idea - using a slashdot posting to drive sales through a referral link. I'll be back later - I'm off to find some books to review...

    --
    Seen any BadMarketing lately?
    1. Re:Read between the links... by tmark · · Score: 2

      As someone else called it, this is a racket: most every Slashdot review does this. Why do you think so few of the books reviewed here get generally positive reviews ? If I really feel like buying a book recommended by Slashdot, or even finding out a bit more, I type in the URL to the main site myself.

    2. Re:Read between the links... by rkischuk · · Score: 2
      As someone else called it, this is a racket: most every Slashdot review does this.


      I'm not saying it's a racket - they have every right to get some cash for the massive amounts of time/cash/bandwidth they put into letting us use this site for free. I just found it amusing,

      --
      Seen any BadMarketing lately?
    3. Re:Read between the links... by chromatic · · Score: 3, Informative

      Just to be clear, this is a hobby for me. I receive no financial remuneration for book reviews. That's right -- no money from OSDN, no money from referral links. I've never even joined any sort of affiliate program. Hemos (and others) have sent me free review copies, though I've also purchased books on my own to review.

    4. Re:Read between the links... by spudnic · · Score: 2

      Why would you do this? Does it cost you any more to purchase the book when you give someone credit for it? Then why bother?

      Why not let someone make a bit of money off of it. You're just being petty.

      --
      load "linux",8,1
  12. XML is Lisp. by DGolden · · Score: 5, Funny

    Take LISP, make the syntax twice as annoying, and hey presto, XML!

    XML is just an annoyingly verbose way of representing s-expressions, data structures that lisp was designed around.

    So much so, in fact, that it's possible to do a 1:1 mapping of XML into Scheme - see this site for the most sensible way of processing XML - translate it into the equivalent scheme representation.

    This allows you to use all the LISPy tricks in the book to munge your XML data.

    --
    Choice of masters is not freedom.
  13. Java & XML, 2nd Edition by wangi · · Score: 2, Informative

    On a related note - O'Reilly's 'Java & XML' book by Brett McLaughlin was eventually released this week after sliding from it original July release date.

  14. A very helpful book by sben · · Score: 4, Informative

    Highly useful, and highly recommended.

    When I was between jobs earlier this year, I decided to learn XML, and bought this book after perusing several others in the bookstore. I'd had a vague introduction to it at my previous job, and understood the basic ideas behind it. The book gave me a thorough understanding, and I was able to talk about it intelligently (and correctly) at subsequent job interviews. I now work with it on a nearly-daily basis, and the book is a big source of my knowledge.

  15. XML isn't the problem. by Genom · · Score: 4, Insightful

    As someone already said, XML is the ultimate replacement for the comma-delimited file. For the purposes of storing human readable/modifyable data, it's great, and does fill many of the roles a comma seperated file used to fill. XML itself is pretty darned easy to pick up.

    That's not the problem.

    The problem is with the description technologies - most of which just add a layer of abstraction to the XML data, and try to pass a secondary version of the data back to an HTML template.

    That's all well and good - but quite frankly, the current incarnation of XSL stinks. It's tough to comprehend, easy to butcher, and half the time doesn't make sense.

    Much easier (and more useful, I would think) are the parsers which transform an XML document into a data structure you can use in an existing language like Perl or PHP (for the web), or C, or whatever you want. Once you're in a native data format, you're set, and can manipulate the data just as you normally would.

    That's the way to leverage the strength of XML. Ditch XSL for now, until it can be made clearer - and use some existing backend technology to format the data once it's in a data structure.

    My 2 cents, anyway =)

    1. Re:XML isn't the problem. by jfunk · · Score: 2

      You're right. XSL sucks hard.

      Use your fave language and load the XML into your own data structure.

      I did a project using DOM and, while that's all well and good for C and Java (I recommend it highly for those languages), I was using Python and was spoiled by the way Python works with regards to large, complicated data structures, which is, quite well.

      Later, I found this article about a module called xml_objectify, which transforms XML into a data structure that Python people (and probably LISP and even Perl people as well) would feel more comfortable with. Remember that we could care less about index numbers half the time. :-)*

      Whether you use Python or not, I highly recommend the article for it's discussion on the topic of converting XML into complex data structures in your fave language.

  16. XML has *already* succeeded by tmoertel · · Score: 5, Insightful
    I think you misunderstand the main application of XML. It's not simply a better HTML. Rather it is a simple way to represent information richly, in a format that preserves the underlying meaning. (SGML does this, too, but is much a heavier representation.) With XML you can define document types and schemas that define the syntax of documents, and you can then associate your own semantics with the document types to capture the meaning of your information. Thus can you represent your information electronically without diluting or distorting its meaning. The "win" is that XML conforms to your information, not the other way around.

    All HTML documents, by contrast, are HTML documents. Does an H1 element represent a chapter title, a section title, a heading, or just a line of bold text separated from the rest? Who knows? The content and the presentation are mixed together in a one-size-fits-all syntax that forces you to throw away the underlying meaning of your information when you shoehorn it into HTML.

    For example, I'm working on a web site to help people affected by breast cancer. The main value of the site is the information it contains, so you can be darn sure that I'm preserving the information's meaning. I'm not using XML as a better HTML but rather as a rich medium that captures all of my information's value. Once captured, the information is easily "extruded" into HTML for web presentation, simple HTML for Palm and hand-held devices, and typeset pages in PDF for offline reading.

    Make no mistake about it, XML is already a winner.

  17. Java & XML by Anonymous Coward · · Score: 4, Interesting

    While eveyone seems to agree that XML is important but a book simply about XML may not be as useful as a book with an explanation of XML and some examples of real life usage.

    Possibly a better book (also on O'Reilly title) is O'Reilly's Java & XML (ISBN: 0-596-00016-2 or EAN: 9780596000165). I have read this book and found it to be execellent. Although it is java-centric, it discusses concepts that could be easily applied to other languages. The book has good coverage of XML as well as usage of SAX, DOM, and JDOM, and using XML with databases, as configuration files, and in wireless devices. It also covers XSL/T and focuses on Apache XML projects.

    A GOOD READ for anyone iterested in using XML.

  18. XML/XSL Confusion by Kallahar · · Score: 2, Insightful

    I've had trouble with the implementation side of XML. While the concept behind XML is extremely simple, getting it to display is quite another. XSL chose some extremely hard to understand syntax for a data structure designed to be human-readable.

    Travis

  19. Schemas? by Malc · · Score: 2, Insightful

    The book review mentioned a chapter on DTDs, but what about schemas? Aren't schemas the way we're supposed to go? Without coverage of Schemas, I will stick with ageing but excellent "Professional XML" book from the Wrox Press.

    1. Re:Schemas? by Matts · · Score: 2

      Disclaimer: I was one of the two tech editors for this book.

      We decided not to include schemas coverage because the Nutshell books cover not just a description of the technologies, but also best practices. Schemas best practices are only just becoming clear, as can be seen on the xml-dev mailing list. Along with that, Schemas were not yet ratified when the book went to tech review, so we could have only covered an old draft.

      Rest assured though, W3C Schemas (and if I can persuade Elliotte, RELAX too) will be covered in the second edition, which I believe is being worked on already.

      --

      Matt. Want XML + Apache + Stylesheets? Get AxKit.
  20. I'm Sorry But I don't Get It by robbyjo · · Score: 2, Interesting

    How the XML is constructed is just like the usual context-free language. Any context-free grammar language (C/C++, Java, Pascal, etc) can easily be parsed by any functional language, such as Scheme, LISP, ML, or OCAML. Because context-free language is based on recursive grammar, it is pretty direct to translate it into the functional language. Manipulating and constructing the AST are also very easy.

    Mapping 1:1 from XML to functional language representation is highly exaggerated. In ML, for example, one would have to build the table data structure -- eventhough this thing can be easily made. There are still some idiosyncrasies that you have to handle too, albeit is not as intricate as the one in imperative languages like Java or C/C++.

    Mapping to AST itself does NOT yield the full usable extent of XML. XML itself is used to describe tuples of data. How you can flatten the AST tree out to records/structs/classes that is directly usable to the subsequent program? It's not that easy either in functional language. Moreover, the post product of records is highly suitable to imperative language rather than the functional language's.

    --

    --
    Error 500: Internal sig error
  21. Re:Not Necessarily by sammy+baby · · Score: 2
    Well, I think XML is a generalization of HTML because of the repetition of HTML extension.
    That's not precisely correct. XML is an extension of SGML, which means that XML is more like HTML's younger brother, or a cousin, than its descendant. It's probably accurate to say that XML is the more anal of the two, retaining more of the "no, the really is a right and a wrong way to do it" sense of SGML, but managing to avoid the unbelievable complexity

    XHTML, on the other hand, is what happens when you marry HTML's docment types to XML's rulebase. This is an exceedingly rare example of how inbreeding isn't necessarily a bad thing.

  22. XML is technology's Esperanto by Brad+Wilson · · Score: 2, Insightful

    The purpose of XML is not as an HTML replacement. Those who use XML to generate HTML are doing one moderately interesting thing with some powerful technologies. But the real power of XML is that everyone is speaking the same language.

    When you see technologies like SOAP and ebXML, you really start the understand the value of this common language. Don't judge XML as an HTML replacement.

  23. Re:XSL isn't the problem. by elBart0 · · Score: 3, Informative

    I spend most of every day working with XSL and XML, and continually have to listen to people complain how hard XSL is. It's not. Though it's a different meathod of writing code than some people are used to, most people I work with, have no problem with it, once they break out of the C-type syntax of coding. Once you comprehend the template concept of development, XSLT is actually rather easy.
    Don't get me wrong, there are limitations to the language, and hopefully, we'll see those limitations removed in 2.0.
    But, if you can make the conceptual jump in coding styles, it can be very effictive.

    --
    09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
  24. XML doesn't need a Nutshell by dingbat_hp · · Score: 2, Informative

    XML is hard to learn, and easy to remember. Nutshell guides are best for complex lists of obscure settings in little-used config files. I have a bunch of similar Nutshell guides, and they see much hard and useful service.

    This book isn't a good tutorial (it isn't meant to be) and I see no need for a "handy quick reference" guide to the parts of XML that are covered here. It's not a bad book, but I see no real useful purpose to it.

    Sometimes I need to read the XML Spec. This is only ever for really obscure and bizarre minutiae, and in those cases I have to go back to the W3C original. Fortunately that's on-line and already on my desk in a well-thumbed paper copy. I've never felt the slightest need for an XML Nutshell.

    Omitting Schema is a real drawback. The Schema spec is one of the very few XML-related specs that's at all large and can't easily be memorised.

  25. Decent XSL Reference anyone? by thomis · · Score: 3, Informative
    I've accumulated a wide variety of links to resources that have 1 or 2 useful items... but I need the equivalent of an O'Reilly 'Definitive Guide' for XSL. Something that's heavy on Xpath, code examples and other red meat.
    I agree with the first post flamebait to an extent; XML is all well and good, nice way for my database guy to get me the goods for Web presentation, but I need to DO something with that data.
    The answer is XSL, but i've had to blunder around for what works. There isn't even a decent FAQ anywhere, that I know of. Suggestions anyone? Following is a list of links i've found useful; please don't send me to any of those...

    TIA

    http://www-106.ibm.com/developerworks/xml/

    http ://www.ucc.ie/xml/

    http://www.vbxml.com/xsl/xsltref.asp

    http://www.xmlhack.com/

    http://www.xml.com/index.csp

    http://www.xmlpitstop.com/ --very good!

    http://www.biglist.com/lists/xsl-list/archives/

    http://www.xslt.com/

    Enjoy!

    --
    ceci n'est pas un 'sig'
    1. Re:Decent XSL Reference anyone? by chromatic · · Score: 2, Informative

      O'Reilly have just published an XSLT book. I've not read it yet, but will hopefully pick it up soon. It does include a chapter and an appendix on XPath.