Perl & XML
The book starts out with a brief explanation of why XML and Perl are well-suited for each other. It then provides a teaser of things to come: an explanation of how to use the XML::Simple module. The first chapter concludes with some warnings and gotchas that seem a little premature since they have not really explained XML. Fortunately, most of these gotchas are covered in context later in the book.
The second chapter provides a whirlwind overview of XML -- covering its structure, DTDs, schemas, and XSLT (transformation). The discussion of XML in general, its history, and parts of an XML document are well done. They give someone who is familiar with static HTML the needed background to understand the structure of an XML document and the vocabulary used to describe it. Unfortunately, the discussion of where XML begins to distinguish itself from HTML, namely with DTDs, the new replacement for DTDs called schemas, and the transformation language XSLT, is too brief. They gloss over these topics with little explanation and few examples. That said, there are other books that do provide more in-depth coverage of XML (this book only promises an introduction).
The next five chapters cover Perl modules designed to process XML, starting with simple parsers and writers. Only methods and syntax relating to XML processing are explained. Therefore, if you are considering reading this book, you should be fairly comfortable with Perl and object-oriented (OO) interfaces to CPAN modules (nearly all the modules discussed provide OO APIs). Again, there are other books and perldoc documentation that cover Perl and it's OO features; so read them first if you are not familiar with OO Perl. If you are familiar with OO Perl, these chapters provide a good overview of the different ways XML can be processed (stream- and tree-based approaches), the advantages and disadvantages of each, and the Perl modules best suited for each approach. These chapters are the biggest strength of this book. The modules discussed in these chapters are by no means an exhaustive list of XML-related modules available from CPAN nor do the explanations of each module cover everything the module does. These chapters do, however, provide the reader with enough information that she can begin to process XML documents intelligently and know where to turn when she needs more information.
The next chapter, Chapter 8, covers XML tree iterators, XPath, XSLT, and XML::Twig. All of these topics are covered in a span of 16 pages (with only slightly over two pages dedicated to XSLT). Indeed, after reading the chapter, you may get the feeling that it was only included so the authors could cram more trite colloquialisms into the book. The short shrift given to these topics creates the impression, which is strengthened in the chapters that follow, that this book was rushed a bit to press.
Chapter 9 discusses applications of XML, including RSS and SOAP, and Chapter 10 is mostly example code. These chapters are intended to give you a feeling for what is possible without really giving you enough information to make it happen. The main problem with these chapters are the examples: the examples are long and the explanations are short. Thus, they are more useful as templates or a quick reference than for learning these topics in detail. Of course, the authors never promised you would be programming SOAP applications when you were done reading this book. And again, there are other books out there which discuss these topics in more detail. So the authors stay true to their promise throughout the book: they will introduce you to XML and tell you how to interact with XML using Perl, no more.
Personally, I found this book did, in general, give me enough information to get started using XML and pointed me where I needed to go to get more information. I am an experienced Perl programmer who is new to XML and comfortable with on-line documentation. This book seems to be written for people who fit this profile and who want to learn by doing (finding the answers to the "hard" questions as they arise). It does introduce a wide variety of XML-related topics and the Perl modules used to interact with them, which is what the authors promised to do in the preface. While it is by no means an authoritative text on Perl and XML, there is something to be said for keeping promises ...
Index As with most first-edition books, the index was adequate but not complete. For example, XML::Twig, which has an entire section covering it, does not appear in the index at all.
Contents
Preface
- Perl and XML
- Why Use Perl with XML?
- XML Is Simple with XML::Simple
- XML Processors
- A Myriad of Modules
- Keep in Mind ...
- XML Gotchas
- An XML Recap
- A Brief History of XML
- Markup, Elements, and Structure
- Namespaces
- Spacing
- Entities
- Unicode, Character Sets, and Encodings
- The XML Declaration
- Processing Instructions and Other Markup
- Free-Form XML and Well-Formed Documents
- Declaring Elements and Attributes
- Schemas
- Transformations
- XML Basics: Reading and Writing
- XML Parsers
- XML::Parser
- Stream-Based Versus Tree-Based Processing
- Putting Parsers to Work
- XML::LibXML
- XML::XPath
- Document Validation
- XML::Writer
- Character Sets and Encodings
- Event Streams
- Working with Streams
- Events and Handlers
- The Parser as Commodity
- Stream Applications
- XML::PYX
- XML::Parser
- SAX
- SAX Event Handlers
- DTD Handlers
- External Entity Resolution
- Drivers for Non-XML Sources
- A Handler Base Class
- XML::Handler::YAWriter as a Base Handler Class
- XML::SAX: The Second Generation
- Tree Processing
- XML Trees
- XML::Simple
- XML::Parser's Tree Mode
- XML::SimpleObject
- XML::TreeBuilder
- XML::Grove
- DOM
- DOM and Perl
- DOM Class Interface Reference
- XML::DOM
- XML::LibXML
- Beyond Trees: XPath, XSLT, and More
- Tree Climbers
- XPath
- XSLT
- Optimized Tree Processing
- RSS, SOAP, and Other XML Applications
- XML Modules
- XML::RSS
- XML Programming Tools
- SOAP::Lite
- Coding Strategies
- Perl and XML Namespaces
- Subclassing
- Converting XML to HTML with XSLT
- A Comics Index
You may also want to check out Erik T. Ray's home page, Jason McIntosh's home page, or O'Reilly's page for the book. You can purchase Perl & XML from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
I am a professional developer, working mostly with Perl. I work in the field of biology and bioinformatics, but have spent the last 8 years working as a web and database Internet developer. And, I own practically every O'Reilly Perl book ever published (not that I necessarily think they're all worth buying). So, now that you know where I'm coming from...
If you are preparing to do a serious amount of XML development, and you're in the process of determining a) which Perl XML modules on CPAN you want to use, and b) how to use them; and, you don't have a whole lot of time to spend tracking down the sometimes-hard-to-find documentation on these modules; then buying this book is a no-brainer. It covers all the major XML modules, how to use then and really helps you figure out when to use the different modules.
Even if you're not new to XML and Perl, this book would serve as an excellent refresher course on what XML tools are available out there for you... Maybe you haven't looked at your code in awhile, or want to update it to use a newer module from CPAN? Or, maybe you're looking for a better way to do it? Then, this book would definitely help you out.
While a fan of O'Reilly books in general, I'll be the first to admit some of them are more useful than others. I highly recommend this book, though, as it's actually useful, comprehensive and very well presented. I find myself cracking it open all the time, especially as my utilization of XML has grown more complicated. It has definitely earned its place in my Aqua Perl book collection.
Do you even lift?
These aren't the 'roids you're looking for.
The book is a little sparse, though. It's about the same thickness as Using csh and tcsh, so don't expect more than an overview of anything. In fact, it might be a little small for US$35.00 (although Bookpool has it for US$21.50). Another small gripe was that it covered parsing XML in far greater detail than generating XML (which was my task at the time I bought the book). Admittedly, parsing XML is typically what most people tend to do and is far more difficult that creating new XML, but I thought a little more coverage was warranted.
If you are faced with doing something involving XML and you're not sure what software bits are up to the task, then this is a good place to find out where to start. You could wind up looking elsewhere if you need lots of nitty-gritty details, but getting off on the right foot is a hard enough task and might be worth the price of the book.
-B
Ash and Hickory, straight-grained and true, make excellent bludgeons, dandy for the cudgeling of vegetarians.
I personally didn't want a handholding book as I've worked with XML in other languages, but something that cut through the confusion of all the different ways to do the same thing.
This little book was perfect for me as it's a nice overview of what is out there and how to pick the right library for the job. Don't expect a complete enterprise application in this book - its for programmers that already know perl and the basics of XML and just need a jumpstart in using the libraries available.
No, Thursday's out. How about never - is never good for you?
You are correct it is a biased opinion, but inquiring minds want to know why.
You can't seriously be suggesting that because Java "already handles XML efficiently" that developers should switch to it, or that suddenly its the holy grail of languages that can use XML? At least not without backing up your statements. There's an enormous amount of work involved in dumping a language for another one. Witness the number of dot-coms who tried it when they bought another company only to fail (whether trying to make everything "Java" or "Enterprise" is a contributing factor or not is a judgement call).
Perl has incredibly efficient libraries for processing XML. For example XML::LibXSLT is faster than every Java XSLT module out there according to freely published benchmarks, so it's hard to see where you find your bias.
Matt. Want XML + Apache + Stylesheets? Get AxKit.
I found this book an excellent introduction for Perl programers who want (or have) to start processing XML. It cuts through the long list of XML modules on CPAN (485 results!) and gives you the basic techniques and tools you can use.
XML is really not that difficult to deal with but it can be a little intimidating. "Perl & XML" is written in a simple and direct style that gives the reader enough information to start writing code, and pointers to find more specific information once they have chosen the tools they need.
Armed with this book, The Perl-XML FAQ and Kip Hampton's column on XML.com any Perl programer can start working confidently with XML.
Look, that's why there's rules, understand? So that you think before you break 'em. (Terry Pratchett)
Comparisons between HTML and XML are useful for those who already know HTML, but there are sufficient problems with HTML that, if both are new to you, then learning XML first is a beter route.
HTML allows various contructs that are not proper XML. Browsers will gladly accept unclosed meta, img, p, and br elements. Most HTML tools do not enforce the same requrements as an XML tool.
Worse, some browser can't handle certain markup; <br/> chokes certain versions of Navigator.
If you care to learn XML for some practical purpose, try learning XHTML. You'll learn about validation, well-formeness, and acquire a useful skill.
http://www.w3.org/TR/xhtml1/
If you can get your pages online, you can check them using the W3C html validation service:
http://validator.w3.org/
Java is the blue pill
Choose the red pill
Not much larger then a pamphlet, the book packs an amazing amount of info into its svelte form. It covers standards, tools, thought process, programming tips, and history in an effortless, breezy tone. In the best tradition of Oreilly books (particularily the Perl ones) you can sit down and read the book cover to cover and enjoy it, or jump in here and there for quick reference.
The authors manage to stir clear the problem that plagues so many XML books, the endless reams of theory without application. E.g., who the hell deals with PIs on a regular basis when parsing XML? And yet every book drones on and on about them, but when the time comes to actually parse a little xml, the example will be a cop-out, the XML equivalent of "hello world", parse this simple, 1 level deep key-value pairs in XML.
Not so with "Perl & XML", the author cover the theory of XML, but are much more interested in getting you coding and producing then being pendantic. The w3c as already got the monopoly of pendantism anyways.
I particularily liked the walk through of XML::RSS late in the book, for an example of how to build something very much real world, and useful without being overly complicated.
And, at least for right now, the book is up-to-date, miracle of miracles, chronicling important new changes in the Perl XML parsing story. (like the new Perl SAX work being done)
Contrast Perl & XML with New Riders' "XML & PHP", which I almost abandoned in the first 20 pages, when they tried to tell me that expat was a compliant SAX parser. Expat is important, and confusing, and its understandable for the authors' to feel defensive about PHP's xml toolset, but the solution isn't to lie, nor be blithely ignorant. The book continues on from there, totally disorganized with no sense of building upon what you've just learned. Also, an entire chapter is dedicated to WDDX? Who uses WDDX? And the authors contribute yet another half-assed PHP RSS parser to the world; is it possible to get negative karma for sharing source?
The reviewer mentions:
This seems to me to show a lack of understanding about much of the real work being done with XML. Its been my expirence that most XML parsing being done, particularily in a scripting environment, does not check against a DTD assuming one even exists. Plus covering DTDs, the proposed W3C Schemas, the increasingly popular challenger RELAX, plus Schematron, and others could easily have added another 100pgs to the book. And XSLT is a book unto itself (and in fact has an Oreilly book to itself).The reviewer suggests that the XPath coverage is included for the purpose of "trite colloquialisms", and while, I'm not sure what that means, I think the fact that Perl has high quality tools supporting standards like XPath is awesome, and very gratifying. Without that sort of work being done, Perl simply wouldn't be a competive choice with Python and Java as an XML processing language.
And finally " it is by no means an authoritative text on Perl and XML,", there are good authoritative books on Perl (lots of them), and good authoritative books on XML (a handful), this book bridges the gap, does it nicely in my view, and I personally love the shortness, the focus, and the form factor.