Perl & XML
The book starts out with a brief explanation of why XML and Perl are well-suited for each other. It then provides a teaser of things to come: an explanation of how to use the XML::Simple module. The first chapter concludes with some warnings and gotchas that seem a little premature since they have not really explained XML. Fortunately, most of these gotchas are covered in context later in the book.
The second chapter provides a whirlwind overview of XML -- covering its structure, DTDs, schemas, and XSLT (transformation). The discussion of XML in general, its history, and parts of an XML document are well done. They give someone who is familiar with static HTML the needed background to understand the structure of an XML document and the vocabulary used to describe it. Unfortunately, the discussion of where XML begins to distinguish itself from HTML, namely with DTDs, the new replacement for DTDs called schemas, and the transformation language XSLT, is too brief. They gloss over these topics with little explanation and few examples. That said, there are other books that do provide more in-depth coverage of XML (this book only promises an introduction).
The next five chapters cover Perl modules designed to process XML, starting with simple parsers and writers. Only methods and syntax relating to XML processing are explained. Therefore, if you are considering reading this book, you should be fairly comfortable with Perl and object-oriented (OO) interfaces to CPAN modules (nearly all the modules discussed provide OO APIs). Again, there are other books and perldoc documentation that cover Perl and it's OO features; so read them first if you are not familiar with OO Perl. If you are familiar with OO Perl, these chapters provide a good overview of the different ways XML can be processed (stream- and tree-based approaches), the advantages and disadvantages of each, and the Perl modules best suited for each approach. These chapters are the biggest strength of this book. The modules discussed in these chapters are by no means an exhaustive list of XML-related modules available from CPAN nor do the explanations of each module cover everything the module does. These chapters do, however, provide the reader with enough information that she can begin to process XML documents intelligently and know where to turn when she needs more information.
The next chapter, Chapter 8, covers XML tree iterators, XPath, XSLT, and XML::Twig. All of these topics are covered in a span of 16 pages (with only slightly over two pages dedicated to XSLT). Indeed, after reading the chapter, you may get the feeling that it was only included so the authors could cram more trite colloquialisms into the book. The short shrift given to these topics creates the impression, which is strengthened in the chapters that follow, that this book was rushed a bit to press.
Chapter 9 discusses applications of XML, including RSS and SOAP, and Chapter 10 is mostly example code. These chapters are intended to give you a feeling for what is possible without really giving you enough information to make it happen. The main problem with these chapters are the examples: the examples are long and the explanations are short. Thus, they are more useful as templates or a quick reference than for learning these topics in detail. Of course, the authors never promised you would be programming SOAP applications when you were done reading this book. And again, there are other books out there which discuss these topics in more detail. So the authors stay true to their promise throughout the book: they will introduce you to XML and tell you how to interact with XML using Perl, no more.
Personally, I found this book did, in general, give me enough information to get started using XML and pointed me where I needed to go to get more information. I am an experienced Perl programmer who is new to XML and comfortable with on-line documentation. This book seems to be written for people who fit this profile and who want to learn by doing (finding the answers to the "hard" questions as they arise). It does introduce a wide variety of XML-related topics and the Perl modules used to interact with them, which is what the authors promised to do in the preface. While it is by no means an authoritative text on Perl and XML, there is something to be said for keeping promises ...
Index As with most first-edition books, the index was adequate but not complete. For example, XML::Twig, which has an entire section covering it, does not appear in the index at all.
Contents
Preface
- Perl and XML
- Why Use Perl with XML?
- XML Is Simple with XML::Simple
- XML Processors
- A Myriad of Modules
- Keep in Mind ...
- XML Gotchas
- An XML Recap
- A Brief History of XML
- Markup, Elements, and Structure
- Namespaces
- Spacing
- Entities
- Unicode, Character Sets, and Encodings
- The XML Declaration
- Processing Instructions and Other Markup
- Free-Form XML and Well-Formed Documents
- Declaring Elements and Attributes
- Schemas
- Transformations
- XML Basics: Reading and Writing
- XML Parsers
- XML::Parser
- Stream-Based Versus Tree-Based Processing
- Putting Parsers to Work
- XML::LibXML
- XML::XPath
- Document Validation
- XML::Writer
- Character Sets and Encodings
- Event Streams
- Working with Streams
- Events and Handlers
- The Parser as Commodity
- Stream Applications
- XML::PYX
- XML::Parser
- SAX
- SAX Event Handlers
- DTD Handlers
- External Entity Resolution
- Drivers for Non-XML Sources
- A Handler Base Class
- XML::Handler::YAWriter as a Base Handler Class
- XML::SAX: The Second Generation
- Tree Processing
- XML Trees
- XML::Simple
- XML::Parser's Tree Mode
- XML::SimpleObject
- XML::TreeBuilder
- XML::Grove
- DOM
- DOM and Perl
- DOM Class Interface Reference
- XML::DOM
- XML::LibXML
- Beyond Trees: XPath, XSLT, and More
- Tree Climbers
- XPath
- XSLT
- Optimized Tree Processing
- RSS, SOAP, and Other XML Applications
- XML Modules
- XML::RSS
- XML Programming Tools
- SOAP::Lite
- Coding Strategies
- Perl and XML Namespaces
- Subclassing
- Converting XML to HTML with XSLT
- A Comics Index
You may also want to check out Erik T. Ray's home page, Jason McIntosh's home page, or O'Reilly's page for the book. You can purchase Perl & XML from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
Any Perl/Python bilingual folks out there care to comment how the XML abilities of the two compare nowadays?
"Programmers, hear my cry! Spend your precious hours working on your program interface, your error- checking, your overall design and modularity, don't spend time worrying about a scheme with a fancy name that saves data like this: value."
Argh! Slashdot cut out my pseudo-tags, in my original post I meant <variable>value</variable>.
I bet I won't be the only one to make that mistake today. If you're posting XML, be sure to save the post as "Extrans (html tags to text)" instead of "Plain old text" or "HTML formatted" to save your braces from being truncated.
This has been a public service announcement.
Now I'm depressed, I'm going to go work on my latest server. At least I have some control there.
-----
Slogan-free since April! We pass the savings on to you!
Ok, granted it is generally bad form to respond to trolls, but this one reminded me of a good story that I thought I would share.
Problem: Given a document in Word format containing a table on which various operations must be performed, resulting in an HTML page with a consistent format.
Now, first of, simply saving the document as HTML from within word was far from sufficient. So, what to do? We tried various methods using Microsoft products to do the requisite transformations, all to no avail. We simply didn't have the control we needed.
Solution: Import the file into OpenOffice.org's Writer, save in OOo format (XML based), write a quick one-page perl script using XML::Twig (even though I had never examined OOo XML format prior to this exercise), and voila, problem solved.
This was a great example to me of the power of XML. Sure, XML is verbose, but remember, it is all ASCII text, and compressing ASCII text is basically a solved problem in computer science, so the verbosity needn't create much of a storage hit.
Horray for adoption of XML file formats!
Does anybody know of any Perl XSLT module that allows Perl functions to be called from the templates? I.e., to format dates or stuff like that.
I'm looking into it for XML::LibXSLT, but it's non-trivial due to lack of docs and lack of context. Keep watching CPAN is all I can suggest!
Matt. Want XML + Apache + Stylesheets? Get AxKit.
As far as parsing it, there are libraries for that.
I have spent my Junior and Senior years in college working with XML (as a personal project). I just graduated and I am still working with XML. I will agree that in some places XML is being used in ways it was never intended. That is why they call it extensible. It will fit almost anywhere but is not the best solution for most problems.
One good example of XML use is in Open Office. I believe the Open Office file format will end up being the most important contribution that Sun made in the office application arena.
Another good place to use XML is in cofiguration files. The advantages are obvious.
Parsing XML takes resources, so in most applications you should not do it in real time. An example of this is in a dynamic web environment. Try implementing Slashdot with a XML based backend. But with browsers becoming XML aware, you can offload this parsing to the browser.
The worst place to use XML IMHO is to describe logic. Some people have tried this - like XSP or JSP. There are some advantages to it but I think it ends up being a mess. XSLT got away with being a mess because it was one of few solutions to the problem of XML transformations.
An argument about the evils of XML is akin to saying Perl is a nasty language. A professor who taught Perl actually told me that. It is a nasty language but it solves some problems elegantly.
XML on the other hand is quite pretty and it solves several problems elegantly.
It is when you fall to the hype and use XML becaue you want to advertise it as a "feature" that it fails. XML is not a feature - it is a solution to certain problems.
I could care less if your program uses XML in some obscure place that I can't see. If you can give me a way to export my data to XML, I will be happy. I can write a config file as XML, I will be happy. But if you say, I use XML in this application to implement the Help feture which is only accessible through the Help button, I could care less.
P.S: I hate it when interesting stuff gets posted in the middle of the day when I am at work