Slashdot Mirror


XML and Perl

davorg writes "One of Perl's great strengths is in processing text files. That is, after all, why it became so popular for generating dynamic web pages -- web pages are just text (albeit text that is supposed to follow particular rules). As XML is just another text format, it follows that Perl will be just as good at processing XML documents. It's therefore surprising that using Perl for XML processing hasn't received much attention until recently. That's not saying that there hasn't been work going on in that area -- many of the Perl XML processing modules have long and honourable histories -- it's just that the world outside of the Perl community doesn't seem to have taken much notice of this work. This is all set to change with the publication of this book and O'Reilly's Perl and XML." Read on to see how well Davorg thinks this book introduces XML text processing with Perl to the wider world. XML and Perl author Mark Riehl, Ilya Sterin pages 378 publisher New Rider rating 8 reviewer Davorg ISBN 0735712891 summary Good introduction to processing XML with Perl

XML and Perl is written by two well-known members of the Perl XML community. Both are frequent contributors to the "perl-xml" mailing list, so there's certainly no doubt that they know what they are talking about. Which is always a good thing in a technical book.

The book is made up of five sections. The first section has a couple of chapters which introduce you to the concepts covered in the book. Chapter one introduces you separately to XML and Perl and then chapter two takes a first look at how you can use Perl to process XML. This chapter finishes with two example programs for parsing simple XML documents.

Section two goes into a lot more detail about parsing XML documents with Perl. Chapter three looks at event-driven parsing using XML::Parser and XML::Parser::PerlSAX to demonstrate to build example programs before going to talk in some detail about XML::SAX which is currently the state of the art in event-driven XML parsing in Perl. It also looks at XML::Xerces which is a Perl interface to the Apache Software Foundation's Xerces parser. Chapter four covers tree based XML parsing and presents examples using XML::Simple, XML::Twig, XML::DOM and XML::LibXML. In both of these chapters the pros and cons of each of the modules are discussed in detail so that you can easily decide which solution to use in any given situation.

Section three covers generating XML documents. In chapter five we look at generating XML from text sources using simple print statements and also the modules XML::Writer and XML::Handler::YAWriter. Chapter six looks at taking data from a database and turning that into XML using modules like XML::Generator::DBI and XML::DBMS. Chapter seven looks at miscellaneous other input formats and contains examples using XML::SAXDriver::CSV and XML::SAXDriver::Excel.

Section four covers more advanced topics. Chapter eight is about XML transformations and filtering. This chapter covers using XSLT to transform XML documents. It covers the modules XML::LibXSLT, XML::Sabletron and XML::XPath.

Chapter nine goes into detail about Matt Sergeant's AxKit, the Apache XML Kit which allows you to create a website in XML and automatically deliver it to your visitors in the correct format.

Chapter ten rounds off the book with a look at using Perl to create web services. It looks at the two most common modules for creating web services in Perl - XML::RPC and SOAP::Lite.

Finally, section five contains the appendices which provide more background on the introductions to XML and Perl from chapter one.

There was one small point that I found a little annoying when reading the book: Each example was accompanied with a sample of the XML documents to be processed together with both a DTD and an XML Schema definition for the document. This seemed to me to be overkill. Did we really need both DTDs and XML Schemas for every example. I would have found it less distracting if one (or even both) of these had been moved to an appendix.

That small complaint aside, I found it a useful and interesting book. It will be very useful to Perl programmers (like myself) who will increasingly be expected to process (and provide) data in XML formats.

You can purchase XML and Perl from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.

7 of 138 comments (clear)

  1. XML is NOT just text! by Anonymous Coward · · Score: 5, Insightful

    The whole point of XML is that it is NOT just a string of text. That's why Perl isn't particularly any better than Java or C++ or VB or whatever for processing XML - you're going to be using a library that gives you SAX or DOM access to your XML, and you'll never need to know that there's a text representation being serialized onto some wires somewhere.

  2. Re:Nice by DaRobin · · Score: 5, Informative

    Would be nice to have a book with more than just one chapter on web services.

    You might be interested in Programming Web Services with Perl then.

    --
    Radioactive cats have 18 half-lives.
  3. This was a review? by Syris · · Score: 4, Insightful
    I'm sorry, but this just wasn't a terribly deep review and well below par for /. Listing contents of a book and then nitpicking a detail don't a book review make.


    How effective were the examples? How easy to read and understand were the general concepts? Were the descriptions of libraries and API's clear? Was the writing generally readable?


    Would this book even make a good reference?


    Jeez, anyone want to follow up the post with a real review?

  4. XML frees us from Perl by Euphonious+Coward · · Score: 4, Interesting
    The whole point of XML is to free us from having to do the kinds of things Perl is meant for. Absent free-form text munging, Perl really has no advantage over other languages. At the same time, it has real deficits for people who need to know they have solved a problem correctly and completely.

    (For reference, see this rant by the brilliant net.kook Erik Naggum. The most quotable bit, for the lazy among you, is

    ...[Perl] rewards idiotic behavior in a way that no other language or tool has ever done, and on top of it, it punishes conscientiousness and quality craftsmanship -- put simply: you can commit any dirty hack in a few minutes in perl, but you can't write an elegant, maintainabale program that becomes an asset to both you and your employer; you can make something work, but you can't really figure out its complete set of failure modes and conditions of failure. (how do you tell when a regexp has a false positive match?)
    )

    1. Re:XML frees us from Perl by glwtta · · Score: 4, Insightful
      how do you tell when a regexp has a false positive match?

      A what? You (or rather the brilliant person being quoted) either mean that it matches a string that the expression isn't supposed to, which would be a serious bug in the language (and I am not aware of any such bugs); or you mean that it matches correctly, but matches things you didn't expect it to, in which case you tell, by (gasp!) testing your code. In any case, how do you tell a "false positive" regexp match in Java?

      but you can't write an elegant, maintainabale program that becomes an asset to both you and your employer

      Perhaps you can't. I have, and I do.

      --
      sic transit gloria mundi
  5. Perl is a reflection of your soul by Nexus7 · · Score: 4, Interesting

    Well, perhaps not your soul, but your Perll code just reflects the way you think to a greater extent than other languages. This isn't something that's done underhandedly, it is well advertised in every posting in c.l.perl and the Camel book, and every other book about Perl. Which is that Perl is not at all orthogonal, TMTOWDI (there's more than one way to do it). If you want to be rigorous and declare everything and not have your typos become references automatically, you "use strict" and your magic line is "#!/usr/bin/perl -w". If not, well Perl allows you to do that too. If you want objects, you can do that, if not, not.

    If is possible to write quality code in Perl Just because the language allows you to not do so isn't its fault. It doesn't stop you from doing it, because that'd stop you from doing brilliant things.

    To address some specific things you mentioned, you can do full-fledged exception handling in Perl if you want to (with eval and specific modules), or, you know, not. And I'm not familiar with the false positive matches in regexps (perhaps you're referring to some famous problem). But if a regexp doesn't do what you want it to, isn't is wrong? Between // and tr and split I get along just fine.

  6. hasn't received much attention until recently? by HealYourChurchWebSit · · Score: 4, Informative



    The reviewer is correct, Perl is a good tool for slamming and jammin' text, including XML. What I'm not so sure of is the quote "It's therefore surprising that using Perl for XML processing hasn't received much attention until recently."

    I mean one need only scroll down the extensive list of CPAN Modules to see well over 50, as well as many sites/authors devoting time, energy and resource.

    Similarly, I would point out some press modules supporting web services via XML, such as SOAP::Lite as far back as 02/26/01 and XML-RPC also in '01 -- or O'Reilly's own XML.com with articles such as "Processing XML with Perl" written shortly after the turn of the millenium.

    Point is, though I personally love Perl, blatant plugs such as "... it's just that the world outside of the Perl community doesn't seem to have taken much notice of this work. This is all set to change with the publication of this book and O'Reilly's Perl and XML." " don't inspire confidence in the reviewer's objectivity.

    --
    --- have you healed your church website?