The Perl Cookbook, 2nd Edition

Posted by timothy on Tuesday October 14, 2003 @05:50AM from the who's-cooking-tonight dept.

doom writes "For those of you who haven't been paying attention, when the The Perl Cookbook by Tom Christiansen & Nathan Torkington came out in 1999 it immediately became one of the primary references in the perl world. It's one of the first places you should check before making a move with perl, right up there with search.cpan.org, itself. Now we've got the second edition. What's the diff? The diff is 58 new recipes and program examples (list provided below), plus two new chapters on mod_perl and XML (which provide an additional 27)." Read on for doom's complete review. The Perl Cookbook, 2nd Edition author Tom Christiansen & Nathan Torkington pages 927 publisher O' Reilly rating 9 reviewer doom ISBN 0596003137 summary How to do common tasks in perl

The new recipes cover a number of subjects. One of the prominent themes is how to use perl's new unicode support, as well as the new I/O layers feature. The coverage of web programming has definitely been fleshed out with recipes on XML-RPC, SOAP and so on, plus the new chapter on mod_perl. Also of interest of course are the additional recipes on database access with DBI.

The mod_perl chapter is a good succinct introduction, with some very cute recipes in it (though admittedly a lot of these are also covered in the excellent Mod_perl Developer's Cookbook by Young, Lindner and Kobes out from Sams). For example "Transparently Storing Information in URLs" shows how to embed information in any arbitrary position inside a URL. This quickly shows the kind of things you can do with a PerlTransHandler and a PerlFixupHandler. The chapter closes with what looks like a good introduction to "Template Toolkit", which I would probably be very excited about if I wasn't already familiar with the (also discussed) HTML::Mason.

I really enjoyed reading the XML chapter (a subject I'm less familiar with): I predict that you'll find this to be the fastest way through the XALPHABET XSOUP without drowning. For me, this was almost worth the price of the book.

Very little has been removed (hence the page count has gone from 757 to 927), and where I have been able to find a deletion, there are usually very good reasons for it. For example, the first edition takes the trouble to tell us that qr// was introduced in perl 5.005, but the new edition drops the babble about versions there, because for most of us, anything before 5.6 is now ancient history. However, I do miss this particular irrelevant parenthetic aside that's been deleted now:

Remember that the opposite of read is not write but print, although oddly enough, the opposite of sysread actually is syswrite. (split and join are opposites, but there's no speak to match listen, no resurrect for kill, and no curse for bless.)

(p.295, first edition, compare to p.323, second edition.)

In general, it's difficult to think of anything seriously wrong with the Perl Cookbook. I might suggest that in some places they fall into the trap of talking about all the ways to do it, rather than just the best ways, (e.g. recipe 7.5 "Storing Filehandles into Variables" seems a bit complicated).

And maybe there are some slight problems with order of presentation, as with the new perl 5.8 feature of "I/O Layers", which is mentioned a few times before it's finally discussed in the beginning of Chapter 8 (though really, it's amazing that there aren't more problems like this: this is supposed to be reference work, and yet it usually works well as a tutorial also).

I've got one big complaint about the 2nd edition though: they changed the numbering of existing recipes! I've been writing code with comments like

# Schwartzian transform. See Perl Cookbook, recipe 4.15

and now it turns out I should've been specifying an edition number also. Please: "Cookbook" authors, come up with a numbering scheme that remains invariant with new editions... if you can't always just append to the end of the chapter, there's nothing wrong with tacking another dotted decimal on the end. We're programmers, we can handle it.

And speaking of the "Schwartzian transform" that recipe has a very clear, self-explanatory name "Sorting a List by Computable Field", but in the first edition, there was also a footnote explaining that many people call this the Schwartzian Transform, named after Randall Schwartz, who invented the technique. With this second edition, that footnote has been quietly dropped. Guys, if you're going to carry on a feud, this is really not the way to do it. It just makes you look bad.

O'Reilly's perl.com site has a series of articles by the authors, featuring some recipes from the book:

Appendix: New recipes and examples (not including the two new chapters):

Using Named Unicode Characters
Treating Unicode Combined Characters as Single Characters
Canonicalizing Strings with Unicode Combined Characters
Treating a Unicode String as Octets
Properly Capitalizing a Title or Headline
Constant Variables
Implementing a Sparse Array
Creating a Hash with Immutable Keys or Values
Matching Nested Patterns
Writing a Subroutine That Takes Filehandles as Built-ins Do
Storing Multiple Files in the DATA Area
Reading an Entire Line Without Blocking
Treating a File as an Array
Setting the Default I/O Layers
Reading or Writing Unicode from a Filehandle
Converting Microsoft Text Files into Unicode
Comparing the Contents of Two Files
Pretending a String Is a File
Working with Symbolic File Permissions Instead of Octal Values
Writing a Switch Statement
Coping with Circular Data Structures Using Weak References
Program: Outlines
Overriding a Built-in Function in All Packages
Customizing Warnings
Writing Extensions in C with Inline::C
Cloning Constructors
Copy Constructors
Saving Query Results to Excel or CSV
Escaping Quotes
Dealing with Database Errors
Repeating Queries Efficiently
Building Queries Programmatically
Finding the Number of Rows Returned by a Query
Using Transactions
Viewing Data One Page at a Time
Querying a CSV File with SQL
Using SQL Without a Database Server
Graphing Data
Thumbnailing Images
Adding Text to an Image
Program: graphbox
Turning Signals into Fatal Errors
Multitasking Server with Threads
Writing a Multitasking Server with POE
Accessing an LDAP Server
Sending Attachments in Mail
Extracting Attachments from Mail
Writing an XML-RPC Server
Writing an XML-RPC Client
Writing a SOAP Server
Writing a SOAP Client
Program: rfrm
Using Cookies
Fetching Password-Protected Pages
Fetching https:// Web Pages
Resuming an HTTP GET
Parsing HTML
Extracting Table Data

You can purchase The Perl Cookbook, 2nd Edition from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.

2 of 148 comments (clear)

Min score:

Reason:

Sort:

Re:My problem with Perl by DG · 2003-10-14 06:13 · Score: 3, Insightful

Like learning any new language (and the regular expression syntax IS a sublanguage into itself) the best way to learn it is to actually work with it for a while.

After a little hands-on work, you'll start to understand the logic behind all the line noise, and once you get to that point, the pure beauty of regexes and what they can do becomes clear.

In a way, it's a little bit like learning to program assembler. At first, all those opcodes are just a confusing mess, but once you get the hang of it, it starts to become clear.

DG

--
Want to learn about race cars? Read my Book
Re:My problem with Perl by Frater+219 · 2003-10-14 06:23 · Score: 2, Insightful

My problem with Perl is the ubiquitous use of the regular expressions.

It's true that people writing in Perl tend to use regular expressions in places where they're not necessarily appropriate. For instance, algorithmically speaking, subsequence matches are faster than regular-expression matches. (This is why Python has the .startswith and .endswith string methods, and the in operator.) However, the Perl regular-expression engine (PCRE) is optimized to heck and its raw speed can usually overcome this.
That said, the traditional regular-expression syntax is rather arcane. The only real alternative I've seen is the S-expression syntax of cl-ppcre -- the Common Lisp PCRE implementation. This allows you to write complex regular expressions as tree structure rather than as strings of character glyphs.
For instance, in place of the regex string "(?:foo)|(?:bar)|(?:b(a|(?:uz))z)" you can write:
(:alternation "foo" "bar" (:sequence "b" (:alternation "a" "uz") "z"))

Now, that might not be any clearer to you if you don't know Lisp, but it gets better as the regex gets more complicated. (I've been a little tricky by putting a lot of ?: in the original regex string. That's the code for "I want to do grouping, but I don't want to capture groups into variables." In the Lisp syntax, you have to mark when you do want capture, not when you don't. People writing in Perl usually let their groups get captured even when they don't make any use of the resulting variables.)
Interestingly enough, the authors of cl-ppcre claim that it outperforms Perl -- a remarkable claim, but they seem to have pretty comprehensive statistics as to when it does and when it doesn't. It's odd to think that even though many people think Lisp is slow, compiled Lisp can really be quite speedy for tasks that people usually use a specialized language for.