Slashdot Mirror


mod_perl Developer's Cookbook

davorg writes "Over the last few years mod_perl has become a serious force in web development. If you're building a web site to run on an Apache server and you want to write the code in Perl, then you're going to want to install mod_perl on your server too as it's the best way to avoid many of the performance issues with traditional CGI. It's taken a while for publishers to wake up to the fact, however, and there haven't been many books in the shops. It looks like this will be the year that this changes. A number of mod_perl books are about to be published and this is the first." Read on below for Daveorg's thoughts on this one. mod_perl Developer's Cookbook author Geoffrey Young, Paul Lindner & Randy Kobes pages 630 publisher Sams rating 9 reviewer Dave Cross ISBN 0-672-32240-4 summary What mod_perl programmers have been waiting for

This book uses the popular "cookbook" approach, where the content is broken down into short "recipes" each of which addresses a specific problem. There are almost two hundred of these recipes in the book arranged into chapters which discuss particular areas of mod_perl development. In my opinion the cookbook approach works much better in some chapters than in others.

It's the start of the book where the cookbook approach seems most forced. In chapter 1 problems like "You want to compile and build mod_perl from source on a Unix platform" provide slightly awkward introductions to explanations about obtaining and installing mod_perl on various platforms (kudos to the authors for being up-to-date enough to include OS X in the list). All the information you want is there however, so by the end of the chapter you'll have mod_perl up and running.

Chapter 2 looks at configuration options. It tell you how to get your CGI programs running under mod_perl using the Apache::Registry module which simulates a standard CGI environment so that your CGI programs can run almost unchanged. This will give you an immediate performance increase as you no longer have the performance hit of starting up a Perl interpreter each time one of your CGI programs is run. This chapter also addresses issues like caching database connections and using mod_perl as a proxy server.

We then get to part II of the book. In this section we look at the mod_perl API which gives us to the full functionality of Apache. This allows us to write Perl code which is executed at any time during any of the stages of Apache's processing.

Chapter 3 introduces the Apache request object which is at the heart of the API and discusses various ways to get useful information both out of and back into the object. Chapter 4 serves a similar purpose for the Apache server object which contains information about the web server and its configuration.

In chapter 5 the authors look at Uniform Resource Identifiers (URIs) and discuss many methods for processing them. Chapter 6 moves from the logical world of URIs to the physical world of files. This chapter starts by explaining the Apache::File module before looking at many ways to handle files in mod_perl.

The previous few chapters have built up a useful toolkit of techniques to use in a mod_perl environment, in chapters 7 and 8 we start to pull those techniques together and look in more detail at creating handlers - which are the building blocks of mod_perl applications. Chapter 7 deal with the creation of handlers and chapter 8 looks at how you can interact with them to build a complete application.

Chapter 9 is one of the most useful chapters in the book as it deals with benchmarking and tuning mod_perl applications. It serves as a useful guide to a number of techniques for squeezing the last drops of performance out of your web site. Chapter 10 is a useful introduction to using Object Oriented Perl to create your handlers. While the information is all good, this is, unfortunately, another chapter where the cookbook format seems a little strained.

Part III of the book goes into great detail about the Apache lifecycle. Each chapter looks at a small number of Apache's processing stages and suggests ways that handlers can be used during that stage. This is the widest ranging part of the book and it's full of example code that really demonstrates the power of the Apache API. I'll just mention one particular chapter in this section. Chapter 15 talks about the content generation phrase. This is the phase that creates the actual content that goes back to the user's browser and, as such, is the most important phase of the whole transaction. I was particularly pleased to see that the authors took up most of this chapter looking at methods that separate the actual data from the presentation. They have at recipes that look at all of the commonly used Perl templating systems and a few more recipes cover the generation of output from XML.

Finally, two appendices give a brief reference to mod_perl hooks, build flags and constants and a third gives a good selection of pointers to further resources.

This is the book that mod_perl programmers have been waiting for. The three authors are all well-known experts in the field and it's great that they have shared their knowledge through this book. If you write mod_perl applications, then you really should read this book.

You can purchase mod_perl Developer's Cookbook from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.

28 of 80 comments (clear)

  1. Not a really useful book by MrBoombasticfantasti · · Score: 3, Insightful

    It doesn't actually add much to the info already available at CPAN. Still nice to have it on the shelve.

    --
    !ERR: Signature not found.
    1. Re:Not a really useful book by thaigan · · Score: 2

      Except that I can read it while on the train.

      --

      42
    2. Re:Not a really useful book by thaigan · · Score: 2

      Maybe I'm the only one, but I like to read Cookbook style books even when I'm away from the computer.

      --

      42
    3. Re:Not a really useful book by consumer · · Score: 2

      I disagree. It is better organized and more clearly presented than most of the on-line documentation, and it provides more examples. It also shows how to do things that are not discussed anywhere else, like automatically caching the output of a content handler. It's a very handy book to have.

  2. mod_perl is not just "quicker CGI" by ajs · · Score: 5, Informative

    mod_perl provides a means for transparently wrapping CGI programs so that they run continuously instead of starting up (and thus parsing) every time a request requirest them.

    However, it's much more than a CGI accelerator. It provides hooks into all of the stages of an apache transation.

    As an example of the kind of power this gives you, you can write a Perl plugin for Apache that intercepts 404s, and generates a dynamic page which you then cache to disk for future access (far out-stripping even native C dynamic page generation speeds on subsequent hits). This is just one example. You can write whole content management systems using mod_perl, and in fact many have.

    1. Re:mod_perl is not just "quicker CGI" by ajs · · Score: 3, Informative


      Request 1: xyz.html

      file not found
      mod_perl intercept of 404 calls xyz.pl
      xyz.pl writes xyz.html (e.g. from database)

      Request 2: xyz.html

      file exists
      sendfile or tux used to fire file to socket


      Even C cannot dynamically generate a file as fast as it can be read from disk. Granted, you could write the same plugin in C as you wrote in mod_perl (mod_perl uses the C API for apache after all), but it would be a lot more work, and all you would get is the performance boost on that first page generation, after that they perform the same.

      This is the model used by at least one major content management system that uses a language that make Perl look zippy by comparison. They still compete because most page views are found on disk.

      Of course, now you get to play the cache management game, but that's the right problem to have when serving lots of content.

    2. Re:mod_perl is not just "quicker CGI" by cperciva · · Score: 3, Insightful

      Even C cannot dynamically generate a file as fast as it can be read from disk.

      That depends upon what the file is, and how fast your disk system is. Many large scientific computations which, in the past, precomputed values and stored them to disk now recompute as necessary, simply because the recomputation is faster than a disk access.

      You won't be able to regenerate a file as fast as it can be read from cache; but unless you have an infinite amount of cache memory, there are likely to be cases where you're better off to recompute and allow something else to be cached.

    3. Re:mod_perl is not just "quicker CGI" by ajs · · Score: 2

      True, but only in the case of huge files that require no disk access to generate dynamically. Since most dynamic content on the Web requires a database....

      C's sendfile can (when possible) perform a DMA transfer from the disk controler to the ethernet controler, which will beat the snot out of any relational database access.

    4. Re:mod_perl is not just "quicker CGI" by WWWWolf · · Score: 2, Informative
      However, it's much more than a CGI accelerator. It provides hooks into all of the stages of an apache transation.

      Yeah, I found it extremely appealing for two reasons: First, I hate writing configuration file readers - and with mod_perl, $request->dir_config('whatever'); to read stuff that is set with PerlSetVar in .htaccess or server conf. The second reason: Logging with various debug levels. Easy with Apache::Log.

    5. Re:mod_perl is not just "quicker CGI" by Glorat · · Score: 3, Insightful

      Actually, one of the barriers to mod_perl use is that mod_perl by default does *not* provide transparent wrapping of CGI programs. It can be made to do so using PerlRun modules but I think it's just a case that a documentation needs to be more prominent about this fact that vanilla Apache::Registry scripts behave significantly different from CGI. Perhaps the documentation should advertise more the PerlRun modules (etc) that do give transparent CGI wrapping. I like many others have fallen into the trap of just blindly switching a script from CGI to mod_perl and bitten by many of the (documented) issues if you bother to RTFM which of course I didn't at first =)

      Now that I know mod_perl indepth, the parent is correct in the immense flexibility of mod_perl with its ability to directly interface with Apache. Something you won't be able to do ever with CGI or even PHP.

      And about You can write whole content management systems using mod_perl, and in fact many have. Of course the CMS running here at Slashdot is powered by Slashcode which runs under Apache/mod_perl.

    6. Re:mod_perl is not just "quicker CGI" by cperciva · · Score: 2

      True, but only in the case of huge files that require no disk access to generate dynamically.

      Except that the database entries used are more likely to be reused for other requests -- so if the output could be cached, the database certainly would be.

      Obviously, in some cases it is better to precompute entire pages; but it is really something which has to be determined on a case-by-case basis.

    7. Re:mod_perl is not just "quicker CGI" by gorilla · · Score: 3, Insightful

      An example of one of these content management systems would be mason, http://www.masonhq.com, and mason apps such as Fuse CMS and Bricolage. I find Mason to be just as powerful as multi-thousand dollar applications such as StoryServer

    8. Re:mod_perl is not just "quicker CGI" by ajs · · Score: 2

      Except that the database entries used are more likely to be reused for other requests -- so if the output could be cached, the database certainly would be.

      I'm going to explain why this is wrong, but first let me explain that you're in some very good company in having made this assumption. I and just about everyone I know who've seen a good caching content management system in use have been stunned by the simplicity and correctness of the solution. In the case of Vignette (the one I'm most familliar with), I was also stunned that such slipshod software written in a language that couldn't even do lexical scoping (TCL) was doing this one thing so well :-)

      Ok, on to the technical. Yes, you can cache your database in memory (Oracle lets you cache gigs and gigs in RAM), but that buys you a lot less than you would think.

      You still have to execute millions upon millions of instructions just to generate the simplest page. When an HTML file is on disk, apache just calls sendfile(2), which copies the file from disk to socket with no userland code in between. Trust me when I say that this is so much more efficient that it's not even worth the comparison.

    9. Re:mod_perl is not just "quicker CGI" by consumer · · Score: 2
      Caching data, as opposed to just caching generated HTML, allows you to reuse that data in other pages, some of which can't be cached. For example, I worked on an application where we would cache data from a product catalog and use that data in the browsing pages, the shopping cart, the gift registry, etc.

      A good system will allow for caching of both data and generated HTML.

    10. Re:mod_perl is not just "quicker CGI" by cperciva · · Score: 2

      You still have to execute millions upon millions of instructions just to generate the simplest page

      Only if you write crappy code, or you have extremely complicated pages. A few hundred thousand cycles is reasonable for well written code generating a web page from cached data.

    11. Re:mod_perl is not just "quicker CGI" by ajs · · Score: 2

      Only if you write crappy code

      Nope

      or you have extremely complicated pages.

      Nope

      A few hundred thousand cycles is reasonable for well written code generating a web page from cached data.

      Most assuredly nope!

      Sure, I too can come up with a home page for peeling paint that I can generate with a six-line C program. But, even moderate complexity would run you aground.

      How are you caching data? How are you locking/cleaning/managing/clearing that cache? Your page generation will have to be in bed with that to some extent in order to determine if a new page request invalidates some or all of the cached data that it touches. Then, you're going to have the small matter of how you share this cached data. Is it in a simple database (e.g. Berkeley DB) or a second-tier relational database or do you try to manage a live, shared memory cache. Cache consistency management on that's going to get ugly fast!

      Now, you start dealing with protocol management, HEAD vs GET vs POST requests, parsing POST bodies. URL-encoding, cookie access, security, etc, etc.

      "Well written code" as defined by number of cycles consumed usually means that many of these needs are handled in a one-off way that does not take into account the mountain of special-cases that makes up what we call the World Wide Web.

      Instead I suggest you spend all of that premature optmization energy on writing a good cache management system that can mix and match static HTML cache with dynamically generated pages on the fly. That would benefit everyone, not just one Web page.

  3. website support by Anonymous Coward · · Score: 4, Informative

    we (the authors) support a companion website where you can find a number of useful items, such as all the code from the book (to save your fingers) and a full-text search engine (to supplement the index).

    http://www.modperlcookbook.org/

    enjoy

  4. Re:mod_perl slow, php good by consumer · · Score: 3, Informative

    PHP is fine, but it's not as fast as mod_perl.

  5. Re:Not a really useful book (to you?) by lindner · · Score: 5, Informative
    It doesn't actually add much to the info already available at CPAN. Still nice to have it on the shelve.

    [disclaimer: author post follows]

    The problem with CPAN is knowing what's useful and what's not. This book isn't just a collection of modules and documentation. Instead it's geared to people who are writing mod_perl code. The code examples are used to show you not just how to do some task, but also (in most cases) how the code does what it does.

    In fact, distilling mod_perl code into short, sweet examples was where most of the effort went into writing this book. You don't want pages and pages of code to illustrate one or two simple ideas.

    So, perhaps we didn't write a book that was useful to you. Given the feedback I've read, it is useful to many other people.

  6. It's taken a while for publishers to wake up to... by pizza_milkshake · · Score: 3, Informative

    it's taken /. a while as well; this book was published in January

  7. Re:It's taken a while for publishers to wake up to by lindner · · Score: 3, Funny
    As one of the authors it's been difficult to wait for this book to get more widespread exposure. One reason might be because it is published by SAMS. I suspect if there was a cute O'Reilly animal on the cover we'd be much more widespread at this point. Who knows, maybe we should stuck with the (unfounded) SAMS stereotype and named the book mod_perl unleashed in 21 days for dummies. Nah..

    In any case, it's nice to see a new review on one of my favorite web sites. More good reviews over there at amazon and at the book's official web site.

  8. Re:It's taken a while for publishers to wake up to by davorg · · Score: 2, Informative

    It's partly my fault. I got my review copy in June :-/

  9. Not mod_perl 2.0 by PineHall · · Score: 2

    http://www.modperlcookbook.org/modperl2.html
    One thing to note is that it is for the 1.3 version not the new 2.0 version. They say though there are not too many differences.

    1. Re:Not mod_perl 2.0 by lindner · · Score: 2, Interesting
      One thing to note is that it is for the 1.3 version not the new 2.0 version. They say though there are not too many differences.


      Funny thing. We were worried that mod_perl 2.0 would steal our thunder. It's now september 2002 and we still don't have the official release.


      Apache 1.3 and mod_perl 1.x will be around for a long time though. Especially on all those production servers that don't get the latest greatest software, only the boring reliable stuff...

  10. A very useful book by barries · · Score: 2, Interesting
    Apache and mod_perl are incredibly powerful but complex systems; it's very difficultfor any one person, to keep all of the details and possible approaches to all of the things you can do with them in my^Wtheir head.

    This book's approach helps me find tried and true approaches to the things I need to make mod_perl do. It's far better organized and written than the freely available documentation and covers a range of modules (many written for the book) that do things I used to do the hard way. It's clear, concise, and the material is well chosen. You'll get a lot further along on your next mod_perl project a lot faster with this book close to hand than by repeatedly scouring CPAN and the web for the modules, mail messages, and documentation

    Yours in mod_perl,

    Barrie

  11. Re:I'd like to use mod_perl but... by Anonymous Coward · · Score: 2, Informative

    I don't think so. Mod_perl gets its hooks so much deeper into Apache than CGI does, that it's hard to share.

    One problem is that a bad mod_perl program can bring down the server. A bad CGI program can't since by definition it's forking a new process to run, so all it can do is crash its own forked process.

    Another issue is that in order to load new or modified mod_perl scripts, you need the privileges of the process running the server.
    No way you can do that in a virtual hosting environment, unless you have a death wish.

    In order to install a mod_perl script you also have to be able to edit the apache config file. Typically (for good reasons) a file writable only by root.

    Another issue is that the apache processes running keep all the mod_perl programs in memory. If there were ten different mod_perl-enabled "web sites" run by the same apache server on the same box, that could get really inefficient.

    More I think about it, the only way this could work is if each mod_perl virtual host has its own apache server instance, with full ownership of its own config file and privileges to bring it down and restart it.

    You would need some kind of gateway server answering all requests coming in to port 80 and redirecting them to a different port depending on the request's URL; each mod_perl virtual host would have its own port number which the gateway server would redirect to.
    This sort of internal redirect is often used now running a server without mod_perl to handle static requests, and a mod_perl-enabled server to handle dynamic requests on the same box.

    -->So I have changed my off-the-cuff conclusion. You could do this all on one box, but it would be complicated, and a fundamentally different model than shared hosting with CGI only. And I'm not sure any hosting company is doing it. Nor am I sure that there aren't some bad security or performance implications I haven't thought of.

    Just get DSL and start messing around running mod_perl on your own computer. If it's a low-volume or self-instructional site that should be quite adequate.

  12. Re:Use PHP - faster, easier, more efficient by unicron · · Score: 2

    Maybe obfusgated(sp?) code. My perl looks just as good as my C/C++. I've never had a problem reading the perl of others, either, provided it was formatted and commented with just a little care.

    --
    Finally, math books without any of that base 6 crap in them.
  13. Re:Use PHP - faster, easier, more efficient by bitpusherdotorg · · Score: 2, Insightful

    True, Perl doesn't have to be messy. It's all a matter of coding style. If you are a considerate developer, you will write clean, easy to read code that's replete with useful comments.
    Unfortunately, there are people who want to write messy code out there, and I'll be damned if I have to maintain it!
    Thanks for the point - clarity of code is a matter of style, but certainly the choice of language helps as well.