The PHP Anthology - Volume II, 'Applications'
There are seven chapters in this volume, each dealing with real-world problems. Many problems are those you've seen solved on sites you admire and wondered "How did they do that?" Others are frameworks that allow your site to run smoothly, with nobody getting accidentally logged out or having to wait too long while your script gluttonously pulls the same data out of the database for the Nth time. At the end, Fuecks goes back to the beginning, to show how proper design and development can save you time when you start your next project.
Chapter 1: Access Control
Authentication is the process by which users identify themselves. This is difficult in HTTP, a stateless protocol in which the server handles one request at a time and instantly forgets you. Luckily HTTP allows cookies, which are bits of data the server sends to the client for to reveal upon revisiting. At first cookies were used only to annoy ("Hello, Steve! You have visited this page 3 times"), but a cookie can hold the ID of a session record in a database, which contains any state-information that you like.
You can authenticate without sessions via HTTP server configuration, as long as you like the dull dialog box the browser pops up when users enter a restricted area. Oh, and you don't mind the fact that users won't be able to "log out" without quitting their browser, nor can you force a logout after a certain timeout value. Nor can you allow users to register themselves... these are all existing, solved problems and the author shows some of the best solutions. Common tasks like allowing users to change their passwords, recover their passwords if (I mean when) they forget them, and arranging users in groups to which you assign common permissions are also covered.
My favorite example from this chapter is the humans-only registration application. Remember when online voting for the Major League Baseball All-Star Game first started? Anyone who knew how to write a web client could have automated a task to vote as many times as the server could handle, and have his favorite players be the all-star team.* To bring it closer to home, what if somebody decides to bog down your site by automatically registering a huge number of times and filling up your database? You can keep these things from happening by making users look at images which contain text but are hard for computers to "read." PHP is in use at all stages of this game, from writing the registration form's HTML to generating the obscured image on-the-fly.
Chapter 2: XML
XML is a fact of life and, hype aside, is a great way to store and transmit machine-readable data. One of the most visible applications is the thousands of bloggers and news sites providing XML feeds of their headlines. You can write portal sites that grab these headlines, parse them all and present them on your site with links to the full text at the source.
There are two ways to parse XML: with events, or by using the Document Object Model (DOM). The methodologies are similar to reading a plain-text file line-by-line or all at once. Using events you can implement a finite-state machine based on which tags and text come down the pike. Or you can slurp the whole document into memory and find any part of it with ease. The built-in library for the former is based on the popular Simple API for XML (or SAX; don't you like those nested acronyms?), while the latter often uses Xpath to find the particular document nodes you want.
The author shows how to parse RSS feeds with both SAX and the DOM, and how to render a feed with DOM. Further, you can use Extensible Stylesheet Language Transformations (known as XSLT) to transform XML -- whether it's to XHTML for regular browser reading, WML (Wireless Markup Language) for viewing on mobile phones, or even SQL to communicate with a database.
Another exciting XML application is in the area of web services, in which agents (often but not necessarily web servers) communicate with each other over an XML-based protocol built on top of PHP. The two most popular protocols are XML-RPC (the RPC stands for "Remote Procedure Calls") and SOAP (which used to stand for "Simple Object Access Protocol" but now is just a name). Often-changing information such as stock prices and weather are often offered through web services, but they can also be used as an object API between agents over the network. What's cool about using SOAP is you can publish to clients exactly what services you offer and how they can call them using the Web Services Description Language (WSDL).
Chapter 3: Alternative Content Types
If you've ever printed out a web page that was designed for browser viewing, you know the less-than-desired effect. The navigational elements, search boxes, and banners, while necessary for the web page, are useless once a static copy is printed. Furthermore, you need to extend your site to include users with less-featured browsers, such as mobile phones.
Fortunately, PHP has been taught many languages. PDF is the standard for print-quality documents, and there are several libraries (free and non-free) which allow you to generate them. WML is the HTML of cell phone browsers, in which screen space is at a premium and bandwidth scarce. SVG is an XML application which allows vector-based images like PostScript does. The coolest example, however, uses XUL (the XML User interface Language, not to be confused with Zool) to make full GUI applications that you run through Mozilla. This isn't useful for the outside world where you can't force your users to use Mozilla (sigh), but works well for intranet applications that run on a variety of platforms.
The author also brings up in this chapter an HTML SAX parser he has written. You can process HTML pages chunk-by-chunk and extract the pieces you want. I hadn't known about such a class until I read the book and I'm very excited I know about it now. For sometimes it's necessary to parse a web page meant for humans to read (perhaps to pretend to be a user and automate your all-star voting), and most HTML pages won't validate as HTML, let alone XML.
A good point here is that a well-designed, tiered application will allow you to swap out different presentation classes with little code rewrite. Separating the tasks of extracting the data from the database and presenting to the user in variety of formats is a common task that when done right becomes subsequently easier.
Chapter 4: Stats and Tracking
Once your site is up and running, you'll be interested to know which parts
are the most active, and how much traffic you're getting. Into a dynamic page you can obviously insert any logging mechanism, but a great place to put it is
inside your site's logo. PHP can send binary data as easily as text.
Why would you want to do this?
- The logo is usually on every page (or it should be). You don't have to cut-and-paste code.
- You can serve the image, then use the flush command to send the output on and do extra processing. This way logging doesn't get in the way of page rendering.
There are lots of packages available to collect and analyze data. The author goes through phpOpenTracker which is quite rich in features. There are also ways to collect data on what links users follow to leave your site, and to keep requests from search engines from cluttering your log files.
Chapter 5: Caching
Another possible knock against PHP is that, while it's good to have dynamic pages, some pages are unnecessarily so. This is a waste of server resources to keep rendering the same page anew. There are different ways to conserve.
On the client side, you can use HTTP 1.1 headers like Cache-control and Expires to tell browsers when it's okay to store cached copies locally
On the server side, as can be expected, you have a greater level of control. You can use output buffering to delay sending of output to the browser, then save a copy of the output locally. On subsequent requests, you can serve the file rather than generate the HTML all over again. This can be implemented on a chunk (or block) level, so that you can keep some parts ultra-time sensitive and others not so much. The package PEAR::Cache_Lite can help with this.
Chapter 6: Development Technique
The last two chapters were my favorites of the two-volume set. They are on
a higher level of abstraction than the features of PHP's library of functions, or previous five chapters on real-world solutions. After you've reached a certain level of expertise in PHP coding, you being to wonder about the "right"
way to do things. The author shows how to use Xdebug to find bottlenecks in your code, as well as a few quick optimization tips (for instance, design your flow control so that the first choice is the one most often taken).
He then discusses the principles of N-tiered design. N is usually 5, but the data layer (usually a database or file system) and presentation layer (usually the browser) are most often handled outside of PHP, so you normally have three levels to worry about:
- Data Access: Getting data from the outside world into your application
- Application Logic: Doing whatever unique thing your application is supposed to do
- Presentation Logic: Forming a response in a format acceptable to your client
Keeping these layers separate and restricting them to communicating through well-defined interfaces allows you maximum flexibility. If you need to change databases (say you just got venture capital money and can afford Oracle now), you can do so only changing one layer. If you want to serve different flavors of HTML, or different markup languages altogether, or binary data, you can do so by only changing one layer. You can even strive for maximum distributability by enabling your layers to "live" on physically independent machines and communicate with XML-RPC or SOAP.
Documenting your code is essential. Anybody who's been programming for over a year has gone back to code he or she's written and thought, "Now what the heck was this supposed to do?" It's even more essential when you write something and wish to distribute it for the benefit of others. You can expect them to grok your code at an even lower rate since they didn't write it the first time.
Luckily, scripted languages like PHP are excellent at parsing text files, including PHP scripts themselves. Using well-defined documentation formats akin to JavaDoc, you can embed documentation in your code inside comments, and use tools like phpDocumentor to extract these documentation blocks and format them as nice, cross-reference HTML. In fact, writing doc blocks before your code is a good way to think ahead about how you want your classes and methods to work.
Unit Testing, one of the most digestible dogmas of Extreme Programming, is an awesome way to test your code for logic errors. You build up tiny test cases (using mock objects to isolate the class you're testing) and build as many as you like. Once you do this (PHPUnit and SimpleTest are two rich frameworks), you keep your tests and each time you add features, you run your test to make sure you haven't added bugs as well.
Chapter 7: Design PatternsDesign Patterns is one of the modern classics in information technology. After having done OOP for a while, you will inevitably get the feeling of deja vu that you've solved a problem before. Not so concretely as "I need a database abstraction layer," or "I need a templating system," but as in "I need a way to create objects without specifying exactly what class they belong to," or "I'm tired of writing so many if statements." Design patterns are common object architectures which can be used to solve common (though unique) problems.
Many design patterns are more suited to state-equipped applications with GUIs, but there are plenty to assist the PHP coder. The Factory Method is a pattern through which an object can create other objects of varying classes. So instead of writing mysql_connect everywhere, then having to change every occurrence of that function, you can abstract all database interaction to a class, then instantiate a database connection through a class method of another class: $db = MyApp::getDatabaseConnection(). This is useful when the connection (not just the RDBMS, but the actual database) you want varies depending on whether you are developing, testing, or going live with your application. Factory methods are also a good way to avoid global configuration variables.
The Iterator Pattern and the Observer Pattern are two others mentioned in this chapter. Iterators are used often in paging through database results. Observers are used to let objects notify other objects of changes in their state. This chapter will make you want to go read the whole Gang-of-Four book if you haven't already.
My biggest beef with the book is that this wasn't presented earlier on, perhaps at the beginning of Volume II. As a climax, it leaves me flat, wondering how the rest of the volume could have been derived from this very cool concept. But most PHP books conclude with chapters on how to extended PHP on the C level, or giant case studies involving massive code dumps, and I'm often not satisfied with them. This is a nice philosophical note to go out on. And there's something to be said for the argument that books like these aren't written to be read cover-to-cover.
Appendices
The book closes with the same indices as in Volume I. Since I don't know the URL of my review of that volume, I'll just copy: You can read about which configuration directives you're probably most interested in (the complete list you can get on PHP's web site), some common security breaches, and how to install PEAR, PHP's version of CPAN. My favorite appendix is the "Hosting Provider Checklist," a great reference for evaluating whether kewlhosting.com is going to give you the freedom and support you need to make a great hosted web site.
This volume was informative, well-written, and inspirational in that it made me want to go out and add cool and useful features to my web sites. Check it out if you can.
*Not really (not that I tried or anything), but they've always been a little bit smarter about it. You get my point, though. This did happen on an ESPN.com Page 2 mascot popularity contest, but they noticed through request headers that millions of votes were coming from the same place, and invalidated all those votes.
In real life, Matthew Leingang is Preceptor in Mathematics at Harvard University. He promises to review any book sent to him for free, and sometimes actually does it. Both volumes of The PHP Anthology are available from SitePoint. Slashdot welcomes readers' book reviews; to see your own review here, carefully read the book review guidelines, then visit the submission page.
The future of the web is the Symantic Web. XML is just 1/5th of the layer to fully implement it. Go look for things like Owl-S and study that as a suplement to PHP handoff.
100% Crunchier
And you profess to be a geek! Haven't you heard of sandles-and-socks.
What's another word for Thesaurus?
-Steve Wright
Does it have a chapter on how to talk to tech support at a hosting service to:
Kidding aside, I have a love-hate relationship with PHP because of having to support applications where I don't have total control over how PHP is installed.
(S(SKK)(SKK))(S(SKK)(SKK))
I just shudder when I hear the words PHP and enterprise. I simply would never write an app. that I absolutely depended on in PHP. It's simply too easy to make mistakes in, and thrives on mixing the display (view) of data with business logic. PHP was "designed" by people who never actually thought through exactly what the problem was that they were trying to solve.
...theres 400 posts from people who don't realise that register_globals has been turned OFF by default for years and only outdated old PHP scripts and guides need it turned on.
If memory serves me correctly:
1) Someone did exactly that, to try to put Nomar Garciaparra on the AL team.
2) I'm not 100% sure about this, so I'm not going to say the name but -- isn't the person responsible now one of the Slashdot editors?
Not sure if Timothy is obliquely referring to that incident or if he sincerely doesn't know about it. It makes me laugh, though, especially when we periodically get stories about Microsoft encouraging their employees to vote in a web poll and I read all the "WHY AREN'T THEY IN JAIL???!!??" comments. I'll also laugh when Nomar and his bogus injury hit .220 in Chicago.
What I'm listening to now on Pandora...
If this is offtopic, this -> http://developers.slashdot.org/comments.pl?sid=116 764&cid=9923016
might also be offtopic, but it isn't! It's rather "ontopic".
Is the 'e' a typo in Harrys' name?
must... not... comment... on... Harrys... last... name... nnghhh!
I am waiting for a book about PHP and web services, going beyond the obvious google/amazon examples, a book that explores complex applications.
Posts so far:
offtopic 8
fuecks jokes 6
obvious flames 5
flamebites 6
agrees with flame 1
trolls 2
funnies 2
explanation of the funnies 1
refunny the funny 2
php related 3
Perl is better 2
misc 1
Doh! Not there!
Settings shouldn't have to all reside in a global config, that is stupid. And functions semantics should not change between minor versions, especially with a package like PHP that has to be upgraded every couple of weeks because of security issues. I never had to notify anyone when I upgraded perl, in fact they never even noticed.
Not only that, but you don't need someone else to load an optional library for you, php lets you do that in your scripts.
In all honesty, writing a PHP enterprise-level application [declarative distributed transactions, etc.] with the same featureset as a well-written J2EE application would take many orders of magnitude longer...
And from a language standpoint, PHP's weak typing isn't the best thing in the world to deal with at that level - finding type errors at run time rather than compile time is [often] a bad thing.
This is not to say that PHP isn't suitable for good web applications - it's just that it's difficult to write and maintain robust distributed systems with it.
Think of a 200,000 line perl program vs. a 200,000 line java/c# program - which would [generally] be easier to write and maintain?
about Fuecks? I really expected more. I am disappointed, guys!
Sorry to be so on-topic, but does anyone have a take on how this book compares to George Schlossnagle's Advanced PHP Programming, which appears to cover a lot of the same ground?
Arent you meant to have a link to goat.cx in there somewhere?
LOL...Funny because its true
Harry Fuecks who?
Enough with the PHP articles already. I'd rather hear about Ruby, and I hate Ruby.
Actually this was hilarious. It's clearly meant to be humorous. Please mod parent up for humor.
----
How to Make Work Enjoyable