HTTP: The Definitive Guide
OK, so I answered "C". I am going to make bold the claim that HTTP: The Definitive Guide, the long-awaited O'Reilly book on HTTP is ambitious enough in breadth and depth that if you answered "B," "C," or "D," you will find this book useful and informative. This is primarily due to clear organization of the book, as well as its friendly (even chummy) writing style.
Even if you are a technically-inclined sort from the Marketing department, and answered "A," you could get a good technical overview of the plumbing of the Web by skimming through this book; plus, having any O'Reilly book on the shelf in your cubicle would score you some street cred with the guys sitting over in Development -- this could be the one you've actually read. :-)
Breadth Unless you answered "D," HTTP is more complicated than you think. This is especially true if, as the authors of a good technical book should do (and these authors do), one spends some time touching on matters one level down (to TCP/IP, and other areas, in this case), and one level up (to HTML, generally, in this case). Because the authors are particularly concerned with HTTP performance, details of the interactions between HTTP and adjacent levels can be important.
The book is divided into five main sections: 1) an overview of HTTP, URLs, and connection management; 2) HTTP Architecture, including Web servers, proxies, caches, gateways, tunnels, robots; 3) Identification, Authorization, and Security; 4) Entities, Encodings, and Internationalization; 5) Content Publishing and Distribution, including hosting, publishing, load balancing, logging. So, even if you classify yourself as a "D," or even if you are hacking on an extensible open-source router software platform (in that case, you are an "F"), you will find yourself pulling this book from the shelf from time to time to check on something in one of these areas. The modular organization of the book is good.
The full Table of Contents is available on line.
Depth One (unfortunate?) thing about the Web is that its "architecture" (if you can even call it that) evolved and grew piece by piece. The design goals people had in mind back in 1993, or even in 1999, have been blown away by what has happened on the ground. Inter-company politics have also been a big factor -- never helpful for promoting standardization, or sound design. (Perhaps another problem has been the lack of an O'Reilly book on HTTP to tie everything together!) Hence, not only do you have a confusing mass of obsolete and/or overlapping specifications documents, you also have major differences between how different browsers, servers, and proxies adhere to these specifications in practice. This is one place the book shines: sprinkled throughout the pages are little tidbits about compatibility or performance pitfalls, gleaned from much practical experience. (The authors were some of the architects of Inktomi's Traffic Server "enterprise class" Web cache. Think "proxy caching for all of AOL's Web traffic.") As one example: "Technically, any Connection header fields (including Connection: Keep-Alive) received from an HTTP/1.0 device should be ignored, because they may have been forwarded mistakenly by an older proxy server. In practice, some clients and servers bend this rule, although they run the risk of hanging on older proxies." I can just imagine the series of bug reports leading to the inclusion of that piece of advice in the book. There are many other such warnings and bits of advice, generally aimed at HTTP application developers, often with an eye to performance tuning.
Here again, appropriate depth of discussion for a variety of readers is handled by clear organization of the book. The basic background material is laid out, and as the authors dive deeper into detail they may make a suggestion like, "If you are [not] writing high-performance HTTP software... feel free to skip ahead." Then, at the end of every chapter, there is a section labelled, "For More Information," which is a collection of relevant references and links, for those who want to dig into the source documents themselves.
Cautions This book review is addressed to the Slashdot crowd, a very technically savvy audience, so it's appropriate to mention what this book is not. It's not a detailed technical reference on all the topics mentioned in the table of contents (above); it would be tough to fit all that material into the book's 650-plus pages. However, the book is a good overview of HTTP and many related topics. The book does dip down into the grungy detail in many areas, but this won't be your only reference if you are a Web application developer.
Conclusion Overall, this is one of the more accessible O'Reilly books I own. In addition, while experts will certainly seek out greater depth in their particular area of expertise, few people are expert in the whole range of topics related to HTTP that this book covers. In addition, the book provides many tips drawn from practical experience, and references to more detailed material. HTTP, if not the heart and soul of the Web (perhaps that is Web content itself), could perhaps be called the Web's circulatory system. If you have a professional interest in Web content distribution, or Web application development, I believe this book deserves a spot on your shelf.
You can purchase HTTP: The Definitive Guidefrom bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
It's nice to see a review like this. Many slashdot reviews are short and detail-less, but this one is a good overview, which I like.
As much as I want to know about the underpinnings of HTTP, I find this one of those "books I'd like to HAVE read." If I buy it, which I may, I'm pretty sure it will be one of those books I just don't get around to reading because I personally don't have a huge need for it. I'd love to know the information, but I don't know I have the time to pull off actually reading it. Is it just me, or does everyone have a few of those books - the ones you wish you had actually read, but instead just look nice as part of your technical book collection?
I guess there's at least one positive about the Matrix - I can make a quick phone call and have my operator just load "The Complete HTTP" for me.
I figure XHTML 2 is going to require a big re-design of everything anyway, why not design an HTTP 2.0 to go with it?
But HTTP 1.1 has been out a while, and there isn't anything really new on the horizon. This book will probably have a longer life than many.
have you even noticed the 'X-Bender: something goes here' field in slashdot http responses? I sometimes make thousands of requests a day just to see how many there are. So maybe CowboyNeal did give good header.
this sig is deprecated
Standards should be lean and so easy to understand and so trivial to implement that one undergrad student can implement it to full compliance in one afternoon.
HTTP 1.1 has over 100 pages, most of them absolutely useless for implementors. Unnecessary verbiage, unnecessary optional parts, unnecessary warts, unnecessary "I'm working on a thesis about foo, let's put it in this standard and see what happens" crap.
Examples: chunked encoding -- absolutely superfluous! Amazingly useless. Or what about the range support? HTTP allows to request a byte range from a file. Normally you would use that to fetch the second half of an aborted download, or maybe for PDF reading you would fetch bytes 10 to 100 or so. HTTP 1.1 allows to specify several ranges in the same request, and the server is expected to construct some MIME abomination as answer, if it supports this at all. If it doesn't, it is allowed to coalesce the ranges and just send the whole range. This makes this feature horrendously useless for clients (why bother with it if you a) have to implement some sort of complicated parser to understand the result and b) won't even save bandwidth because the server isn't going to implement it in the first place and c) it is not even cheaper than just using keepalive connections and asking for the parts one by one.
In short: HTTP needs to die quickly and be replaced by something sane.
Did I mention the monstrosity that is content negotiation? It is impossible to write a proxy that can cache content in the face of content negotiation. Luckly, nobody uses it on their servers, because it is a pig to implement and configure on the server. Clients tend to support it, but who cares.
That isn't best practice. That is saying "Do this, unless there are exceptional circumstances". That is part of the protocol. Best practice is where there is an appropriate algorithm that most implementations have settled upon. It's a subtle difference, but it's definitely there.
What complete and utter egotistical bollocks. I'm sure I could code up an implementation simply by following the RFC - but there's no way in hell a responsible developer would, given the choice. There's plenty of experience codified in this book - experience that my ego allows me to benefit from, unlike Real Programmers like you, who seem to be too macho to know what is good for them.
So you wrote an implementation "without issue", and then state "The problem comes in..."?
See what's wrong with this picture? Books like this include information on what brain damages are floating around out there. The RFC doesn't. You may have written a conformant implementation, but what most people are after has to be useful too. Perhaps if your ego allowed it, you would have bought this book and actually been productive.
Well, sometimes the X-Bender field is an X-Fry field. Did you notice?
Unselfish actions pay back better
The spec and books are both good sources of information on HTTP, but I find it difficult to actually apply the knowledge.
I recently interviewed for a position requiring intimate HTTP knowledge. Rather than try and understand every bit of the spec, I just captured all of my clear text HTTP traffic for a night of surfing, I then looked at the actual HTTP exchanges between my web browser and various servers and looked things up in the spec and other sources that I didn't understand.
I also learned some things that weren't in the spec, which were helpful in the interview like how session keys are structured on various servers, etc.
(Score:-1, Wrong)
IOW the book is a good pocket reference but no substitute for the RFCs.
What about text-decoration: blink?
d) I use a trinary computer, you insensitive clod!
I think you meant a ternary computer.
I used that on a midrange programmer once. He was pissed. Claimed I made up base three and that there was nothing between binary and octal.