Slashdot Mirror


HTTP: The Definitive Guide

Michael Palmer writes "OK, how well you know HTTP? Here's a pop quiz: QUESTION: Did you know that the Keep-Alive header was valid in HTTP 1.0, but has been deprecated in HTTP 1.1? A) What does "deprecated" mean? B) What is the "Keep-Alive header?" C) That's too bad - I kind of thought Keep-Alive was handy! D) Get with the program... HTTP 1.1 came out in 1999. The Internet boom is over already! Persistent connections are the default in HTTP 1.1 anyway." Answer (not necessarily your answer) and the rest of Palmer's review follows. HTTP: The Definitive Guide author David Gourley, Brian Totty pages 656 pages publisher O'Reilly & Associates; 1st edition (September 2002) rating excellent overview, plus detail in core areas reviewer Michael Palmer ISBN 1565925092 summary An overview of HTTP and related topics

OK, so I answered "C". I am going to make bold the claim that HTTP: The Definitive Guide, the long-awaited O'Reilly book on HTTP is ambitious enough in breadth and depth that if you answered "B," "C," or "D," you will find this book useful and informative. This is primarily due to clear organization of the book, as well as its friendly (even chummy) writing style.

Even if you are a technically-inclined sort from the Marketing department, and answered "A," you could get a good technical overview of the plumbing of the Web by skimming through this book; plus, having any O'Reilly book on the shelf in your cubicle would score you some street cred with the guys sitting over in Development -- this could be the one you've actually read. :-)

Breadth Unless you answered "D," HTTP is more complicated than you think. This is especially true if, as the authors of a good technical book should do (and these authors do), one spends some time touching on matters one level down (to TCP/IP, and other areas, in this case), and one level up (to HTML, generally, in this case). Because the authors are particularly concerned with HTTP performance, details of the interactions between HTTP and adjacent levels can be important.

The book is divided into five main sections: 1) an overview of HTTP, URLs, and connection management; 2) HTTP Architecture, including Web servers, proxies, caches, gateways, tunnels, robots; 3) Identification, Authorization, and Security; 4) Entities, Encodings, and Internationalization; 5) Content Publishing and Distribution, including hosting, publishing, load balancing, logging. So, even if you classify yourself as a "D," or even if you are hacking on an extensible open-source router software platform (in that case, you are an "F"), you will find yourself pulling this book from the shelf from time to time to check on something in one of these areas. The modular organization of the book is good.

The full Table of Contents is available on line.

Depth One (unfortunate?) thing about the Web is that its "architecture" (if you can even call it that) evolved and grew piece by piece. The design goals people had in mind back in 1993, or even in 1999, have been blown away by what has happened on the ground. Inter-company politics have also been a big factor -- never helpful for promoting standardization, or sound design. (Perhaps another problem has been the lack of an O'Reilly book on HTTP to tie everything together!) Hence, not only do you have a confusing mass of obsolete and/or overlapping specifications documents, you also have major differences between how different browsers, servers, and proxies adhere to these specifications in practice. This is one place the book shines: sprinkled throughout the pages are little tidbits about compatibility or performance pitfalls, gleaned from much practical experience. (The authors were some of the architects of Inktomi's Traffic Server "enterprise class" Web cache. Think "proxy caching for all of AOL's Web traffic.") As one example: "Technically, any Connection header fields (including Connection: Keep-Alive) received from an HTTP/1.0 device should be ignored, because they may have been forwarded mistakenly by an older proxy server. In practice, some clients and servers bend this rule, although they run the risk of hanging on older proxies." I can just imagine the series of bug reports leading to the inclusion of that piece of advice in the book. There are many other such warnings and bits of advice, generally aimed at HTTP application developers, often with an eye to performance tuning.

Here again, appropriate depth of discussion for a variety of readers is handled by clear organization of the book. The basic background material is laid out, and as the authors dive deeper into detail they may make a suggestion like, "If you are [not] writing high-performance HTTP software... feel free to skip ahead." Then, at the end of every chapter, there is a section labelled, "For More Information," which is a collection of relevant references and links, for those who want to dig into the source documents themselves.

Cautions This book review is addressed to the Slashdot crowd, a very technically savvy audience, so it's appropriate to mention what this book is not. It's not a detailed technical reference on all the topics mentioned in the table of contents (above); it would be tough to fit all that material into the book's 650-plus pages. However, the book is a good overview of HTTP and many related topics. The book does dip down into the grungy detail in many areas, but this won't be your only reference if you are a Web application developer.

Conclusion Overall, this is one of the more accessible O'Reilly books I own. In addition, while experts will certainly seek out greater depth in their particular area of expertise, few people are expert in the whole range of topics related to HTTP that this book covers. In addition, the book provides many tips drawn from practical experience, and references to more detailed material. HTTP, if not the heart and soul of the Web (perhaps that is Web content itself), could perhaps be called the Web's circulatory system. If you have a professional interest in Web content distribution, or Web application development, I believe this book deserves a spot on your shelf.

You can purchase HTTP: The Definitive Guidefrom bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.

74 of 283 comments (clear)

  1. Wow, long article by TopShelf · · Score: 4, Funny

    I think I'll download it to my PDA and go deprecate for a while...

    --
    Stop by my site where I write about ERP systems & more
  2. Invalid Question by Anonymous Coward · · Score: 5, Funny

    True or false questions should not be followed by a list of four choices, none of which are "true" or "false."

    1. Re:Invalid Question by ePhil_One · · Score: 4, Funny
      True or false questions should not be followed by a list of four choices, none of which are "true" or "false."

      True or False questions are always be Pre-pended with (T or F). Trust me, I tried putting True down for an essay question once and it didn't work.

      --
      You are in a maze of twisted little posts, all alike.
    2. Re:Invalid Question by Cromac · · Score: 2, Insightful
      The correct statement should be:

      The Keep-Alive header was valid in HTTP 1.0, but has been deprecated in HTTP 1.1. True or False

      By adding "did you know" there isn't a good answer since both True and False are correct depending on who answers the question.

      Tests in school would have been much easier if they all started out with "Did you know...".

  3. Missing poll option by Anonymous Coward · · Score: 5, Funny

    I choose:

    E) CowboyNeal gives good header

    1. Re:Missing poll option by winse · · Score: 4, Interesting

      have you even noticed the 'X-Bender: something goes here' field in slashdot http responses? I sometimes make thousands of requests a day just to see how many there are. So maybe CowboyNeal did give good header.

      --
      this sig is deprecated
    2. Re:Missing poll option by mutende · · Score: 2, Interesting
      have you even noticed the 'X-Bender: something goes here' field in slashdot http responses?

      Well, sometimes the X-Bender field is an X-Fry field. Did you notice?

      --
      Unselfish actions pay back better
    3. Re:Missing poll option by onomatomania · · Score: 3, Informative

      I don't know what is more pathetic: that you would make requests just to see X-Bender headers, or that I would know where to look in the slashcode CVS to see the list (scroll down to the end of that page.)

  4. well by Joe+the+Lesser · · Score: 5, Funny

    A) What does "deprecated" mean?

    deprecated: adj. In a state of having soiled oneself. Johnny was not efficient enough and failed to reach the restroom, and was thus deprecated.

    --
    "I only speak the truth"
    Karma: null(Mostly affected by an unassigned variable)
    1. Re:well by fryguy451 · · Score: 3, Informative

      The first and fully accepted meaning of deprecate is "to express disapproval of." But the word has steadily encroached on the meaning of depreciate. It is now used, almost to the exclusion of depreciate, in the sense "to belittle or mildly disparage,".

      http://dictionary.reference.com/search?q=depreca te d

    2. Re:well by Wakkow · · Score: 2, Funny

      Hopefully the next version will fix this bug..

    3. Re:well by Anonymous Coward · · Score: 2, Funny

      If you wrote in a less impenetrable style, people might be able to tell if you made sense or not.

  5. Yes/No or Multiple choice? by JUSTONEMORELATTE · · Score: 3, Funny

    QUESTION: Did you know that the Keep-Alive header was valid in HTTP 1.0, but has been deprecated in HTTP 1.1?

    Uhh, my answer is "No"

    --

    1. Re:Yes/No or Multiple choice? by RetroGeek · · Score: 4, Funny

      How exactly does one ask a yes/no question and then give a multiple choice answer?

      You sir, are NOT a marketing guy....

      --

      - - - - - - - - - - -
      I am a programmer. I am paid to produce syntax not grammar. Deal with it.
    2. Re:Yes/No or Multiple choice? by keli · · Score: 2, Funny

      How exactly does one ask a yes/no question and then give a multiple choice answer?

      Like this:

      Is this a binary question?
      a) Yes
      b) No

      Duhh.... :-

    3. Re:Yes/No or Multiple choice? by JUSTONEMORELATTE · · Score: 3, Funny

      d) I use a trinary computer, you insensitive clod!

      --

    4. Re:Yes/No or Multiple choice? by TobiasSodergren · · Score: 3, Funny

      Is this an unary question?

      a) what?

    5. Re:Yes/No or Multiple choice? by charlieo88 · · Score: 2, Interesting

      d) I use a trinary computer, you insensitive clod!

      I think you meant a ternary computer.

      I used that on a midrange programmer once. He was pissed. Claimed I made up base three and that there was nothing between binary and octal.

  6. Keep-Alive... by Xerithane · · Score: 5, Informative
    HTTP 1.1 Specification does allow the difference between Keep-Alive and Close. By default it says it's peristent (Keep-Alive) but you can still turn it off (Connection: close\n)

    Mozilla Sends:
    GET / HTTP/1.1
    ...
    Keep-Alive: 300
    Connection: keep-alive
    Which isn't necessarily a bad thing, but they have to be backwards compatible in case they hit a poorly implemented HTTP 1.1 server. Gets annoying to code hybrid httpd systems.

    HTTP isn't that complicated of a specification though, the RFC is easy enough to understand.
    --
    Dacels Jewelers can't be trusted.
  7. RFCs have all the info you need by Anonymous Coward · · Score: 5, Informative

    Honestly, save yourself ~ $50 for an O'Reilly book and go directly to the source of the information:

    HTTP 1.0
    HTTP 1.1

    It's remarkably easy to read for a technical document.

    1. Re:RFCs have all the info you need by bwalling · · Score: 2, Insightful

      Honestly, save yourself ~ $50 for an O'Reilly book and go directly to the source of the information:

      HTTP 1.0
      HTTP 1.1


      Well, the organization of the RFCs isn't exactly what I'm looking for, there is useful commentary in the book, there is an index in the book, and I like having things in print. Sure, it's not too expensive to print the RFC, but if you shop around, the book isn't $50.

    2. Re:RFCs have all the info you need by Anonymous Coward · · Score: 4, Insightful

      No, RFCs don't have all the information you need. Specifications should contain a succint description of the protocol - not advice, best practices, informative examples, and so on. That is what books like this are for.

    3. Re:RFCs have all the info you need by Xerithane · · Score: 3, Insightful

      ...not advice, best practices, informative examples, and so on. That is what books like this are for.

      HTTP 1.1 does tell you the best practice. It says, "You SHOULD do XYZ in case ABC." If you need help coding something, you shouldn't be implementing HTTP 1.1. HTTP is not that complex, it doesn't need informative examples. What examples can you possibly need? "When using this header, the values are X, Y, or Z." Well.. it tells you that.

      I wrote a complete HTTP 1.1 implementation according to the RFC without issue. They are remarkably easy to write, and validate HTTP headers. The problem comes in from non-compliant browsers (which are non-compliant to handle non-compliant servers)

      --
      Dacels Jewelers can't be trusted.
    4. Re:RFCs have all the info you need by kazrak · · Score: 2, Insightful
      I've read the RFCs. I have the O'Reilly book as well. There is a lot of information in the O'Reilly book that is not in the RFCs. (Information on robots.txt, for example. A lot more proxy information than the RFCs contain. Some basic information on WebDAV. These are just a few things I found flipping through my copy.)

      Sure, you can find all this stuff online. You buy a book so you have a well-organized place to find it all together, though. This book succeeds marvelously at this task.

    5. Re:RFCs have all the info you need by iabervon · · Score: 2, Insightful

      A compliant browser SHOULD handle non-compliant servers, and a compliant server SHOULD handle non-compliant browsers. An important property of a good specification is that old and broken programs may be handled gracefully without violating the standard.

    6. Re:RFCs have all the info you need by Anonymous Coward · · Score: 2, Interesting

      HTTP 1.1 does tell you the best practice. It says, "You SHOULD do XYZ in case ABC."

      That isn't best practice. That is saying "Do this, unless there are exceptional circumstances". That is part of the protocol. Best practice is where there is an appropriate algorithm that most implementations have settled upon. It's a subtle difference, but it's definitely there.

      If you need help coding something, you shouldn't be implementing HTTP 1.1.

      What complete and utter egotistical bollocks. I'm sure I could code up an implementation simply by following the RFC - but there's no way in hell a responsible developer would, given the choice. There's plenty of experience codified in this book - experience that my ego allows me to benefit from, unlike Real Programmers like you, who seem to be too macho to know what is good for them.

      I wrote a complete HTTP 1.1 implementation according to the RFC without issue. They are remarkably easy to write, and validate HTTP headers. The problem comes in...

      So you wrote an implementation "without issue", and then state "The problem comes in..."?

      See what's wrong with this picture? Books like this include information on what brain damages are floating around out there. The RFC doesn't. You may have written a conformant implementation, but what most people are after has to be useful too. Perhaps if your ego allowed it, you would have bought this book and actually been productive.

    7. Re:RFCs have all the info you need by statusbar · · Score: 3, Informative
      This is what matters in the RFC:


      1.2 Requirements

      The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [34].


      RFC 2119 says:


      1. MUST This word, or the terms "REQUIRED" or "SHALL", mean that the definition is an absolute requirement of the specification.

      2. MUST NOT This phrase, or the phrase "SHALL NOT", mean that the
      definition is an absolute prohibition of the specification.

      3. SHOULD This word, or the adjective "RECOMMENDED", mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course.


      So in this case should is not synonymous with must.

      --jeff++
      --
      ipv6 is my vpn
  8. Wow. by sethadam1 · · Score: 4, Interesting

    It's nice to see a review like this. Many slashdot reviews are short and detail-less, but this one is a good overview, which I like.

    As much as I want to know about the underpinnings of HTTP, I find this one of those "books I'd like to HAVE read." If I buy it, which I may, I'm pretty sure it will be one of those books I just don't get around to reading because I personally don't have a huge need for it. I'd love to know the information, but I don't know I have the time to pull off actually reading it. Is it just me, or does everyone have a few of those books - the ones you wish you had actually read, but instead just look nice as part of your technical book collection?

    I guess there's at least one positive about the Matrix - I can make a quick phone call and have my operator just load "The Complete HTTP" for me.

  9. When is HTTP 2.0 coming out? by Anonymous Coward · · Score: 3, Interesting

    I figure XHTML 2 is going to require a big re-design of everything anyway, why not design an HTTP 2.0 to go with it?

    1. Re:When is HTTP 2.0 coming out? by leighklotz · · Score: 2, Informative

      An AC Writes:
      > I figure XHTML 2 is going to require a big re-design of everything anyway, ...
      XHTML 2 has been working in many browsers since August, 2002, even though it's still a draft. Part of the point of point of XHTML 2 is to cleanly re-seat HTML on top of the stack of stuff that browsers are supposed to implement already (CSS, XML, linking, etc.).

    2. Re:When is HTTP 2.0 coming out? by shiflett · · Score: 4, Insightful

      Never.

      To quote the W3C:

      Now that both HTTP extensions and HTTP/1.1 are stable specifications, W3C has closed the HTTP Activity. The Activity has achieved its goals of creating a successful standard that addresses the weaknesses of earlier HTTP versions.

  10. problems with definitive guides by stonebeat.org · · Score: 4, Insightful

    The problem with definitives guides is that, they get outdated very quickly :)

    so i wouldn't spend any money on them. instead i would just browse the W3C website or other reference web sites.

    1. Re:problems with definitive guides by Anonymous Coward · · Score: 2, Interesting

      But HTTP 1.1 has been out a while, and there isn't anything really new on the horizon. This book will probably have a longer life than many.

    2. Re:problems with definitive guides by mmcshane · · Score: 2, Interesting
      But HTTP 1.1 has been out a while, and there isn't anything really new on the horizon. This book will probably have a longer life than many.
      Actually, that's not true. Roy Fielding (co-creator of HTTP 1.1, former Chairman of apache.org) is working on WAKA (PPT, sorry).
  11. zeldman by Meeble · · Score: 5, Informative

    > One (unfortunate?) thing about the Web is that its "architecture" (if you can even call it that) evolved and grew piece by piece. The design goals people had in mind back in 1993, or even in 1999, have been blown away by what has happened on the ground. Inter-company politics have also been a big factor - never helpful for promoting standardization, or sound design. >

    I couldn't agree with this more from a web development area as well, so many designers are still using hack and slash methods from the early 90's it's sad[although not always their fault!]. It correlates to the same principles used to build the architecture itself.

    side note: if you're interested in learning more about forward compatible web design you should check out Jeffrey Zeldman's new book 'Designing With Web Standards' you can find him at www.zeldman.com - I just finished this book and it was well worth the $24.50 - all you nested table designers should pick this one up or those looking to bridge the gap from using tabled design. =)

    --
    Fear Breeds Knowledge
    1. Re:zeldman by Brummund · · Score: 3, Insightful

      I don't know about you. but I'd rather die or work in the advertising business than buy a book about web design by someone who uses light grey on white background on their homepage. Come on, he should know better than "It's hardly readable, but it SURE looks nice."

    2. Re:zeldman by Meeble · · Score: 2, Insightful

      sure maybe at 7000 x 7000 resolution it doesn't take up everything on your screen - however his compatible design works in all browsers and WAP out there currently - including Safari.

      --
      Fear Breeds Knowledge
  12. I do know this... by Otter · · Score: 3, Funny

    ===================
    QUESTION: Did you know that the Keep-Alive header was valid in HTTP 1.0, but has been deprecated in HTTP 1.1?
    A) What does "deprecated" mean?<br>
    B) What is the "Keep-Alive header?"
    C) That's too bad - I kind of thought Keep-Alive was handy!
    D) Get with the program... HTTP 1.1 came out in 1999. The Internet boom is over already! Persistent connections are the default in HTTP 1.1 anyway.
    ============

    Well, I'm no HTTP expert but I do know this -- that <br> tag doesn't belong there.

  13. I'm in management now... by AKAJack · · Score: 4, Funny

    ...I have someone I can fire if they don't know the answer to this question.

    1. Re:I'm in management now... by sandbagger · · Score: 3, Funny

      You can't fire them if they're in marketing and they insist on saying things like "This HTTP sounds interesting. Can it be put in the web?"

      Oh yeah, the same applies to human resources.

      --
      ---- The above post was generated by the Turing Institute. Maybe.
  14. Re:Or... by Zeinfeld · · Score: 5, Informative
    Or, you could just check out the W3C and read up on it without the need of someone making edits to the explanations of the actual specs.

    Where do you think you can find HTTP on the W3C site?

    HTTP was standardized in IETF process, not W3C. HTML started in IETF process and then we yanked it out and did it in W3C. IETF process is not the place to work on something where there are religious wars, the SGML folk were big on religious wars.

    The RFCs on HTTP are useful if you are writing a server or client, however they are less useful as a guide to how what is out there works. One of the big problems with the IETF is that the RFCs look like shit, they are designed to be printed in a fixed width font because thats the way they did things in Babbage's day. So not surprisingly engineers tend to go for documentation that is easier on the eye, even if it turns out to be wrong.

    The other issue with the specs is that they describe what the WG came up with. That does not necessarily represent reality, the group took seven years to complete. If you want to know what will work you need more information than is in the RFC.

    I wrote parts of the HTTP spec and even I would want more information than just the spec. I am not sure about the 'advice' about working arround older broken proxies, I tend to think its not a bad thing if folk running obsolete software lose every so often. But it is useful to know that it can be an issue.

    --
    Looking for an Information Security student project suggestion?
    Try http://dotcrimeManifesto.com/
  15. Re:Jesus Christ! Get with the program, grandpa! by Gibble · · Score: 2, Informative

    *psst*

    HTML != HTTP

    --
    Gibble: Descriptive of an emotional state in which one's mind is scrabbling for some purchase on reality
  16. Re:Jesus Christ! Get with the program, grandpa! by Fizzl · · Score: 2, Funny

    So your answer would be:

    e) I thought the HTTP standard would be 4.01 already!

    Which means you should definetely first read "internet protocols for dummies".

    Ok, I'm a bit mean here, but I just couldn't resist.

    *smug smirk*

  17. Re:answer e) by jc42 · · Score: 2, Funny

    everyone with any real cred still uses HTTP 1.0.

    Huh? To get real cred, you do:

    : telnet foo.bar.com 80
    GET /some/file.xml

    And you hit Return twice, of course, but you knew that.

    HTTP 0.9 is the Real Thing.

    Hey, anyone remember HTTP 0.5?

    --
    Those who do study history are doomed to stand helplessly by while everyone else repeats it.
  18. Re:It even answers by glenstar · · Score: 5, Funny
    Here are some other interesting codes, pulled directly from the RFC:

    402 -- Payment Required
    406 -- Not Acceptable
    300 -- Multiple Choices

  19. Re:Or... by weston · · Score: 4, Funny

    Where do you think you can find HTTP on the W3C site?

    And yet, as has been pointed out, you can indeed find it on the w3 site.

    The RFCs on HTTP are useful if you are writing a server or client, however they are less useful as a guide to how what is out there works.

    But, as anyone who's tried CSS or just about anything else knows, this is absolutely true. Differences between vendor implementations are one reason why many geeks are bald, sickly, and pale.

  20. Re:Jesus Christ! Get with the program, grandpa! by jointm1k · · Score: 2, Funny

    Actually, they would be at XHTTP1.1 by now ;)

    --
    You know it makes sense, a little reminder from jointm1k.
  21. "OK, how well you know HTTP?" by Anonymous Coward · · Score: 4, Funny

    Me know HTTP real good!

  22. Re:Or... by Kredal · · Score: 2, Funny

    Aww, I miss blink. I used to have a version of my webpage wherein every other word blinked. It was actually quite pretty, in a geeky epileptic sort of way.

    --
    Whoever stated that signature sizes should be limited to one hundred and twenty characters can just go ahead and kiss my
  23. /me too by DrSkwid · · Score: 2, Insightful

    until divs will auto resize we'll be stuck with pages like this one (light orange on white for them menus ffs!) that only go 20% to the width of my browser window.

    & his menus don't resize to fit the text if you turn up the size

    still, never mind, im sure he makes $ from his book, but not from me

    --
    There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
  24. HTTP is amazingly badly engineered by Fefe · · Score: 3, Interesting

    Standards should be lean and so easy to understand and so trivial to implement that one undergrad student can implement it to full compliance in one afternoon.

    HTTP 1.1 has over 100 pages, most of them absolutely useless for implementors. Unnecessary verbiage, unnecessary optional parts, unnecessary warts, unnecessary "I'm working on a thesis about foo, let's put it in this standard and see what happens" crap.

    Examples: chunked encoding -- absolutely superfluous! Amazingly useless. Or what about the range support? HTTP allows to request a byte range from a file. Normally you would use that to fetch the second half of an aborted download, or maybe for PDF reading you would fetch bytes 10 to 100 or so. HTTP 1.1 allows to specify several ranges in the same request, and the server is expected to construct some MIME abomination as answer, if it supports this at all. If it doesn't, it is allowed to coalesce the ranges and just send the whole range. This makes this feature horrendously useless for clients (why bother with it if you a) have to implement some sort of complicated parser to understand the result and b) won't even save bandwidth because the server isn't going to implement it in the first place and c) it is not even cheaper than just using keepalive connections and asking for the parts one by one.

    In short: HTTP needs to die quickly and be replaced by something sane.

    Did I mention the monstrosity that is content negotiation? It is impossible to write a proxy that can cache content in the face of content negotiation. Luckly, nobody uses it on their servers, because it is a pig to implement and configure on the server. Clients tend to support it, but who cares.

    1. Re:HTTP is amazingly badly engineered by cdipierr · · Score: 5, Informative

      Um...chunked encoding is not useless.

      If you've got dynamic output, and don't want to buffer then entire content so you can generate a Content-Length header, then chunked encoding is for you. There's no reason for a server to be buffering up a potentially huge reply if the client can accept it piece-meal instead.

    2. Re:HTTP is amazingly badly engineered by mmcshane · · Score: 5, Informative

      Troll city. I'll bite.

      Chunked encoding is usefull to me everyday. I use a protocol one level up from HTTP1.1 (AS2) where messages and their digests are transferred in the same request - in chunks.

      As for supporting ranges, this is why agents are encouraged to delegate difficult MIME handling to helper apps like a Flash plugin. Plenty of servers implement this, it's actually not even that hard. There is a separate issue related to what a range response actually represents (in the theoretical sense), but I won't touch that for now. Read www-tag @W3C for more info.

      Content negotiation works nicely. We serve French pages to agents that prefer French. We also serve unstyled xml to agents which we're sure are not browsers. It's not hard to do, we look at a header and then decide which representation to serve. Caches use the Vary header to choose which responses to serve from cache. It's not rocket science.

      My favorite part: "HTTP needs to die quickly and be replaced by something sane"

      Yeah, it'll never catch on.

    3. Re:HTTP is amazingly badly engineered by Fefe · · Score: 2, Insightful

      First of all, it's perfectly OK to serve the dynamic content without a content-length header.

      Second of all, the whole point of the content-length header is so that the client knows how much data will come and is thus able to allocate memory, see whether it will be able to process the whole content and display a progress bar. All of these are not possible with chunked encoding, so you get none of the benefits from content-length. Why not drop it in the first place?

      Not having a content-length header has only one drawback: it breaks keep-alive connections. But since sane sites are compressing their dynamic content anyway to save bandwidth cost and make it appear quicker on the client machine, and dynamic HTML pages of 100k typically compress down to below 10k because HTML is so bloated, there really is no point in not buffering those 10k. The system has larger buffers than that for TCP anyway, so memory consumption is not a valid excuse. Also, if you do the buffering, you can add the content-length header and get all the benefits.

      Oh, and one last point: we have had security problems caused by chunked encoding. We also have had a trillion security problems by idiots and static buffering, but so far nobody has been stupid enough to do compression and HTTP output buffering using a static buffer.

    4. Re:HTTP is amazingly badly engineered by Fefe · · Score: 2, Interesting

      Keep-alive is important because it substantially reduces the amount of traffic used in service multiple small requests

      No. The traffic difference between using keep-alive and not are two TCP packets, 60 bytes each (unless you use a modem line with header compression, in which case it is even less).

      Keep-alive reduces the latency, though. The difference is big in benchmarks but small in practice. Without keep-alive I can still make over 2000 connections a second on my old notebook.

      Not all clients support dynamically compressed content.

      Oh, really? Which client are you talking about? wget? Mozilla, old Netscape, Opera and Internet Exploder support compressed content. That is about 99% of the market. Whom are you kidding here?

      Many servers do not provide dynamically compressed content because it is heavy on CPU load.

      No shit, Sherlock! So? They are generating dynamic content, which is inherently heavy on the CPU. Are you telling us that they didn't buy a large machine with powerful CPUs in the first place? Sorry but I call bullshit.

      The issue isn't memory allocation, its speed of response. Dynamic pages can take a long time to generate.

      If the user has to wait a long time for your content, the latency win of having content-length is neglegible in the first place. You just shot down your only argument.

  25. deprecated by ap0stle · · Score: 3, Informative
    From w3.org :

    deprecated

    Deprecated

    A deprecated element or attribute is one that has been outdated by newer
    constructs. Deprecated elements are defined in the reference manual in
    appropriate locations, but are clearly marked as deprecated. Deprecated
    elements may become obsolete in future versions of HTML.

    User agents should continue to support deprecated
    elements for reasons of backward compatibility.


    Definitions of elements and attributes clearly indicate which are
    deprecated.


    This specification includes examples that illustrate how to avoid using
    deprecated elements. In most cases these depend on user agent support for style
    sheets. In general, authors should use style sheets to achieve stylistic and
    formatting effects rather than HTML presentational attributes. HTML
    presentational attributes have been deprecated when style sheet alternatives
    exist.


  26. Re:Or... by Yunzil · · Score: 3, Funny

    One of the big problems with the IETF is that the RFCs look like shit, they are designed to be printed in a fixed width font because thats the way they did things in Babbage's day. So not surprisingly engineers tend to go for documentation that is easier on the eye, even if it turns out to be wrong.

    I don't know about that. I'm an engineer, and I'd rather have something printed in fixed-width font, on green-and-white fanfold paper. Less BS, more facts. :)

  27. Two minutes to midnight. by HarveyBirdman · · Score: 4, Funny
    A) What does "deprecated" mean?

    "Soon to be a Microsoft standard."

    --
    --- Ban humanity.
  28. Re:It even answers by _Bunny · · Score: 3, Informative

    Error 300 isn't as unusual as you might think.

    Apache's mod_speling module will correct small typeos in URLs that are requested, and if it finds more than one possible match it returns an error 300 with the possible choices.

    For example:

    http://www.madriver.k12.oh.us/network/netware/wefs 1

    - Bunny

  29. Other useful error codes by Mr_Silver · · Score: 2, Funny
    I find the error codes generated by here rather enlightening.

    (reload a couple of times)

    Yes, I did have something to do with it. Sorry.

    --
    Avantslash - View Slashdot cleanly on your mobile phone.
  30. Re:Jesus Christ! Get with the program, grandpa! by bigman2003 · · Score: 5, Funny

    Geez, I've been running Internet 6.0 for a long time. I don't know anyone still running 1.1. Some of the Netscape people are still running version 4, but I heard they can move up to seven.

    I hope that Microsoft comes out with version 8 of the Internet- but by then AOL will have Internet version 9. This is so hard to keep track!

    Who cares about Internet 1.1 though. Maybe you should get a new computer.

    --
    No reason to lie.
  31. Re:Or... by mmol_6453 · · Score: 3, Funny

    The blink tag works great for papers on quantum physics.

    (Credit to UserFriendly goes here)

    --
    What's this Submit thingy do?
  32. The only book you need... by spazoid12 · · Score: 2, Informative

    For the full-featured HTTP server that I designed and implemented at my last job...I found just one book to be all the help a person needs:

    "HTTP Pocket Reference", O'Reilly, maybe 4 bucks at Bookpool.

    75 pages, of which about 65 aren't necessary.

    656 pages on HTTP??? It's not a detailed technical reference on all the topics mentioned in the table of contents (above); it would be tough to fit all that material into the book's 650-plus pages. ... good grief!!

  33. How well you know HTTP? by sharkey · · Score: 2, Funny

    I'm an IIS coder, you insensitive clod!

    --

    --
    "Outlook not so good." That magic 8-ball knows everything! I'll ask about Exchange Server next.
  34. Lean vs Trivial by SnakeStu · · Score: 3, Insightful

    Standards should be lean and so easy to understand and so trivial to implement that one undergrad student can implement it to full compliance in one afternoon.

    I suppose that appeals to undergrads, and those who like extremely granular standards that only address small parts of a solution. Beyond that, it's an absurd overstatement. Standards should be lean in the sense that they should be focused, but to be trivial enough for full implementation by an undergrad in one afternoon ducks below the bar of general usefulness. It's somewhat analogous to what I've heard more than one teacher respond when asked by a student "how long" a paper should be: It should be like a skirt -- long enough to cover the important parts, short enough to keep it interesting. You're right that it should be lean (short enough to keep it interesting) but your criterion for that might not cover the important parts.

  35. what deprecated really means by marhar · · Score: 4, Funny

    A) What does "deprecated" mean?

    "No matter how much we pretend otherwise, this will stay around forever."

  36. Thou shalt not SHOULD? by fm6 · · Score: 2, Insightful
    A standards document should never use the word SHOULD.
    Don't you mean, "A standards document must never use the word SHOULD? ;)

    Strictly speaking, RFCs are not standards -- only government-sanctioned bodies can issue standards. Of course, that's a distinction only of interest to compulsive nit-pickers (aka Tech Writers).

    In practical terms, I think a good RFC plays the role both of a standards document (MUST) and a best practices document (SHOULD). Given the ad hoc nature of the Internet, it makes a lot of sense to combine the two. It's the sort of informal process and documentation that has allowed the net to grow so quickly.

    And (the bring us back to the real topic) that's a good reason to not waste money on a book if there's a good RFC at hand.

  37. Re:Or... by shiflett · · Score: 3, Informative

    Your entire post could not be more untrue.

    HTTP was created long before it was handed off to be maintained by the IETF. It existed prior to the RFC that you claim to have co-wrote. The only reason that exchange was made is because HTTP is viewed as a piece of the Internet's infrastructure; in fact it is essentially where the Internet and the Web intersect.

    Also, HTTP is very useful as "a guide to how what is out there works." Check out a mailing list for mod_perl, PHP, etc. You will find countless questions being asked that would be answered by a simple understanding of HTTP - how the Web works. This is what real Web developers need; then maybe I can check my bank account balance or sell some stocks without having to interact with a poorly-constructed Web site.

    As the author of the HTTP Developer's Handbook, you might think that I would point out weaknesses in O'Reilly's effort. On the contrary, I think this work is very good, and I would highly recommend it to anyone involved in Web development. I think my book is more suited for the everyday reference that you carry with you that explains things specifically from a Web developer's perspective rather than focusing on clarifying the standards, and I think the two go well together.

    At any rate, I think this is a quality book on a very important topic.

  38. Learning HTTP by slagdogg · · Score: 2, Interesting

    The spec and books are both good sources of information on HTTP, but I find it difficult to actually apply the knowledge.

    I recently interviewed for a position requiring intimate HTTP knowledge. Rather than try and understand every bit of the spec, I just captured all of my clear text HTTP traffic for a night of surfing, I then looked at the actual HTTP exchanges between my web browser and various servers and looked things up in the spec and other sources that I didn't understand.

    I also learned some things that weren't in the spec, which were helpful in the interview like how session keys are structured on various servers, etc.

    --
    (Score:-1, Wrong)
  39. Most overlooked HTTP feature by KjetilK · · Score: 2, Insightful
    OK, so what are people's favorite overlooked HTTP feature?

    Mine are definately content negotation, specifically language negotation, since I develop multilingual websites (yeah, English is not my first language).

    I find that extremely useful, yet, nobody cares about it... It is really annoying when you get to a website and you have to choose the language, "Hey, I told you that in my accept-language header, just listen!"

    Things are moving sooooo slowly...

    --
    Employee of Inrupt, Project Release Manager and Community Manager for Solid
  40. Useful book by Anonymous Coward · · Score: 2, Informative

    I used this book in addition to the RFC when writing my webserver software.

    It's a good addition to the RFC's but not a substitute. The introductory stuff is a bit too basic but the rest of the chapters clarify several things about the RFC's. 2616 can be a bit ambiguous at times.

    All in all, it was worth the money if you are planning to do any serious work with HTTP.

  41. Re:Or... by kill-1 · · Score: 2, Interesting
  42. Re:Don't give away the ending! by Graspee_Leemoor · · Score: 2

    " I hate it when people post spoilers before I've read the book!"

    Suck on this!

    The animal on the cover of HTTP: The Definitive Guide is a thirteen-lined ground squirrel (Spermophilus tridecemlineatus), common to central North America. True to its name, the thirteen-lined ground squirrel has thirteen stripes with rows of light spots that run the length of its back. Its color pattern blends into its surroundings, protecting it from predators. Thirteen-lined ground squirrels are members of the squirrel family, which includes chipmunks, ground squirrels, tree squirrels, prairie dogs, and woodchucks. They are similar in size to the eastern chipmunk but smaller than the common gray squirrel, averaging about 11 inches in length (including a 5-6 inch tail).

    Thirteen-lined ground squirrels go into hibernation in October and emerge in late March or early April. Each female usually produces one litter of 7-10 young each May. The young leave the burrows at four to five weeks of age and are fully grown at six weeks. Ground squirrels prefer open areas with short grass and well-drained sandy or loamy soils for burrows, and they avoid wooded areas-mowed lawns, golf courses, and parks are common habitats.

    Ground squirrels can cause problems when they create burrows, dig up newly planted seeds, and damage vegetable gardens. However, they are important prey to several predators, including badgers, coyotes, hawks, weasels, and various snakes, and they benefit humans directly by feeding on many harmful weeds, weed seeds, and insects.

    graspee

  43. Re:Or... by Zeinfeld · · Score: 2, Insightful
    HTTP was created long before it was handed off to be maintained by the IETF. It existed prior to the RFC that you claim to have co-wrote. The only reason that exchange was made is because HTTP is viewed as a piece of the Internet's infrastructure; in fact it is essentially where the Internet and the Web intersect.

    Well yes, before there was HTTP 1.1 there was HTTP 1.0. There was also an HTTP 0.9 that was arround before that...

    HTTP was NOT handed off to the IETF by the W3C as your post appears to imply, there was no W3C at that time. HTTP was taken to the IETF to get recognition as a protocol standard. There was no 'handing off', the same people continued to work on the protocol as before. The only significant change was that the mailing list changed, www-talk had become very noisy by this time. The IETF has change control in a nominal sense, they can write new versions of the spec and call them HTTP, but so can anyone else, they just might have more difficulty getting others to recognise them...

    That is the reason there are two sets of acknowledgements in the spec. The first set is the original authors, the second the set of people who worked on the draft after the IETF process started.

    I don't seem to remember your name from any of the Web working groups I have been associated with. It is unlikely that if you know as much as you claim to about the Web that you don't know mine. I don't think that publishing a book about my work gives you the right to accuse me or for that matter anyone else of being a liar.

    Perhaps if you actually read what I wrote rather than what you think I wrote you might not have made such a fool of yourself.

    --
    Looking for an Information Security student project suggestion?
    Try http://dotcrimeManifesto.com/