Why IE Is So Fast ... Sometimes
safrit writes "Finally the scoop on how IE "cheats" a little to up its performance! Do RFCs mean nothing anymore? What's next, Riots in the streets, dogs and cats living together, mass hysteria!
From the blog story: 'Internet Explorer on Windows always seems either to run impossibly fast (page requests are fulfilled almost before the mouse button has returned to its original unclicked position), or ridiculously slow...' Now read to see why..."
Straight from the site.......
...And that's it. The client doesn't FIN, and the server doesn't ACK. In other words, the connection is kept "half-open" on the server end. The reason for this? Why, to make subsequent connections from IE clients faster. If the connection isn't torn down all the way, all IE has to do is send an HTTP request, with no preamble-- and the server will immediately respond. Ingenious!
Internet Explorer on Windows always seems either to run impossibly fast (page requests are fulfilled almost before the mouse button has returned to its original unclicked position), or ridiculously slow (as with the weird stalling-on-connect problem that many people, including myself, have noticed).
One possible explanation is something that my team and I noticed a couple of years ago, in analyzing packet traces of IE's connection setup procedure. Microsoft might have fixed this since then; I'm not sure. But it's a possible culprit.
First of all, for those rusty on their TCP/IP-- here's how a normal HTTP request over TCP should work:
Client Server
1. SYN ->
2.
4. Request ->
This is how the client and server synchronize their sequence numbers, which is how a connection gets established. The client sends a synchronization request, the server acknowledges it and sends a synchronization request of its own, and the client acknowledges that. Only then can the HTTP request proceed reliably.
The server's SYN (synchronize) and ACK (acknowledgement) packets are combined for speed; there's no reason to send two separate packets, when you're trying to get a connection established as quickly as possible. Another speed enhancement that Mac OS 9's stack uses, by the way, is to combine the client's ACK and the HTTP request into a single packet; this is legal, but not frequently done. The idea is that within the structure of TCP/IP, you want to minimize the number of transactions that need to take place in setting up the two-way handshake necessary before you can send the HTTP request.
When tearing down a connection, it looks like this:
Client Server
1.
3. FIN ->
4.
Uh... what? Dunno what the hell this is. I'll ignore it, or RST.
2. Oh, you're a standard server. Okay: SYN ->
3.
5. Request ->
In other words, instead of sending a SYN packet like every other TCP/IP application in the world, IE would send out the request packet first of all. Just to check. Just in case the HTTP server was, oh, say, a Microsoft IIS server. Because IIS' HTTP teardown sequence looked like this:
Client Server
1.
They probably called it "Microsoft Active Web AccelerationX(TM)®" or something.
(I may be remembering this incorrectly; it might be that the client does FIN, and the server simply keeps the connection around after it ACKs it. Instead of shutting down the connection entirely, it just waits to see if that client will come back, so it can open the connection back up immediately instead of having to go through that whole onerous SYN-SYN/ACK procedure. Damn rules!)
Now, what does this mean for non-IIS servers? It means that if you use IE to connect to them, it first tries to send that initial request packet, without any SYNs-- and then it only proceeds with the standard TCP connection setup procedure if the request packet gets a RST or no response (either of which is a valid way for a legal stack to deal with an unsynchronized packet). But IIS, playing by its own rules, would respond to that packet with an HTTP response right away, without bothering to complete the handshake. So IE to IIS servers will be nice and snappy, especially on subsequent connections after the first one. But IE to non-IIS servers waste a packet at the beginning of each request-- and depending on how the server handles that illegal request, it might immediately RST it, or it might just time out... which would make the browser seem infuriatingly slow to connect to new websites.
This is only marginally less stupid than RunTCP's "solution"-- and I say "marginally" only because in the grand scheme of things, this probably makes sense to Microsoft's network engineers. After all, eventually all clients will be Windows platforms running IE, and all servers will be Windows platforms running IIS. And then we can break all kinds of rules! Rules are only there to hold us back and force us to play nice with other vendors. Well, once the other vendors are all gone, who cares about some stupid RFC?
I have to admire their arrogance and their confidence. But it'll be some time before I can bring myself to admire their technical integrity.
These pretzels are making me thirsty.
18:07 - What makes IE so fast?
(top) Internet Explorer on Windows always seems either to run impossibly fast (page requests are fulfilled almost before the mouse button has returned to its original unclicked position), or ridiculously slow (as with the weird stalling-on-connect problem that many people, including myself, have noticed).
One possible explanation is something that my team and I noticed a couple of years ago, in analyzing packet traces of IE's connection setup procedure. Microsoft might have fixed this since then; I'm not sure. But it's a possible culprit.
First of all, for those rusty on their TCP/IP-- here's how a normal HTTP request over TCP should work:
Client Server 1. SYN -> 2. <- SYN/ACK 3. ACK -> 4. Request ->
This is how the client and server synchronize their sequence numbers, which is how a connection gets established. The client sends a synchronization request, the server acknowledges it and sends a synchronization request of its own, and the client acknowledges that. Only then can the HTTP request proceed reliably.
The server's SYN (synchronize) and ACK (acknowledgement) packets are combined for speed; there's no reason to send two separate packets, when you're trying to get a connection established as quickly as possible. Another speed enhancement that Mac OS 9's stack uses, by the way, is to combine the client's ACK and the HTTP request into a single packet; this is legal, but not frequently done. The idea is that within the structure of TCP/IP, you want to minimize the number of transactions that need to take place in setting up the two-way handshake necessary before you can send the HTTP request.
When tearing down a connection, it looks like this:
Client Server 1. <- FIN 2. ACK -> 3. FIN -> 4. <- ACK
This generally takes four steps, and the FIN/ACK packets are usually not consolidated because connection teardown is nowhere near as speed-sensitive as startup is. (The FIN sequence can be initiated either by the client or the server.)
Many very stupid companies have tried to come up with overly clever ways to speed up TCP/IP. TCP, by its nature, is a stateful and bidirectional protocol that requires all data packets to be acknowledged; this makes the data flow reliable, by providing a mechanism for dropped packets to be retransmitted; but this also makes for a more strictly regimented flow structure involving more packets transmitted over the wire than in simpler, non-reliable protocols like UDP-- and therefore it's slower. One company that thought itself a lot smarter than it really was, called RunTCP, came up with the idea of "pre-acking" TCP packets; it would send out the acknowledgments for a whole pile of data packets in advance, thus freeing them from the onerous necessity of double-checking that each packet actually got there properly. And it worked great, speeding up TCP flows by a significant margin-- in the lab, under ideal test conditions. The minute you put RunTCP's products out onto the real Internet, everything stopped working. Which stands to reason-- their "solution" was to tear out all the infrastructure that made TCP work reliably, under competing load and in adverse conditions, in the first place. Dumbasses.
So then there's this thing we discovered in the lab. We noticed that when you entered a URL in Internet Explorer 5, its sequence of startup packets didn't look like the one shown above. Instead, it looked like this:
Client Server 1. Request -> Uh... what? Dunno what the hell this is. I'll ignore it, or RST. 2. Oh, you're a standard server. Okay: SYN -> 3. <- SYN/ACK 4. ACK -> 5. Request ->
In other words, instead of sending a SYN packet like every other TCP/IP application in the world, IE would send out the request packet first of all. Just to check. Just in case the HTTP server was, oh, say, a Microsoft IIS server. Because IIS' HTTP teardown sequence looked like this:
Client Server 1. <- FIN 2. ACK ->
...And that's it. The client doesn't FIN, and the server doesn't ACK. In other words, the connection is kept "half-open" on the server end. The reason for this? Why, to make subsequent connections from IE clients faster. If the connection isn't torn down all the way, all IE has to do is send an HTTP request, with no preamble-- and the server will immediately respond. Ingenious!
They probably called it "Microsoft Active Web AccelerationX(TM)®" or something.
(I may be remembering this incorrectly; it might be that the client does FIN, and the server simply keeps the connection around after it ACKs it. Instead of shutting down the connection entirely, it just waits to see if that client will come back, so it can open the connection back up immediately instead of having to go through that whole onerous SYN-SYN/ACK procedure. Damn rules!)
Now, what does this mean for non-IIS servers? It means that if you use IE to connect to them, it first tries to send that initial request packet, without any SYNs-- and then it only proceeds with the standard TCP connection setup procedure if the request packet gets a RST or no response (either of which is a valid way for a legal stack to deal with an unsynchronized packet). But IIS, playing by its own rules, would respond to that packet with an HTTP response right away, without bothering to complete the handshake. So IE to IIS servers will be nice and snappy, especially on subsequent connections after the first one. But IE to non-IIS servers waste a packet at the beginning of each request-- and depending on how the server handles that illegal request, it might immediately RST it, or it might just time out... which would make the browser seem infuriatingly slow to connect to new websites.
This is only marginally less stupid than RunTCP's "solution"-- and I say "marginally" only because in the grand scheme of things, this probably makes sense to Microsoft's network engineers. After all, eventually all clients will be Windows platforms running IE, and all servers will be Windows platforms running IIS. And then we can break all kinds of rules! Rules are only there to hold us back and force us to play nice with other vendors. Well, once the other vendors are all gone, who cares about some stupid RFC?
I have to admire their arrogance and their confidence. But it'll be some time before I can bring myself to admire their technical integrity.
For Opera to get it's "Fastest browser on earth" title, it caches EVERYTHING. Even things that aren't supposed to be cached like SSL pages.
No, it doesn't. In fact, it doesn't even cache any page that's protected by a password, nor does it add them to the list of recently visited addresses (which is nice both for security and privacy reasons).
They are only kept in the RAM cache (i.e., when you press "back" or "forward", it will usually show you a page's last state (down to the position of the scroll bars), without reloading it. This is quite useful, BTW; it means you can go back and forth between pages without losing what you were writing in a form (unlike MSIE, where forms are reset).
RMN
~~~
HTTP 1.1 allows for this - it's called a persistant connection.... and is exactly what Mozilla, Opera, IE, and every other browser is SUPPOSED to do...
The speed probably comes as a side effect of broadband and a well connected server... as a colleague of mine pointed out, Mozilla is just slow because they wait 1 second before they display the page, in case the layout changes...
Maybe the slashdot editors need to do a little bit of reading about a subject before they post them.
Indeed, it makes spoofing much easyer: no need to bother with sequence number guessing, just send your data packet right away, and pretend the connection is already open. This, combined with the fact that many IIS servers are often full of SQL-injectionable scripts should provide for great phun! Who needs open proxies when you can spoof so easily?
IE's other trick, or so it is assumed (since the source isn't available) is that it does full DOM and JS caching.
That is to say, if you visit a webpage with (say) Mozilla, the HTML is interpreted and the HTML tree is built in memory. Pages with advanced CSS have a more complicated tree, of course. However, when the user leaves the page, that tree is destroyed and has to be recreated each time the user visits the page.
The bug to correct this in Mozilla is bug 38486 - "[FEATURE] Keep DOM and JS context in memory to provide fast access when clicking back". You can also vote for it (free Bugzilla account required) though you'll have to copy-n-paste the URL into your browser window since Bugzilla doesn't accept referrers from Slashdot.
PS Threaded e-mail is handy, eh? It sure is, unless your mail reader doesn't remember that you want to see your mailboxes in threaded view and keeps reverting back to collapsed form. That one is bug 64426 (vote for it if you like).
Alex Bischoff
HTML/CSS coder for hire
It is forcing persistant connections rather than requesting them the HTTP/1.1 way! This means that these servers are stuck with tons of open Sockets causing it to refuse new ones!
I demand an Apache workaround by tomorrow! (j/k, apache is unnaffected and takes longer to load in MSIE because of its COMPLIANCE TO A STANDARD that M$ is trying to bend to their own will).
Will benchmarking authors be blackmailed/bribed into making their software use this while testing MSIE and MSIIS?
btw, faster connections don't mean squat if your server takes ittself down monthly!
You can't judge a book by the way it wears its hair.
4069902 TCP in 2.5.1 should have similar slow start mechanism as in 2.6 13 Aug 1997
/dev/tcp tcp_slow_start_initial 2
) TCP BASICS - SLOW START AND DELAYED ACK
The TCP specification requires something known as "slow start". The
algorithm applies to the sender side and is described in RFC2001.
The intent of the slow start algorithm is to avoid a "congestion
collapse" in a network by ensuring that each TCP sender doesn't
overwhelm the network. The algorithm mandates that the first
transmission be a single packet. If the recipient acknowledges
the first packet successfully (i.e. the communication doesn't time
out and the recipient believes that the packet has arrived without
error), the sender sends two more packets. Successful transmission
results in the sender sending yet more packets in parallel, until
the capability of the underlying network is reached and one or more
packets are not acknowledged successfully. Essentially the sender
uses ACKs as a "clock" to regulate and gradually increase the
rate packets are injected into the network until it reaches an
equilibrium.
The TCP specification describes another technique known as
"delayed ACK", which concerns the receive side. The technique
calls for an acknowledgement of a data packet to be delayed for a
short period of time - the delayed-ACK interval. Different TCP
implementations use different delay intervals. The TCP specification
(RFC1122) mandates that the delayed-ACK interval must be less than
0.5 second. Delayed ACK serves to give the application an opportunity
to send an immediate response, in which case the ACK can be
piggyback'ed with the packet carrying the response. This technique
is very useful, both in saving the network bandwidth and in reducing
the protocol processing overhead, and is widely adopted by TCP
implementations. The TCP standard also recommends that an ACK not to
be delayed for more than two data packets. This is to keep the slow
start algorithm on the sender side flowing, which counts on the ACK
packets coming back from the receive side in order to strobe more
data packets into the network.
2) TCP SENDER/RECEIVER DEADLOCK - THE IDLE TIME
A simplistic implementation of delayed ACK can cause unnecessary
idle time during the initial data transfer phase in a client-server
network environment. The scenario is as follows. When a sender
request can't fit in one TCP packet, TCP will break it up into
multiple packets. During the initial slow start phase, the sender
is allowed to send only one packet. Therefore only a partial sender
request is sent. The receiver application, upon receiving the
data in the packet, is not able to respond because the data is
incomplete. In the mean time, the receiver TCP is holding back the
ACK, waiting for the second data packet to show up. But the sender
TCP is waiting for an ACK to come back before sending more data - a
temporary deadlock. Eventually, the receiver TCP will give up the
waiting after a delayed-ACK interval, and send back an ACK.
This interplay of a simplistic delayed-ACK implementation with
slow-start algorithm accounts for the idle time problem seen in a
number of WEB benchmarks. These benchmarks employ HTTP response
messages of at least 8KB and usually more. On a typical network,
this size of data requires more than one TCP packet to carry.
During the idle time, the client TCP holds back the acknowledgement
of the first packet while the client HTTP is waiting for the rest
of the response data from the server before it can issue the next
HTTP request. But the server is waiting for the client TCP to ACK
before it can send the rest of response data.
3) SOLARIS CLIENTS - NO DELAY ON INITIAL ACK
Only configurations with clients that use a simplistic delayed ACK
implementation, e.g. Windows/NT, will exhibit the idle time problem
when talking to a Solaris server. Configurations using Solaris
clients are not affected by this problem because Solaris uses a more
sophisticated delayed-ACK algorithm. It recognizes the initial data
transfer phase, and will not delay the acknowledgement of the first
data packet.
4) SLOW START BUG - NO MORE IDLE TIME
Configurations using a server running Windows/NT, or an OS with a
BSD derived TCP stack don't exhibit this idle time problem. This
is, rather ironically, due to a widespread bug in the slow start
implementation in both Windows/NT and BSD derived TCP stacks.
The bug in the server erroneously takes the last ACK in the TCP 3-way
connection handshake as an indication of a data packet successfully
going through the wire. Therefore, when the server is ready to send
back the first response, it is allowed to send TWO, instead of one
TCP packet. The client, upon receiving two packets, will ACK
immediately as suggested by the TCP specification.
5) BREAKING DEADLOCK - THE WORKAROUND
A new TCP tunable "tcp_slow_start_initial" has been added to the
Solaris 2.6 release. The default value is one (1), which gives the
same behavior as Solaris 2.x releases prior to 2.6, and is fully
compliant with the current TCP slow-start standard (RFC2001).
The amount of the extra delay described above depends on the
delayed-ACK interval of the client's TCP stack, and is usually on
the order of 200 milli-seconds. For a normal TCP connection, this
delay is hardly noticeable. Nevertheless, it may not be true in an
environment that employs many short-lived connections, or connections
transmitting only a small amount of data. A good example is a WEB
server. In those environments, one should consider changing
"tcp_slow_start_initial" from the default value of one (1) to two (2).
The potential downside of the change is that, with many clients all
starting at two packets instead of one, more network congestion
might be introduced. IETF (Internet Engineering Task Force, the
industry group that governs the Internet standards), after recognizing
the problem described here and the widespread of the slow start bug
described in 4) only recently, conducted a preliminary study over the
global Internet on the effect of amending the slow start algorithm
to start at two packets instead of one. The study found no evidence
that the change caused more congestions. It's still conceivable,
although rare, that on a configuration that supports many clients on
very slow-links, the change might induce more network congestions.
Therefore the change of "tcp_slow_start_initial" should be made with
caution.
Sun is actively participating in an effort in IETF to revise TCP
specification to allow more packets to be sent initially. Once the
revision is ratified, Sun will take the appropriate actions to
upgrade Solaris TCP accordingly.
6) COMMANDS FOR THE WORKAROUND (Solaris 2.6 only)
> su to root
> ndd -set
See ndd(1M) for an explanation of the tuning facility.
The way I understood it was there's 2 forms of communication going on between the client and server. For simplicity, I'll use an analogy.
It's sort of like making a telephone call in 1 of two ways:
The first way - Call a friend on the phone, and have an entire conversation, but never do the formality of a "Hello" or "Bye" at the beginning or end of the call and don't hang up even if you've run out of things to say.
The second way - Call a friend on the phone, but ring them individually for each and every word of the entire conversation, and be sure to include the formality of "Hello" and "Bye" with each and every call.
Maybe I have a wierd way of reading this, but that's what I got out of it.
__________________________________
Free your mind - Flush your toilet
Sounds to me like this blog is describing pipelinging which is a standard part of HTTP 1.1...
What is HTTP pipelining?
Normally, HTTP requests are issued sequentially, with the next request being issued only after the response to the current request has been completely received. Depending on network latencies and bandwidth limitations, this can result in a significant delay before the next request is seen by the server.
HTTP/1.1 allows multiple HTTP requests to be written out to a socket together without waiting for the corresponding responses. The requestor then waits for the responses to arrive in the order in which they were requested. The act of pipelining the requests can result in a dramatic improvement in page loading times, especially over high latency connections.
Pipelining can also dramatically reduce the number of TCP/IP packets. With a typical MSS (maximum segment size) of 512 bytes, it is possible to pack several HTTP requests into one TCP/IP packet. Reducing the number of packets required to load a page benefits the internet as a whole, as fewer packets naturally reduces the burden on IP routers and networks.
HTTP/1.1 conforming servers are required to support pipelining. This does not mean that servers are required to pipeline responses, but that they are required to not fail if a client chooses to pipeline requests. This obviously has the potential to introduce a new category of evangelism bugs, since no other popular web browsers implement pipelining.
When should we pipeline requests?
Only idempotent requests can be pipelined, such as GET and HEAD requests. POST and PUT requests should not be pipelined. We also should not pipeline requests on a new connection, since it has not yet been determined if the origin server (or proxy) supports HTTP/1.1. Hence, pipelining can only be done when reusing an existing keep-alive connection.
How many requests should be pipelined?
Well, pipelining many requests can be costly if the connection closes prematurely because we would have wasted time writing requests to the network, only to have to repeat them on a new connection. Moreover, a longer pipeline can actually cause user-perceived delays if earlier requests take a long time to complete. The HTTP/1.1 spec does not provide any guidelines on the ideal number of requests to pipeline. It does, however, suggest a limit of no more than 2 keep-alive connections per server. Clearly, it depends on the application. A web browser probably doesn't want a very long pipeline for the reasons mentioned above. 2 may be an appropriate value, but this remains to be tested.
What happens if a request is canceled?
If a request is canceled, does this mean that the entire pipeline is canceled? Or, does it mean that the response for the canceled request should simply be discarded, so as not to be forced to repeat the other requests belonging to the pipeline? The answer depends on several factors, including the size of the portion of the response for the canceled request that has not been received. A naive approach may be to simply cancel the pipeline and re-issue all requests. This can only be done because the requests are idempotent. This naive approach may also make good sense since the requests being pipelined likely belong to the same load group (page) being canceled.
What happens if a connection fails?
If a connection fails or is dropped by the server partway into downloading a pipelined response, the web browser must be capable of restarting the lost requests. This case could be naively handled equivalently to the cancelation case discussed above.
The1Genius - Littera Scripta Manet
As the owner and operator of a small commercial web hosting outfit I wholeheatedly agree. Just two days ago one of my clients' sites got slashdotted.
/. traffic accounts for less than 1% of my servers' total traffic, it just happens to happen over a short period of time. It is not economical for me to have 99% idle bandwidth for the 0.01% of the time that it is needed. Also, you trolls aren't paying for it, I am.
It is extremely annoying to see posts about poor server configuration from the losers who post here. The server is seldom the issue, the bandwidth is. My server gets slashdotted about once a month and every time the server load is nominal, yet my two T1s get crushed. Of course I surcharge my clients responsible for this as it creates problems for the rest of my clients.
Some responsible behavior on the part of Slashdot editors/administrators is in order. It doesn't take a genius to figure out which sites may survive a slashdotting and which may not. When in doubt, ask.
As for the trolls that whine like little bitches about lack of bandwidth,
You obviously have no clue about networking. keep-alives are implemented at a MUCH higher level, using a keep=alive header to keep the connection open.
The sequences described here are low level packet tweaks which are not RFC compliant at all. They leave connections in a half closed state in case another non RFC compliant request comes in.
SO what happens? It makes IE requests complete faster on IIS, but non IE requests slower due to an extra handshake due to the connection being half closed.
Top Most Bizarre/Disturbing Error Messages
Which is a standard What is everyone complaining about?
hey, Linux users, for more Phoenix speed plug this into yr ~/.phoenix/default/xxxxxxxx.xlt/user.js file:
user_pref("nglayout.initialpaint.delay", 0);
CB
free ipod and free gmail!
The parent +5 post is flat out wrong. This is not about persistant connections, which is a high-level HTTP feature that keeps a connection open so that the browser can send more requests. This is about a low-level TCP hack that IE uses to get a small speed boost on IIS servers, while breaking TCP standards compliance.
If I read the article correctly, instead of creating a new TCP connection and then sending a request, IE sends the request immediately without bothering to finish the TCP handshake. Microsoft IIS web servers deal with it automatically, and it is faster because it saves a round-trip wait for the ACK and the following requset.
The down side is that non-IIS servers have no clue what this incoming packet is. It must be invalid because it is not a SYN. So it gets thrown away, and the server might or might not reset the connection. If a non-IIS server resets the connection, IE goes with a standard TCP handshake and has wasted only the round trip time for the request packet and the RST. But if the server swallows the invalid packet and does not send a RST, then Internet Explorer will just sit around for a few seconds until it times out and falls back to a standard TCP conection.
The summary is that IE is breaking the TCP protocol for a small speed boost when connecting to IIS servers. It results in a small speed penalty when connecting to most non-IIS servers. When connecting to non-IIS servers that do not reset the connetion, it results in a very noticable delay.
It could also be a potential security risk, because if this is true, then it makes it very easy to IP-spoof a HTTP request against IIS (since the request is a self-contained packet instead of a long connection sequence).
Here's a tcpdump for www.microsoft.com, on an XP box:
03:47:16.259661 10.0.0.52.1328 > www.us.microsoft.com.http: S 2485226999:2485226 999(0) win 16384 (DF)
03:47:16.279661 www.us.microsoft.com.http > 10.0.0.52.1328: S 631604626:63160462 6(0) ack 2485227000 win 65535 (DF)
03:47:16.289661 10.0.0.52.1328 > www.us.microsoft.com.http: . ack 1 win 17520 (D F)
03:47:16.289661 10.0.0.52.1328 > www.us.microsoft.com.http: P 1:398(397) ack 1 w in 17520 (DF)
03:47:16.339661 www.us.microsoft.com.http > 10.0.0.52.1328: . ack 398 win 65139
And here's for www.msn.com:
03:50:22.169661 10.0.0.52.1397 > www.msn.com.http: S 2535664221:2535664221(0) wi n 16384 (DF)
03:50:22.199661 www.msn.com.http > 10.0.0.52.1397: S 3601141750:3601141750(0) ac k 2535664222 win 65535 (DF)
03:50:22.209661 10.0.0.52.1397 > www.msn.com.http: . ack 1 win 17520 (DF) 03:50:22.209661 10.0.0.52.1397 > www.msn.com.http: P 1:391(390) ack 1 win 17520 (DF)
03:50:22.269661 www.msn.com.http > 10.0.0.52.1397: . ack 391 win 65146
These look like perfectly valid TCP handshakes. I did notice that when refreshing a site, IE reuses the previous connection, but that's legal (assuming it used Connection: KeepAlive in the HTTP header. I didn't verify that.)
The samples were taken on my network's gateway, which is a Linux box, hence impartial :)
But don't take my word for it. Try it yourself!
Everyone catch their breath from MS bashing and think for a second. There's no way IE is using some custom TCP layer, and there's no way it could get away with not sending SYN's.
I've seen plenty of packet traces with several versions of IE and everthing is perfectly legal TCP-wise. I'm guessing the author of the blog just caught part of a sequence that had used a persisent connection that had been closed. In fact, if the client didn't get the RST on that connection, it sounds more like a server issue.
No, what's described is a plain vanilla half-close of a standard TCP connection. The server called shutdown(), sends a FIN, the client stack ACKs it. The browser doesn't call shutdown(), hence it doesn't generate any FIN packet for it's half of the connection. It's entirely acceptable from a TCP-protocol standpoint, although highly annoying.
As has been said countless times already, no. This is a violation of the TCP standard. Pipelining works within the HTTP standard, and part of that is keeping the connection open using standard TCP signalling technology, which this is definitely not.
Got time? Spend some of it coding or testing
Persistent connections work through the HTTP protocol layer over standard TCP, this is a violation of a much lower TCP protocol layer instead.
Got time? Spend some of it coding or testing
It is being set up properly. What happens is that the browser hasn't closed it's half of the connection. When the next request happens it tries a TCP write, but since the server side has closed the connection the write fails. That's what's confusing the blog author, they're not familiar with the TCP protocol. A TCP connection has two halves and it's entirely legal to close one half but not the other, leaving a socket that can be read from but not written to (or vice versa). IE doesn't check for the server-side close like it should, treats the socket as if it's writable (which it is) and writes to it. Since the server's closed the socket on it's end, that attempted write generates an RST (which is TCPly correct), the browser gets a write error and finally notices that it's connection has been closed by the remote end, closes everything down like it should have much earlier and builds a completely new socket.
You can get this same behavior between two Linux systems. The server side goes:
- socket(...)
- listen(...)
- accept(...)
- read(...)
- write(...)
- shutdown( SHUT_RDWR )
- close()
The client side goes:- socket(...)
- connect(...)
- write(...)
- read(...)
- write(...)
- Note error
- close(...)
- socket(...)
- connect(...)
In IE, steps 3 and 4 in the client handle one request. Step 5 is an attempt to handle the next request assuming that the server handles persistent connections. Step 6 is where IE notices that the server doesn't do persistent connections.The right thing to do would be to notice the HTTP version and lack of a Connection: header indicating support for persistent connections in the response and close the connection upon receipt of the response. IE is stupid in not handling non-persistent-connection servers as it should, but it's not violating or even bending the TCP protocol spec in any way. It's just stupid coding.
More like, the powers that keep slashdot editors from telling people they're going to be linked caused perl to suck up all our CPU. Nothing crashed, the proc was simply pegged until I woke up and fixed it.
It's back up with a redirect for slashdot.
What is described in the article is a bastard half-closed connection, which is completely unnecessary unless your goal is gratuitous violation of the TCP spec.
You know, I seem to recall some guy saying that Microsoft's long-term goal was to embrace, extend, extinguish TCP/IP. And that they'd start by making tiny little changes so that Microsoft programs talking to Microsoft programs worked much better than Microsoft/non-Microsoft. He got booed down quite loudly - everyone claimed that they could never try anything like that. It'd be noticed immediately and they'd have a PR disaster.
The odd thing? He was half-right. He was wrong only in saying that they hadn't done it yet.
I then fired up Windows XP Pro. XP sends lots of netbios stuff at startup and periodically. Very interesting. But again, nothing nearly as interesting as this article suggests. MSIE 6.0.2600.0000... also did not reproduce this non-RFC behavior.
Here is the packet log from tcpdump, with some comments. 192.168.194.211 is the Windows XP client. 192.168.194.1 is the nameserver, and 66.218.71.83 is the web server (www.yahoo.com).
First, XP asks the nameserver for the IP number of www.yahoo.com
15:19:50.426473 192.168.194.211.1026 > 192.168.194.1.domain: 2+ A? www.yahoo.com. (31)
The nameserver responds
15:19:50.702603 192.168.194.1.domain > 192.168.194.211.1026: 2 10/11/0 CNAME[|domain] (DF)
XP/MSIE sends a normal SYN packet. There is no non-RFC packet transmitted before this standard SYN packet, corresponding to an already-open connection before this as the article claims.
15:19:50.734980 192.168.194.211.1032 > 66.218.71.83.http: S 3861657940:3861657940(0) win 16384 <mss 1460,nop,nop,sackOK> (DF)
Yahoo responds with a normal SYN
15:19:50.797377 66.218.71.83.http > 192.168.194.211.1032: S 3674114276:3674114276(0) ack 3861657941 win 65535 <mss 1460> (DF)
XP/MSIE sends a normal ACK to finish the connection setup
15:19:50.802506 192.168.194.211.1032 > 66.218.71.83.http: . ack 1 win 17520 (DF)
XP/MSIE sends the HTTP request (196 bytes)
15:19:50.809064 192.168.194.211.1032 > 66.218.71.83.http: P 1:197(196) ack 1 win 17520 (DF)
Yahoo responds with the first 1460 bytes of data
15:19:50.907564 66.218.71.83.http > 192.168.194.211.1032: . 1:1461(1460) ack 197 win 65535 (DF)
XP/MSIE acks it
15:19:50.919180 192.168.194.211.1032 > 66.218.71.83.http: . ack 2921 win 17520 (DF)
Yahoo responds with another 1460 bytes
15:19:50.923751 66.218.71.83.http > 192.168.194.211.1032: . 2921:4381(1460) ack 197 win 65535 (DF)
XP/MSIE acks it
15:19:50.941174 192.168.194.211.1032 > 66.218.71.83.http: . ack 4381 win 17520 (DF)
Yahoo responds with two more packets
15:19:50.999791 66.218.71.83.http > 192.168.194.211.1032: . 4381:5841(1460) ack 197 win 65535 (DF)
15:19:51.007961 66.218.71.83.http > 192.168.194.211.1032: . 5841:7301(1460) ack 197 win 65535 (DF)
XP/MSIE acks that it has received up to 7301. Notice how Microsoft is properly delaying the ack until a second packet is received.
15:19:51.013652 192.168.194.211.1032 > 66.218.71.83.http: . ack 7301 win 17520 (DF)
So there are two tests, with the MSIE shipped (unpatched) with Windows 98 SE and Windows XP Pro. It looks like there just isn't a story here.
PJRC: Electronic Projects, 8051 Microcontroller Tools
Whoever wrote this and his 'team' are tards. What they were seeing was a keep-alive (persistent) connection, or a persistent connection...it's total BS that IE would ever send a request to a host without a connection already being open. IIS just allows for persistent connections...when you hit blah.com, you open the sock, send your request and all and specify keep-alive. Now, the socket just stays open, so when they hit another page on the same host, they send a request to the already-open socket without the initial 3-way handshake since they've already done that. If it was true that IIS allowed IE to get a page without a 3-way handshake first (not that the Windows TCP/IP stack would even _allow_ that packet to get through because it's based off of the BSD TCP/IP stack, and a 3-way handshake _must_ be done before any data can get to a user-land socket..and not like any NATed routers would let it through, either), it would allow total TCP hijacking and DoS's But it's always nice to see that people who don't know jack are able to post stuff to slashdot ;o
> no need to bother with sequence number guessing, just send your data packet right away, and pretend the connection is already open.
wrong. you still need to know the sequence number of the stream from the client to the server if you are going to send a packet. otherwise the server will drop the packet.
you dumbass. if you knew shit about protocols you would know that UDP will "just gimme the data", but you can't do that reliably. What if packets get lost on the way? What if the client loses connection? TCP ends up way better for this sort of thing than "just gimme the data". You want to actually get the data, you gotta pay the price for the reliability.
You really don't want to do that. HTTP over UDP is simply a bad idea... Why? In order to meet the most basic needs of a stream reliant protocol (ala HTTP) you need a few things:
1. Reordering (No guarantee packets arrive in order)
2. Retransmiting (Detect lost packets and resend)
3. Speed throttling (Packets go too fast -> router interface buffers overflow -> packet dropped)
It is of course possible to write a protocol on top of UDP to do these things, but thankfully we don't have to as someone did the work for us... it is called TCP.
I suppose if you wanted to use a HTTP/UDP mechanism for very short communication only (read: one request packet, one reply packet) then those issues aren't relevant, but otherwise leave the heavy lifting to TCP or other stream based IP protocols.
IE doesn't exhibit this behavior with servers that don't support http pipelining/keepalives/whatever.
IIS isn't the only server that supports it, btw. Apache does, and I imagine Tux or whatever the current kernelspace webserver is supports it too.
Also, your second scenario, for a server that doesn't understand the keepalive, is, as you allow, completely wrong. If a server could be confused in such a manner, then it would be trivial to write a DoS attack for the server that would not require large amounts of bandwidth.
There are no trails. There are no trees out here.
Actually i am pretty sure that if you log out it will shutdown IE.
Q.