Why IE Is So Fast ... Sometimes
safrit writes "Finally the scoop on how IE "cheats" a little to up its performance! Do RFCs mean nothing anymore? What's next, Riots in the streets, dogs and cats living together, mass hysteria!
From the blog story: 'Internet Explorer on Windows always seems either to run impossibly fast (page requests are fulfilled almost before the mouse button has returned to its original unclicked position), or ridiculously slow...' Now read to see why..."
Heck, IE still uses an HTTP Accept line with */* at the end without quality ratings rather than a more complete one, like Mozilla's. Reason? It saves a few bytes.
n /xml,application/xhtml+xml,text/html;q=0.9,text/pl ain;q=0.8,video/x-mng,image/png,image/jpeg,image/g if;q=0.2,*/*;q=0.1
Example:
IE 6/Win: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-shockwave-flash, */*
Mozilla: application/x-shockwave-flash,text/xml,applicatio
Does anyone know if this sequence is there for security purposes? It looks like this might lead to a spoofing vulnerability.
ato
I think what we're seeing is the use of the HTTP Keep-Alive which is part of the HTTP 1.1 standard. Am I wrong?
I find this hard to believe, and also very un-newsworthy.
i wonder about the relationship between this and the standard keepalive protocol, which basically is a standard that keeps a connection open for a certain amount of time so the browser doesn't have to keep opening new tcp connections for each image or whatever.
i would assume that the keepalive protocol reduces the ill effects of this system, since once a connection is made it doesn't have to be torn down and reestablished, or at least not for each request.
This almost makes me want to break some other rules and hack my TCP stack to send back some other amusing responses to unsynchronized packets - perhaps a ping of death or an invalid OOB packet (WinNuke)?
great post, wish i had mod points... i'd love to see this question answered.
sig.
This is (was) often used to confuse IDS systems as they wouldn't look for data in a new connection.
A custom application we run at work makes use of the IE ftp client to make automated connects to our ftp server. Any other client, Linux or Windows, disconnects from the server on shutdown. IE or the IE-based ftp client don't, even if you exit IE. Because of this we've been forced to set a session idle timeout of 1 minute on the server to avoid hanging connections. Is this another example of the same technique, client-side?
The thing I don't understand... Isn't this somewhat like keepalive and pipelining?
I normally hate Microsoft, and think they are up to massive conspiracies. However, in this case, it seems more to me like a legitimate innnovation, as opposed to some elaborate scheme. I fail to understand what is 'evil' about this: isn't this a good thing?
________________________________________________
suwain_2
that cannot be... surely there'd be an endless amount of problems with stateful firewalls. not to mention that isa and msproxy server would have to support this.
are we sure that the author just doesn't understand persistant connections???
a simple netstat -a would show you if the connection was kept open... i'm using squid as my proxy so can't test this.
No matter how fast or how slow IE is, a lot of people are still stuck using it because there are just some sites that are Windows-centric. Some sites just don't work or looks like crap if you're using something else.
Granted I cannot read the actual article as the site is down by based on the above:
This is fuzzy math. I do not like IE/IIS any more than the next guy but if the server to leave a half open connection on IT's side two things would happen.
#1 The client's TCP stack (not IE, the stack) would have no idea that this connection was still open and would send a new SYN as soon as the user selected another link. This new request would have a different sequence number (probably source port as well) and would have to do the THREE-WAY handshake (SYN - SYN/ACK - ACK). Negating any benefit
#2. Those 1/2 open connections on the server use resources. Any host has a limit to the # of connection it can maintain at which point it stops accepting new ones (Check out how syn-floods work!) that means if this were true IIS server would need to be restarted constantly to clear these buffers.
..of Microsoft browser networking bugs which make it only work well with IIS. For example, This bug causes IE to fail to properly shutdown SSL connections. IE browsers using SSL conenctions with standard Apache webserver configurations will have all kinds of errors due to this issue. You need to either disable keepalives or increase the keepalive timeout to something outrageous like 2 minutes. This "bug" has been around for ages yet despite IE being in version 6, it is yet to be fixed. My guess is this is actually some kind of "feature" that makes IE work faster with IIS (since the connection never closes, subsequent reqests go faster, assuming the webserver knows how to speak the broken protocol).
Err, I don't think so. From what I've read about HTTP KeepAlive, the connection should be kept alive by adding a "Connection: KeepAlive" header to the request or something like that. I can't imagine any reason why any protocol should want to interfere with the TCP handshaking sequence for keepalive purposes. That would mean crossing out of the application layer into the transport layer.
... It'd just return "Page could not be loaded" or something like that. The problem never cropped up in Mozilla or other browsers, and eventually I found out that if I added this line:
This issue caused me a lot of grief last year, and I am just figuring out why. We set up a webmail server using Apache/Vhosts and OpenSSL, and we had this recurring problem of links just suddenly breaking in IE
SetEnvIf User-Agent ".*MSIE.*" nokeepalive ssl-unclean-shutdown
to the virtual host configuration, the problem went away. Now that I've read this article, I think I understand why. What I think is happening here is that Microsoft trying to make the most out of keepalive/persistent connections by bending the rules. And it's not right.
Am I a hipster-doofus?
#1 The client's TCP stack (not IE, the stack) would have no idea that this connection was still open and would send a new SYN as soon as the user selected another link. This new request would have a different sequence number (probably source port as well) and would have to do the THREE-WAY handshake (SYN - SYN/ACK - ACK). Negating any benefit
True, but suppose they hacked their TCP stack to recognize a magic SYN number, and bypass the three-way handshake if the client sends this magic number. Improbable? IE is part of the Windows kernel, so what's to say it doesn't poke directly in the TCP/IP stack. Wild speculation, I know.
Bush Lies Watch
Before jumping on the let's kill MS for ignoring RFCs. Maybe Linux needs to look at itself, as I'm sure there are places it does NOT adhere to RFCs.
One place of note in particular, is the implementation, of Nagle's algorithm in the TCP/IP stack.
Nagle's algorithm is specified as the algorithm to use for TCP/IP sockets that are no flagged as NoDelay, yet Linux blatantly ignores that, and implements its own algorithm, which while superior in some case, is worse in others, and is definetly NOT standard.
----------
For reference.. Nagle's algorithm basically says that only one TCP packet is outstanding at a time, if no packets are outstanding, a packet is sent as soon as data is available. If a packet has been sent but unACKd, just buffer the outgoing data and send it in the next packet when the ACK arrives.
Linux's implementation DOES NOT send data as soon as it is available, and adds a timed delay for the first packet, even though no packets are outstanding.
This leads to horrible performance in applications that do not enable NoDelay but do not send large amounts of data in one batch.
I'm missing the point of this /. story. M$ bashing? An attempt to establish a position on the "moral highground" for the non-M$ community? Trying to convince IE users to switch to another browser because using IE would constitute a standards violation (say it isn't so!)?
Boring.
You might want to look into HTTP 1.1 as well. In fact, so should Microsoft, because (if the article is accurate) they've apparently re-invented the wheel in square form.
Standard HTTP 1.1 keepalive still uses a regular, plain, vanilla TCP connection. No FIN packets until the connection actually is finished. It simply doesn't close the connection, allowing further requests on the same connection (because the connection is still open). The connection is closed - using the standard methods - when one side decides to close the connection (eg. after a timeout).
What is described in the article is a bastard half-closed connection, which is completely unnecessary unless your goal is gratuitous violation of the TCP spec.
The blog describes the full HTTP transaction process as:
Which IE (allegedly) "hacks" and the transaction really goes like:
If this is true, then IE saves 2 round trips per connection. Clients generally open 4 connections per server, and keep them open (alive) until they've downloaded the page and all supporting files. So IE possibly saves 8 round trips per page with this (alleged) hack.
For domestic dialup connections, the average round-trip latency is 60ms. DSL is around 40, while cable is around 20. Ping slashdot.org to find out the latency of your connection.
So, for a domestic dialup user connecting to an IIS server, a straight request (with no handshake) would save 8*0.06s = 0.48s. The page mentions combining SYN/ACK packets, so this may even be less of a savings.
An 0.48s cheat in page load times hardly makes IE "impossibly fast" when page load times over a modem typically run > 20s.
Also, don't forget that this blog also talks about non-IIS servers balking at this non-standard connection setup with with an RST packet. That adds 4*0.06 = 0.24s to page load times on, say, Apache servers. If true, that doesn't make IE "ridiculously slow," either.
It all goes downhill from first post
I run a bunch of Apache/mod_ssl servers for WebCT, an online course management tool. We have to disable keepalives for IE users because otherwise their connections get hosed up.
I wonder if this habit of playing fast and loose with the protocol is responsible.
A dyslexic man walks into a bra.
Does IE have a custom TCP layer in it?
You're kidding right? IE is not some stadn alone program. It has MANY links into low level microsoft stuff where it is 'part' of the WIndows OS. This was the whole arguement of M$'s lawyers, that IE couldn't be removed easily.
So it wouldnt' surprise me if IE had access to some special stack API to pull stuff like this. Would not surprise me at all.
Top Most Bizarre/Disturbing Error Messages
The IIS team probably noticed this and just accepted the command even though there wasn't actually a valid TCP connection present. So if they receive a packet that looks enough like a HTTP request then do it. There's probably a stack of vulnerabilities here.
The interesting point is that IE and IIS must be using the network stack at a layer lower than the BSD style socket calls otherwise these packets would be rejected at the OS level and no, I don't believe Windows' networking stack is that crap. TCP processing is fiddly so cue more security holes.
This is also an easy in to hurt IE performance. Rather than responding to the dud packet with a RST, don't respond at all (which according to the article is an acceptable response). I'm not sure how linux handles this atm. The end result is IE is dog slow to start loading the page but every other browser is super quick.
And to all those people who posted saying this is HTTP pipelining, please don't talk about networking, ever. You lack a basic understanding of how network protocols are layer upon each other. It would be better if you just rub your chin and nod sagely, possibly saying "hmmmm" at the same time. That way you wont look so stupid.
Nerd: Derogatory term typically directed at anybody with a lower Slashdot ID than you.
So MS works things so their stuff is quicker -
Quite honestly I don't care a hoot about the standards, TCP or any of that techno-babble. If IE is faster then for everyday people that's a Good Thing.
Boo-hoo, waaahh, my 'zilla is slower - shame an army of developers didn't figure this out and hack it into the appropriate sources first, if they had then it would be; Yay for OSS!
Unless they are *not* using the standard sockets interface... rather using some undocumented hack inside win32 that does this (hey, linux has something like this, its called the "Packet Generator"(LOL), but it is atleast documented (and has its usage higly counter-recommended)). Man, if this is the case, I see some ppl getting pretty pissed off. Thats why closed-source monopoly software is not a Good Thing (TM). Anyone remember the stories about M$ using undocumented windows APIs in Office to be faster than the competition?
cheers.
``If a program can't rewrite its own code, what good is it?'' - Mel
IE's influence on a windows box is (unfortunately) quite widespread, but AFAIK, it's mainly just a bunch of COM components. Does anyone know if IE installs a driver (.SYS) file? That's the only way it would be able to jump into kernel mode.
Perhaps it's because for 60%+ of the servers out there, it actually makes things slower and for 100% of the servers, it makes it less reliable.
it's documented here: "Object Moved Error"
something I can run on my apache server that rejects clients that don not follow the rfc for tcp/ip, and hence rejects ie
I've toyed with blocking based on agent string, but that seems cruel and stoops to the level of MS...(who do this regualrly) and besides, it goes against my beliefs of software choice... however, it would be nice to redirect peopel to a page that says, "Your browser is not standards-compliant"
First of all, the article leaves out if the URL was visited previously in the browsing session, or it was a first visit to the URL. Secondly it Does NOT MENTION a specific test with IIS. The article says what IE's teardown sequence looks like, but does not cite a full trace of an IIS server. It is basically a bunch of speculation and conspiracy theory, or the authors were in a hurry. It definitely was NOT scientifically conducted.
I HIGHLY DOUBT that Microsoft would make the connection to an IIS server between Internet Explorer and IIS lose the reliability of TCP. Images and other large files would be a nightmare to get without acknowledgements.
What I think may be happening(ingenious YES and no, read on), assuming the article isn't all just conspiracy theory, is that Internet Explorer visits a website, and leaves the connection open. Then you hit the back button(which is used extensively in web browsing) or you go to other links on the same web server. Because the client doesn't close its end of the TCP connection(REMEMBER TCP IS FULL DUPLEX), half of the connection is open. The client can just send the request immediately, the sequence numbers were already assigned, because the connection was NOT CLOSED. What support is there for my view?
1. ".And that's it. The client doesn't FIN, and the server doesn't ACK. In other words, the connection is kept "half-open" on the server end. The reason for this? Why, to make subsequent connections from IE clients faster." What is this saying? Quite simply subsequent connections would be faster by leaving it open.
2. The article goes on to show IIS's server's sequence, how it shuts down but the client leaves its upstream to the server open.
It's definitely a trade off. It means YOUR computer as well as the IIS server keeps track of the connection. For a one shot deal on a server(frequently for slashdot users), the connection being left open is a waste. Eventually it must time out, but it does eat up some server resources and it would make the slash effect worse on the machine.
The article doesn't mention what non IIS servers do. If the client(IE) doesn't FIN its end of the connection, the connection is not closed. Do other servers time out? I do notice that when web browsing there are a lot of open connections in a timed wait when I use Internet Explorer. I don't think they all use IIS. In this case, their speedup would work on any client. But I am not knowledgable enough about what Apache/other http servers do to know for sure. In my TCP/IP class a year ago in college we discussed the FIN ACK sequence both ways, but didn't discuss what happens if the client decides not to FIN(aka doesn't follow the rules) by most standard servers. It is probably a design decision. Also what does MS IIS do on the first connection? It would probably be best to respond immediately with that packet so the client could immediately open the connection. And the super slow pages could be on servers that don't answer at all so IE has to wait for the request to time out. Essentially what this article is saying is that Microsoft Internet Explorer is a resource hog. But we all knew that already, that's why we call it Internet Exploiter.
# HTML support
# URI parsing that's RFC-2396 compliant
# Cookies support, RFC-2965 compliant
# XHTML 1.0 rendering
# Plain text rendering
# Image formats support: PNG, JPEG and GIF (no animated GIFs)
# HTTP 1.x Compliance
RFC / W3C STANDARDSYou keep going until you die..."Me".
What is much more interesting is what IE does AFTER it sends that first request without opening the connection... You know the lovely MSN Search page it loves to pop up? Everytime IE encounters (for the first time in each session) a non-IIS server, it promptly connects to MSN Search and submits the website address....
You are being watched, friends.
Cool! Amazing Toys.
In any sufficiently complex software, there are bugs. One should certainly entertain the fact that this behavior is accidental. Not everything MS does is deliberate, and I find it entertaining that so many folks assume their 'enemy' is malicious and organized.
First of all it is not "reproducsble" but "reprecucible". Furthermore, the browser/client TCP/IP stack should start up without the "syn" packet right away. Therefore it cannot know what kind of web-server the server side is running. I must say that I have never seen one of these strange requests myself. I cannot believe this has ever been the case, or ever will be the case. Furthermore, imho Microsoft is quite nice about following HTTP standards. They have been more standardised than Netscape browsers until the 4.7 versions were replaced by - eh - mozilla 1.0 effectively :).
This is a good thing too. My firewall would not allow such strange going on's in the TCP/IP connections probably (dunno, my NAT router will probably not even set up a connection without a syn package. HMMM. Maybe I should try this :).
Maarten
I'm sorry, but this person must not know how to perform a network trace.
The first thing you need to do when you start a network trace is start with a clean slate. I'm sure that if this trace was done properly, that the results would be different.
IE CANNOT do what is requested here without modifications not only to IE but to the entire IP stack of the client AND the server AND firewalls AND NAT routers AND proxy servers.
For someone to send a packet requesting data without first opening up a connection would force the connection to not only be RST by every known host on the planet, but also by every known firewall and NAT router in which most of the world runs through at some point in time these days.
Therefore, it would be impossible for this packet go get through without first going through the standard TCP setup and teardown procedures.
Michael (too lazy to create an account to post under)
Sometimes, when I enter a URL in OS X in IE, the first time I try it 404s but when I press return a second time it connects. Now I understand...and hate Microsoft more than ever for their flaunting of the standards that bring us all together. Orwellian for sure. Big Brother Bill.
When trying to connect to an address of form 1.2.3.4, the program would halt for some twenty-thirty seconds before proceeding.
IIRC this was a problem with Sun's implementation of InetAddress.getByName() on Windows. When passed a string containing a dotted numeric quad, it stupidly tried to do a DNS lookup on it instead of simply filling in the four bytes and handing you an InetAddress. Because who knows- maybe someone registered "192.168.1.23" as a domain name! (Which would be akin to registering "microsoft.com" with a Cyrillic "o", but never mind.) Then of course your thread stalled inside InetAddress for half a minute while it waited for the DNS timeout. This makes me suspect that Sun's code was waiting for the successful DNS response and ignoring the failure response that actually arrived. Probably the same moron was responsible for both bugs. Editing the hosts file became the standard workaround.
I don't know when it got fixed but there's code in there to check for a dotted quad now.
I think you would have to try a second request to see the behavior described. The article is talking about reusing a connection that is left half open by the client. It should be the second connection that would exhibit the behavior.
--
"What do you want me to do? Whack a guy? Off a guy? Whack off a guy? Cause I'm married."
(assuming it used Connection: KeepAlive in the HTTP header)
Never assume.
This is the key to this whole mess. The dude who wrote the blog doesn't know about the KeepAlive HTTP header.
This is about the funniest thing i've ever seen on Slashdot.
Read the whole fucking RFC people.
The 4 steps when the server finishes sending data and closes the connection, from the article:
Client Server
<-- FIN
ACK -->
FIN -->
<-- ACK
When the server has no more data, it sends FIN.
The server should not be allowed to send more data after the FIN! This is a violation of the TCP spec. Otherwise, how would clients truly know whether or not the server had more data to send?
TCP does support something called "half close". It is possible to indicate that you have no more data to send, but that you are still willing to receive data. This is why both sides must send FIN, in order to cleanly close the connection. If one side sends FIN but the other doesn't, the connection remains open, but data can only flow in one direction (sending from the side that did not send the FIN). This is useful for cleanly shutting down connections and making sure that both sides receive all the data they were expecting.
In the example from the article, when the client receives a FIN but does not send a FIN of its own, this is legal: the TCP connection now is one way, and data can only be sent from the client to the server. The server is not allowed to send more data. So, if IIS is doing this, it is breaking the spec. It is important to note that the client is doing nothing wrong in this case.
Dr. Demento On The 'Net!
This has nothing to do with antitrust cases or any other legal action whatsoever. It is simply a case of Microsoft trying their best to improve their customers' web surfing a little bit provided that the server admin runs IIS. Sure they are bending the rules a bit but hey, they are in a position which allows them to do it and IE users are the majority anyway. What I think should be done is to add a similar behavior pattern to Apache and whatever other web servers you might be using. If some RFC disallows this, then another RFC should be written to allow it.
The secret to a successful
How about because they are not increasing the connection speed. They are causing more traffic
because like it or not IIS is not the majority server out there. There for more often then not they are creating more traffic and slowing down your browsing not speeding it up.
Any other questions?