Slashdot Mirror


How to Work Around Broken Port-80 Routing?

Dr. Zowie writes "My ISP places an opaque (intended to be transparent) web proxy between me and the rest of the world. It is causing me problems due to misconfiguration or misdesign. My question is twofold. On the micro level, what can I do in the short term to work around the broken routing (in the long term, I switch ISPs if it's not fixed)? On the macro level, what can we as a community do to prevent breakage of the net on a global scale by poorly designed routing hacks?"

Dr. Zowie continues: "I use a regional ISP with otherwise-very-good policies. However, they seem to be intercepting anything that comes from my home net on port 80, so that they can ``transparently'' cache web requests based on the payload of those packets. The proxy seems to work rather well in most cases: I never noticed it until I started using OpenNIC. Then I found that some web pages that should have resolved OK through the OpenNIC system failed even though routing on different ports worked OK.

"I did some experimentation using ``telnet'' on port 80 directly, and found that packets are being routed based only on the payload regardless of the original destination address: I can (for example) retrieve the Slashdot front page by using ``telnet www.google.com 80'' and asking for "http://www.slashdot.org http/1.1". The tech support folks seem to be stonewalling me: the main contact tells me that the behavior is "not broken" even though it clearly violates RFC 1812, the standard set of rules for IP routing.

"The practice of ``transparent'' proxy routing seems to be growing more widespread. It appears to break the internet standard in a way that works for most folks for now, but that breaks port 80 usage in general. Looking ahead, this breakage seems like a growing nightmare waiting to happen. At the very least, I expect more instances of my particular problem to appear as folks give up on the corporate hegemony of ICANN. More insidiously, transparent proxy routers break the layered nature of the internet protocol and restrict the flexibility that made it work in the first place. One would hope that such proxies would at least act like routers when the fancier proxying fails, but at least my ISP's doesn't. What about your ISP's?"

30 of 323 comments (clear)

  1. Use netcat... by samrolken · · Score: 4, Informative

    You can use netcat to route your own port 80 traffic. Simply get a good UNIX shell account, and configure your router to direct to that. It becomes a real version of what you would be trying to do. However, I would bitch like crazy if my ISP did anything like that to me. If I want to connect to port 80 on something, I would want to be connecting to such port 80. Any fiddling with it would sure make me drop that ISP in an instant.

    --
    samrolken
  2. Tunneling by Matthaeus · · Score: 3, Interesting

    I recently had this problem with my university account...They route all resnet web traffic through an old 386 proxy server that can't handle the load. Find a free proxy out there and SSH tunnel to it. I'm sure there are more elegant means of getting through a poorly configured proxy, but this'll work as a quick fix.

  3. Re:Use netcat... or someone else's proxy server by samrolken · · Score: 4, Informative

    I should have posted all this in one comment... oh well...

    You could also use a third party proxy server. You can find gobs of them here:

    http://tools.rosinstrument.com/proxy/

    and here:

    http://directory.google.com/Top/Computers/Intern et /Proxies/Free/?tc=1

    --
    samrolken
  4. Sounds like what my college does by AntiNorm · · Score: 3, Informative

    Onenet is the internet "service" provider to most state agencies within Oklahoma, including Oklahoma State University, where I am currently working on a BSEE. Neglecting Onenet's other issues (AOL's netadmins could do a better job than Onenet's), they have a "transparent" web cache proxy. More often than not, errors fetching a web page come not from the browser or the site itself as they should, but from the proxy. DNS errors from the proxy are not uncommon. As for switching ISPs, I can't, which really sucks. But for what I can reach on the net, I'm still getting ultra-cheap broadband :P.

    --

    I pledge allegiance to the flag...
    of the Corporate States of America...
  5. same problem by babycakes · · Score: 3, Interesting

    We had pretty much the exact same problem with our ISP, in that if we sent HTTP requests out without any proxy configuration, they would often take a couple of times to get through, since our ISP's transparent proxying didn't work. However, on setting the browser's proxy settings to the proxy itself, this seemed to solve the problem since it would ask the proxy directly.

    Don't ask me why :)

  6. Education by radoni · · Score: 3, Interesting

    At my highschool, the current system for blocking webpages was introduced as a means to cache commonly used pages and make the District 225 intranet faster. The superintendent and members of the district board know very little about computers, so naturally it is approved. After the Columbine incident, a new feature was tacked on that blocked certain objectionable web sites. The recent WTC attack caused even more areas of the net to be restricted. Today, when i want to search "terrorism" for a paper on the war afghanistan, my results are blocked. Teachers have informed us that we must use the one non-blocked computer in the tech room, or do research at home.

    my friend set up an anonymous web surfing proxy at his home computer, and using this i can get whatever i want.

    there are publically available anonymous port-80 proxies still around.

    --
    SIGERR: laziness exceeds quota
  7. My Experience on ISP with faulty service by Jucius+Maximus · · Score: 5, Interesting
    I used to have an ISP that, although they allowed you to have your own site (on their webspace,) loading the site was just damn SLOW for anyone who tried. It was much faster if the pages were hosted somewhere on another continent compared to an ISP with a server in the same city.

    The thing is, they probably won't listen to problems like this, or your proxy issue in most cases. But I found a way to make them listen to you:

    Phone them up saying that you want to cancel the service. Mention something about their web hosting being broken. They will probably say that they will have a management person phone you back to confirm the process.

    When they do phone back, for me, the call was like "Hello, there was a call eariler about a slow connection?" And at this point you have someone on the line who is interested in helping you, has power in the organisation to really fix things (because they're management or a senior tech) and they want to get your issue fixed to they don't lose your business. And THIS is when you really try to explain what's going on.

    This was my experience. Perhaps it will work for you.

  8. Lots of ways to work around your ISPs. by BrookHarty · · Score: 5, Informative

    Proxy servers, They might not be cacheing 8080 or other Proxy ports. Check http://tools.rosinstrument.com/proxy/

    Bouncers - You set this program on an external server on a port thats not filtered. You just point your browser at this IP/port and your outside your filtered isp. Check www.freshmeat.net

    SSH, tunnel or route from an external box.

    Really, If you cant go through it, go around it, either with software or networking.
    -
    Well, if crime fighters fight crime and fire fighters fight fire, what do freedom fighters fight? They never mention that part to us, do they? - George Carlin

  9. OpenNIC by glasn0st · · Score: 3, Informative

    The poster mentioned that he used OpenNIC which is an alternative DNS root. It is proper HTTP, but a transparent proxy that does not "see" domains in this namespace effectively block you from viewing webpages under this domain.

    His own box is properly configured to do OpenNIC lookups, but the HTTP request to the (proper) webserver gets intercepted. Now the proxy has to do the real HTTP request, but the proxy does not know about the alternative domains and probably returns a "Host not found" error.

    I haven't heard of free proxy servers supporting one of the alternative NICs and I doubt the ISP will be interesting in subscribing to such a service. I guess the only solution will be to convince a friend to set up a proxy on a box someplace else.

    Some alternative roots have their own "real" Internet domain which acts as a gateway domain, for instance name.space has http://name.space.xs2.net/ (regular hostname) which enables non-subscribers to view http://name.space/ (namespace only), making the domains available globally. If OpenNIC provides such a service, an alternative solution could be to run some proxy at home and let it rewrite OpenNIC urls into "regular" URLs.

    --
    ( ^_^)/
  10. Their way or the highway by Lumpish+Scholar · · Score: 3, Informative

    (1) Line up a serious alternative ISP. Talk to their sales department; see if they do the same thing.

    (2) Talk to your ISP's sales department. Tell them your problem. Tell them you're ready to move. (Perhaps ask what the hit rate of the cache is, that is, if the overhead is worth it for them.) See if they offer any accomodation.

    (3) Go with the ISP that does what you want.

    If you're using them for DSL, you may not have a lot of choice.

    (As others suggested, if host resolution is your issue, you could run a local proxy on your 127.0.0.1 interface that converts host names into addresses.)

    --
    Stupid job ads, weird spam, occasional insight at
    1. Re:Their way or the highway by Jerf · · Score: 4, Informative

      (As others suggested, if host resolution is your issue, you could run a local proxy on your 127.0.0.1 interface that converts host names into addresses.)

      Unfortunately, that's not a complete solution. Example: Compare my home page versus the IP address that hostname resolves to.

      Lots of servers do this.

  11. Re:Wasn't port 80 supposed to be HTTP? by Jerf · · Score: 5, Insightful

    I reply to this because I bet a lot of people are going to think this.

    The real problem is that you're probably using port 80 for something other than what it's explicit purpose.

    No, that's not it at all. Follow the openNIC link.

    What he's trying to do is resolve an address, via the perfectly standard and normal DNS protocol, with an alternative root server. This is also perfectly standard and normal. This is not a violation of DNS, nor any other protocol, nor is it a particularly wierd thing to want to do. (Unusual, but perfectly normal.)

    The problem is that his ISP is catching all traffic to port 80, and redirecting it to their proxy. Thus, when he asks for "http://www.something.nonstandardroot", the web proxy is interfering with the request (presumably after his home computer correctly resolved the DNS address of www.something.nonstandardroot), catching the GET part of the HTTP request, extracting the server name, and attempting on it's own to resolve the name.

    (Note this is a complete waste: The home computer has probably already resolved the address, now the proxy will resolve it again.)

    Unfortunately, the proxy is too ignorant to know how to resolve the alternate DNS address. It's not incapable in the technical sense, it just doesn't understand root servers it's not configured for. The problem is that this means that the perfectly normal and acceptable HTTP request, for an HTML document, on an IP address the client computer has already perfectly normally resolved, gets lost, because the proxy doesn't know how to resolve the address. Bad proxy!

    A workaround, albiet a sucky one, is to resolve the address on one's home computer, then go to that IP address manually. This still causes problems on subdomain-aware webservers, where several domains or subdomains may all come from the same IP address, and the server wants to use the host part of the HTTP GET request to differentiate what to serve. (You could code up a quick Python/TK script to do this, but it'll still suck.)

    So, when you say a proxy is not required to route anything anywhere, you've accidentally hit on the exact problem: a proxy shouldn't be routing, because it may not know how. This proxy tries to. That's why it sucks.

    And to cover the last part of your post, there's absolutely nothing non-standard about any of this, except the behavior of the proxy, which is the only thing in this whole mess that hasn't "embrace[d] the DNS standard, HTTP standard and the routing standard". ICANN's root servers are not written into RFC's. They are merely common practice, one that many people, probably correctly, believe is an increasingly dangerous common practice. (You may not completely agree, but the opinions deserve consideration.)

  12. Re:Use netcat... or your own proxy server... by khuber · · Score: 3, Insightful
    You might try something like the port + 65536 rule.

    How could a number outside 16 bits make it to a router since TCP only holds 16 bits for ports? If you wrap around to 80, you have 80, not 65616.

    -Kevin

  13. Look At It From the ISP's Standpoint by ocip · · Score: 5, Insightful

    If you look at it from your ISP's standpoint transparent proxies aren't as evil as you make it sound.

    99.9% of the ISPs clients aren't trying to do anything tricky, like this. Of those 99.9%, say, only 40% have a proxy server specified. These 40% get to enjoy faster web browsing--which is probably all they're doing anyway. The other 60% enjoy slightly less quick web browsing, but that's they're own fault, right? They're the only ones losing out, right?

    Wrong. The ISP has to pay for bandwidth. The ISP doesn't like the proxy only because it makes browsing snappier, it likes the proxy because it also saves them on bandwidth costs! If the other 60% of the clients were using the proxy they might save 10%, or more, on total bandwidth costs.

    You could think of it like this, too: that's 10% more bandwidth available for the clients at no additional cost to the company (apart from the capital for the proxy server). Yes, they're not perfect, but they make a difference. When you weigh the pros and cons, well, it's obviously going to be worth it for the ISPs to have it installed.

    You could look around for an ISP that doesn't use a transparent proxy but, as you said, they're becoming more popular. Realise that they're not doing to squash your freedom, but instead to provide better service and to save money.

    1. Re:Look At It From the ISP's Standpoint by jdavidb · · Score: 3, Insightful

      I agree with everything you say; proxy servers are a great thing for all involved and not a threat to freedom.

      But the problem is that this proxy server doesn't work right. My browser should look up the IP corresponding to the site, send a request on port 80, and get the response. In this case, it looks like the proxy is insisting on doing the lookup part, and so the user effectively can't change his DNS.

    2. Re:Look At It From the ISP's Standpoint by Anonymous Coward · · Score: 4, Insightful

      1. An HTTP proxy server is not a router.

      2. What is happening is that your *default gateway* (which really IS a router) is redirecting packets bound to port 80 to the proxy server. Your default gateway is doing the routing, NOT the proxy server. (Linux does a nice job at transparent proxying, btw.)

      3. The proxy server then tries to resolve the domain name using DNS.

      4. The DNS server the proxy server is configured to use, not knowing anything about these funky TLDs you're trying to access, can't find it. It tells the proxy server so.

      5. The proxy server comes back and gives you a nice, friendly error message telling you it can't resolve the host name.

      Look...transparent proxying is to bandwidth what NAT is to private networks. It works, it works very well, it's in widespread use (getting wider every day, probably), and it's here to stay. If you really want to do something constructive to solve your problem, ask your ISP to configure their DNS to resolve the OpenNIC TLDs. They're a lot more likely to do that than they are to stop using transparent proxying (I know I would be).

    3. Re:Look At It From the ISP's Standpoint by evilviper · · Score: 3, Insightful
      99.9% of the ISPs clients aren't trying to do anything tricky, like this.


      Well, in that case, they can stop supporting anything but windows, since it has a clear majority. Oh, and you can't use anything but IE since it's got a majority as well.

      The problem is that I don't pay for 'a service that allows me to view most web sites'. Rather, I pay for an 'internet service'. If anything that should work, doesn't, then they are violating their end of the contract... Not to mention probable false-advertising, etc.

      If it costs them 10% more bandwidth for those who choose not to use their optional proxy, then they should charge the customers 10% more.

      How about if the USPS decided to crush every package by 1cm because then they can fit more packages in each plane/truck. Besides, 99% of people have at least 1cm of padding to protect the package contents anyhow.

      It's exactly the same thing. Doing something that doesn't hurt too many people, in exchange for more profit. The fact that most people aren't going to be negatively affected doesn't make it right, or legal for that matter.
      --
      Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
  14. ISPs required by law to block port 80 in Singapore by tangent3 · · Score: 4, Interesting

    Here in Singapore, ISPs are required by law to block port 80, forcing all outgoing http requests to go through a proxy server (which filters out webpages which are deemed unsuitable for Singaporeans to view, including www.playboy.com), or to have a transparent proxy server blocking out such requests.

    This has caused me many problems before, when my IP gets determined wrongly by the remote site (which naturally thinks takes the proxy server's IP for my IP address). Some applications don't like the transparent proxy either, for example Frontpage Extension (not my choice to use!), and an autopatching program which refused to download the latest version of a file, insisting on downloading only the file cached in the proxy server until the cache gets flushed.

    The only real method of bypassing the proxy is to use another proxy server (since 8080 isn't blocked) outside the ISP's network. This tends to be really slow though.

    I guess I have to live with this until the government one day realises that proxy servers cannot stop the people from viewing pr0n, and it's probably not worth maintaining the proxy servers to meet the demands of all the net users in Singapore, not to mention maintaining the list of sites to block.

  15. corrections, suggestions, etc by MattW · · Score: 5, Informative
    First of all, the phrase "routing" is a misnomer. Web caching is something that happens on the application layer of the OSI model, layer 7, whereas "routing" refers to layer 3, which supplies IP routing for the TCP/IP protocol suite. What's broken is their caching, their cache server, or their proxying; pick a term.

    Second, there's a lot of ways around it which involve tunnelling.

    Tunnel to another box running a non-broken web cache. I used to tunnel my http traffic through ssh to my colocated boxes, which ran adzapper, and proxied through that.

    Tunnel at the IP layer by running any IP-in-IP encapsulation. If you have some version of windows, for example, you might convince someone with a server to run a PPTP server for you somewhere and you could tunnel through that. There are even Free PPTP Servers for Linux available to help.

    Find someone who runs a little proxier for their own net with socks, and bounce off their socks proxy. Someone you know no another ISP probably has Wingate or the like running, and if they allowed it (and on some older version, it will permit this by default), you could set your browsers SOCKS settings to bounce off their proxy server, and since SOCKS isn't on port 80, your ISP will probably ignore it.

    There are also a number of things you might discuss with your ISP to resolve the issue.

    Suggest that they switch to a less broken cache server. (Squid, anyone?)

    Suggest that they exempt you specifically from the cache server by telling it to ignore your ip address.

    Note that they have an obligation to make sure their caching software doesn't interfere with your browsing; so it will be necessary (and not cost-effective for them) for you to call for every problem you notice.

    Obviously, you'll need to probably speak to a whole number of supervisors, and probably eventually get transferred to a "real engineer", and they will probably hack in a fix (like exempting you only) rather than truly deal with the problem.

    If all else fails, then you may want to try issuing ultimatums, like, "If you can't fix this problem, then you can cancel my service." Tech support people are lazy, however, in some cases, and may just opt to cancel you. This is a harsh reality in the world of consumer bandwidth -- and it will be worse, soon, with bells closing their DSL lines to competition, meaning unless someone else builds a telephony infrastructure to you, you'll probably pick Cable vs 1 DSL provider, and if you don't like something at either of them, you're just out of luck.

    1. Re:corrections, suggestions, etc by Phroggy · · Score: 5, Interesting

      Tech support people are lazy, however, in some cases, and may just opt to cancel you.

      Au contraire. Tech support people are tired of listening to customers whine about problems that tech support people cannot fix. If customers have unreasonable expectations, and refuse to listen to us, it's far better for the company if they just cancel service and go elsewhere (becoming somebody else's problem).

      Also, non-chalance about canceling service is sometimes the best way to make customers understand that we really are doing our best to help them, and we're not just blowing them off. Sounds weird, but here's an example:

      Customer has a problem with their DSL service. We've identified that the problem lies with the phone company. Phone company has given us a commit date of Tuesday by end of business day for repair to be complete. For whatever reason, the customer feels like they've been dragged around, and their service isn't getting fixed. Customer says if they're not up and running by 9:00am Monday morning, they're cancelling service.

      Customer expects us to bend over backwards to get them up and running by 9:00am Monday morning. We can't. There is absolutely nothing we can do. It's out of our hands. Customer needs to understand this. Customer will have the same problem at any competing DSL ISP, but we're the ones who have identified the problem and are getting it fixed.

      We respond by repeating to the customer that we have been given a commit time of Tuesday by end of business day, but that we cannot guarantee that the issue will be resolved by then. We then offer to the customer that if this is unacceptible and they'd prefer to cancel service, although we'd hate to lose them as a customer, we'd be more than happy to transfer them to someone who can take care of that.

      This has the effect of making it clear to the customer that we really mean what we say. Usually, they shut up, keep their account, and let us do our jobs. Often, they'll ask to be transferred to get the account cancelled, then hang up during the transfer.

      The alternative is to offer the customer incentives to try to convince them to stay with us, such as offering a free month of service, or a credit on their account. This costs us money, and gains nothing - if the customer has the expectation that we're willing to give him free service, he'll try to take advantage of it in the future. Far too many ISPs have failed for this very reason.

      At the last few ISPs I've worked for, nearly all my coworkers have been genuinely interested in helping the customer, and we've been fortunate to have management that allows us to do so. I understand that at some companies this is not the case; those are obviously the ones to avoid.

      Sorry for ranting. Getting back on track: ultimatums like "if you don't fix this problem, I'll cancel my service" sometimes are a good idea. That will tell you whether or not you can get the issue resolved. Be prepared to actually cancel, because if they can't resolve the issue, that's what will happen. If they can but just don't want to, threatening to cancel may just be the incentive they need to get it done.

      --
      $x='S24;r)>63/* h@<5+oZ)32"5cz';$me='phroggy'x$];
      $x=~y+ -xz+\0-Tx+;print$_^chop$me for split'',$x;
  16. What brand of transparent proxy is it? by billstewart · · Score: 3, Interesting
    Do you know what brand of attempting-to-be-transparent proxy cache server they're using? Proxy caching is an important performance enhancer for ISPs, corproate firewalls, and other bottleneck network environments, and "transparent" proxies are less trouble for the ISP and for the users as well (especially since many users wouldn't bother configuring their browsers for them unless either they're pre-configured by the ISP or forced to use the proxy by firewall rules that block non-proxy access.)

    Of course, the problem with transparent servers is when they're not, and your ISP seems to have one that isn't. Is it possible to find out what kind it is, either by telnetting to the thing and looking at headers or by asking the ISP, and can you do bug reports to the vendor to get them to fix their product?

    --

    Bill Stewart
    New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
  17. This saves LOTS of bandwidth by theCoder · · Score: 4, Insightful

    My college has a similar set up because it saves an incredible amount of bandwidth. It's not to be mean, or malicious, or spy on your browsing habits, it's just to save bandwidth. And it does (I wish I had numbers to back this up, but I don't run the proxy).

    There have been problems with the proxy in the past (it not returning any data) and there are still some minor issues, but on the whole it works well (in that you don't ever notice it).

    It sounds like the ISP in question has a bug in their web cache code. If the web cache doesn't have the particular URL cached, it forward the request to the intended destination. I'd bet it's trying, but it can't lookup whatever OpenNIC URL is being specified (because it doesn't use OpenNIC). The ISP really should report this bug to the manufacturer.

    My advice is this -- get the ISP on your side to fix the problem. They won't remove the proxy, and they shouldn't have to if the bug is fixed.

    --
    "Save the whales, feed the hungry, free the mallocs" -- author unknown
  18. AOL ignores ports by Anonymous Coward · · Score: 4, Interesting

    AOL's transparent proxy is a little worse. It ignores the port and proxies anything that looks like HTTP. Of course, they deny having a transparent proxy, but I was able to watch packets leaving our network headed for AOL and then watch altered packets come back from AOL.

    I stumbled across this when their proxy had some trouble with the cookies we were using and suddenly no one on AOL could use our service. A few minutes later they could again. Then they could not. During this time, I was running a packet logger on the outgoing traffic from our server and on the incoming traffic to a workstation I had connect to AOL. Everything worked find until the server sent the cookie. Then AOL suddenly stopped sending more packets. This occured on every port I tried, even ports reserved for other services.

  19. The behavior is correct. by xanthan · · Score: 5, Informative

    The web cache is exhibiting correct behavior. When a forward proxy cache (transparent or not) gets a request in the form of GET http://www.site.com/ http/1.1, it will use the www.site.com address instead regardless of what original dns name you went to (www.google.com in your example). In the transparent case where the GET statement looks more like GET /content.html http/1.1, it will use the original destination address.

    In other words, it's your client that's broken. See RFC 2616 for details.

    The unfortunate truth is that more often than not, sites simply don't set their cache controls correctly. They forget that caches don't exist just on the server side but that they exist on the client side as well. Section 13 of RFC 2616 explains how they work in great detail and it really should be mandatory reading for any site administrator.

    If you're still looking for more information on web caching, check out Content Delivery Networks by Scot Hull. It was just released and is available on Amazon. There is an enlightening section on web caching that should clearly explain why what you're seeing is in fact correct behavior.

    1. Re:The behavior is correct. by GigsVT · · Score: 3, Insightful

      Well, yes and no, how could a proxy work with non-ICANN roots?

      It will try to resolve the address in the GET line, and fail, because it doesn't know about other TLDs.

      The only way to fix this broken proxy behavior is to have it ignore GET lines that is can't resolve, and instead forward the request intact to the IP address.

      --
      I've had enough abrasive sigs. Kittens are cute and fuzzy.
    2. Re:The behavior is correct. by Dr.+Zowie · · Score: 4, Informative
      Yep, the cache is behaving correctly for a cache. The problem is that it's behaving incorrectly for a router, because I can't send the http: requests I want to the hosts I want to send them to.

      I'm not familiar enough with the ins and outs of cache design to know whether RFC2616 is designed primarily for ``transparent'' or ``selected'' proxies, but using a DNS resolution on the destination host seems to break the layered structure of the IP stack. In this case, packets that I've (layer 3) addressed to a specific host never get there, because (layer 4) they're being directed to another machine based on port, and the other machine (the cache) is routine them based on a name (layer 7) contained in the packet payload.

      That is acceptable behavior for a proxy to which I'm explicitly routing my http requests, but not for a router down which I'm sending port-80 IP packets.

  20. It's in the layers. by Bender+Unit+22 · · Score: 4, Informative

    Normally what you do is to do layer 4 switching but note that you can do do switching on layer 7 as well, which means you can have the switch do url based switching so that a part of the url determines that it should get switched. This requires much more power and is mostly done for server switching like load balancing.

    What happens in your case might be that they have placed a switch that can do at least layer 4 switching, between you and the internet.
    What then is done is that all port 80 requests coming from the clients side(you) are re-directed to the proxy which means that http requests on other ports will not be cached. Note that anonymous ftp can also be proxied.
    A "clever" proxy/switch solution can do ip-spoofing so the webserver gets your IP adr. and sends it back to you directly, but as there is a switch inbetween, it redirects the result to the proxy which then sends the result back to you.

    A way to avoid it is to get a gateway somewhere that can channel your http traffic, you could set your browser to use this gateway as a proxy on any port. The switch will most likely not act on the traffic coming on this port an pass it though.

    The easy way would be installing a proxy server on a box that you have access to on the outside and configure it so that it won't cache anything.

  21. Re:Wasn't port 80 supposed to be HTTP? by Skapare · · Score: 3, Insightful

    If you connect to a specific IP address, a transparent proxy should connect to that very same IP address. If it connects to any other for any reason, it is apply a sort of "routing" logic. Apparently what happens is because the client includes an HTTP version 1.1 "Host" header, the proxy prefers to do a DNS lookup on the hostname given, and (if it finds it) connect there instead of the client's original destination IP address.

    This is broken. If the proxy has a different idea of what domain names mean, it gets the wrong web site, or perhaps fails to get one at all. A correct transparent proxy implementation should always connect to the very same IP address the client tried to connect to without regard to the "Host" header (which must also be passed along). A DNS lookup can still be done to optimize the cache. If the destination IP address is in the list of A records from the DNS query, then it can simply be matched to the cache by name alone. However, if the IP address does not match any that DNS gets, then those pages can still be cached, but they must be cached under the tuple of both the destination IP address and the "Host" header name together (as this content can be different than any other for the same host name or the same IP address).

    Maybe someone can provide a list of which transparent proxy cache programs do it wrong, and which do it right (as I have not examined these programs). I don't know if peakpeak.com will change out the software once they find something that does it right (or even make a configuration change if it turns out that's all that is needed). Ironically, if you find an outside proxy server which can do it right for you, you could connect directly to that service via a different TCP port and end up defeating the efforts of your ISP to save upstream bandwidth by caching.
    --
    now we need to go OSS in diesel cars
  22. Re:Hold on here! by Skapare · · Score: 4, Interesting

    If the user configured his browser to use a specific proxy, then I would agree with you regarding RFC2068. The client in essence is delegating DNS responsibility to the proxy server. However, what is happening here is called transparent proxy. There is no DNS delegation taking place. And RFC2068 requires that semantic transparency be preserved (although it does not seem to differentiate types of proxies). It says:

    semantically transparent
    A cache behaves in a "semantically transparent" manner, with
    respect to a particular response, when its use affects neither the
    requesting client nor the origin server, except to improve
    performance. When a cache is semantically transparent, the client
    receives exactly the same response (except for hop-by-hop headers)
    that it would have received had its request been handled directly
    by the origin server.

    In this case the origin server would have delivered a web page (I actually tried it and it works fine for me), and so the proxy has the responsibility to deliver the same thing. In that, it seems, it failed.

    --
    now we need to go OSS in diesel cars
  23. It's working correctly by davew · · Score: 3, Informative

    I see what you mean. You are sending traffic to a particular address based on your own DNS resolution, and if the traffic is proxied, you want it to be sent to your chosen destination, not that of the proxy.

    In my opinion, the ISP is exhibiting correct behaviour.

    Picture this: the object of the exercise with the transparent proxy is to cache pages and increase speed for the customer, right? I think it's already been agreed earlier in the thread that this is not entirely evil.

    Let's say the proxy honours the destination IP address that you chose (I'm not sure how this would work in practice, but I'll go with it for now). It returns the web page from the server that your DNS picked, and caches it for the next guy.

    Another customer requests a page with the same name. What if they're using a DNS root where the answer conflicts with yours? The customer gets the "wrong" web page. Because cached objects eventually expire, this means that the customer might get a completely different site dependent only on the time and date they happened request it.

    The ISP doesn't use the same DNS root you do, so they can't begin to troubleshoot the problem.

    I concede that the popular "alternate" DNS roots have few enough conflicts with the IANA-assigned roots at the minute, but even that is an irrelevancy - any solution that allows a customer to choose destination IP address on behalf of other customers opens up the ISP to a denial of service attack by a user less trustworthy than you or I. One could set up an arbitrary "root" server that resolves www.yahoo.com to my own site. Or google. Or some site that accepts credit card orders.

    I can't see any scalable way out of this without the ISP picking one root, and sticking with it. If that is so, then I think this is a fundamental problem with split roots and, if you really want to use them, be fully aware of what you're getting yourself into. Turning off the transparent proxy will help this time, but you won't be able to rely on being able to talk to any server on the internet that doesn't use the same root as yours, even the servers you don't (usually) need to know exist.

    Regards,
    Dave