Transparent Web Caching Patented

← Back to Stories (view on slashdot.org)

Transparent Web Caching Patented

Posted by simoniker on Wednesday June 25, 2003 @11:32PM from the time-to-get-the-checkbooks-out dept.

JohnQPublic writes "BIND author and all-around Internet personality Paul Vixie and Mirror Image Internet have recently received US patent 6,581,090, specifically '..technology that efficiently stores and retrieves content requests and balances Web traffic between origin servers to improve performance and speed' - sounds an awful lot like what Akamai do. There's a press release from last week that gives some lovely 'details', including this little gem from CEO Alexander M. Vik: 'We anticipate that these patents and our technology solutions will encourage large groups of corporations to become customers of Mirror Image services. We also recognize that this technology is a critical component of other content delivery services and weâ(TM)ll be attempting to work cooperatively with our competitors and their customers to address this issue.' Can you say 'patent infringement suit'?"

19 of 309 comments (clear)

Min score:

Reason:

Sort:

...for the lazy by Anonymous Coward · 2003-06-25 23:42 · Score: 5, Informative

The patent
1. Re:...for the lazy by Syre · 2003-06-26 01:41 · Score: 4, Informative
  
  Oops.. that is, they filed in Sept. 1997, and I was using Squid before that.
  
  Here's much of the early revision history of Squid.
  
  Version 1.0beta1 was April 19, 1996, and that was based on Harvest which was even earlier.
2. Re:...for the lazy by yandros · 2003-06-26 03:31 · Score: 2, Informative
  
  Simply using Squid will not (necessarily) provide examples of prior art -- the patent (thankfully) covers something more specific than just `using squid'.
  
  Someone certainly could use Squid as part of a system to do what the patent claims, and I suspect that some people were. the average Squid user was NOT engaged in this sort of activity, however.
3. Re:...for the lazy by divisionbyzero · 2003-06-26 03:40 · Score: 2, Informative
  
  Ommm... Not to be rude, but don't you think they might have heard of squid before? It's not like it is some rare, esoteric technology. I'm sure the folks at MII are more than well aware of squid and its implication for their patent claim. Akamai also used squid at one one time in conjunction with their penguin boxes. Anyone who has spent five seconds googling "web caching" knows about squid.
Write to your European Member of Parlemant NOW! by chrestomanci · 2003-06-25 23:42 · Score: 5, Informative

This is an Excellent example of why software patents are bad

Right now, if you are a European citizen, like I am, then Write to your European Member of Parliament (MEP), and tell them that you think software patents are a bad thing, and that they should vote against them on June 30th.

The forthcoming European vote was covered here on slashdot a few days ago, but did not make the front page, so did not get much coverage.

You can find a list of European Members of Parliament here To avoid annoying them, do write you your MEP, not to a party leader. If you have several, please take a look at which issues they cover, and chose the one that take an interest in trade/technology etc.

Remember, Write NOW! we don't want this sort of cr*p in
Mirror Image is not Akamai by Sique · 2003-06-25 23:43 · Score: 5, Informative

Having worked with Mirror Image I have to say that the way Mirror Image is doing the caching differs strongly from Akamai's.

While Akamai is putting cache servers in many IP provider's locations (I think more than 5000 so far), Mirror Image is concentrating its caches in about 20 locations connected to the big exchange and peering points. The Mirror Image presenters were explicitely stressing this point and that this other approach is the key to Mirror Image's success. So I guess the patent covers the Mirror Image Way Of Doing Things rather than the idea to cache websites to speed up transfer rates.

--
.sig: Sique *sigh*
This is not "Tranparent Web Caching" by tyagiUK · 2003-06-25 23:50 · Score: 5, Informative

The generally accepted term for this type of technology is "Content Distribution Networking" or "Content Delivery Networking". Akamai, Speedera, Digital Island etc. are Content Distribution companies which will (according to the necessary commercial agreements), take a customer's content and distribute it around their overlay CDNs. Generally speaking, these CDNs overlay the traditional Internet using co-located space in customer or exchange point datacentres. There are, however, some CDN organisations who take the approach of building their own infrastructure.

"Transparent Web Caching" on the other hand is generally a term applied to the transparent redirection of TCP port 80 IP traffic on access equipment through a set of HTTP proxy devices. This technique is used by many ISPs to force users to use their Webcaches even if the user thinks they are being clever by disabling the pre-defined HTTP Proxy settings in their Web browser.

Until recently, you could build your own CDN ($$$) using software from people such as Inktomi, but can still use devices from other manufacturers such as Network Appliance or Cisco Systems.

--
Contribute to the online videogame encyclopedia: GamerWiki
Filed in 1996? by pgregg · 2003-06-25 23:51 · Score: 3, Informative

From the article: "Mirror Image developed the transparent Web caching patent in 1996"

From Mirror Images "About Us"
1997: Mirror Image Internet Inc. is founded.

The earliest date on the Patent itself is September 30, 1997.

IIRC Squid also was around in '97.

The exact dates will be interesting.
1. Re:Filed in 1996? by mpsmps · 2003-06-26 00:19 · Score: 3, Informative
  
  This is a quirk in US patent law. Prior art needs to exist a year before for the filing date to invalidate a patent provided the patentholder can provide evidence that they developed the technology before then.
  
  By contrast, non-US patents can be invalidated by any art prior to the filing date.
I think AOL might have been first by dschuetz · 2003-06-25 23:53 · Score: 3, Informative

They might want to watch out, because from what I understand AOL has the world's largest internet cache system (all running Linux, actually). And I'd bet that it's been in place since before 1996.
Re:Dammit by tomstdenis · 2003-06-26 00:19 · Score: 3, Informative

You think computers are the only area of patent crapola? Try looking up "multiplication" in the USPTO website thingy... You'll find tons of patents for blatantly obvious [to a math nerd] algorithms [I've even seen Karatsuba's 1962 multiplication algorithm patented].

I imagine the same shit happens in other fields.

The problem with patents isn't the law. Isn't the idea of patents. Its the enforcement. Too many people filing too many patents has caused the patent office to stop caring whether the patent is valid or not.

What I think would be fun though is upto a $100K fine for patents which can be proven to be blatant rip offs, fakes or incompletes [e.g. patents on things not yet invented fully just to stifle competitors].

Then you will see companies like this really feel some pain.

To make it even more fun, whoever can prove the patent is a ripoff gets 10% of the fine. Make it a sport for the average citizen!

Tom

--
Someday, I'll have a real sig.
Re:squid by wfrp01 · 2003-06-26 00:24 · Score: 4, Informative

Oops is a more than worthy alternative, that was developed outside the US. I'm not sure how patent law applies in such a situation.

--

--Lawrence Lessig for Congress!
15 minute Patent Summary & Analysis by aeaeae · 2003-06-26 00:35 · Score: 4, Informative
The patent is at Delphion (free registration required) and the USPTO. Paul Vixie is listed as an inventor but probably has no ownership rights, or even the ability to collect on royalties. So don't lynch him yet...
The first base (or independent) claim is:
1. A method for transferring information via the Internet, comprising the steps of:
  - intercepting a message from an Internet user directed to a content provider address;
  - determining whether or not the message is an information request;
  - sending the message to the Internet without being affected if the message is not an information request;
  - determining whether or not said information request relates to a content provider address having a corresponding alternative address, said alternative address providing at least part of the information provided at said content provider address; and
  - directing said information request to said corresponding alternative address, if existing, or sending said information request to the Internet without being affected, if not.
Doesn't sound much like my understanding of how Akamai works (I didn't think Akamai "intercepted" requests -- the origin servers actually pointed to the cache servers in their img src tags). It does sound an awful lot like a transparent proxy however.
There's 36 claims, but only 3 are independent -- the rest are derived from those 3 (dependent claims). It's only the claims that are worth reading and worth worrying about. Press releases, abstracts and summaries are all irrelevant to what a patent actually covers. I find them more confusing than useful.
Let's concentrate on the 3 independent claims then. Here's the other 2:
- 15. A system for transferring information via the Internet, comprising:
  - first means for intercepting a message from an Internet user directed to a content provider address;
  - second means for determining whether or not the message is an information request;
  - third means for sending the message to the Internet without being affected if the message is not an information request;
  - fourth means for determining whether or not said information request relates to a content provider address having a corresponding alternative address, said alternative address providing at least part of the information provided at said content provider address; and
  - fifth means for directing said information request to said corresponding alternative address, if such a corresponding alternative address exists, or sending said information request to the Internet without being affected, if not.
- 36. A method for efficiently delivering cached information to Internet users, comprising the steps of:
  - intercepting a message from an Internet user directed to a content provider, the message requesting specific information;
  - determining whether or not the message relates to a content provider address having a corresponding alternative address, the corresponding alternative address providing at least part of the information provided at the content provider address;
  - determining whether or not the specific information is within the at least part of the information provided at the corresponding alternative address; and
  - providing the at least part of the information to the Internet user, if the specific information is within the at least part of the information, or sending the message to the Internet, if not.
As you can see, the differences between these claims are very subtle. I'd need to spend more time reading those claims to understand
Re:akamai overseas ? by Groote+Ka · 2003-06-26 01:07 · Score: 2, Informative

My experience is that quite some US companies only file patent application in the US. On the other hand, Japanese and European companies file at home AND in the US.
Guess who's really laughing...
Mirror Image Internet, Inc., since they were wise enough to file almost everywhere, contrary to quite some others... Go to the Espacenet, the European Patent Office search database and search for Mirror Image Internet as applicant.
The fat lady will be singing for quite a while in this case.
Mirror Image's original caching service vs. CDN by kriegsman · 2003-06-26 01:14 · Score: 5, Informative

It looks to me like Mirror Image's original "transparent supercache" system is what's described in this newest patent (not so much their Content Delivery Network). The patent looks like its fairly broadly worded, and probably covers some similar models too, but on the other hand, they cite plenty of prior art in their own patent. So overall I would guess that "ordinary" transparent caching is not covered by this patent, but then again IANAL, and in particular IANAPA.

Mirror Image's original business plan was to provide a client-side supercache service to client-side ISPs in places where upstream bandwidth was scarce/expensive (ie, Europe in the 90s). MII would 'mirror' popular high traffic (American) content onto supercaches located just a few hops from the ISPs. ISPs subscribing to the MII service could then configure their proxies to do a "look aside" and access popular content from the local MII supercache rather than have to sent requests across the ocean and pull the content all the way back. It worked nicely for ISPs that needed it, but there were fewer and fewer client-side ISPs willing to pay for access to the MII supercaches. So MII expanded into the server-side part of the caching business: "Content Delivery Networks".

In 2001, MII bought an existing CDN technology company (Clearway Technologies) and in the process acquired a nifty server-side software agent (your choice of Apache module or IIS plug-in) that automatically "Mirrorizes" *coughcoughlikeAkamizescough* all of the output from an origin Web server, so getting your server's content onto the MII CDN only takes a couple of minutes and you don't have to alter any of your Web content. That agent and its associated methods are covered by the other patents mentioned in MII's press release.

Personally, I believe that if MII wanted to sue Akamai for patent infringement, they probably could make a case for it these days, but --as always-- it's unclear that that would be the best use of their resources.

-Mark Kriegsman
Former Chief Scientist, Mirror Image Internet;
Founder, Clearway Technologies;
Inventor, US Patents 5,991,809, 6,370,580 and 6,480,893 (now assigned to MII)
Lots of prior art .. by LionsFate · 2003-06-26 01:42 · Score: 2, Informative

Ok, I'm not anywhere familiar with reading patents, but as far as I can guess, we have plenty of prior art.

From reading the basics of it, and having almost gone into convulsions for attempting to understand it, heres what I can gather.

Re-directing a user to an "alternate address" is covered. So it doesn't have to be transpartent in the proxy sence, the client can be re-directed.

We all know CPAN, right?

CPAN redirects you to a mirror automatically. Thus CPAN is covered by this patent, if I read correctly that redirection is considered 'transparent'. CPAN also had a 'local copy' that you may have been redirected to. Further making it appear to be more of a 'proxy'. CPAN was created in 1995, two years prior to this patent.

There are hundred of other sites that were using this method prior to that, all prior to the patent.

AOL uses proxies, as does many countries (China anyone?), anyone know when they were first setup?
Re:squid by cait56 · 2003-06-26 02:34 · Score: 5, Informative
I believe the patent predates Squid, so there could be a problem to whatever degree that Squid infringes. Just because a later developer is open source does not mean that the original claim was invalid.

However, reading the patent carefully, you realize that it actually only describes a very specific solution. Specific enough that it truly is describing a solution, not a problem. And specific enough that it might legitimately be considered novel for the time it was filed (I really don't have time to search the source code of all proxy servers in the 1996 time frame -- let someone with a finanicial stake do that).

Specifically the patent deals with websites that are identified by their IP Address and where certain content (by default all) is held in an alternate (and presumably closer) server.

There is nothing in this patent about determining if the content is fresh. The description presumes that the cached copies were pushed by the server.

So this would only seem to proxy servers that are transparent to the user, but not to the servers. The proxy servers that are of most interest to an ISP would either be transparent to the server as well, or more of an akamai style strategy where the first-response page is localized to directly fetch pre-positioned material from edge caches.

Interestingly, the patent seems to be worded to cover a single box which handles both the intercept and the decision to proxy, but does not handle the actual proxy response. A firewall transparently redirecting a port to a proxy server is prior art. The basic claim to being novel here is that the client does not have to be configured to use the proxy, and diversions only take place if certain content is requested, non-proxied sites are passed through "unaffected" (which is a false claim, BTW, which I'll deal with in a moment).

There are some serious omissions in the description, would could undermine its enforceability.
- It speaks about identifying "requests" and forwarding those that are not "web requests" to their original destination "unmodified". It fails to disclose that TCP does not naturally delimit "requests", and that identification of a complete "web request" is a complex matter.
- It does not disclose that "other requests" are not amenable to the same parsing algorithms as for "web requests", and that in fact they must be dealt with at another protocol layer.
- It does not disclose that there can only be a single "request" per session, and a single "reply" from either the original source or the alternate. Specifically there is no disclosure on how to splice responses, which it obviously does not do, or on the lifespan of a session that makes the short-version possible.
Perhaps most importantly, the invention described here is working as an application level gateway. It is incapable of quickly identifying TCP connections that do not require proxying and leaving those connection truly unaltered. Terminating a TCP connection, examing the first request in it, and then deciding to actually forward the request to the real server is not "transparent".

The "preferred embodiement" either a) deferred establishiing the connection until the "true source" was to be known (clearly unacceptable, what if the "true source" is not accepting connections?, or b) established the connection, and then aborted it, once the decision to substitute was made.

The implications are not discussed or disclosed. Which isn't surprising, because this patent describes techniques that only work for HTTP 1.0

Caching for HTTP 1.1 is a new problem. You have to deal with caching hints, persistent connections, cookies that might affect the material supplied, etc.
Re:squid by yandros · 2003-06-26 03:23 · Score: 2, Informative

They claim to have designed their system in 1996, so your 1998 practices are unlikely to have much of an effect.

That said, I believe that numerous cases of prior art exist. I don't know if anyone will actually pursue such a claim, since doing so can be difficult and time-consuming.
Re:GPL'd patents by JoeBuck · 2003-06-26 04:36 · Score: 2, Informative
This is already being done. Here are some examples:
- IBM and Rice University have both licensed patents having to do with register allocation, so that GCC can use them.
- IBM has licensed its RCU patent, which is used by the Linux kernel (this is a case where SCO is claiming ownership of the technology even though IBM owns the patent!)
- Raph Levien, of Advogato and Ghostscript fame, has licensed a whole series of patents he holds with respect to printing technology for use in GPLed code.
In all cases, these patents are free to use by any GPLed software, but not by non-copylefted free software.