Transparent Web Caching Patented
JohnQPublic writes "BIND author and all-around Internet personality Paul Vixie and Mirror Image Internet have recently received US patent 6,581,090, specifically '..technology that efficiently stores and retrieves content requests and balances Web traffic between origin servers to improve performance and speed' - sounds an awful lot like what Akamai do. There's a press release from last week that gives some lovely 'details', including this little gem from CEO Alexander M. Vik: 'We anticipate that these patents and our technology solutions will encourage large groups of corporations to become customers of Mirror Image services. We also recognize that this technology is a critical component of other content delivery services and weâ(TM)ll be attempting to work cooperatively with our competitors and their customers to address this issue.' Can you say 'patent infringement suit'?"
The patent
This is an Excellent example of why software patents are bad
Right now, if you are a European citizen, like I am, then Write to your European Member of Parliament (MEP), and tell them that you think software patents are a bad thing, and that they should vote against them on June 30th.
The forthcoming European vote was covered here on slashdot a few days ago, but did not make the front page, so did not get much coverage.
You can find a list of European Members of Parliament here To avoid annoying them, do write you your MEP, not to a party leader. If you have several, please take a look at which issues they cover, and chose the one that take an interest in trade/technology etc.
Remember, Write NOW! we don't want this sort of cr*p in
Having worked with Mirror Image I have to say that the way Mirror Image is doing the caching differs strongly from Akamai's.
While Akamai is putting cache servers in many IP provider's locations (I think more than 5000 so far), Mirror Image is concentrating its caches in about 20 locations connected to the big exchange and peering points. The Mirror Image presenters were explicitely stressing this point and that this other approach is the key to Mirror Image's success. So I guess the patent covers the Mirror Image Way Of Doing Things rather than the idea to cache websites to speed up transfer rates.
The generally accepted term for this type of technology is "Content Distribution Networking" or "Content Delivery Networking". Akamai, Speedera, Digital Island etc. are Content Distribution companies which will (according to the necessary commercial agreements), take a customer's content and distribute it around their overlay CDNs. Generally speaking, these CDNs overlay the traditional Internet using co-located space in customer or exchange point datacentres. There are, however, some CDN organisations who take the approach of building their own infrastructure.
"Transparent Web Caching" on the other hand is generally a term applied to the transparent redirection of TCP port 80 IP traffic on access equipment through a set of HTTP proxy devices. This technique is used by many ISPs to force users to use their Webcaches even if the user thinks they are being clever by disabling the pre-defined HTTP Proxy settings in their Web browser.
Until recently, you could build your own CDN ($$$) using software from people such as Inktomi, but can still use devices from other manufacturers such as Network Appliance or Cisco Systems.
Contribute to the online videogame encyclopedia: GamerWiki
From the article: "Mirror Image developed the transparent Web caching patent in 1996"
From Mirror Images "About Us"
1997: Mirror Image Internet Inc. is founded.
The earliest date on the Patent itself is September 30, 1997.
IIRC Squid also was around in '97.
The exact dates will be interesting.
They might want to watch out, because from what I understand AOL has the world's largest internet cache system (all running Linux, actually). And I'd bet that it's been in place since before 1996.
You think computers are the only area of patent crapola? Try looking up "multiplication" in the USPTO website thingy... You'll find tons of patents for blatantly obvious [to a math nerd] algorithms [I've even seen Karatsuba's 1962 multiplication algorithm patented].
I imagine the same shit happens in other fields.
The problem with patents isn't the law. Isn't the idea of patents. Its the enforcement. Too many people filing too many patents has caused the patent office to stop caring whether the patent is valid or not.
What I think would be fun though is upto a $100K fine for patents which can be proven to be blatant rip offs, fakes or incompletes [e.g. patents on things not yet invented fully just to stifle competitors].
Then you will see companies like this really feel some pain.
To make it even more fun, whoever can prove the patent is a ripoff gets 10% of the fine. Make it a sport for the average citizen!
Tom
Someday, I'll have a real sig.
Oops is a more than worthy alternative, that was developed outside the US. I'm not sure how patent law applies in such a situation.
--Lawrence Lessig for Congress!
The patent is at Delphion (free registration required) and the USPTO. Paul Vixie is listed as an inventor but probably has no ownership rights, or even the ability to collect on royalties. So don't lynch him yet...
The first base (or independent) claim is:
Doesn't sound much like my understanding of how Akamai works (I didn't think Akamai "intercepted" requests -- the origin servers actually pointed to the cache servers in their img src tags). It does sound an awful lot like a transparent proxy however.
There's 36 claims, but only 3 are independent -- the rest are derived from those 3 (dependent claims). It's only the claims that are worth reading and worth worrying about. Press releases, abstracts and summaries are all irrelevant to what a patent actually covers. I find them more confusing than useful.
Let's concentrate on the 3 independent claims then. Here's the other 2:
15. A system for transferring information via the Internet, comprising:
36. A method for efficiently delivering cached information to Internet users, comprising the steps of:
As you can see, the differences between these claims are very subtle. I'd need to spend more time reading those claims to understand
Guess who's really laughing...
Mirror Image Internet, Inc., since they were wise enough to file almost everywhere, contrary to quite some others... Go to the Espacenet, the European Patent Office search database and search for Mirror Image Internet as applicant.
The fat lady will be singing for quite a while in this case.
It looks to me like Mirror Image's original "transparent supercache" system is what's described in this newest patent (not so much their Content Delivery Network). The patent looks like its fairly broadly worded, and probably covers some similar models too, but on the other hand, they cite plenty of prior art in their own patent. So overall I would guess that "ordinary" transparent caching is not covered by this patent, but then again IANAL, and in particular IANAPA.
Mirror Image's original business plan was to provide a client-side supercache service to client-side ISPs in places where upstream bandwidth was scarce/expensive (ie, Europe in the 90s). MII would 'mirror' popular high traffic (American) content onto supercaches located just a few hops from the ISPs. ISPs subscribing to the MII service could then configure their proxies to do a "look aside" and access popular content from the local MII supercache rather than have to sent requests across the ocean and pull the content all the way back. It worked nicely for ISPs that needed it, but there were fewer and fewer client-side ISPs willing to pay for access to the MII supercaches. So MII expanded into the server-side part of the caching business: "Content Delivery Networks".
In 2001, MII bought an existing CDN technology company (Clearway Technologies) and in the process acquired a nifty server-side software agent (your choice of Apache module or IIS plug-in) that automatically "Mirrorizes" *coughcoughlikeAkamizescough* all of the output from an origin Web server, so getting your server's content onto the MII CDN only takes a couple of minutes and you don't have to alter any of your Web content. That agent and its associated methods are covered by the other patents mentioned in MII's press release.
Personally, I believe that if MII wanted to sue Akamai for patent infringement, they probably could make a case for it these days, but --as always-- it's unclear that that would be the best use of their resources.
-Mark Kriegsman
Former Chief Scientist, Mirror Image Internet;
Founder, Clearway Technologies;
Inventor, US Patents 5,991,809, 6,370,580 and 6,480,893 (now assigned to MII)
Ok, I'm not anywhere familiar with reading patents, but as far as I can guess, we have plenty of prior art.
From reading the basics of it, and having almost gone into convulsions for attempting to understand it, heres what I can gather.
Re-directing a user to an "alternate address" is covered. So it doesn't have to be transpartent in the proxy sence, the client can be re-directed.
We all know CPAN, right?
CPAN redirects you to a mirror automatically. Thus CPAN is covered by this patent, if I read correctly that redirection is considered 'transparent'. CPAN also had a 'local copy' that you may have been redirected to. Further making it appear to be more of a 'proxy'. CPAN was created in 1995, two years prior to this patent.
There are hundred of other sites that were using this method prior to that, all prior to the patent.
AOL uses proxies, as does many countries (China anyone?), anyone know when they were first setup?
I believe the patent predates Squid, so there could be a problem to whatever degree that Squid infringes. Just because a later developer is open source does not mean that the original claim was invalid.
However, reading the patent carefully, you realize that it actually only describes a very specific solution. Specific enough that it truly is describing a solution, not a problem. And specific enough that it might legitimately be considered novel for the time it was filed (I really don't have time to search the source code of all proxy servers in the 1996 time frame -- let someone with a finanicial stake do that).
Specifically the patent deals with websites that are identified by their IP Address and where certain content (by default all) is held in an alternate (and presumably closer) server.
There is nothing in this patent about determining if the content is fresh. The description presumes that the cached copies were pushed by the server.
So this would only seem to proxy servers that are transparent to the user, but not to the servers. The proxy servers that are of most interest to an ISP would either be transparent to the server as well, or more of an akamai style strategy where the first-response page is localized to directly fetch pre-positioned material from edge caches.
Interestingly, the patent seems to be worded to cover a single box which handles both the intercept and the decision to proxy, but does not handle the actual proxy response. A firewall transparently redirecting a port to a proxy server is prior art. The basic claim to being novel here is that the client does not have to be configured to use the proxy, and diversions only take place if certain content is requested, non-proxied sites are passed through "unaffected" (which is a false claim, BTW, which I'll deal with in a moment).
There are some serious omissions in the description, would could undermine its enforceability.
Perhaps most importantly, the invention described here is working as an application level gateway. It is incapable of quickly identifying TCP connections that do not require proxying and leaving those connection truly unaltered. Terminating a TCP connection, examing the first request in it, and then deciding to actually forward the request to the real server is not "transparent".
The "preferred embodiement" either a) deferred establishiing the connection until the "true source" was to be known (clearly unacceptable, what if the "true source" is not accepting connections?, or b) established the connection, and then aborted it, once the decision to substitute was made.
The implications are not discussed or disclosed. Which isn't surprising, because this patent describes techniques that only work for HTTP 1.0
Caching for HTTP 1.1 is a new problem. You have to deal with caching hints, persistent connections, cookies that might affect the material supplied, etc.
They claim to have designed their system in 1996, so your 1998 practices are unlikely to have much of an effect.
That said, I believe that numerous cases of prior art exist. I don't know if anyone will actually pursue such a claim, since doing so can be difficult and time-consuming.
This is already being done. Here are some examples:
In all cases, these patents are free to use by any GPLed software, but not by non-copylefted free software.