If you're interested in knowing if your own page is being modified in flight, we (the authors of the study) have an open source toolkit for adding a "web tripwire" to your page. It's just a piece of JavaScript code that does an integrity check within the user's browser, and it can report any in-flight changes back to your server.
The toolkit requires you to run CGI scripts on your server to collect results, but we also have a web tripwire service that is easier to use (available on the same page above). Just add one line of JavaScript to your page, and our server will handle the integrity check and collect the results. We can then provide you with reports of the changes, much like Google Analytics.
We hope that by spreading web tripwires to other pages, we can at least deter ISPs from making further changes to web pages in-flight.
Actually, our test page happens to answer these questions, to some extent.
All of our test pages are marked with "Pragma: no-cache" and "Cache-control: no-cache" in the HTTP response headers, but we're observing changes to the pages anyway.
Our integrity checking mechanism uses AJAX requests (XmlHttpRequests) to fetch the test page. ISPs can't distinguish between an AJAX request and a normal page request (i.e., they both look like normal HTTP requests), so they inject ads into both. However, we're only asking for a normal HTML file with the AJAX request, so I can't comment on whether they would modify other types of XML data.
We do have a set of scripts that we intend to make available as an integrity checking tool for others to easily use on their websites. We'll be refining them based on what we learn from this experiment, and we'll probably use some randomization to make it harder to detect the "tripwire."
We'll make them available in the not too distant future.
The toolkit requires you to run CGI scripts on your server to collect results, but we also have a web tripwire service that is easier to use (available on the same page above). Just add one line of JavaScript to your page, and our server will handle the integrity check and collect the results. We can then provide you with reports of the changes, much like Google Analytics.
We hope that by spreading web tripwires to other pages, we can at least deter ISPs from making further changes to web pages in-flight.
Actually, our test page happens to answer these questions, to some extent.
All of our test pages are marked with "Pragma: no-cache" and "Cache-control: no-cache" in the HTTP response headers, but we're observing changes to the pages anyway.
Our integrity checking mechanism uses AJAX requests (XmlHttpRequests) to fetch the test page. ISPs can't distinguish between an AJAX request and a normal page request (i.e., they both look like normal HTTP requests), so they inject ads into both. However, we're only asking for a normal HTML file with the AJAX request, so I can't comment on whether they would modify other types of XML data.
Charlie
We'll make them available in the not too distant future.
Charlie