Thousands of Sites Wrongly Blocked
Ben Edelman writes: "In the context of the ACLU's pending
challenge to the Children's
Internet Protection Act (PDF), I recently prepared a list of some 6000+ web sites that, by and large, fail to meet the category definitions of popular Internet filtering programs yet are blocked by at least one such program. This topic may be old hat, but my work is new: I have prepared an unusually large list of sites (including police departments, libraries, home-schooling sites, candidates for political office, and on and on), and I have retested these sites over a period of several months."
Troller_Park_Trash, If you're already knowledgeable about the means of operation of filtering software, you may find that the most new & interesting part of the http://cyber.law.harvard.edu/people/edelman/mul-v- us/ page is the Appendices listing specific sites that have been, by and large, wrongly classified by filtering programs.
- us/index-subset.html ("Blocked Site Archives - Subset with Linked Pages - Appendix A") gives information about 395 such URLs. You'll likely find yourself surprised that many of these are blocked -- I know I was.
- us/ mentions, a protective order (from the court in which the underlying case is pending) limits distribution of certain portions of my report -- namely anything I learned from reviewing confidential documents from filtering companies, or from attending confidential portions of depositions of their employees. But the work you, and most others here, are likely to find of greatest interest is the listings of specific sites blocked. (I'm presently adding a bit of text and formatting to help folks find this content more quickly and easily.)
For example, http://cyber.law.harvard.edu/people/edelman/mul-v
Regarding the blacking out of certain text from my report: As http://cyber.law.harvard.edu/people/edelman/mul-v
Ben Edelman
[Originally sent to a mailing-list]
In honor of the censorware material just released by ACLU, I thought I'd try a little experiment in distributed verification.
I took one example from Edelman's report:
16. Southern Alberta Fly Fishing Outfitters #6809 /Regional/Countries/Canada/Business and Economy/Shopping and /Regional/North America/Canada/Alberta/Recreation and
http://www.albertaflyfish.com
Blocked by: N2H2 (Pornography - Sep 11, Oct 7), Websense (Sex - Jul 5,
Aug 18, Sep 11)
Yahoo:
Services/Outdoors/Fishing/Fly Fishing/Lodges/
Google:
Sports/Fishing
Fly fishing in Alberta Canada on the world famous Bow River.
Now, what does censorware have against this site? Maybe it doesn't like too many 'Fly' references in one place? No, it turns out that this site has the misfortune to be virtually hosted and share an internet address with:
http://clubexoticx.com - Club Exoticx
There's a bunch of other completely innocuous sites suffering the same collective guilt of the censorware blacklist. I'd like people to go to N2H2's lookup, at http://database.n2h2.com/cgi-perl/catrpt.pl and *verify* this for themselves by testing the following sites:
http://albertaflyfish.com - Southern Alberta Fly Fishing Outfitters
http://alistairbrown.com - Alistair Brown Folksinger
http://eclothing.com - 'The Game Is On Sportswear Company Ltd.'
http://effectivemanagementsolutions.com - Effective Solutions
http://eligh.com - Springboard Consulting
http://eyepowered.com - E Y E P O W E R E D - 360 Degree Panoramas
http://friendlyfacesonline.com - Create personalized family cartoon
http://gear4pickups.com - Gear4Trucks: HitchHoist Portable Truck Crane,
http://informationonhold.com - Information On-Hold
http://letsmakewine.com - Let's Make Wine
http://planetregister.com - Planet Registe
http://ppt-slides.com - 35mm Slides from your computer file
http://proteach.net - Pro Teach Main Page - Baseball instruction
http://rosiedonovan.com - Rosie Donovan Photography
http://springboardtoinnovation.com - Springboard Consulting
Here, I'll make this easy. Just click these URLs:
http://database.n2h2.com/cgi-perl/catrpt.pl?req_UR L=http://albertaflyfish.com R L=http://alistairbrown.com R L=http://eclothing.com R L=http://effectivemanagementsolutions.com R L=http://eligh.com R L=http://eyepowered.com R L=http://friendlyfacesonline.com R L=http://gear4pickups.com R L=http://informationonhold.com R L=http://letsmakewine.com R L=http://planetregister.com R L=http://ppt-slides.com R L=http://proteach.net R L=http://rosiedonovan.com R L=http://springboardtoinnovation.com
http://database.n2h2.com/cgi-perl/catrpt.pl?req_U
http://database.n2h2.com/cgi-perl/catrpt.pl?req_U
http://database.n2h2.com/cgi-perl/catrpt.pl?req_U
http://database.n2h2.com/cgi-perl/catrpt.pl?req_U
http://database.n2h2.com/cgi-perl/catrpt.pl?req_U
http://database.n2h2.com/cgi-perl/catrpt.pl?req_U
http://database.n2h2.com/cgi-perl/catrpt.pl?req_U
http://database.n2h2.com/cgi-perl/catrpt.pl?req_U
http://database.n2h2.com/cgi-perl/catrpt.pl?req_U
http://database.n2h2.com/cgi-perl/catrpt.pl?req_U
http://database.n2h2.com/cgi-perl/catrpt.pl?req_U
http://database.n2h2.com/cgi-perl/catrpt.pl?req_U
http://database.n2h2.com/cgi-perl/catrpt.pl?req_U
http://database.n2h2.com/cgi-perl/catrpt.pl?req_U
You should get
The Site: [all sites above]
is categorized by N2H2 as:
Pornography
If there's some error-message text in a red font, that means the N2H2 program itself wasn't working, try again.
Now, since I've publicized this, I expect it'll be changed very rapidly for this one item. I have a saying: "Alacrity varies directly with publicity". But this is just one example in a HUGE blacklist. What else is lurking in there?
Sig: What Happened To The Censorware Project (censorware.org)
But it seems that someone else disagreed with me, and now it is categorized as 'satire'. Exactly how a site with such poor standards of journalistic integrity is allowed in that category amazes me.
I have now added adequacy.org to my junkbuster file, so I never have to see it again.
I've visited some of the sites on the two lists, and can attest that many of them are flagged properly. Bottom line? I don't think the author of the paper spidered the sites as well as the censorware did. As an example, it takes some work to find the pr0n links at mulletsgalore.com, but they're there.